Abstract

The Bank of England fan chart of inflation visualizes the uncertainty of the bank’s inflation projections. Visualization is a way to tame uncertainty, in the sense that uncertainty is brought under the measure of a probabilistic distribution, in this case a (two-piece) normal distribution. As such, taming is a process of homogenization, that is, a process of translating various heterogeneous items into a common medium. In this case, the common medium is a specific curve, the shape of which is determined by principles of ignorance: one starts with a simple symmetrical and smooth shape, and deviates from it if there is reason to. These principles of visualization work epistemologically in the same way that gestalt principles are used to perceptually structure visual information. This article shows that the principles of symmetry, proximity, and smoothness are the underlying heuristics that shapes the unknown future into a fan chart.

1 Introduction

Ian Stewart (2019), in his recent book on the history of the mathematics of uncertainty distinguishes among what he terms six ages of uncertainty. In the first age of uncertainty, humans invented belief systems, in which the uncertainty of nature was seen as the will of gods. The second age of uncertainty is characterized by the rise of science. Science “gave way to the belief that most things would be explicable if we could tease out the underlying laws” and therefore that uncertainty is “merely temporary ignorance” (5). The third age of uncertainty is the age in which the study of uncertainty became a new branch of mathematics, and uncertainty became quantified in terms of probability. So far, all forms of uncertainty reflect human ignorance. In the fourth age of uncertainty, scientists, in particular physicists, discovered that the world “is made from uncertainty” (8). The fifth age of uncertainty emerged when scientists realized that also a deterministic system can be unpredictable, in fact can be chaotic.

According to Stewart, we have now entered the sixth age of uncertainty, “characterized by the realisation that uncertainty comes in many forms, each being comprehensible to some extent. We now possess an extensive mathematical toolkit to help us make sensible choices in a world that’s still horribly uncertain” (10). He discusses briefly two examples to clarify this age: weather forecasts and the Bank of England’s forecasts of changes to the rate of inflation, presented as fan charts. A fan chart (see Fig. 1) plots the predicted inflation rates over time, but not as a single line, instead as a shaded band. As the time passes, the band gets wider, indicating an increase of uncertainty. The density of the ink indicates the level of probability: a dark region is more likely than a fainter one. The shaded area covers 90 percent of the probable forecasts.

The fan chart was presented for the first time in the Bank of England Inflation Report of February 1996 (Fig. 1). This report shows two fan charts, one is a projection for twelve-month RPIX inflation and the other is a projection for twelve-month RPIY inflation.1 Because forecasts will never be “precisely accurate” (Bank of England 1996: 46), both charts are “designed to illustrate the distribution of possible outcomes of inflation over the next two years” (48). The projection of inflation is not presented as a line but as a “distribution of possible outcomes” because “inflation, like other economic phenomena, is inherently uncertain” (46) due to two reasons: first, “the economy is too complex and too rapidly changing for its behaviour to be captured in any fixed set of equations or ‘model’ of the economy,” and second, “inflation is subject to unpredictable shocks, which can vary greatly in size” (46). Actually, the sources of inaccuracy are in principle unlimited, or in other words, can be anything.

The charts are in fact quantitative representations of uncertainty. The central band of figure 1, colored deep red, represents the area that is “judged to [have] about a 10% chance that inflation will be within [it] at any date” (48). The next deepest shade, on both sides of the central band, takes the distribution out to 20 percent, and so on, in steps of 10 percentage points. In other words, the shades are visual expressions of probabilities.

The charts, therefore, visualize actually two kinds of uncertainties. One is the “central tendency” of the future inflation rate, represented by the imaginary center line of the central band.2 This visualization is based on an economic model that maps choices about economic assumptions onto an inflation forecast. The other kind of uncertainty represented by the shaded bands beside the expected inflation rates are not based on any economic model, or any other extrapolation of our economic knowledge. While the first kind of forecast takes “known unknowns” into account, the latter kind of forecast attempts also to account for the “unknown unknowns.”

The inherent uncertainty of economic phenomena (Stewart’s fourth age of uncertainty) came to be acknowledged in the early 1900s and was therefore the subject of writings by economists of the time. Frank Knight’s (1921) distinction between “risk” and nonmeasurable, noncalculable uncertainty, today usually indicated by “Knightian uncertainty,” is the most known. In relation to the Bank of England’s fan chart, however, John Maynard Keynes’s discussion of different kinds of uncertainty is more directly of relevance in relation to economic policy:

By “uncertain” knowledge, let me explain, I do not mean merely to distinguish what is known for certain from what is only probable. The game of roulette is not subject, in this sense, to uncertainty: nor is the prospect of a Victory bond being drawn. Or, again, the expectation of life is only slightly uncertain. Even the weather is only moderately uncertain. The sense in which I am using the term is that in which the prospect of a European war is uncertain, or the price of copper and the rate of interest twenty years hence, or the obsolescence of a new invention, or the position of private wealth-owners in the social system in 1970. About these matters there is no scientific basis on which to form any calculable probability whatever. We simply do not know. (Keynes 1937: 213–14)

The necessity for economic policy, however, compels us, as Keynes continues, “to do our best to overlook this awkward fact and to behave exactly as we should if we had behind us a good Benthamite calculation of a series of prospective advantages and disadvantages, each multiplied by its appropriate probability, waiting to be summed” (214).

A contemporary of Keynes, Karl Menger, made almost the same observations three years earlier when discussing the concept of uncertainty in economics:

Quantitative precision, even in a restricted way, for describing the dependence of the evaluation of a change upon its probability and the potential gain can be achieved only in the case of games of chance. In other domains of economic actions, uncertainty also plays a very important role indeed but can rarely be made numerically precise. This is most obvious in the case of general economic and political uncertainty, however important their influence on economic actions may be. If some piece of real estate lends itself only to a special use, say, the development of a luxury hotel or an armament factory, then its evaluation will largely depend upon the evaluator’s views on the economic development of the country or the prospect of war—thus on his views about uncertain circumstances. But even if he can make precise his personal judgement of likelihoods . . . one cannot speak of probability in a stricter sense. ( [1934] 1979: 273)

Notwithstanding these distinctions between risk and probability on the one side and nonmeasurable and noncalculable uncertainty on the other, the Bank of England’s fan chart suggests that uncertainty can be tamed, that is, uncertainty can be brought “under the control of natural or social law” (Hacking 1990: 10). By the preceding phrase Ian Hacking meant to denote that it is covered by a statistical distribution. Since the late 1990s the Bank of England publishes graphs that depict the uncertain future as a measurable phenomenon. More precisely, the Bank of England Inflation Report presents the uncertainty of future inflation as a fan of differently shaded red areas, where each area represents 10 percent of the total area in which future inflation could appear. While the sources of uncertainty are various and heterogeneous and for a large part unknown, these charts present uncertainty as a homogenous and measurable phenomenon. Because there are no theories, models, or “Benthamite calculations” available to graph uncertainty, this visualization of ignorance must be based on other heuristic principles. This article will investigate what these heuristic principles are. It will be shown that in visualizations of the unknown, these heuristic principles play a role in a similar way as gestalt principles do in vision. In the case of the fan charts, these heuristic principles are similar to the gestalt principles of symmetry, proximity, and continuity.

2 The Making of the Fan Charts

To investigate the question of how the Bank of England arrives at a visualization of uncertainty, even at a quantification of it, I first will discuss what the bank itself reports about the procedures behind its published fan charts. These reports show that the bank’s visualization is based on a combination of a specific probability distribution, past mistakes, and expert judgments.

In the first reports in which the fan chart appeared, the probabilities attached to inflation falling within certain ranges were obtained from thirty-eight “outside forecasters” (Bank of England 1996: 47). After the installment of the Monetary Policy Committee (MPC) in June 1997, its members play the major role in deciding what the probability distribution should be:

Judgment has always been key to the forecast process in the Bank. But whose judgment and whose forecast? A distinctive feature of the Report process prior to May 1997 was the involvement of the Governors and Directors of the Bank in agreeing key assumptions and risks, on the basis of advice from Bank staff. With the advent of the Monetary Policy Committee (MPC), the Report and the forecast represent the views of the MPC members, again aided by advice from Bank staff. (Britton, Fisher, and Whitley 1998: 31)

The fan chart “portrays a probability distribution that approximates to the MPC’s subjective assessment of inflationary pressures evolving through time” (31).

The shape of the distribution, however, is not determined by the MPC. Erik Britton, Paul Fisher, and John Whitley (1998) clarify that the specific form for the distribution—a (two-piece) normal distribution—is chosen in advance (see Fig. 2).3 The MPC’s subjective assessment is limited to the “calibration” of this distribution, derived by deciding what its variance and skewness is. Because the “central tendency” (the center line of the central band) is based on a model, one could ideally use this model to evaluate all possible shock that might affect the inflation forecast to map out this probability distribution. In practice, however, only a limited number of shocks are evaluated. These shocks are used to determine the mode of this distribution. The mode is the most likely outcome and is chosen as a measure of the central tendency because it has the advantage that it is not affected by extreme outcomes and outliers, unlike the mean (32).

The variance of the distribution represents the “degree of uncertainty” (32). The uncertainty is the “subjective assessment” of how likely it is that “the future events will differ from the central view” (32). The initial value of this uncertainty is based on the forecast errors from the previous ten years. Then the MPC is required to form a view as to whether or not this uncertainty is greater or less than in the past. To give an example of such an assessment, in February 2003, the MPC judged that the threat of a military conflict with Iraq added substantially to the risks facing the UK economy and so it temporarily widened the fan charts (Elder et al. 2005: 330): “The unusual uncertainty relating to the duration and impact of a possible war in Iraq has led the Committee to widen the range of possible outcomes” (Bank of England 2003: iii). The skewness represents the “balance of the risks” (32). The assessment of the skewness is based on the MPC’s view on the difference between the mean and the mode of the forecast distribution.

This monthly calibration process is carried out in three meetings between the MPC and the bank’s forecast team that lead to the publication of the report.4 Instead of a potentially unlimited number of shocks that might affect the inflation forecast, the MPC focuses only on a selection of “major issues of the day.” At the first meeting, this selection of main issues is determined. Following this meeting, the forecast team maps the decisions of the MPC onto a central projection and uncertainty distribution. A second meeting with the MPC considers this draft forecast. The quantification of the mapping from each “major issue of the day” and uncertainty assessment is reviewed, new data are incorporated, and changes are requested. A third meeting gives the MPC an opportunity to fine-tune the revised forecast distribution and bring it up to date. The final forecast, published in the report, includes adjustment in response to the advent of market-related data in the period up to the relevant monthly MPC meeting, and reflects any change in interest rates made by the committee in that last meeting.

The variance of inflation is thus derived from the underlying variances of the shocks that have been considered, using the mapping provided by the econometric model. But instead of taking a weighted sum, where the weights are determined by the model, the past inflation forecast error variance is taken as a starting point and then adjusted upward or downward, based on changes to a limited number of variance assumptions. “By adjusting the basic variances, the forecast variance of inflation is thus changed to match the degree of uncertainty as viewed by the MPC” (Britton, Fisher, and Whitley 1998: 33).

3 Homogenization of Uncertainty

The starting point for the calibration process as described in the former section is the choice of the probability distribution, which is a two-piece normal distribution. While the calibration of this distribution is clarified in detail in the Bank of England publications, the choice of this distribution is not. For example, Briton, Fisher, and Whitley 1998, when clarifying the forecasting process at the Bank of England, only discusses briefly in an appendix what a two-piece normal is.

To gain a more complete understanding of the epistemic activity that leads to the fan chart, the two-piece normal will be briefly explained before its underlying assumptions are discussed. As the name already indicates, it is a distribution that consists of two pieces, where each piece is the half of a normal distribution, each with a different standard deviation, σ1 and σ2, respectively. The reason for having two different pieces is that the normal distribution is not skewed, it is symmetrical. Skewness is therefore introduced by piecing together two different normal distributions:

 
fX(x;μ,σ1,σ2)={Cexp{-12σ12(x-μ)2}, for-xμCexp{-12σ22(x-μ)2}, for μ<x<,

where μ is the mode and C is the normalizing constant such that the two-piece distribution integrates to unity.5 The skewness of this distribution is defined by the difference between the two standard deviations.6 So, probabilities at both sides of the mode are considered to have a similar normal distribution, except that the variances at both sides are different.

The application of the normal distribution is, however, conditioned on certain assumptions about the distribution variable. The implicit assumptions for the application for the normal distribution are the ones that underlie the Central Limit Theorem, that is, the theorem that justifies the use of the normal distribution when there are no other explicit justifications offered, such as an underlying mechanism that determines the shape of the distribution.7 The Central Limit Theorem says that “under very general conditions the sum of n independent variables, distributed in whatever form, tends to normality as n tends to infinity” (Kendall and Stuart 1963: 223–24). Therefore, the crucial assumption for the normal distribution to be a good approximation of the probability distribution of uncertainty is that the component variables have the same distribution (“identical distributed”), though “in whatever form,” and are independent and large in number.

For certain specific conditions, the requirements of independency and identical distribution can be weakened. But because one usually does not know which conditions are met for the various types of sources of uncertainty, one cannot assume that these specific conditions hold. Hence, the application of normal distribution, whether in one or two pieces, assumes that the different sources of uncertainty, such as the complexity and rapid change of economic behavior and unpredictable shocks varying greatly in size, all have the same distribution, are independent, and are very large in number. Because of the diversity of the nature of these sources, these assumptions are obviously too strong.

However, the normal distribution is not used because the legitimizing assumptions apply, rather, conversely, the normal distribution is used to induce these assumptions. Whenever the normal distribution is used as the model for uncertainty for cases of which it is not known whether its legitimating assumptions apply, these assumptions actually function as homogenizing assumptions. They assume away the heterogeneity of the various sources of uncertainty and assume that these sources instead are identical, independent, and in large number.

To better understand how this homogenization works, it is revealing to see which assumptions Carl Friedrich Gauss used to infer the normal (Gaussian) distribution. The subject of his investigation, however, was not uncertainty but error. But similar to the sources of economic uncertainties, we are also in principle ignorant about the sources of “random errors.” To arrive at his theory of error, Gauss first discussed the nature of errors. Therefore, he distinguished two kinds of errors, “random or irregular errors” and “constant or regular errors.” This distinction was based on the assumed sources of error. A random error “depends on varying circumstances that seem to have no essential connection with the observation itself. . . . Such errors come from the imperfections of our senses and random external causes, as when shimmering air disturbs our fine vision. Many defects in instruments, even the best, fall in this category; e.g., a roughness in the inner part of a level, a lack of absolute rigidity, etc.” (Gauss 1995: 3). These errors are the ones that were to be considered in his theory of error; regular errors were explicitly excluded from it. In other words, the theory of error is an account of errors of which sources we are most ignorant.

Taking this concept of randomness as a synonym for ignorance, Gauss assumed that the possible values of these errors, ɛ, have probabilities given by a function φ(ɛ). Gauss noted that a priori he could only make general statements about this function. “In practice we will seldom, if ever, be able to determine φ a priori” (7). These general statements were the following: The function is “zero outside the limits of possible errors while it is positive within those limits. . . . We may assume that positive and negative errors of the same magnitude are equally likely, so that φ(-x) = φx [and] since small errors are more likely to occur than large ones, φx will generally be largest for x = 0 and will decrease continuously with increasing x” (7). He then adopted as postulate that when any number of equally good direct observations of an unknown magnitude are given, the most probable value is their arithmetic mean, and subsequently he proved that the distribution must have the form of what later came to be called the Gaussian, or normal, curve:

 
φ(ε)=hπe-h2ε2,

for some positive constant h, where h came to be viewed as a measure of precision of observation.

Thus, the derivation of the bell-shaped curve was based on four assumptions:

  1. The curve is zero outside the limits of possible errors while it is positive within those limits.

  2. Positive and negative errors of the same magnitude are equally likely, so that φ(-x) = φx.

  3. Since small errors are more likely to occur than large ones, φx will generally be largest for x = 0 and will decrease continuously with increasing x.

  4. The distribution of errors is represented by a curve, that is, by a smooth line.

The first assumption was actually not used in the derivation, which can be seen by the fact that the bell-shaped curve does not show any such limits; the curve is positive for the entire infinite wide domain of possible events, and therefore this assumption will not further be discussed. The second is a symmetry assumption, the third a clustering assumption, and the fourth a smoothness assumption.

The symmetry assumption used for errors and uncertainty is actually a principle of insufficient reason, or principle of indifference, as called by Keynes ([1921] 1973), or principle of the equal distribution of our ignorance, as called by Norwood Russell Hanson (1969).

In order that numerical measurement [of probability] may be possible, we must be given a number of equally probable alternatives. The discovery of a rule, by which equiprobability could be established, was, therefore, essential. A rule, adequate to the purpose, introduced by James Bernouilli, who was the real founder of mathematical probability, has been widely adopted, generally under the title of the principle of non-sufficient reason, down to the present time. (Keynes [1921] 1973: 44)

The basic idea of this principle is that a number of possibilities are equally probable when we know no reason why one should occur rather than any of the others. It is a principle of ignorance that is actually also applied in the Gauss’s theory of error, from which he inferred the normal distribution.

Gauss’s three assumptions about the nature of random errors enable the homogenization of these errors. Despite the variance and differences of their sources, they could be covered by one shape. These assumptions enable the structuring and ordering of heterogeneous materials. As such they function in a similar way as gestalt principles that enable us to see shapes and forms in the continuous torrent of light stimuli that hits our eyes. Therefore, homogenization will be explored further with the aid of Gestalt theory.

4 Gestalt Theory and Epistemology

Gestalt psychologists study perceptual organization: “how all the bits and pieces of visual information are structured into larger units of perceived objects and their interrelations” (Palmer 1999: 255). A naive realist explanation of this organization could be that it simply reflects the structure of the external world. A problem with this explanation is that the visual system does not have direct access to how the environment is structured, it has only access to the image projected onto the retina, the “array of light that falls on the retinal mosaic” (257). This optic array allows for an infinite variety of possible organizations. The question therefore is how the visual system picks out one of them.

To answer this question Max Wertheimer, one of the founders of Gestalt psychology, studied the stimulus factors that affect perceptual grouping: “how various elements in a complex display are perceived as ‘going together’ in one’s perceptual experience” (Palmer 1999: 257). It should be noted that this perceptual experience is spontaneous and effortless. Therefore, it is often assumed that these “groupings” represent real objects and are not results of mental activity. This study resulted in several principles of grouping, such as:

  • Proximity: Items close together are perceptually grouped together.

  • Similarity: Elements of similar form tend to be grouped together.

  • Continuity: Connected or continuous visual elements tend to be grouped.

  • Symmetry: Symmetrical elements are perceived as belonging together.

  • Closure: Closed contours tend to be seen as objects.

  • Relative size: Smaller components of a pattern tend to be perceived as objects.

The theoretical approach of the Gestalt psychologists is that perceptual organization is based on the hypothesis of maximizing simplicity, or equivalently, minimizing complexity. They called this hypothesis the principle of Prägnanz, today also called the minimum principle. It states that the percept will be as good as the prevailing conditions allow. The term good refers to the degree of figural simplicity or regularity, prevailing conditions refers to the structure of the current stimulus image (289). The Gestalt psychologists saw symmetry as a global property with which figural goodness could be analyzed.

In the 1960s Gestalt theory gained interest among philosophers of science, most famously with Norwood Russel Hanson and Thomas Kuhn. Kuhn (1962) used the idea of gestalt switch to explain his idea of paradigm shift, and so implicitly connected the idea of gestalt with his concept of paradigm. Hanson (1958, 1969) employed the concept of gestalt to explain the idea of theory-ladenness of scientific observation, and thereby linked the idea of gestalt with theory. To both, observation in science is not effortless; instead it requires training within a specific scientific discipline. “Indeed, a textbook diagram, in biology, in chemistry, or in physics, is a kind of perceptual scheme. It provides a set that influences one’s visual experience markedly” (Hanson 1969: 167).

This idea of theory-ladeness of scientific observation has been dominant since the 1960s, and is also assumed to apply in research practices where there is not much theory to cover the data, in particular those practices dealing with big data. Typical for this dominant view is the following statement, from a Science article titles “Economics in the Age of Big Data” (Einav and Levin 2014). It should however be noted that in economics the spectacle behind the eyes (Hanson’s 1969 terminology) is not a theory but a model:

As data sets become richer and more complex and it is difficult to simply look at the data and visually identify patterns, it becomes increasingly valuable to have stripped-down models to organize one’s thinking about what variables to create, what the relationship between them might be, and what hypotheses to test and experiments to run. Although the point is not usually emphasized, there is a sense that the richer the data, the more important it becomes to have an organizing theory to make any progress. (6)

Although philosophers of science have used Gestalt theory to show the organizing role of theory in observations, the claim of this article is that heuristic rules similar to gestalt principles play the same role when there is no theory available in principle, particularly in cases of errors and uncertainty.

5 Clustering

One of Gauss’s three assumptions for shaping the error function is that small errors are more likely to occur than large ones. This assumption is the implication of the presumption that the measurements cluster around the true value. The faith in this assumption is revealed in the common confusion of “accuracy” and “precision.” In measurement science, accuracy is defined as the “closeness of agreement between a measured quantity value and a true quantity value of the measurand” (JCGM 200 2012: 21); and precision as “closeness of agreement between indications or measured quantity values obtained by replicate measurements on the same or similar objects under specified conditions” (22). That both are commonly confused seems to be the reason that both definitions are accompanied with the same warning: “the term ‘measurement precision’ should not be used for ‘measurement accuracy’” (21) and “sometimes ‘measurement precision’ is erroneously used to mean measurement accuracy” (22).

Because true values can only be inferred from measurements and for the assessment of the measurements’ accuracy we need to know these true values, accuracy is indeterminate in principle. But precision is not. According to JCGM 200 (2012: 22), precision can even be expressed numerically by measures of imprecision, such as standard deviation or variance.

An often used way to show the difference between accuracy and precisions are dart boards with several distributions of hits to visualize the difference, such as shown in figure 3. To clarify the difference between precision and accuracy, the bull’s-eye represents the true value. But this is a view on truth we never have, for we do not know where the bull’s-eye is located. (If we did we would not need these measurements.) If one asks students (which I often do when discussing this issue with them) to assess which measurement system they consider most reliable by showing two different distributions (which are the right-top and left-bottom ones but without the underlying dart boards), they always choose the one representing high precision. Stronger clustering is considered to be a positive indication of the reliability of the measurement system.

The gestalt principle that seems at work here is the principle of proximity. Items that are close together are grouped together. Proximity could be an indication that the items have something in common, which clarifies the closeness. What other reason would they have for clustering around one point than that the point, around which the cluster appears, represents, for example, the effect of a dominant common cause? The workings of plenty of tiny other causes would only explain the relatively small variations of these measurements. Unless there is some (additional) information about any systematic bias, there is no reason not to assume that the items are clustered around the truth.

6 Smoothness

Gauss assumed that the distribution should be represented by a function, by which he meant a continuous function: “most causes of random error obey a continuous law” (1995: 5). A continuous function is a function without interruptions and jumps, but also without kinks, “the continuity of the error . . . means that the probability of an error lying between two infinitely close limits x and x + dx is φx.dx” (7). As a result the drawing of a continuous function is a smooth curve.

This assumption is based on the ontological principle that nature is smoothly shaped, or in other words, that the laws of nature are smoothly shaped. This ontological principle legitimizes the heuristic that truth can be found by smoothness, that is to say, smooth curves are assumed to be closer to true patterns, more “accurate,” than irregular curves.8

The epistemic activity that is paired to this ontological principle is pattern recognition. A pattern is usually considered to be an underlying regularity, and the recognition is the epistemic activity of “noise” reduction such that the pattern becomes clear. The basic problem of this activity is to decide what belongs to the pattern and what can be considered as noise, in other words, to draw a line between pattern and noise. If a law (or laws) of a mechanism is known that is supposed to generate the pattern, this knowledge could be used to distinguish between pattern and noise, but these cases are rather rare. In most cases this distinction is based on a judgment of an expert of the phenomena in question.

This judgment is based on vision, and is about whether the pattern “looks right,” or to be more precise, whether it looks smooth enough. In machine learning, the activity to arrive at this judgment is called regularization.

Regularization is a technique to control smoothness by adding a term to an error function that penalizes irregularity (see Bishop 2006). To clarify how this technique works, we assume the simplest case. Assume that one wishes to infer the pattern of a variable xt, and the values of xt have to be inferred from a set of available data yt, which are supposed to involve irregularities ɛt and where t denotes time. The pattern of xt is found by identifying a model M for which the data yt function as input and x^t the estimation of xt, functions as output:  
x^t=M(yt, α),

where α denotes the set of parameters of the model. x^t is found by minimizing an error function, which usually takes the form of “squared errors”:

 
t(x^t-yt)2.

The problem with this kind of fitting is that the result is also tuned to the irregularities, called overfitting. To control for overfitting, a regularization term, R(yt, α), is added that penalizes overfitting:

 
E(α)=t(x^t-yt)2-λR(yt,α),

where the coefficient λ governs the relative importance of the regularization term compared with the sum-of-squares error term.

The regularization term R can have different forms, depending on the nature of the phenomenon of which one tries to find its pattern and what is already known about it. For example, in actuarial science, where this approach is called graduation, and where one is quite ignorant about the relevant laws, the regularization term is a measure of smoothness,

 
R=t(zx^t)2,

where z denotes the degree of smoothness which has to be set in advance. If there is reason to assume that the underlying law has the shape of a polynomial function of the form

 
M(x,α)=α0+α1x+α2x2++αmxm,

then the regularization term takes the form of a sum of squares of all the coefficients α2. m is the degree of the polynomial, which also has to be set in advance.

While the regularization term is based on knowledge of the phenomenon under study, the λ coefficient has to be determined in another way. The value can only be decided after one has visualized the model outcome into a graph. Only then can a judgment be made on whether the graph looks like it has the right shape. The heuristic based on smoothness can be seen as similar to the gestalt principle of continuity: it is based on a judgment of smoothness but also on how smooth it must be to look right.

7 Conclusions

The case of the fan charts that visualize uncertainty about future developments illustrates that we deal with ignorance by employing a heuristic that Gestalt psychologists have called the principle of Prägnanz: assume the simplest shape as the conditions allow. It is, however, not only a simplicity principle, as Rudolf Arnheim (1987: 102–5) emphasizes, otherwise “it would dissolve all visual material into complete homogeneity, which is obviously the maximum simplicity available. . . . When the simplicity principle is given sufficiently free play, it will indeed produce shapes a simple as a perfect sphere or a symmetrical pattern.” The principle of simplicity is counterbalanced with the stimuli projected on the retinae, which influence how much the eventual gestalt will depart from a symmetrical pattern. This principle of Prägnanz is equivalent to the principle of insufficient reason in probability theory: assume equal or symmetric distribution of probability unless there is a reason to depart from that distribution.

Besides symmetry, there are two other simplicity principles at work in this case of visualizing ignorance, namely, smoothness and clustering. Both appear to be based on specific ontological conceptions that legitimize these principles.9 Smoothness is based on the assumption that nature is smoothly structured, and clustering on the assumption that in nature the set of large and dominant causes is very small. As a result smoothness and precision are indicators of truth.

These principles of ignorance enable us to map terra incognito, that is, to arrive at representations, visual or otherwise, of what we do not know, or of which we only have a theory in its most preliminary stage.

It is these principles that were used to give a shape to uncertainty, which subsequently could be calibrated by the Monetary Policy Committee of the Bank of England. This shape was not given by an economic theory or model. Knowledge, composed by an economic model and assumptions about future shocks, only determined where the center of the shape should be located. The shape is determined by a trade-off between different degrees of ignorance. The more ignorant we are about the future, the more symmetric, clustered, and smooth the visualization will be. Every addition of “subjective assessment” of the MPC about the future—called calibration—introduces a departure from this initial most Prägnant gestalt, introduces a complexity.

Acknowledgments

This article was first presented as a paper at the conference Reasoning and Representation with Diagrams: History and Philosophy of Science and Technology in East and West, at the National Tsing Hua University in Taiwan, 24–25 November 2016. I would like to thank the participants for their valuable comments. I also thank Hsiang-Ke Chao and Harro Maas for their encouragement to improve the article’s argument and the two anonymous referees for their constructive remarks.

Notes

1

RPIX is the retail price index excluding mortgage interest payments; RPIY is the RPIX excluding indirect taxes (VAT, local authority taxes, and excise duties).

2

While the Bank of England deliberately does not draw this line in its charts, the current fan charts of most of the central banks present the point estimations of future inflation rates as a line.

3

See section 3 for an explanation of this distribution.

4

The following description of the process is based on Britton, Fisher, and Whitley 1998.

5

C=2πσ1+σ21

6

2π(σ2-σ1)

7

An explicit use of the Central Limit Theorem as justification for the (two-piece) normal distribution can be found in Blix and Sellin 1998 (7), which discusses the subjective adjustment of the two-piece normal distribution in relation to inflation uncertainty, but without any reference to the fan charts: “Since [the two-piece normal distribution] is closely related to the standard Gaussian, central limit theorems can be used as the basic rationale.”

8

See Boumans 2015 and 2016 for detailed accounts of practices in which this heuristic is applied.

9

A similar account has been developed by Hasok Chang (2009) to show that intelligibility consists of “a kind of harmony between our ontological conceptions and our epistemic activities” (64). Chang argues that we need ontological principles to legitimate epistemic activities. For example, the ontological principle of discreteness enables the activity of counting.

References

References
Arnheim, Rudolf.
1987
. “
Prägnanz and Its Discontents
.”
Gestalt Theory
9
, no.
2
:
102
7
.
Bank of England
.
1996
.
Bank of England Inflation Report
.
February
. www.bankofengland.co.uk/inflation-report/1996/february-1996.
Bank of England
.
2003
.
Bank of England Inflation Report
.
February
. www.bankofengland.co.uk/inflation-report/2003/february-2003.
Bishop, Christopher M.
2006
.
Pattern Recognition and Machine Learning
.
New York
:
Springer
.
Blix, Mårten, and Sellin, Peter.
1998
. “
Uncertainty Bands for Inflation Forecasts
.”
Sveriges Riksbank Working Paper Series 65
.
Stockholm
:
Sveriges Riksbank
.
Boumans, Marcel.
2015
.
Science Outside the Laboratory. Measurement in Field Science and Economics
.
New York
:
Oxford University Press
.
Boumans, Marcel.
2016
. “
Graph-Based Inductive Reasoning
.”
Studies in History and Philosophy of Science Part A
59
:
1
10
.
Britton, Erik, Fisher, Paul, and Whitley, John.
1998
. “
The Inflation Report Projections: Understanding the Fan Chart
.”
Bank of England Quarterly Bulletin
,
1
March
. www.bankofengland.co.uk/quarterly-bulletin/1998/q1/the-inflation-report-projections-understanding-the-fan-chart.
Chang, Hasok.
2009
. “
Ontological Principles and the Intelligibility of Epistemic Activities
.” In
Scientific Understanding
, edited by de Regt, Henk W., Leonelli, Sabina, and Eigner, Kai,
64
82
.
Pittsburgh
:
University of Pittsburgh Press
.
Einav, Liran, and Levin, Jonathan.
2014
. “
Economics in the Age of Big Data
.”
Science
346
, no.
6210
. doi.org/10.1126/science.1243089.
Elder, Rob, Kapetanios, George, Taylor, Tim, and Yates, Tony.
2005
. “
Assessing the MPC’s Fan Charts
.”
Bank of England Quarterly Bulletin
,
26
September
. www.bankofengland.co.uk/quarterly-bulletin/2005/q3/assessing-the-mpcs-fan-charts.
Gauss, Carl Friedrich.
1995
.
Theory of the Combination of Observations: Least Subject to Error
, translated by Stewart, G. W.
Philadelphia
:
Society for Industrial and Applied Mathematics
.
Hacking, Ian.
1990
.
The Taming of Chance
.
Cambridge
:
Cambridge University Press
.
Hanson, Norwoord Russell.
1958
.
Patterns of Discovery
.
London
:
Cambridge University Press
.
Hanson, Norwood Russell.
1969
.
Perception and Discovery. An Introduction to Scientific Inquiry
.
San Francisco
:
Freeman, Cooper, and Company
.
Joint Committee for Guides in Metrology
.
2012
. “
VIM—International Vocabulary of Metrology—Basic and General Concepts and Associated Terms
.” 3rd ed. www.ceinorme.it/en/normazione-en/vim-en.html.
Kendall, Maurice G., and Stuart, Alan.
1963
.
Distribution Theory
. Vol. 1 of The Advanced Theory of Statistics.
3
vols.
London
:
Charles Griffin and Company
.
Keynes, John Maynard.
1937
. “
The General Theory of Employment
.”
Quarterly Journal of Economics
51
, no.
2
:
209
23
.
Keynes, John Maynard. (1921)
1973
.
A Treatise on Probability
.
London
:
Macmillan
.
Knight, Frank H.
1921
.
Risk, Uncertainty, and Profit
.
New York
:
Houghton Mifflin
.
Kuhn, Thomas.
1962
.
The Structure of Scientific Revolutions
.
Chicago
:
University of Chicago Press
.
Menger, Karl. (1934)
1979
. “
The Role of Uncertainty in Economics
.” In
Selected Papers in Logic and Foundations, Didactics, Economics
, translated by Schoellenkopf, W. and Mellon, W. G.,
259
78
.
Dordrecht
:
Reidel
.
Palmer, Stephen E.
1999
.
Vision Science
.
Cambridge, MA
:
MIT Press
.
Stewart, Ian.
2019
.
Do Dice Play God?
London
:
Profile
.
Wallis, Kenneth F.
2014
. “
The Two-Piece Normal, Binormal, or Double Gaussian Distribution: Its Origin and Rediscoveries
.”
Statistical Science
29
, no.
1
:
106
12
.