LAWS OF ERRORI
Section I.—The Law of Error.
98. (I) The Normal Law of Error.—The simplest and best recognized statement of the law of error, often called the " normal law," is the equation
1 (aari''
z sire
more conveniently written (1/11;c) exp(xa)2/c2, where x is the magnitude of an observation or " statistic," z is the proportional frequency of observations measuring x, a is the arithmetic mean of the group (supposed indefinitely' multiplied) of similar statistics: c is a constant sometimes called the " modulus"' proper to the group; and the equation signifies that if any large number N of such a group is taken at random, the number of observations between x and x+tx is (approximately) equal to the righthand side of the equation multiplied by Nox. A graphical representation of the corresponding curve—sometimes called the probabilitycurve "—is here given (fig. Io), showing the general shape of the curve, and how its dimensions vary with the magnitude of the modulus c. The area being constant (viz. unity), the curve is furled up when c is small, spread out when c is large. There is added a table of integrals, corresponding to areas subtended by the curve; in a form suited for calculations of probability, the variable, r, being the length of the abscissa referred to (divided by) the modulus.' It may be noted that the points of inflexion in the figure are each at a distance from the origin of 1/J2 modulus, a distance equal to the square foot of the mean square of error—often called the " standard deviation." Another notable value of the abscissa is that which divides the area on either side of the origin into two equal parts; commonly called the " probable error." The value of r which corresponds to this point is 04769. . . .
Y
0
99. An a priori proof of this law was given by Herschel' as follows: " The probability of an error depends solely on its magni
A prior! tude and not on its direction;" positive and negative
proof. errors are equally probable. ` Suppose a ball dropped
from a given height with the intention that it should fall on a given mark," errors in all directions are equally probable, and errors in perpendicular directions are independent. Accordingly the required law, " which must necessarily be general and apply alike in all cases, since the causes of error are supposed alike
unknown," is for one dimension of the form 4,(x2), for two dimen
' On this conception see below, par. 122.
2 E.g. in the article on " Probability " in the 9th ed. of the Ency. Brit. ; also by Airy and other authorities. Bravais, in his article Sur la probabilite des erreurs. . . . " Memoires presentes par divers savants " (1846), p. 257, takes as the " modulus or parameter " the Inverse square of our c. Doubtless different parameters are suited to different purposes and contexts; c when we consult the common tables, and in connexion with the operator, as below, par. 160; k( = lc2) when we investigate the formation of the probabilitycurve out of independent elements (below, par. 104) ; h(= I/c2) when we are concerned with weights or precisions (below, par. 134). If one form of the coefficient must be uniformly adhered to, probably, a( =chi 2), for which Professor Pearson expresses a preference, appears the best. It is called by him the " standard deviation."
' Fuller tables are to be found in many accessible treatises. Burgess's tables in the Trans. of the Edin. Roy. Soc. for 1900 are carried to a high degree of accuracy. Thorndike, in his Mental and Social Measurements, gives, among other useful tables, one referred to the standard deviation as the argument. New tables of the probability integral are given by W. F. Sheppard, Biometrics, ii. 174 seq.
' Edinburgh Review (1850), xcii. 19.
' The italics are in the original. The passage continues: " And391
sions 4'(x2 + y2) ; and ¢ (x2 + y2) = ¢ (x2) X 4,(y2) ; a functional
equation of which the solution is the function above written. A reason which satisfied Herschel is entitled to attention, especially if it is endorsed by Thomson and Tait.' But it must be confessed that the claim to universality is not, without some strain of interpretation,' to be reconciled with common experience.
Table of the Values of the Integral I = 10 a i2dx.
r I T I r I r I
0'00 0 .00000 •2 •22270 I.3 93401 2.4 99931
•01 •01128 •3 •32863 I '4 '95229 2.5 •99959
•02 •02256 4 •42839 I.5 •96611 2.6 •99976
.03 •03384 .5 .52050 I.6 '97635 2'7 99986
•04 •04511 •6 •60386 I.9 .98379 2.8 99992
•05 •05637 •7 •67780 I.8 •98909 2.9 99996
•o6 •06762 •8 •74210 I.9 99279 3.0 '99998
•07 •07886 .9 '79691 2'0 99532 00 1.00000
•o8 •09008 1.0 •84270 2.1 •99702
•09 •10128 1 I •88020 2.2 •99814
•I •11246 1 •2 •91031 2.3 99886
loo. There is, however, one class of phenomena to which Herschel's reasoning applies without reservation. In a " molecular chaos," such as the received kinetic theory of gases postulates, if a molecule be placed at rest at a given point and the distance which it travels from that point in a given time, driven hither and thither by colliding molecules, is regarded as an " error," it may be presumed that errors in all directions are equally probable and errors in perpendicular directions are independent. It is remarkable that a similar presumption with respect to the velocities of the molecules was employed by Clerk Maxwell, in his first approach to the theory of molecular motion, to establish the law of error in that region.
tot. The LaplaceQuetelet Hypothesis.—That presumption has,
indeed, not received general assent; and the law of error appears to be better rested on a proof which was originated by Laplace. According to this view, the normal law of error is a first approximation to the frequency with which different values are apt to be assumed by a variable magnitude dependent on a great number of independent variables, each of which assumes different values in random fashion over a limited range, according to a law of error, not in general the law, nor in general the same for each variable. The normal law prevails in nature because it often happens—in the world of atoms, in organic and in social life—that things depend on a number of independent agencies. Laplace, indeed, appears to have applied the mathematical principle on which this explanation depends only to examples (of the law of error) artificially generated by the process of taking averages. The merit of accounting for the prevalence of the law in rerum natura belongs rather to Quetelet. He, however, employed too simple a formulas for the action of the causes. The hypothesis seems first to have been stated in all its generality both of mathematical theory and statistical exemplification by Glaisher.9
102. The validity of the explanation may best be tested by first (A) deducing the law of error from the condition of numerous independent causes; and (B) showing that the law is (A) Deducwhich fulfilled in a variety of concrete cases, in (A) from which the condition is probably present. The con Hypodition may be supposed to be perfectly fulfilled in games thet/ca! of chance, or, more generally, sortitions, characterized by Condithe circumstance that we have a knowledge prior to !ions. specific experience of the proportion of what Laplace calls favourable cases10 to all cases—a category which includes, for instance, the distribution of digits obtained by random extracts from mathematical tables, as well as the distribution of the numbers of points on dominoes.
103. The genesis of the law of error is most clearly illustrated by the simplest sort of " game," that in which the sortition is between two alternatives, heads or tails, hearts or nothearts, or, generally, success or failure, the probability of a success being p and that of a failure q, where p + = 1. The number of aames of such successes in the course of n trials may be con cries sidered as an aggregate made up of n independently varying elements, each of which assumes the values o or 1 with respective frequency q and p. The frequency of each value of the
it is on this ignorance, and not on any peculiarity in cases, that the idea of probability in the abstract is formed." Cf. above, par. 6.
' Natural Philosophy, pt. i. art. 391. For other a priori proofs see Czuber, Theorie der Beobachtungsfehler, th. i.
Cf. note to par. 127.
a He considered the effect as the sum of causes each of which obeys the simplest law of frequency, the symmetrical binomial.
° Memoirs of Astronomical Society (1878), p. 105. Cf. Morgan Crofton, " On the Law of Errors of Observation," Trans. Roy. Soc. (1870), vol. clx. pt. i. p. 178.
10 Above, par. 2.
+x
aggregate is given by a corresponding term in the expansion of (q+p), and by a wellknown theorem ' this term is approximately equal to v2/2npq
r2npge ; where v is the number of integers
by which the term is distant from np (or an integer close to np) ; provided that v is of (or <) the order ,/n. Graphically, let the sortition made for each element be represented by the taking or not taking with respective frequency p and q a step of length i. If a body starting from zero takes successively n such steps, the point at which it will most probably come to a stop is at npi (measured from zero); the probability of its stopping at any neighbouring point within a range oft ni is given by the abovewritten law of frequency, vi being the distance of the stoppingpoint from npi. Put vi=x and 2npgi2=c2; then the probability may be written (I/JT c) expx2/c2.
104. It is a short step, but a difficult one, from this case, in which the element is binomial—heads or tails—to the general case, in which the element has several values, according to the law of frequency—consists, for instance, of the number of points presented by a randomlythrown die. According to the general theorem, if Q is the sum 2 of numerous elements, each of which assumes different magnitudes according to a law of frequency, z=fr(x), the function f being in general different for different elements, the number of times that Q assumes magnitudes between x and x+Ox in the course of N trials is Nzhx, if z = (I/Jxk) exp(xa)2/2k; where a is the sum of the arithmetic
means of all the elements, any one of which a, = [ f xfr(x)dx] , the
square brackets denoting that the integrations extend between the extreme limits of the element's range, if the frequencylocus for each
element is continuous, it being understood that [ f fr(x)dx] =1; and k is the sum of the mean squares of error for each element, _ [ f 2 fr(ar+E)dtj , if the frequencylocus for each element is con
tinuous, where a, is the arithmetic mean of one of the elements, and t the deviation of any value assumed by that' element from a„
denoting summation over all the elements. When the frequencylocus for the element is not continuous, the integrations which give the arithmetic mean and mean square of error for the element must be replaced by summations. For example, in the case of the dice above instanced, the law of frequency for each element is that it assumes equally often each of the values I, 2, 3, 4, 5, 6. Thus the arithmetic mean for each element is 3.5, and the mean square of error ](3.5  1)2 + (3.5  2)2 + &c.[ /6 = 2.916. Accordingly, the sum of the points obtained by tossing a large number, n, of dice at random will assume a particular value x with a frequency which is approximately assigned by the equation
z=(1/dr5.83n) exp(x3.5)2/5.8n.
The rule equally applies to the case in which the elements are not similar; one might be the number of points on a die, another the number of points 'on a domino, and so P'on. Graphically, each element is no longer represented by a step which is either null or i, but by a step which may be, with an assigned probability, one or other of several degrees between those limits, the law of frequency and the range of i being different for the different elements.
105. Variant Proofs.—The evidence of these statements can only be indicated here. All the proofs which have been offered involve some postulate as to the deviation of the elements from their respective centres of gravity, their " errors:" If these errors extended to infinity, it might well happen that the law of error would not be fulfilled by a sum of such elements.8 The necessary and sufficient postulate appears to be that the mean powers of deviation for the elements, the second (above written) and the similarly formed third, fourth, &c., powers (up to some assigned power), should be finite.'
1o6. (I) The proof which seems to flow most directly from this postulate proceeds thus. It is deduced that the mean powers of deviation for the proposed representative curve, the law of error (up to a certain power), differ from the corresponding powers of the actual locus by quantities which are negligible when the number of the elements is large.' But loci which have their mean powers of deviation (up to some certain power) approximately equal may be considered as approximately coincident."
107. (2) The earliest and bestknown' proof is that which was
By the use of Stirling's and Bernoulli's theorems, Todhunter,
History. . . of Probability.
2 The statement includes the case of a linear function, since an element multiplied by a constant is still an element.
8 E.g. if the frequencylocus of each element were I/r(I+x2), extending to infinity in both directions. But extension to infinity would not be fatal, if the form of the element's locus were normal.
For a fuller exposition and a justification of many of the statements which follow, see the writer's paper on " The Law of Error " in the Camb. Phil. Trans. (1905).
5 Loc. cit. pt. i. § 1.
6 On this criterion of coincidence see Karl Pearson's paper "On the Systematic Fitting of Curves," Biometrika, vols. i. and ii.originated by Laplace and generalized by Poisson.' Some idea of this celebrated theory may be obtained from the following free version, applied to a simple case. The case is that in which all the elements have one and the same locus of frequency, and that locus is symmetrical about the centre of gravity. Let the locus be represented by the equation n=0(), where the centre of gravity is the origin, and 0(+t)=o(t); the construction signifying that the probability of the element having a value i; (between say tiAt and + IAt) is 4(t)0t. Square brackets denoting summation between
extreme limits, put x(a) for [S4,(t)e  Ian 0t] where is an integer
multiple of At (or Ox) = pOx, say. Form the mth power of x(a).
The coefficient of el/  IarOx in
(x(a))"' is the probability that the
sum of the values of the m elements should be equal to rAx; a probability which is equal to Oxy,., where y is the ordinate of the locus representing the frequency of the compound quantity (formed by the sum of the elements). Owing to the symmetry of the function ¢ the value of y,, will not be altered if we substitute
far .x lan.x
fore e  , nor if we substitute i(e+.`'—lar°x+
e _ ~/ _ 1arOx) that is cos arLx. Thus (x(a))'" becomes a sum of terms of the form Axy, cos ardx, where y _,.=y+,.. Now multiply
(x(a))'" thus expressed by cos tOxa, where, t being an integer, tix=x, the abscissa of the " error " the probability of whose occurrence is to be determined. The product will consist of a sum of terms of the form &xy, 2(cos a(r+t)Ox+cos a(rt)Ox). As every value of rt (except zero) is matched by a value equal in absolute magnitude, r+t, and likewise every value of r+t is matched by value rt, the series takes the form <1xyrB cos gaZx+Lxy,, where q has all possible integer values from I to the largest value of [r[8 increased by It[; and the term free from circular functions is the equivalent of ~xyr cos a(r+t)Ax, when r= t, together with &xyr cos a(rt)Ax, when r=+l. Now substitute for atlx a new symbol 13; and integrate with respect to $, the thus transformed (x(a))m cos tOxa between the limits ti =o and ,3=r. The integrals of all the terms which are of the form Axy,cos q13 will vanish, and there will be left surviving only 1r Xy,.
We thus obtain, as equal to it xy,, f { x($/fix) }'"cos 430. Now change the independent variable to a; then as dfl=daAx,
Oxy,=Ax—f rl°xda(x(a))r" cos tOxa. r o
Replacing tAx by x, and dividing both sides by ix, we have
r/Ox
y.= f o Sa(X(a))'" cos ax.
Now expanding the cos ax which enters into the expression for x(a), we obtain
x(a) =[S4,(a)] 2 [s4,(a)a2]x2+ II [s4(a)a4]x' 4•
Performing the summations indicated, we express x(a) in terms of the mean powers of deviation for an element. Whence x(a)' is expressible in terms of the mean powers of the compound locus. First and chief is the mean second power of deviation for the compound, which is the sum of the mean second powers of deviation for the elements, say k. It is found that the sought probability may be equated to
I z
1'rl''xdxe —la* cos ax+4l k2 f r/~xdxa4e 2a k cos ax .
where k2 is the coefficient defined below .9 Here r/L,x may be replaced by Go , since the finite difference Ax is small with respect to unity when the number of the elements is large; ° and thus the integrals involved become equateable to known definite integrals. If it were allowable to neglect all the terms of the series but the first the
z
expression would reduce toN/ (2rk)eu /k, the normal law of error.
But it is allowable to neglect the terms after the first, in a first approximation, for values of x not exceeding a certain range, the
number of the elements being large, and if the postulate above enunciated is satisfied." With these reservations it is proved that the sum of a number of similar and symmetrical elements conforms to the normal law of error. The proof is by parity extended to the case in which the elements have different but still symmetrical frequency functions; and, by a bolder use of imaginary quantities, to the case of unsymmetrical functions.
'' Laplace, Theorie analytique ' des probabilites, bk. ii. ch. iv.; Poisson, Recherches sur la probabilite des jugements. Good restatements of this proof are given by Todhunter, History . .of Probability art. 1004, and by Czuber, Theorie der Beobachtungsfehler,!art. 38 and Th. 2, § 4.
8 The symbol () is used to denote absolute magnitude, abstraction being made of sign.
" Below, pars. 159, 160. I° Loc. cit. app. I. 11 Loc. Cit. p. 53 and context.
108. (3) De Forest' has given a proof unencumbered by imaginaries of what is the fundamental proposition in Laplace 's theory that, if a polynomial of the form
Ao+Aiz+A2z'+ ... +A,nz"'
be raised to the nth power and expanded in the form Bo+B,z+B2z'+ ... +Bm,.Z""
then the magnitudes of the B's in the neighbourhood of their maximum (say B,) will be disposed in accordance with a " probabilitycurve," or normal law of error.
109. (4) Professor Morgan Crofton's original proof of the law of error is based on a datum obtained by observing the effect which the introduction of a new element produces on the frequencylocus for the aggregate of elements. It seems to be assumed, very properly, that the sought function involves as constants some at least of the mean powers of the aggregate, in particular the mean second power, say k. We may without loss of generality refer each of the elements (and accordingly the aggregate) to its respective centre of gravity. Then if y, =f(x), is the ordinate of the frequencylocus for the aggregate before taking in a new element, and y = ay the ordinate after that operation, by a wellknown principle,' y+ay = [S~m()f(x — )AI:],
where 71, =~,n(), is the frequencylocus for the new element, and the
square brackets indicate that the summation is to extend over the whole range of values assumed by that element. Expanding in ascending powers of (each value of) and neglecting powers above the second, as is found to be legitimate under the conditions specified, we have (since the first mean power of the element vanishes)
ay=2[SE''O'"( AtId .
From the fundamental proposition that the mean square for the aggregate equals the sum of mean squares for the elements it follows that ISE'4),n(U)At] the mean second power of deviation for the mth element is equal to ak, the addition to k the mean second power of deviation for the aggregate. There is thus obtained a partial differential equation of the second order
dy—,d'y (I) dk
A subsidiary equation is (in effect) obtained by Professor Crofton from the property that if the unit according to which the axis of x is graduated is altered in any assigned ratio, there must bea corresponding alteration both of the ordinate expressing the frequency: of the aggregate and of the mean square of deviation for the aggregation. By supposing the alteration indefinitely small he obtains a second partial differential equation, viz. (in the notation here adopted)
y+xdx+ 2kdk =o. (2)
From these two equations, regard being had to certian other conditions of the problem,' it is deducible that y=Cex'/2k, where Cis a constant of which the value is determined by the condition that co
' The Analyst (Iowa), vols. v., vi., vii. passim; and especially vi. 142 seq., vii. 172 seq.
' Morgan Crofton, loc. cit. p. 781, col. a. The principle has been used by the present writer in the Phil. Mag. (1883), xvi. 301.
' For a criticism and extension of Crofton's proof see the already cited paper on " The Law of Error," Camb. Phil. Trans. (1905), pt. i. § 2. Space does not permit the reproduction of Crofton's
proof as given in the 9th ed. of the Ency. Brit. (art. " Probability," 3 roof
Loc. cit. pt. I. § 4; and app. 6. 6 Loc. Cit. p. 122 seq.
vertex rises up. The change in " spread " produced by the accession of new elements is illustrated by the transition from the high to the low curve, in fig. to, in the case of a sum; in the case of an average (arithmetic mean) by the reverse relation.
113. The proposition which has been proved for linear functions may be extended to any other function of numerous variables, each representing the value assumed by an independently ion fluctuating element; if the function may be expanded Extens
in ascending powers of the variables, according to Nonlinear Taylor's theorem, and all the powers after the first Functions. may be neglected. The matter is not so simple as it is often represented, when the variable elements may assume large, perhaps infinite, values; but with the aid of the postulate above enunciated the difficulty can be overcome."
114. All the proofs which have been noticed have been extended to errors in two (or more) dimensions' Let Q be the sum of a number of elements, each of which, being a functions xtension of two variables, x and y, assumes different pairs of to two or values according to a law of frequency rzr=frix, y), themore functions being in general different for different elements.Dimenslons. The frequency with which Q assumes values of the
variables between x and +Ax and between y and y+Ay is zAxAy, if
1 _ m(x—a)'—2l(x — a)(q — b) + k(y — b)'
z—2xJhml' exp 2(km—l')
of x and y concurring.
If i is the distance from o to I and FIG. it.
from I to 2 on the abscissa, and is the corresponding distance on the ordinate, the mean of the values of x for the element—A¢, as we may say,—is s, and the corresponding mean square of horizontal deviations is 412. Likewise Ab=i'; Am=; and Al=e(+iX+i'—iX —i') =;ii'. Accordingly, if n such elements are put together (if n steps of the kind which the diagram represents are taken), the frequency with which a particular pair of aggregates x and y will concur, with which a particular point on the plane of xy, namely, x=ri and y=ri, will be reached, is given by the equation
z=2a 3 exp—3n[(r—n)'i'—(r—n)(r'—n)ii'+(r'—u)i'2]".
115. A verification is afforded by a set of statistics obtained with dice by Weldon, and here reproduced by his permission. A success is in this experiment defined, not by obtaining a head when a coin is tossed, but by obtaining a face with more than three points on it when a die is tossed; the probabilities of the two events are the same, or rather would be if coins and dice were perfectly symmetrical' Professor Weldon virtually took six steps of the sort above described when, six painted dice having been thrown, he added the number of successes in that painted batch to the number of successes in another batch of six to form his x, and to the number of successes in a third batch of six to form his y. The result is represented in the annexed table, where each degree on the axis of x and y respectively corresponds to the i and i' of the preceding paragraphs, and i =
i'.
The observed frequencies being represented by numerals, a general correspondence between the facts and the formula is apparent.
6 Loc. cit. pt. ii. § 7.
' The second by Burbury, in Phil. Mag. (1894), xxxvii. 145; the third by its author in the Analyst for 1881; and the remainder by the present writer in Phil. Mag. (1896), xii. 247; and Comb: Phil. Trans. (1905), loc. Cit.
" Compare the formula for the simple case above, § 4.
9 On the irregularity of the dice with which Weldon experimented, see Pearson, Phil. Mag. (1900), p. 167.
f oo ydx=1.
I Io. (5) The condition on which Professor Crofton's proof is based may be called differential, as obtained from the introduction of a single new element. There is also an integral condition obtained from the introduction of a whole set of new elements. For let A be the sum of ml elements, fluctuating according to the sought law of error. Let B be the sum of another set of elements m2 in number (ml and m2 both large). Then Q a quantity formed by adding together each pair of concurrent values presented by A and B must also conform to the law of error, since Q is the sum of mi+m2 elements. The general form which satisfies this condition of reproductivity is limited by other conditions to the normal law of error.*
i i i. The list of variant proofs is not yet exhausted,' but enough has been said to establish the proposition that a sum of numerous elements of the kind described will fluctuate approximately according to the normal law of error.
112. As the number of elements is increased, the constant above designated k continually increases; so that the curve. representing varietiea the frequency of the compound magnitude spreads out Varini s from its centre. It is otherwise if instead of the simple FunctionL sum we consider the linear function formed by adding the
m elements each multiplied by 1/m. The ' spread " of the average thus constituted will continually diminish as the number of the elements is increased; the sides closing in as the
where, as in the simpler case, a=Ear, a,. being the arithmetic mean of the values of x assumed in the long run by one of the elements, b is the corresponding sum for values of y, and
k = [// x — as)'fr(x, y)dxdy]
m [//J''y — br)'f*(x, y)dxdy]
= [ JJ (x — ar) (y — br)fr(x, y)dxdy]
the summation extending over all the elements, and the integration between the extreme limits of each; supposing that the law of frequency for each element is contin
uous, otherwise summation is to be substituted for integration. For example, let each element be constituted as follows: Three coins having been tossed, the number of heads presented by the first and second coins together is put for x, the number of heads pre '7 sented by the second and third coins together is put for y. The law of frequency for the element is represented in fig. II, the integers outside denoting the values of x or y, the fractions inside probabilites of particular values
o 4
a '3 7
8 0
The maximum frequency is, as it ought to be, at the point x=6i, y=6i'. The density is particularly great along a line through that point, making 45° with the axis of x; particularly small in the complementary direction. This also is as it ought to be. For if the centre is made the origin by substituting x for (x—a) and y for (y—b), and then new coordinates X and Y are taken, making an angle 0 with x and y respectively, the curve which is traced on the plane of zX by its intersection with the surface is of the form
z=J exp—X2[k sine 0—21 cos 0 sin 0+m cos' 0]/2(kml2),
a probabilitycurve which will be more or less spread out according as the factor k sine 0—21 cos 0 sin 0+m cos' 0 is less or greater. Now this expression has a minimum or maximum when (k—m) sin 0—21 cos 20=0; a minimum when (k—m) cos 20+2 lsin 20 is positive, and a maximum when that criterion is negative; that is, in the present case, where k=m, a minimum when 0=;ar and a maximum when 0 =',r.
0 1 2 1 3 4 5 6 7 8 9 10 11 12
12
11 I I 5 I I
10 2 6 28 27 19 2
9 I 2 II 43 76 57 54 15 4
8 6 18 49 116 138 118 59 25 5
7 12 47 109 208 213 118 71 23 I
6 9 29 77 199 244 198 121 32 3
5 3 12 51 119 181 200 129 69 18 3
16
2
4
55 100 117 91 46 19 3
3 2 14 28 53 43 34 17 I
2 7 12 13 18 4 I I
1 2 4 I 2 I
0 i
116. Characteristics of the Law of Error'—As may be presumed from the examples just given, in order that there should be some approximation to the normal law the number of elements need not be very great. A very tolerable imitation of the probabilitycurve has been obtained by superposing three elements, each obeying a law of frequency quite different from the normal one,' namely, that simple law according to which one value of a variable occurs as frequently as another between the limits within which the variation is confined (y=1/2a, between limits x= +a, x= —a). If the component elements obey unsymmetrical laws of frequency, the compound will indeed be to some extent unsymmetrical, unlike the " normal " probabilitycurve. But, as the number of the elements is increased, the portion of the compound curve in the neighbourhood of its centre of gravity tends to be rounded off into the normal shape. The portion of the compound curve which is sensibly identical with a curve of the " normal " family becomes greater the greater the number of independent elements; caeteris paribus, and granted certain conditions as to the equality and the range of the elements. It will readily be granted that if one component predominates, it may unduly impress its own character on the compound. But it should be pointed out that the characteristic with which we are now concerned is not average magnitude, but deviation from the average. The component elements may be very unequal in their contributions to the average magnitude of the compound without prejudice to its " normal character, provided that the fluctuation of all or many of the elements is of one and the same order. The proof of the law requires that the contribution made by each element to the mean square of deviation for the compound, k, should be small, capable of being treated as differential with respect to k. It is not necessary that all these small quantities should be of the same order, but only that they should admit of being rearranged, by massing together those of a smaller order, as a numerous set of
i Experiments in part materia performed by A. D. Darbishire afford additional illustrations. See " Some Tables for illustrating Statistical Correlation," Mem. and Proc. Man. Lit., and Phil. Soc., vol. li. pt. iii.
2 Journ. Scat. Soc. (March 1900), p. 73, referring to Burton, Phil. Meg. (1883), xvi. 301.independent elements in which no two or three stand out as sui genesis in respect of the magnitude of their fluctuation. For example, if one element consist of the number of points on a domino (the sum of two digits taken at random), and other elements, each of either 1 or o according as heads or tails turn up when a coin is cast, the first element, having a mean square of deviation 16.5, will not be of the same order as the others, each having 0.25 for its mean square of deviation. But sixtysix of the latter taken together would constitute an independent element of the same order as the first one; and accordingly if there are several times sixtysix elements of the latter sort, along with one or two of the former sort, the conditions for the generation of the normal distribution will be satisfied. These propositions would evidently be unaffected by altering the average magnitude, without altering the deviation from the average, for any element, that is, by adding a greater or less fixed magnitude to each element. The propositions are adapted to the case in which the elements fluctuate according to a law of frequency other than the normal. For if they are already normal, the aforesaid conditions are unnecessary. The normal law will be obeyed_ by the sum of elements which each obey it, even though they are not numerous and not independent and not of the same order in respect of the extent of fluctuation. A similar distinction is to be drawn with respect to some further conditions which the reasoning requires. A limitation as to the range of the elements is not necessary when they are already normal, or even have a certain affinity to the normal curve. Very large values of the element are not excluded, provided they are sufficiently rare. What has been said of curves with special reference to one dimension is of course to be extended to the case of surfaces and many dimensions. In all cases the theorem that under the conditions stated the normal law of error will be generated is to be distinguished from the hypothesis that the conditions are fairly well fulfilled in ordinary experience.
117. Having deduced the genesis of the law of error from ideal conditions such as are attributed to perfectly fair ,B) Verifica games of chance, we have next to inquire how far ttoa ofthethese conditions are realized and the law fulfilled inNorma,Law common experience.
118. Among important concrete cases errors of observation occupy a leading place. The theory is brought to bear on this case by the hypothesis that an error is the algebraic sum of Errors numerous elements, each varying according to a law
of frequency special to itself. This hypothesis involves proper. two assumptions: (1) that an error is dependent on numerous independent causes; (2) that the function expressing that dependence can be treated as a linear function, by expanding in terms of ascending powers (of the elements) according to Taylor's theorem and neglecting higher powers, or otherwise. The first assumption seems, in Dr Glaisher's words, " most natural and true. In any observation where great care is taken, so that no large error can occur, we can see that its accuracy is influenced by a great number of circumstances which ultimately depend on independent causes: the state of the observer's eye and his physiological condition in general, the state of the atmosphere, of the different parts of the instrument, &c., evidently depend on a great number of causes, while each contributes to the actual error."3 The second assumption seems to be frequently realized in nature. But the assumption is not always safe. For example, where the velocities of molecules are distributed according to the normal law of error, with zero as centre, the energies must be distributed according to a quite different law. This rationale is applicable not only to the fallible perceptions of the senses, but also to impressions into which a large ingredient of inference enters, such as estimates of a man's height or weight from his appearance,• and even higher acts of judgment.5 Aiming at an object is an act similar to measuring an object, misses are produced by much the same variety of causes as mistakes; and, accordingly, it is found that shots aimed at the same bull'seye are apt to be distributed according to the normal law, whether in two dimensions on a target or according to their horizontal deviations, as exhibited below (par. 156). A residual class comprises miscellaneous statistics, physical as well as social, in which the normal law of error makes
its appearance, presumably in consequence of the action of numerous independent influences. Wellknown instances are afforded by human heights and other
bodily measurements, as tabulated by Quetelet 8 and statistics. others? Professor Pearson has found that " the normal curve suffices to describe within the limits of random sampling the distribution of the chief characters in man."8 The tendency of social phenomena to conform to the normal law of frequency is well
3 Memoirs of Astronomical Society (1878), p. 105.
' Journ. Stat. Soc. (1890), p. 462 seq.
5 E.g. the marking of the same work by different examiners. Ibid. 8 Lettres sur la theorie des probabilitis and Physique sociale.
7 E.g. the measurements of Italian recruits, adduced in the Atlante
statistico, published under the direction of the Ministero de Agricul
tura (Rome, 1882); and Weldon's measurements of crabs, Proc.
Roy. Soc. liv. 321; discussed by Pearson in the Trans. Roy. Soc.
(1894), vol. clxxxv. A.
8 Biometrika, iii. 395. Cf. ibid. p. 141.
exemplified by A. L. Bowley's grouping of the wages paid to different classes.'
119. The division of concrete errors which has been proposed is not to be confounded with another twofold classification, namely, A Variant observations which stand for a real objective thing, and Alassifies such statistics as are not thus representative of something lion. outside themselves, groups of which the mean is called
" subjective," This division would be neither clear nor useful. On the one hand socalled real means are often only approximately equal to objective quantities. Thus the proportional frequency with which one face of a die—the six suppose—turns up is only approximately given by the objective fact that the six is one face of a nearly perfect cube. For a set of dice with which Weldon experimented, the average frequency of a throw, presenting either five or six points, proved to be not •3, but 0.3377.2 The difference of this result from the regulation 0•3 is as unpredictable from objective data, prior to experiment, as any of the means called subjective or fictitious. So the mean of errors of observation often differs from the thing observed by a socalled " constant error." So shots may be constantly deflected from the bull'seye by a steady wind or " drift."
120. On the other hand, statistics, not purporting to represent a teal object, have more or less close relations to magnitudes which cannot be described as fictitious. Where the items averaged are ratios, e.g. the proportion of births or deaths to the total population in several districts or other sections, it sometimes happens that the distribution of the ratios exactly corresponds to that which is obtained in the simplest games of chance—" combinational " distribution in the phrase of Lexis.' There is unmistakably suggested a sorlition of the simplest type, with a real ascertainable relation between tha number of " favourable cases " and the total number of cases. The most remarkable example of this property is presented by the proportion of male to female (or to total) births. Some other instances are given by Lexis 4 and Westergaard.' A similar correspondence between the actual and the " combinational " distribution has been found by Bortkevitch 6 in the case of very small probabilities (in which case the law of error is no longer " normal "). And it is likely that some ratios—such as general deathrates—not presenting combinational distribution, might be broken up into subdivisions—such as deathrates for different occupations or ageperiods—each distributed in that simple fashion.
121. Another sort of averages which it is difficult to class as subjective rather than objective occurs in some social statistics, under the designation of indexnumbers. The percentage which represents the change in the value of money between two epochs is seldom regarded as the mere average change in the price of several articles taken at random, but rather as the measure of something, e.g. the variation in the price of a given amount of commodities, or of a unit of commodity.' So something substantive appears to be designated by the volume of trade, or that of the consumption of the working classes, of which the growth is measured by appropriate indexnumbers,' the former due to Bourne and Sir Robert Gillen," the latter to George Wood.10
122. But apart from these peculiarities, any set of statistics may be related to a certain quaesitum, very much as measurements are related to the object measured. That quaesitum is the limiting or ultimate mean to which the series of statistics, if indefinitely prolonged, would converge, the mean of the complete group; this conception of a limit applying to any frequencyconstant, to " c," for instance, as well as " a " in the case of the normal curve." The given statistics may be treated as samples from which to reason up to the true constant by that principle of the calculus which determines the comparative probability of different causes from which an observed event may have emanated."
123. Thus it appears that there is a characteristic more essential to the statistician than the existence of an objective quaesitum, namely, the use of that method which is primarily, but not exclusively, proper to that sort of quaesitum—inverse probability.l3
' Wages in the United Kingdom in the Nineteenth Century; and art. " Wages " in the Ency. Brit., loth ed., vol. xxxiii.
2 Phil. Mag. (1900), p. 168.
Cf. Journ. Slat. Soc., Jubilee No., p. 192.
Masser erscheinungen.
6 Grundzuge der Statistik. Cf. Bowley, Elements of Statistics, P. 302.
6 Das Gesetz der kleinen Zahlen.
See for other definitions Report of the British Association (1889), pp. 136 and 161, and compare Walsh's exhaustive Measurement of General ExchangeValue.
' Cf. Bowley, Elements of Statistics, ch. ix.
Journ. Stat. Soc. (1874 and later). Parly. Papers [C. 2247] and [C. 30791
" " WorkingClass Progress since 186o," Journ. Stat. Soc. (1899), p. 639.
11 On this conception compare Venn, Logic of Chance, chs. iii. and iv., and Sheppard, Proc. Lond. Math. Soc., p. 363 seq. '2 Laplace's 6th principle, Theorie analytique, intro. x.
" 3 See above, pars. 13 and 14.Without that delicate instrument the doctrine of error can seldom be fully utilized; but some of its uses may be indicated before the introduction of technical difficulties.
124 Having established the prevalence of the law of error,14 we go on to its applications. The mere presumption that wherever three or four independent causes cooperate, the law of error App/ica tends to be set up, has a certain speculative interest." ,The assumption of the law as a hypothesis is legiti dtheonsaormal mate. • When the presumption is confirmed by specific "fns, experience this knowledge is apt to be turned to Law. account. It is usefully applied to the practice of gunnery,16 to determine the proportion of shots which under assigned conditions may be expected to hit a zone of given size. The expenditure of ammunition required to hit an object can thence be inferred. Also the comparison between practice under different conditions is facilitated. In many kinds of examination it is found that the total marks given to different candidates for answers to the same set of questions range approximately in conformity with the law of error. It is understood that the civil service commissioners have founded on this fact some practical directions to examiners. Apart from such direct applications, it is a useful addition to our knowledge of a class that the measurable' attributes of its members range in conformity with this general law. Something is added to the truth that " the days of a man are threescore and ten," if we may regard that epoch, or more exactly for England, 72, as " Nature's aim, the length of life for which she builds a man, the dispersion on each side of this point being . . , nearly normal."" So Herschel says: " An [a mere] average gives us no assurance that the future will be like the past. A [normal] mean may be reckoned on with the most complete confidence."" The existence of independent causes," inferred from the fulfilment of the normal law, may be some guarantee of stability. In natural history especially have the conceptions supplied by the law of error been fruitful. Investigators are already on the track of this inquiry: if those members of a species whose size or other measurable attributes are above (or below) the average are preferred—by " natural " or some other kind of selection—as parents, how will the law of frequency as regards that attribute be modified in the next generation?
End of Article: LAWS OF 

[back] LAWS 
[next] CECIL GORDON LAWSON (1851—1882) 
There are no comments yet for this article.
Do not copy, download, transfer, or otherwise replicate the site content in whole or in part.
Links to articles and home page are encouraged.