Page updated: March 14, 2021
Author: Curtis Mobley
View PDF

Error Estimation

This page shows how to estimate the errors in Monte Carlo simulations. General results from probability theory are illustrated with numerical examples.

To be specific, suppose that we need to estimate the fraction of power emitted by a light source that will be received by a detector. The answer of course depends on the water IOPs; the size, orientation, and location of the detector relative to the light source; the angular distribution of the emitted light; and perhaps on other things like boundary surfaces that can reflect or absorb light scattered onto the boundary. The numerical results to be shown below use the source and detector geometry shown in Fig. 1. The source emits a collimated bundle of rays toward a detector that is τ = 5 optical path lengths away and is 0.1 optical path lengths in diameter. The water IOPs have a = 0.2m1 and b = 0.8m1 so that c = 1m1 and one meter is one optical path length. The albedo of single scattering is ωo = 0.8. The scattering phase function is a Henyey-Greenstein phase function with a scattering-angle mean cosine of g = 0.8.


Figure 1: Source and detector geometry used for numerical simulations.

In the simulations, N rays will be emitted from the source, each with an initial weight of w = 1. Most of those rays will miss the detector, as illustrated by the blue arrows in Fig. 1. However, some rays will hit the detector, as illustrated by the red arrows, at which time their current weight will be tallied to the accumulating total weight wd received by the detector. After all rays have been traced, the the Monte Carlo estimate of the fraction of emitted power received by the detector is simply fd = wdN.

If we do only one simulation tracing, say, N = 104 ray bundles, then the resulting estimate of fd is all we have. In particular, we have no estimate of the statistical error in the estimated fd.

Probability Theory

To develop a quantitative error estimate for the result of a Monte Carlo simulation, we begin with a review of some results from basic probability theory. Recalling the notation introduced on the Monte Carlo Introduction page, let pW (w) be the probability density function (pdf) for random variable W. Capital W represents the random variable, e.g. the ray bundle weight received by the detector, and lower case w represents a specific value of W, e.g., w = 0.72. In the present example, pW (w) is the pdf that a ray strikes the detector with a weight w, 0 w 1. rays that miss the detector do not contribute to accumulating weight or power received by the detector and do not enter into the calculations below; they are simply wasted computer time. Note that we have no idea what mathematical form pW (w) has: it results from a complicated sequence of randomly determined ray path lengths and scattering angles.

For any continuous pdf pW (w) the expected or mean value of W is defined as

mean(W) μ {W} =wpW (w)dw, (1)

where denotes expected value and the integral is over all values for which W is defined. If the random variable is discrete, the integral is replaced by a sum over all allowed values of W. Similarly, the variance of W is defined as

var(W) σ2 {(W μ)2} =(w μ)2p W (w)dw = {W2} [{W}]2.

Note that if c is a constant, then

{cW} = c{W}andvar(cW) = c2var(W).

Greek letters μ and σ2 are used to denote the true or population mean and variance of a pdf.

Suppose that Nd ray bundles actually reach the detector. The total weight received by the detector is then given by the sum of these randomly determined weights:

SNd = i=1Nd wd(i),

where wd(i) is the weight of ray bundle i when it reached the detector.

Each of the N ray bundles emitted by the source and traced to completion is independent of the others. In particular, a different sequence of random numbers is used to determine the path lengths and scattering angles for each emitted bundle. Moreover, the underlying pdfs for ray path length and scattering angles (i.e., the IOPs) are the same for each bundle. The random variables are then said to be independent and identically distributed (iid), and SNd is said to be a random sample of size Nd of random variable W.

The linearity of the expectation (i.e., the integral of a sum is the sum of the integrals) means that for iid random variables such as W,

{SNd} = Nd{W} = Ndμandvar(SNd) = Ndvar(W) = Ndσ2. (2)

In the Monte Carlo simulation, the sample mean, i.e. the estimate of the average detected weight obtained by from the Nd detected ray bundles is

mNd 1 NdSNd = 1 Nd i=1Nd wd(i).

Equation (2) now gives two extremely important results. First,

{mNd} = 1 Nd{SNd} = μ. (3)

That is, the expectation of the sample mean mNd is equal to the true mean μ. The sample mean is then said to be an unbiased estimator of the true mean of the pdf. Second,

var(mNd) = var( 1 NdSNd) = 1 Nd2var(SNd) = σ2 Nd. (4)

Thus, the variance of the sample mean goes to zero as Nd , that is, as more and more ray bundles are detected. In other words, the Monte Carlo estimate of the average power received by the detector is guaranteed to give a result that can be made arbitrarily close to the correct result if enough rays are detected. This result is known as the law of large numbers. Again, you can emit and trace all the rays you want, but if they don’t hit the detector, they don’t count.

It is often convenient to think in terms of the standard deviation, e.g., when plotting data and showing the spread of values. The standard deviation of the error in mNd is

sNd = var(mNd ) = σ Nd. (5)

The dependence of the standard deviation of the estimate on 1Nd is a very general and important result. However, this “approach to the correct value” is very slow. If we want to reduce the standard deviation of the error in the estimated average power received by the detector by a factor of 10, we must detect 100 times as many rays. That can be computationally very expensive.

It is to be emphasized that result (4) that the variance of a sample mean equals the true variance divided by the sample size holds for any situation for which the individual samples are independent and identically distributed random variables.

Note, of course, that if we knew the pdf for the received power, pW (w), then we could simply evaluate Eq. (1) to get the desired true mean μ, and no Monte Carlo simulation would be required.

Finally, it must be remembered that the discussion here assumes that ”all else is the same” when considering the number of detected rays. For example, we emit and trace more rays without changing the physics of the simulation. The page on importance sampling presents ways to increase the number of detected rays and thereby reduce the variance, but with a change in the physics that sometimes may invalidate the simple 1Nd dependence.

Numerical Examples

Numerical simulations were performed for the geometry and conditions described for Fig. 1. For these simulations, tracing type 1 of the previous page was used. That is, ray bundles were traced until they either hit the target (still with weight w = 1) or were absorbed. For a given number N of emitted ray bundles, various numbers Nrun of independent runs were done. That is, N rays were traced and the number Nd of detected ray bundles and their weights were tallied for each run. The fraction of emitted power received by the detector was then computed by the total detected weight for the run divided by N. Then another run was made with everything the same except that a different sequence of random numbers was used (i.e., each run was started with a different seed for the U[0, 1] random number generator used to determine path lengths and scattering angles).

The first set of simulations used N = 10, 000 emitted ray bundles for each run, with Nrun = 10, 100, 1000, and 10,000 runs being made. Figure 2 shows the distributions of the sample means mNd and other information for these four sets of run numbers. The upper left panel of the figure is for only Nrun = 10 runs, or trials, or simulations. In this panel, the histogram shows that one run, or 0.1 of the total number of runs, gave an estimated fraction between 0.0088 and 0.0090 of the emitted power; two runs, or 0.2 of the total, gave a fraction between 0.0104 and 0.0106, and so on. As the number of runs increases, the estimates of the fraction of power received range from slightly less than 0.008 to slightly more than 0.014, with most estimates centering somewhere near 0.0108.


Figure 2: Estimates of the fraction of detected power for four sets of runs with N = 104 emitted rays in each run.

As the number of runs increases, something very remarkable happens: the distribution of the fraction of the emitted power appears to be approaching a Gaussian or normal shape, even though the underlying pdf pW (w) is certainly not Gaussian. This distribution can be thought of as the distribution of errors in the estimated mean of the distribution, or the distribution of {mNd μ}, where μ is the unknown true mean of the distribution pW (w). This approach to a Gaussian distribution is a consequence of the central limit theorem. The central limit theorem states that the sum of a large number of independently distributed random variables with finite means and variances is approximately normally distributed regardless of what the distribution of the random variable itself may be. This is one of the most profound results in probability theory. Indeed, it explains why so many natural phenomena tend to have a Gaussian shape. Phenomena as disparate as average student exam scores, noise in electrical circuits, daily water usage in a city, or the fraction of people who develop cancer can all result from sums of many individual contributions. Such sums then tend toward a Gaussian distribution as the number of individual contributions increases. The theorem was first proved for a specific pdf in 1733, but it was not proven to hold for all pdfs (having finite means and variances) until the early 1900’s. (By the way, in spite of what you see in the tabloid press, there is nothing in probability theory called “the law of averages.” The central limit theorem is maybe the closest thing to the often involked buy mythical “law of averages.”)

Figure 3 shows the corresponding results for series of Nrun = 10, 100, 1000, and 10,000 runs being made, but with each run now having N = 100, 000 emitted rays. Now the spread in the estimated values is much less, from slightly less than 0.010 to about 0.012, again centering somewhere around 0.0108. Again, we see the approach to a Gaussian shape as more and more runs are made, but the Gaussian has a narrower width, i.e. less variance about the mean. The standard deviation of the sample estimates of the mean is usually called the “standard error of the mean.”


Figure 3: Estimates of the fraction of detected power for four sets of runs with N = 105 emitted rays in each run.

The lower right panel of Fig. 2 shows that the sample standard deviation for the case of 10,000 runs each with 10,000 emitted rays is sNd = 1.186 103 and the average number of detected rays is Nd = 107.9. The corresponding panel of Fig. 3 shows sNd = 3.718 104 and Nd = 1080.5. The ratio of these sample standard deviations is

sNd(Nd = 107.9) sNd(Nd = 1080.5) = 1.186 103 3.718 104 = 3.19.

The corresponding ratio of σ Nd values is

σ Nd =107.9 σ Nd =1080.5 = 1080.5 107.5 = 3.16.

This nicely illustrates the dependence of the sample standard deviation, or the standard error of the mean, on the square root of the number of ray bundles detected, just as predicted by Eq. (5).

Error Estimation

In the present example of the fraction of emitted power received by a detector, the central limit theorem guarantees that the errors in the fraction of received power computed by many Monte Carlo runs approaches a Gaussian. We can thus use all of the results for Gaussian, or normal, probability distributions to estimate the errors in the Monte Carlo results.

It is often desirable to know the probability that the computed sample mean mNd is within some prechosen amount, say 1 standard deviation, of the (unknown) true mean μ. Conversely, we may want to compute the error range so that the sample mean is within that error range of the true mean with some prechosen probability. Such questions can be answered starting with the statement

Prob{μ βsNd mNd μ + βsNd} = 1 α.

This equation states that the probability is 1 α that the sample mean is within a range βsNd of the true mean μ; β is a fraction of the sample standard deviation sNd. In other words, the probability is 1 α that μ = mNd ± βsNd. The central limit theorem guarantees that, if the sample size is large enough, the deviation of the sample mean from the true mean, mNd μ, is approximately normally distributed:

pdf(mNd μ) 1 2πsNd exp (mNd μ)2 2sNd2 . (6)

Assuming that we have enough samples to get a good approximation to the normal distribution, the probability that mNd is greater than μ by an amount βsNd is then

Prob{mNd μ βsNd} = 1 2πsNdβsN d exp t2 2sNd2 dt.

Letting y = tsNd gives

Prob{mNd μ βsNd} = 1 2πβ exp y2 2 dy Q(β). (7)

The Q(β) integral in the last equation cannot be performed analytically, but it is tabulated in probability texts, and software packages such as MATLAB and IDL have routines to compute it. In to also common to find tables and subroutines for

Φ(β) 1 2πβ exp y2 2 dy.

Note that Q(β) + Φ(β) = 1.

Let us now apply Eq. (7) to various examples. We first compute the probability that the sample mean is within one standard deviation of the true mean. Letting β = 1, we compute the probability α2 that mNd μ lies in the “right-hand tail” of the normal distribution beyond β = 1. This is the area shaded in red in Fig. 4.


Figure 4: The Gaussian or normal distribution of Eq. 6

This probability is

Prob{mNd μ sNd} = Q(1.0) = α 2 .

Q(1) 0.1587, so α = 2Q(β) = 0.3174. The Gaussian distribution is symmetric about the mean, so Prob{mNd μ sNd} that mNd μ lies in the left-hand (green-shaded) tail of the distribution also equals 0.1587. Thus the probability that mNd μ does not lie in either tail of the distribution, i.e. that μ mNd ± sNd is 1 α2 α2 = 0.6826.

For the simulations of Fig. 2, the lower right panel shows values of mNd = 0.0179 and sNd = 1.186 103. Thus there is a roughly 68% chance that the true fraction of received power is within the range 0.01079 ± 1.186 103. For the corresponding run of Fig. 3, which traced ten times as many ray bundles, the corresponding numbers are 0.01081 ± 3.718 104. Thus, in the last set of simulations, we are 68% certain that the sample mean is within about ± 3.4% of the true mean.

Suppose we need the probability that we are within, say, 5% of the correct value. For the lower right simulation of Fig. 3, 0.05 of 0.01081 is about 0.00054. From 0.00054 = β3.178 104 we get β = 1.454. Q(1.454) = 0.0729 = α2, so the probability of being within 5% is 1 α = 0.854. If being 85% confident that the Monte Carlo estimate of the mean is within 5% of the true value is adequate for your application, then you are done. If you need to be 95% confident that you are within 5% of correct, then you need to continue tracing rays until you get enough rays on the target to reduce the sample variance to a value small enough to achieve the desired 95% confidence.

As a final example, we might ask how big is the error so that we can say that we are within that range with 90% certainly. We now set 1 α = 0.9, and solve

Q(β) = α 2 = 0.05

Again, the inverse of Q(β) is also tabulated. This equation gives β = 1.645. From the last panel of Fig. 3 we then get βsNd = 1.645 × 3.718 104 = 6.12 104, so that μ = 0.01081 ± 6.12 104 with 90% confidence.

As a final caveat to this section, keep in mind that the central limit theorem says that the error becomes Gaussian as the number of samples, Nd in the present examples, becomes very large. How large is large enough depends on the particular problem and the user’s accuracy requirement. In the present examples, 10,000 runs each with 10,000 or more emitted rays, resulting in 100 or more detected rays for each run, gives distributions that visually appear close to Gaussian (the lower right panels of the preceding two figures). There are various ways to quantify how close a data distribution is to a Gaussian, but that is a topic for somewhere else. Just do a search on ”normality tests.”

Comments for Error Estimation:

Loading Conversation