Contents

Page updated: March 14, 2021
Author: Curtis Mobley
View PDF

Error Estimation

This page shows how to estimate the errors in Monte Carlo simulations. General results from probability theory are illustrated with numerical examples.

To be speciﬁc, suppose that we need to estimate the fraction of power emitted by a light source that will be received by a detector. The answer of course depends on the water IOPs; the size, orientation, and location of the detector relative to the light source; the angular distribution of the emitted light; and perhaps on other things like boundary surfaces that can reﬂect or absorb light scattered onto the boundary. The numerical results to be shown below use the source and detector geometry shown in Fig. 1. The source emits a collimated bundle of rays toward a detector that is $τ = 5$ optical path lengths away and is $0.1$ optical path lengths in diameter. The water IOPs have $a = 0.2 m^{- 1}$ and $b = 0.8 m^{- 1}$ so that $c = 1 m^{- 1}$ and one meter is one optical path length. The albedo of single scattering is $ω_{o} = 0.8$ . The scattering phase function is a Henyey-Greenstein phase function with a scattering-angle mean cosine of $g = 0.8$ .

Figure 1: Source and detector geometry used for numerical simulations.

In the simulations, $N$ rays will be emitted from the source, each with an initial weight of $w = 1$ . Most of those rays will miss the detector, as illustrated by the blue arrows in Fig. 1. However, some rays will hit the detector, as illustrated by the red arrows, at which time their current weight will be tallied to the accumulating total weight $w_{d}$ received by the detector. After all rays have been traced, the the Monte Carlo estimate of the fraction of emitted power received by the detector is simply $f_{d} = w_{d} ∕ N$ .

If we do only one simulation tracing, say, $N = 1 0^{4}$ ray bundles, then the resulting estimate of $f_{d}$ is all we have. In particular, we have no estimate of the statistical error in the estimated $f_{d}$ .

Probability Theory

To develop a quantitative error estimate for the result of a Monte Carlo simulation, we begin with a review of some results from basic probability theory. Recalling the notation introduced on the Monte Carlo Introduction page, let $p_{W} (w)$ be the probability density function (pdf) for random variable $W$ . Capital $W$ represents the random variable, e.g. the ray bundle weight received by the detector, and lower case $w$ represents a speciﬁc value of $W$ , e.g., $w = 0.72$ . In the present example, $p_{W} (w)$ is the pdf that a ray strikes the detector with a weight $w, 0 \leq w \leq 1$ . rays that miss the detector do not contribute to accumulating weight or power received by the detector and do not enter into the calculations below; they are simply wasted computer time. Note that we have no idea what mathematical form $p_{W} (w)$ has: it results from a complicated sequence of randomly determined ray path lengths and scattering angles.

For any continuous pdf $p_{W} (w)$ the expected or mean value of $W$ is deﬁned as

m e a n (W) \equiv μ \equiv ℰ {W} = \int w p_{W} (w) d w,

(1)

where $ℰ$ denotes expected value and the integral is over all values for which $W$ is deﬁned. If the random variable is discrete, the integral is replaced by a sum over all allowed values of $W$ . Similarly, the variance of $W$ is deﬁned as

v a r (W) \equiv σ^{2} \equiv ℰ {{(W - μ)}^{2}} = \int {(w - μ)}^{2} p_{W} (w) d w = ℰ {W^{2}} - {[ℰ {W}]}^{2} .

Note that if $c$ is a constant, then

ℰ {c W} = c ℰ {W} a n d v a r (c W) = c^{2} v a r (W) .

Greek letters $μ$ and $σ^{2}$ are used to denote the true or population mean and variance of a pdf.

Suppose that $N_{d}$ ray bundles actually reach the detector. The total weight received by the detector is then given by the sum of these randomly determined weights:

S_{N_{d}} = \sum_{i = 1}^{N_{d}} w_{d} (i),

where $w_{d} (i)$ is the weight of ray bundle $i$ when it reached the detector.

Each of the $N$ ray bundles emitted by the source and traced to completion is independent of the others. In particular, a diﬀerent sequence of random numbers is used to determine the path lengths and scattering angles for each emitted bundle. Moreover, the underlying pdfs for ray path length and scattering angles (i.e., the IOPs) are the same for each bundle. The random variables are then said to be independent and identically distributed (iid), and $S_{N_{d}}$ is said to be a random sample of size $N_{d}$ of random variable $W$ .

The linearity of the expectation (i.e., the integral of a sum is the sum of the integrals) means that for iid random variables such as $W$ ,

ℰ {S_{N_{d}}} = N_{d} ℰ {W} = N_{d} μ a n d v a r (S_{N_{d}}) = N_{d} v a r (W) = N_{d} σ^{2} .

(2)

In the Monte Carlo simulation, the sample mean, i.e. the estimate of the average detected weight obtained by from the $N_{d}$ detected ray bundles is

m_{N_{d}} \equiv \frac{1}{N_{d}} S_{N_{d}} = \frac{1}{N_{d}} \sum_{i = 1}^{N_{d}} w_{d} (i) .

Equation (2) now gives two extremely important results. First,

ℰ {m_{N_{d}}} = \frac{1}{N_{d}} ℰ {S_{N_{d}}} = μ .

(3)

That is, the expectation of the sample mean $m_{N_{d}}$ is equal to the true mean $μ$ . The sample mean is then said to be an unbiased estimator of the true mean of the pdf. Second,

v a r (m_{N_{d}}) = v a r (\frac{1}{N_{d}} S_{N_{d}}) = \frac{1}{N_{d}^{2}} v a r (S_{N_{d}}) = \frac{σ^{2}}{N_{d}} .

(4)

Thus, the variance of the sample mean goes to zero as $N_{d} \to \infty$ , that is, as more and more ray bundles are detected. In other words, the Monte Carlo estimate of the average power received by the detector is guaranteed to give a result that can be made arbitrarily close to the correct result if enough rays are detected. This result is known as the law of large numbers. Again, you can emit and trace all the rays you want, but if they don’t hit the detector, they don’t count.

It is often convenient to think in terms of the standard deviation, e.g., when plotting data and showing the spread of values. The standard deviation of the error in $m_{N_{d}}$ is

s_{N_{d}} = \sqrt{v a r (m_{N_{d}})} = \frac{σ}{\sqrt{N_{d}}} .

(5)

The dependence of the standard deviation of the estimate on $1 ∕ \sqrt{N_{d}}$ is a very general and important result. However, this “approach to the correct value” is very slow. If we want to reduce the standard deviation of the error in the estimated average power received by the detector by a factor of 10, we must detect 100 times as many rays. That can be computationally very expensive.

It is to be emphasized that result (4) that the variance of a sample mean equals the true variance divided by the sample size holds for any situation for which the individual samples are independent and identically distributed random variables.

Note, of course, that if we knew the pdf for the received power, $p_{W} (w)$ , then we could simply evaluate Eq. (1) to get the desired true mean $μ$ , and no Monte Carlo simulation would be required.

Finally, it must be remembered that the discussion here assumes that ”all else is the same” when considering the number of detected rays. For example, we emit and trace more rays without changing the physics of the simulation. The page on importance sampling presents ways to increase the number of detected rays and thereby reduce the variance, but with a change in the physics that sometimes may invalidate the simple $1 ∕ \sqrt{N_{d}}$ dependence.

Numerical Examples

Numerical simulations were performed for the geometry and conditions described for Fig. 1. For these simulations, tracing type 1 of the previous page was used. That is, ray bundles were traced until they either hit the target (still with weight $w = 1$ ) or were absorbed. For a given number $N$ of emitted ray bundles, various numbers $N_{r u n}$ of independent runs were done. That is, $N$ rays were traced and the number $N_{d}$ of detected ray bundles and their weights were tallied for each run. The fraction of emitted power received by the detector was then computed by the total detected weight for the run divided by $N$ . Then another run was made with everything the same except that a diﬀerent sequence of random numbers was used (i.e., each run was started with a diﬀerent seed for the $U [0, 1]$ random number generator used to determine path lengths and scattering angles).

The ﬁrst set of simulations used $N = 10, 000$ emitted ray bundles for each run, with $N_{r u n} = 10, 100, 1000$ , and 10,000 runs being made. Figure 2 shows the distributions of the sample means $m_{N_{d}}$ and other information for these four sets of run numbers. The upper left panel of the ﬁgure is for only $N_{r u n} = 10$ runs, or trials, or simulations. In this panel, the histogram shows that one run, or 0.1 of the total number of runs, gave an estimated fraction between 0.0088 and 0.0090 of the emitted power; two runs, or 0.2 of the total, gave a fraction between 0.0104 and 0.0106, and so on. As the number of runs increases, the estimates of the fraction of power received range from slightly less than 0.008 to slightly more than 0.014, with most estimates centering somewhere near 0.0108.

Figure 2: Estimates of the fraction of detected power for four sets of runs with

N = 1 0^{4}

emitted rays in each run.

As the number of runs increases, something very remarkable happens: the distribution of the fraction of the emitted power appears to be approaching a Gaussian or normal shape, even though the underlying pdf $p_{W} (w)$ is certainly not Gaussian. This distribution can be thought of as the distribution of errors in the estimated mean of the distribution, or the distribution of $ℰ {m_{N_{d}} - μ}$ , where $μ$ is the unknown true mean of the distribution $p_{W} (w)$ . This approach to a Gaussian distribution is a consequence of the central limit theorem. The central limit theorem states that the sum of a large number of independently distributed random variables with ﬁnite means and variances is approximately normally distributed regardless of what the distribution of the random variable itself may be. This is one of the most profound results in probability theory. Indeed, it explains why so many natural phenomena tend to have a Gaussian shape. Phenomena as disparate as average student exam scores, noise in electrical circuits, daily water usage in a city, or the fraction of people who develop cancer can all result from sums of many individual contributions. Such sums then tend toward a Gaussian distribution as the number of individual contributions increases. The theorem was ﬁrst proved for a speciﬁc pdf in 1733, but it was not proven to hold for all pdfs (having ﬁnite means and variances) until the early 1900’s. (By the way, in spite of what you see in the tabloid press, there is nothing in probability theory called “the law of averages.” The central limit theorem is maybe the closest thing to the often involked buy mythical “law of averages.”)

Figure 3 shows the corresponding results for series of $N_{r u n} = 10, 100, 1000$ , and 10,000 runs being made, but with each run now having $N = 100, 000$ emitted rays. Now the spread in the estimated values is much less, from slightly less than 0.010 to about 0.012, again centering somewhere around 0.0108. Again, we see the approach to a Gaussian shape as more and more runs are made, but the Gaussian has a narrower width, i.e. less variance about the mean. The standard deviation of the sample estimates of the mean is usually called the “standard error of the mean.”

Figure 3: Estimates of the fraction of detected power for four sets of runs with

N = 1 0^{5}

emitted rays in each run.

The lower right panel of Fig. 2 shows that the sample standard deviation for the case of 10,000 runs each with 10,000 emitted rays is $s_{N_{d}} = 1.186 \cdot 1 0^{- 3}$ and the average number of detected rays is $N_{d} = 107.9$ . The corresponding panel of Fig. 3 shows $s_{N_{d}} = 3.718 \cdot 1 0^{- 4}$ and $N_{d} = 1080.5$ . The ratio of these sample standard deviations is

\frac{s_{N_{d}} (N_{d} = 107.9)}{s_{N_{d}} (N_{d} = 1080.5)} = \frac{1.186 \cdot 1 0^{- 3}}{3.718 \cdot 1 0^{- 4}} = 3.19 .

The corresponding ratio of $\frac{σ}{\sqrt{N_{d}}}$ values is

\frac{\frac{σ}{\sqrt{N_{d} = 107.9}}}{\frac{σ}{\sqrt{N_{d} = 1080.5}}} = \sqrt{\frac{1080.5}{107.5}} = 3.16 .

This nicely illustrates the dependence of the sample standard deviation, or the standard error of the mean, on the square root of the number of ray bundles detected, just as predicted by Eq. (5).

Error Estimation

In the present example of the fraction of emitted power received by a detector, the central limit theorem guarantees that the errors in the fraction of received power computed by many Monte Carlo runs approaches a Gaussian. We can thus use all of the results for Gaussian, or normal, probability distributions to estimate the errors in the Monte Carlo results.

It is often desirable to know the probability that the computed sample mean $m_{N_{d}}$ is within some prechosen amount, say 1 standard deviation, of the (unknown) true mean $μ$ . Conversely, we may want to compute the error range so that the sample mean is within that error range of the true mean with some prechosen probability. Such questions can be answered starting with the statement

P r o b {μ - β s_{N_{d}} \leq m_{N_{d}} \leq μ + β s_{N_{d}}} = 1 - α .

This equation states that the probability is $1 - α$ that the sample mean is within a range $β s_{N_{d}}$ of the true mean $μ$ ; $β$ is a fraction of the sample standard deviation $s_{N_{d}}$ . In other words, the probability is $1 - α$ that $μ = m_{N_{d}} \pm β s_{N_{d}}$ . The central limit theorem guarantees that, if the sample size is large enough, the deviation of the sample mean from the true mean, $m_{N_{d}} - μ$ , is approximately normally distributed:

p d f (m_{N_{d}} - μ) \approx \frac{1}{\sqrt{2 π} s_{N_{d}}} exp \{- \frac{{(m_{N_{d}} - μ)}^{2}}{2 s_{N_{d}}^{2}}\} .

(6)

Assuming that we have enough samples to get a good approximation to the normal distribution, the probability that $m_{N_{d}}$ is greater than $μ$ by an amount $β s_{N_{d}}$ is then

P r o b {m_{N_{d}} - μ \geq β s_{N_{d}}} = \frac{1}{\sqrt{2 π} s_{N_{d}}} \int_{β s_{N_{d}}}^{\infty} exp \{- \frac{t^{2}}{2 s_{N_{d}}^{2}}\} d t .

Letting $y = t ∕ s_{N_{d}}$ gives

P r o b {m_{N_{d}} - μ \geq β s_{N_{d}}} = \frac{1}{\sqrt{2 π}} \int_{β}^{\infty} exp \{- \frac{y^{2}}{2}\} d y \equiv Q (β) .

(7)

The $Q (β)$ integral in the last equation cannot be performed analytically, but it is tabulated in probability texts, and software packages such as MATLAB and IDL have routines to compute it. In to also common to ﬁnd tables and subroutines for

Φ (β) \equiv \frac{1}{\sqrt{2 π}} \int_{- \infty}^{β} exp \{- \frac{y^{2}}{2}\} d y .

Note that $Q (β) + Φ (β) = 1$ .

Let us now apply Eq. (7) to various examples. We ﬁrst compute the probability that the sample mean is within one standard deviation of the true mean. Letting $β = 1$ , we compute the probability $α ∕ 2$ that $m_{N_{d}} - μ$ lies in the “right-hand tail” of the normal distribution beyond $β = 1$ . This is the area shaded in red in Fig. 4.

Figure 4: The Gaussian or normal distribution of Eq. 6

This probability is

P r o b {m_{N_{d}} - μ \geq s_{N_{d}}} = Q (1.0) = \frac{α}{2} .

$Q (1) \approx 0.1587$ , so $α = 2 Q (β) = 0.3174$ . The Gaussian distribution is symmetric about the mean, so $P r o b {m_{N_{d}} - μ \leq - s_{N_{d}}}$ that $m_{N_{d}} - μ$ lies in the left-hand (green-shaded) tail of the distribution also equals 0.1587. Thus the probability that $m_{N_{d}} - μ$ does not lie in either tail of the distribution, i.e. that $μ \leq m_{N_{d}} \pm s_{N_{d}}$ is $1 - α ∕ 2 - α ∕ 2 = 0.6826$ .

For the simulations of Fig. 2, the lower right panel shows values of $m_{N_{d}} = 0.0179$ and $s_{N_{d}} = 1.186 \cdot 1 0^{- 3}$ . Thus there is a roughly 68% chance that the true fraction of received power is within the range $0.01079 \pm 1.186 \cdot 1 0^{- 3}$ . For the corresponding run of Fig. 3, which traced ten times as many ray bundles, the corresponding numbers are $0.01081 \pm 3.718 \cdot 1 0^{- 4}$ . Thus, in the last set of simulations, we are 68% certain that the sample mean is within about $\pm 3.4 %$ of the true mean.

Suppose we need the probability that we are within, say, 5% of the correct value. For the lower right simulation of Fig. 3, 0.05 of 0.01081 is about 0.00054. From $0.00054 = β 3.178 \cdot 1 0^{- 4}$ we get $β = 1.454$ . $Q (1.454) = 0.0729 = α ∕ 2$ , so the probability of being within 5% is $1 - α = 0.854$ . If being 85% conﬁdent that the Monte Carlo estimate of the mean is within 5% of the true value is adequate for your application, then you are done. If you need to be 95% conﬁdent that you are within 5% of correct, then you need to continue tracing rays until you get enough rays on the target to reduce the sample variance to a value small enough to achieve the desired 95% conﬁdence.

As a ﬁnal example, we might ask how big is the error so that we can say that we are within that range with 90% certainly. We now set $1 - α = 0.9$ , and solve

Q (β) = \frac{α}{2} = 0.05

Again, the inverse of $Q (β)$ is also tabulated. This equation gives $β = 1.645$ . From the last panel of Fig. 3 we then get $β s_{N_{d}} = 1.645 \times 3.718 \cdot 1 0^{- 4} = 6.12 \cdot 1 0^{- 4}$ , so that $μ = 0.01081 \pm 6.12 \cdot 1 0^{- 4}$ with 90% conﬁdence.

As a ﬁnal caveat to this section, keep in mind that the central limit theorem says that the error becomes Gaussian as the number of samples, $N_{d}$ in the present examples, becomes very large. How large is large enough depends on the particular problem and the user’s accuracy requirement. In the present examples, 10,000 runs each with 10,000 or more emitted rays, resulting in 100 or more detected rays for each run, gives distributions that visually appear close to Gaussian (the lower right panels of the preceding two ﬁgures). There are various ways to quantify how close a data distribution is to a Gaussian, but that is a topic for somewhere else. Just do a search on ”normality tests.”

See comments posted for this page and leave your own.

Comments for Error Estimation:

Loading Conversation

Importance Sampling »