web.archive.org

Inverse transform sampling

Inverse transform sampling, also known as the inverse probability integral transform or inverse transformation method or Smirnov transform, is a method for generating sample numbers at random from any probability distribution given its cumulative distribution function (cdf). Subject to the restriction that the distribution is continuous, this method is generally applicable (and can be computationally efficient if the cdf can be analytically inverted), but may be too computationally expensive in practice for some probability distributions. The Box-Muller transform is an example of an algorithm that is specific to generating samples from a normal distribution, but is more computationally efficient. It is often the case that, even for simple distributions, the inverse transform sampling method can be improved on^[1]: see, for example, the ziggurat algorithm and rejection sampling.

Contents

Definition

The probability integral transform states that if X is a continuous random variable with cumulative distribution function F_X, then the random variable Y = F_X(X) has a uniform distribution on [0, 1]. The inverse probability integral transform is just the inverse of this: specifically, if Y has a uniform distribution on [0, 1] and if X has a cumulative distribution F_X, then the cumulative distribution function of the random variable $F_X^{-1}(Y)$ is F_X .

The method

The problem that the inverse transform sampling method solves is as follows:

Let X be a random variable whose distribution can be described by the cumulative distribution function F.
We want to generate values of X which are distributed according to this distribution.

Many programming languages have the ability to generate pseudorandom numbers which are effectively distributed according to the standard uniform distribution. If a random variable has that distribution, then the probability of its falling within any subinterval (a, b) of the interval from 0 to 1 is just the length b − a of that subinterval.

The inverse transform sampling method works as follows:

Generate a random number from the standard uniform distribution; call this u.
Compute the value x such that F(x) = u; call this x_chosen.
Take x_chosen to be the random number drawn from the distribution described by F.

Expressed differently, given a continuous uniform variable U in [0, 1] and an invertible cumulative distribution function F, the random variable X = F⁻¹(U) has distribution F (or, X is distributed F).

A treatment of such inverse functions as objects satisfying differential equations can be given.^[2] Some such differential equations admit explicit power series solutions, despite their non-linearity.

Proof of correctness

Let F be a continuous cumulative distribution function, and let F ^{− 1} be its inverse function:^[3]

$F^{-1}(u) = \inf\;\{x \mid F(x)=u, 0<u<1\}$

Claim: If U is a uniform random variable on (0, 1) then F ^{− 1}(U) follows the distribution F.

Proof:

$\begin{align} & \Pr(F^{-1}(U) \leq x) \\ & {} = \Pr(\inf\;\{y \mid F(y)=U\} \leq x)\quad \text{(by definition of }F^{-1}) \\ & {} = \Pr(U \leq F(x)) \quad \text{(applying }F,\text{ which is monotonic, to both sides)} \\ & {} = F(x)\quad \text{(because }\Pr(U \leq y) = y,\text{ since }U\text{ is uniform on the unit interval)} \end{align}$

References

^ Luc Devroye (1986). Non-Uniform Random Variate Generation. New York: Springer-Verlag. http://cg.scs.carleton.ca/~luc/rnbookindex.html.
^ Steinbrecher, G., Shaw, W.T. (2008). Quantile mechanics. European Journal of Applied Mathematics 19 (2): 87-112.
^ Luc Devroye (1986). "Section 2.2. Inversion by numerical solution of F(X) = U". Non-Uniform Random Variate Generation. New York: Springer-Verlag. http://cg.scs.carleton.ca/~luc/chapter_two.pdf.

External links

This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)