Simulating from a gamma distribution with small shape parameter

arXiv:1302.1884v3 [stat.CO] 7 Apr 2015 Simulating from a gamma distribution with small shape parameter Chuanhai Liu Department of Statistics Purdue U...
Author: Brendan Hall
0 downloads 1 Views 121KB Size
arXiv:1302.1884v3 [stat.CO] 7 Apr 2015

Simulating from a gamma distribution with small shape parameter Chuanhai Liu Department of Statistics Purdue University [email protected] Ryan Martin and Nick Syring Department of Mathematics, Statistics, and Computer Science University of Illinois at Chicago (rgmartin, nsyring2)@uic.edu April 8, 2015 Abstract Simulating from a gamma distribution with small shape parameter is a challenging problem. Towards an efficient method, we obtain a limiting distribution for a suitably normalized gamma distribution when the shape parameter tends to zero. Then this limiting distribution provides insight to the construction of a new, simple, and highly efficient acceptance–rejection algorithm. Comparisons based on acceptance rates show that the proposed procedure is more efficient than existing acceptance–rejection methods. Keywords and phrases: Acceptance rate; acceptance–rejection method; asymptotic distribution; exponential distribution; R software.

1

Introduction

Let Y be a positive gamma distributed random variable with shape parameter α > 0, denoted by Y ∼ Gamma(α, 1). The probability density function for Y is given by 1 α−1 −y y e , y > 0, Γ(α) R∞ where the normalizing constant, Γ(α) = 0 y α−1 e−y dy, is the gamma function evaluated at α. This is an important distribution in statistics and probability modeling. In fact, since the gamma distribution is closely tied to so many important distributions, including normal, Poisson, exponential, chi-square, F, beta, and Dirichlet, one could argue that it pα (y) =

1

is one of the most fundamental (e.g., Johnson et al. 1994). Here we are particularly interested in the problem of simulating gamma random variables when the shape parameter α is small. This problem is important because the small-shape gamma, with a large scale parameter, is a simple but useful model for positive size or lifetime random variables with small mean but large variance (e.g., Kleiber and Kotz 2003). Unfortunately, the smallshape gamma distribution is not easy to work with, so, seemingly routine calculations can become inefficient or even intractable. For example, in R, the functions related to the gamma distribution—in particular, the rgamma function for sampling—become relatively inaccurate when the shape parameter is small (R Core Team 2013, “GammaDist” documentation). To circumvent these difficulties, and to move towards new and more efficient software, we show that Gamma(α, 1), suitably normalized, has a simple and nondegenerate limiting distribution as α → 0. This result is then used to develop a new and efficient algorithm for sampling from a small-shape gamma distribution. When the shape parameter α is large, it follows from the infinite-divisibility of the gamma distribution and Lindeberg’s central limit theorem (Billingsley 1995, Sec. 27) that the distribution of Y is approximately normal. Specifically, as α → ∞, α−1/2 (Y − α) → N(0, 1) in distribution. For large finite α, better normal approximations can be obtained by working on different scales, such as log Y or Y 1/3 . Our interest is in the opposite extreme case, where the shape parameter α is small, approaching zero. Here, neither infinite-divisibility nor the central limit theorem provide any help. In Theorem 1 below, we prove that −α log Y converges in distribution to Exp(1), the unit-rate exponential distribution, as α → 0. Motivated by the limit distribution result in Theorem 1, we turn to the problem of simulating from a small-shape gamma distribution. This is a challenging problem with many proposed solutions; see, for example, Best (1983), Kundu and Gupta (2007), Tanizaki (2008), and Xi et al. (2013). For small shape parameters, the default methods implemented in R and MATLAB, due to Ahrens and Dieter (1974) and Marsaglia and Tsang (2000), respectively, have some shortcomings in terms of accuracy and/or efficiency. The exponential limit in Theorem 1 for the normalized gamma distribution suggests a convenient and tight envelope function to be used in an acceptance–rejection sampler (e.g., Devroye 1986; Flury 1990). We flesh out the details of this new algorithm in Section 3 and provide R code at www.math.uic.edu/~ rgmartin. This new method is simple and, as we demonstrate in Section 4, is more efficient than existing methods in terms of acceptance rates.

2

Limit distribution result

For the gamma function Γ(z) defined above, write f (z) = log Γ(z). Then the digamma and trigamma functions are defined as f1 (z) = f ′ (z) and f2 (z) = f ′′ (z), the first and second derivatives of the log gamma function f (z). Recall that these are related to the mean and variance of log Y , with Y ∼ Gamma(α, 1): Eα (log Y ) = f1 (α) and Vα (log Y ) = f2 (α). These formulae are most directly seen by applying those well-known formulae for means and variances in regular exponential families (Brown 1986, Corollary 2.3). Next, write 2

Z = −α log Y . To get some intuition for why multiplication by α is the right normalization, consider the following recurrence relations for the digamma and trigamma functions (Abramowitz and Stegun 1966, Chap. 6): f1 (α) = f1 (α + 1) − 1/α and f2 (α) = f2 (α + 1) + 1/α2 . Then, as α → 0, Eα (Z) = −αf1 (α) = −αf1 (α + 1) + 1 = O(1), Vα (Z) = α2 f2 (α) = α2 f2 (α + 1) + 1 = O(1). That is, multiplication by α stabilizes the first and second moments of log Y . Towards a formal look at the limiting distribution of Z, define the characteristic function

where i =



ϕα (t) = Eα (eitZ ) = Eα (Y −iαt ) = Γ(α − iαt)/Γ(α),

(1)

−1 is the complex unit.

Theorem 1. For Y ∼ Gamma(α, 1), −α log Y → Exp(1) in distribution as α → 0. Proof. Set Z = −α log Y . The gamma function satisfies Γ(z) = Γ(z + 1)/z, so the characteristic function ϕα (t) for Z in (1) can be re-expressed as ϕα (t) =

Γ(α − iαt) Γ(1 + α − iαt)/(α − iαt) 1 Γ(1 + oα ) = = , Γ(α) Γ(1 + α)/α 1 − it Γ(1 + oα )

where oα are terms that vanish as α → 0. Since the gamma function is continuous at 1, the limit of ϕα (t) as α → 0 exists and is given by 1/(1 − it). This limit is exactly the characteristic function of Exp(1), so the claim follows by L´evy’s continuity theorem.

3

Small-shape gamma simulations

Simulating from a gamma distribution with small shape parameter is a challenging problem that has attracted considerable attention in the literature; see, e.g., Best (1983), Kundu and Gupta (2007), Tanizaki (2008), and Xi et al. (2013). Here we demonstrate that the limiting distribution result in Theorem 1 helps provide an improved algorithm for simulating gamma random variables with small shape parameter. For Y ∼ Gamma(α, 1) with α near zero, let Z = −α log Y . To simulate from the distribution of Z, one might consider an acceptance–rejection scheme; see, for example, Lange (1999, Chap. 20.4) or Givens and Hoeting (2005, Chap. 6.2.3). For this, one needs an envelope function that bounds the target density and, when properly normalized, corresponds to the density function of a distribution that is easy to simulate from. By Theorem 1 we know that Z is approximately Exp(1) for α ≈ 0. More precisely, the density hα (z) of Z has a shape like e−z for z ≥ 0. Therefore, we expect that a function proportional to an Exp(1) density will provide a tight upper bound on hα (z) for z ≥ 0. We shall similarly try to bound hα (z) by an oppositely-oriented exponential-type density for z < 0, as is standard in such problems. 3

0.8

1.0

1.0

hα(z)

0.2

0.4

0.6

0.8 0.6 0.4 0.0

0.0

0.2

hα(z)

−0.5

0.0

0.5

1.0

1.5

2.0

2.5

3.0

−0.5

0.0

0.5

z

1.0

1.5

2.0

2.5

3.0

z

(a) α = 0.1

(b) α = 0.05

Figure 1: Plots of the (un-normalized) target hα (z) (solid) and envelope ηα (z) (dashed) for two values of α. The particular bounding envelope function ηα (z) is chosen to be as tight an upper bound as possible. This is done by picking optimal points of tangency with hα (z). For this, we shall need a formula for hα (z), up to norming constant, which is easily found: hα (z) = ce−z−e

−z/α

,

z ∈ (−∞, ∞).

The norming constant c satisfies c−1 = Γ(α + 1). By following standard techniques, as described in Lange (1999, Chap. 20.4), we obtain the optimal envelope function ( ce−z , for z ≥ 0, ηα (z) = λz cwλe , for z < 0, where λ = λ(α) = α−1 − 1 and w = w(α) = α/e(1 − α). Plots of the (un-normalized) target density hα (z) along with the optimal envelope ηα (z), for two small values of α, are shown Figure 1. The normalized envelope function ηα (z) corresponds to the density function of a mixture of two (oppositely-oriented) exponential distributions, i.e., 1 w Exp(1) + {−Exp(λ)}, 1+w 1+w which is easy to sample from using standard tools, such as runif in R. Pseudo-code for the proposed new program, named rgamss, for simulating from a small-shape gamma distribution based on this acceptance–rejection scheme is presented in Algorithm 1. R code is also available at www.math.uic.edu/~ rgmartin, which provides the user with further options. As a side note, since numerical precision can be lost in the final exponentiation step in Algorithm 1, we recommend returning the samples on the log-scale, which is the default in our R function.

4

Algorithm 1 – Pseudo-code for the program rgamss designed to simulate from a gamma distribution with small shape parameter α. 1: set λ ← λ(α), w ← w(α), and r ← r(α) as in the text. 2: loop 3: U ← Unif #Unif denotes the random number generator 4: if U ≤ r then 5: z ← − log(U/r) 6: else 7: z ← log(Unif)/λ 8: end if 9: if hα (z)/ηα (z) > Unif then 10: Z←z 11: break 12: end if 13: end loop 14: Y ← exp(−Z/α)

4

Efficiency comparisons

All methods based on the acceptance–rejection principle will provide genuine samples from the target distribution. So, the most natural way to compare such methods is based on the acceptance rate. It is common practice nowadays to employ an acceptance– rejection sampler inside a Markov chain Monte Carlo method, so having high acceptance rates results in shorter Monte Carlo run times. The acceptance rate r(α) for the proposed method is n r(α) = {1 + w(α)}−1 = 1 +

o−1 α . e(1 − α)

(2)

It is clear from the approximation r(α) ≈ 1 − α/e for α ≈ 0, that the acceptance rate converges to 1 as α → 0. This indicates the high efficiency of the proposed method when α is small. This is to be expected based on Theorem 1: when α ≈ 0, Z is approximately Exp(1), so an algorithm that proposes Exp(1) samples with probability 1/(1 + w) ≈ 1 will likely accept the proposal. To justify the claim in Section 1 that the proposed method is more efficient than existing methods, all based on the accept–reject principle, we provide a comparison in terms of acceptance rates. The methods being compared to rgamss are Ahrens–Dieter (Ahrens and Dieter 1974), Best (Best 1983), and Algorithm 3 of Kundu– Gupta (Kundu and Gupta 2007); the methods of Tanizaki (2008) and Xi et al. (2013) were also considered, but these are not efficient enough to be compared with the others. Plots of the acceptance rates for these methods, as a function of the shape parameter α, are displayed in Figure 2. The proposed rgamss has the highest acceptance rate over a range of small α values, namely (0, 0.3], and, therefore, is most efficient. For α > 0.3, the proposed method’s efficiency drops below that of Kundu–Gupta but stays above Best and Ahrens–Dieter for α ∈ (0.3, 0.4] (not shown). There are additional numerical advantages to the proposed method beyond simulation efficiency. Indeed, when α is very small, the gamma distribution is tightly concentrated 5

1.00 0.95 0.90

Acceptance Rate

0.85

Ahrens−Dieter Best Kundu−Gupta rgamss

0.00

0.05

0.10

0.15

0.20

0.25

0.30

α

Figure 2: Plot of the acceptance rates for the four indicated methods as a function of the shape parameter α ∈ (0, 0.3]. around zero, so getting practically non-zero values can be challenging. For example, when α = 0.001, nearly half of the samples returned by the Ahrens–Dieter method are exact zeros. The Kundu–Gupta method has similar difficulties, since U 1/α , for U ∼ Unif(0, 1), is often practically zero. The proposed method, on the other hand, actually works on the log-scale, a more appropriate scale for very small numbers, so one can readily obtain genuine non-degenerate samples of log Y for very small values of α.

Acknowledgments This work is partially supported by the U.S. National Science Foundation, grants DMS– 1007678, DMS–1208833, and DMS–1208841.

References Abramowitz, M. and Stegun, I. A. (1966). Handbook of mathematical functions, with formulas, graphs, and mathematical tables. Dover Publications Inc., New York. Ahrens, J. H. and Dieter, U. (1974). Computer methods for sampling from gamma, beta, Poisson and binomial distributions. Computing (Arch. Elektron. Rechnen), 12(3):223– 246. Best, D. J. (1983). A note on gamma variate generators with shape parameter less than unity. Computing, 30(2):185–188.

6

Billingsley, P. (1995). Probability and Measure. John Wiley & Sons Inc., New York, third edition. Brown, L. D. (1986). Fundamentals of Statistical Exponential Families with Applications in Statistical Decision Theory. Institute of Mathematical Statistics Lecture Notes— Monograph Series, 9. Institute of Mathematical Statistics, Hayward, CA. Devroye, L. (1986). Nonuniform Random Variate Generation. Springer-Verlag, New York. Flury, B. D. (1990). Acceptance-rejection sampling made easy. SIAM Rev., 32(3):474– 476. Givens, G. H. and Hoeting, J. A. (2005). Computational Statistics. Wiley Series in Probability and Statistics. Wiley-Interscience [John Wiley & Sons], Hoboken, NJ. Johnson, N. L., Kotz, S., and Balakrishnan, N. (1994). Continuous Univariate Distributions. Vol. 1. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons, Inc., New York, second edition. Kleiber, C. and Kotz, S. (2003). Statistical Size Distributions in Economics and Actuarial Sciences. Wiley Series in Probability and Statistics. Wiley-Interscience [John Wiley & Sons], Hoboken, NJ. Kundu, D. and Gupta, R. D. (2007). A convenient way of generating gamma random variables using generalized exponential distribution. Comput. Statist. Data Anal., 51(6):2796–2802. Lange, K. (1999). Numerical Analysis for Statisticians. Statistics and Computing. Springer-Verlag, New York. Marsaglia, G. and Tsang, W. W. (2000). A simple method for generating gamma variables. ACM Trans. Math. Software, 26(3):363–372. R Core Team (2013). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Tanizaki, H. (2008). A simple gamma random number generator for arbitrary shape parameters. Econ. Bull., 3:1–10. Xi, B., Tan, K. M., and Liu, C. (2013). Logarithmic transformation based gamma random number generators. J. Statist. Softw., 55:Issue 4.

7

Suggest Documents