Binomial distribution
From Freepedia
| Probability mass function | |
| Cumulative distribution function | |
| Parameters | <math>n \geq 0</math> number of trials (integer) <math>0\leq p \leq 1</math> success probability (real) |
| Support | <math>k \in \{0,\dots,n\}\!</math> |
| pmf | <math>{n\choose k} p^k (1-p)^{n-k} \!</math> |
| cdf | <math>I_{1-p}(n-\lfloor k\rfloor, 1+\lfloor k\rfloor) \!</math> |
| Mean | <math>n\,p\!</math> |
| Median | one of <math>\{\lfloor n\,p\rfloor-1, \lfloor n\,p\rfloor, \lfloor n\,p\rfloor+1\}</math> |
| Mode | <math>\lfloor (n+1)\,p\rfloor\!</math> |
| Variance | <math>n\,p\,(1-p)\!</math> |
| Skewness | <math>\frac{1-2\,p}{\sqrt{n\,p\,(1-p) |
| Kurtosis | {{{kurtosis}}} |
| Entropy | {{{entropy}}} |
| mgf | {{{mgf}}} |
| Char. func. | {{{char}}} |
kurtosis =<math>\frac{1-6\,p\,(1-p)}{n\,p\,(1-p)}\!</math>|
entropy =|
mgf =<math>(1-p + p\,e^t)^n \!</math>|
char =<math>(1-p + p\,e^{i\,t})^n \!</math>|
}}
- See binomial (disambiguation) for a list of other topics using that name.
In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial. In fact, when n = 1, then the binomial distribution is the Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.
A typical example is the following: assume 5% of the population is HIV-positive. You pick 500 people randomly. How likely is it that you get 30 or more HIV-positives? The number of HIV-positives you pick is a random variable X which follows a binomial distribution with n = 500 and p = 0.05 (when picking the people with replacement). We are interested in the probability Pr[X ≥ 30].
In general, if the random variable X follows the binomial distribution with parameters n and p, we write X ~ B(n, p). The probability of getting exactly k successes is given by the probability mass function:
- <math>f(k;n,p)={n\choose k}p^k(1-p)^{n-k}\,</math>
for <math>k=0,1,2,\dots,n</math> and where
- <math>{n\choose k}=\frac{n!}{k!(n-k)!}</math>
is the binomial coefficient "n choose k" (also denoted C(n, k)), whence the name of the distribution. The formula can be understood as follows: we want k successes (pk) and n − k failures ((1 − p)n − k). However, the k successes can occur anywhere among the n trials, and there are C(n, k) different ways of distributing k successes in a sequence of n trials.
The cumulative distribution function can be expressed in terms of the regularized incomplete beta function, as follows:
- <math> F(k;n,p) = I_{1-p}(n-k, k+1)\,</math>.
If X ~ B(n, p) (that is, X is a binomially distributed random variate), then the expected value of X is
- <math>E[X]=np\,</math>
and the variance is
- <math>\mbox{var}(X)=np(1-p).\,</math>
The most likely value or mode of X is given by the largest integer less than or equal to (n+1)p; if m = (n+1)p is itself an integer, then m − 1 and m are both modes.
If X ~ B(n, p) and Y ~ B(m, p) are independent binomial variables, then X + Y is again a binomial variable; its distribution is
- <math>X+Y \sim B(n+m, p).\,</math>
Two other important distributions arise as approximations of binomial distributions:
- If both np and n(1 − p) are greater than 5 or so, then an excellent approximation (provided a suitable continuity correction is used) to B(n, p) is given by the normal distribution
- <math> N(np, np(1-p)).\,</math>
- This approximation is a huge time-saver; historically, it was the first use of the normal distribution, introduced in Abraham de Moivre's book The Doctrine of Chances in 1733. Nowadays, it can be seen as a consequence of the central limit theorem since B(n, p) is a sum of n independent, identically distributed 0-1 indicator variables. Warning: this approximation gives inaccurate results unless a continuity correction is used. Note: that the picture gives the normal and binomial probability density functions (PDF) and not the cumulative distribution functions.
- For example, suppose you randomly sample n people out of a large population and ask them whether they agree with a certain statement. The proportion of people who agree will of course depend on the sample. If you sampled groups of n people repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true proportion p of agreement in the population and with standard deviation σ = (p(1 − p)/n)1/2. Large sample sizes n are good because the standard deviation gets smaller, which allows a more precise estimate of the unknown parameter p.
- If n is large and p is small, so that np is of moderate size, then the Poisson distribution with parameter λ = np is a good approximation to B(n, p).
The formula for Bézier curves was inspired by the binomial distribution.
Limits of binomial distributions
- As n approaches ∞ and p approaches 0 while np remains fixed at λ > 0 or at least np approaches λ > 0, then the Binomial(n, p) distribution approaches the Poisson distribution with expected value λ.
- As n approaches ∞ while p remains fixed, the distribution of
- <math>{X-np \over \sqrt{np(1-p)\ }}</math>
- approaches the normal distribution with expected value 0 and variance 1.
References
- Luc Devroye, Non-Uniform Random Variate Generation, New York: Springer-Verlag, 1986. See especially Chapter X, Discrete Univariate Distributions.
- Voratas Kachitvichyanukul and Bruce W. Schmeiser, Binomial random variate generation, Communications of the ACM 31(2):216–222, February 1988. DOI:10.1145/42372.42381



