Negative binomial distribution

Different texts (and even different parts of this article) adopt slightly different definitions for the negative binomial distribution. They can be distinguished by whether the support starts at k = 0 or at k = r, whether p denotes the probability of a success or of a failure, and whether r represents success or failure, so identifying the specific parametrization used is crucial in any given text.
	Probability mass function; The orange line represents the mean, which is equal to 10 in each of these plots; the green line shows the standard deviation.
Notation
Parameters	r > 0 — number of successes until the experiment is stopped (integer, but the definition can also be extended to reals); p ∈ [0,1] — success probability in each experiment (real)
Support	k ∈ { 0, 1, 2, 3, … } — number of failures
PMF	involving a binomial coefficient
CDF	the regularized incomplete beta function
Mean
Mode
Variance
Skewness
Excess kurtosis
MGF
CF
PGF
Fisher information
Method of moments	;

In probability theory and statistics, the negative binomial distribution, also called a Pascal distribution,^[2] is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified/constant/fixed number of successes $r$ occur.^[3] For example, we can define rolling a 6 on some dice as a success, and rolling any other number as a failure, and ask how many failure rolls will occur before we see the third success ( $r=3$ ). In such a case, the probability distribution of the number of failures that appear will be a negative binomial distribution.

An alternative formulation is to model the number of total trials (instead of the number of failures). In fact, for a specified (non-random) number of successes $(r)$ , the number of failures $(n - r)$ is random because the number of total trials $(n)$ is random. For example, we could use the negative binomial distribution to model the number of days $n$ (random) a certain machine works (specified by $r$ ) before it breaks down.

The negative binomial distribution has a variance $\mu /p$ , with the distribution becoming identical to Poisson in the limit $p\to 1$ for a given mean $\mu$ (i.e. when the failures are increasingly rare). Here $p\in [0,1]$ is the success probability of each Bernoulli trial. This can make the distribution a useful overdispersed alternative to the Poisson distribution, for example for a robust modification of Poisson regression. In epidemiology, it has been used to model disease transmission for infectious diseases where the likely number of onward infections may vary considerably from individual to individual and from setting to setting.^[4] More generally, it may be appropriate where events have positively correlated occurrences causing a larger variance than if the occurrences were independent, due to a positive covariance term.

The term "negative binomial" is likely due to the fact that a certain binomial coefficient that appears in the formula for the probability mass function of the distribution can be written more simply with negative numbers.^[5]

^ DeGroot, Morris H. (1986). Probability and Statistics (Second ed.). Addison-Wesley. pp. 258–259. ISBN 0-201-11366-X. LCCN 84006269. OCLC 10605205.
^ Pascal distribution, Univariate Distribution Relationships, Larry Leemis
^ Weisstein, Eric. "Negative Binomial Distribution". Wolfram MathWorld. Wolfram Research. Retrieved 11 October 2020.
^ e.g. Lloyd-Smith, J. O.; Schreiber, S. J.; Kopp, P. E.; Getz, W. M. (2005). "Superspreading and the effect of individual variation on disease emergence". Nature. 438 (7066): 355–359. Bibcode:2005Natur.438..355L. doi:10.1038/nature04153. PMC 7094981. PMID 16292310.
The overdispersion parameter is usually denoted by the letter $k$ in epidemiology, rather than $r$ as here.
^ Casella, George; Berger, Roger L. (2002). Statistical inference (2nd ed.). Thomson Learning. p. 95. ISBN 0-534-24312-6.

[DeGrootNB-1] DeGroot, Morris H. (1986). Probability and Statistics (Second ed.). Addison-Wesley. pp. 258–259. ISBN 0-201-11366-X. LCCN 84006269. OCLC 10605205.

[2] Pascal distribution, Univariate Distribution Relationships, Larry Leemis

[Wolfram-3] Weisstein, Eric. "Negative Binomial Distribution". Wolfram MathWorld. Wolfram Research. Retrieved 11 October 2020.

[4] .g. Lloyd-Smith, J. O.; Schreiber, S. J.; Kopp, P. E.; Getz, W. M. (2005). "Superspreading and the effect of individual variation on disease emergence". Nature. 438 (7066): 355–359. Bibcode:2005Natur.438..355L. doi:10.1038/nature04153. PMC 7094981. PMID 16292310.
The overdispersion parameter is usually denoted by the letter $k$ in epidemiology, rather than $r$ as here.

[5] Casella, George; Berger, Roger L. (2002). Statistical inference (2nd ed.). Thomson Learning. p. 95. ISBN 0-534-24312-6.

[1]

[2]

[3]

[4]

[5]

Different texts (and even different parts of this article) adopt slightly different definitions for the negative binomial distribution. They can be distinguished by whether the support starts at k = 0 or at k = r, whether p denotes the probability of a success or of a failure, and whether r represents success or failure,^[1] so identifying the specific parametrization used is crucial in any given text.
Probability mass function The orange line represents the mean, which is equal to 10 in each of these plots; the green line shows the standard deviation.
Notation	$\mathrm {NB} (r,\,p)$
Parameters	r > 0 — number of successes until the experiment is stopped (integer, but the definition can also be extended to reals) p ∈ [0,1] — success probability in each experiment (real)
Support	k ∈ { 0, 1, 2, 3, … } — number of failures
PMF	$k\mapsto {k+r-1 \choose k}\cdot (1-p)^{k}p^{r},$ involving a binomial coefficient
CDF	$k\mapsto I_{p}(r,\,k+1),$ the regularized incomplete beta function
Mean	${\frac {r(1-p)}{p}}$
Mode	${\begin{cases}\left\lfloor {\frac {(r-1)(1-p)}{p}}\right\rfloor &{\text{if }}r>1\\0&{\text{if }}r\leq 1\end{cases}}$
Variance	${\frac {r(1-p)}{p^{2}}}$
Skewness	${\frac {2-p}{\sqrt {(1-p)r}}}$
Excess kurtosis	${\frac {6}{r}}+{\frac {p^{2}}{(1-p)r}}$
MGF	${\biggl (}{\frac {p}{1-(1-p)e^{t}}}{\biggr )}^{\!r}{\text{ for }}t<-\log(1-p)$
CF	${\biggl (}{\frac {p}{1-(1-p)e^{i\,t}}}{\biggr )}^{\!r}{\text{ with }}t\in \mathbb {R}$
PGF	${\biggl (}{\frac {p}{1-(1-p)z}}{\biggr )}^{\!r}{\text{ for }}\|z\|<{\frac {1}{p}}$
Fisher information	${\frac {r}{p^{2}(1-p)}}$
Method of moments	$r={\frac {E[X]^{2}}{V[X]-E[X]}}$ $p={\frac {E[X]}{V[X]}}$