Skip to content

Mixture Distributions

Sometimes, we can model a random variable as a mixture of two or more distributions in the actuarial context. For example, the distribution of the total claim amount in a portfolio of insurance policies can be modeled as a mixture of two distributions: one for the claims that are below the deductible and another for the claims that exceed the deductible.

A mixture distribution is a probability distribution that is a mixture of two or more distributions. The probability density function (pdf) of a mixture distribution is given by the weighted sum of the pdfs of the component distributions. The weights are non-negative and sum to 1.

Let f1(x)f_1(x) and f2(x)f_2(x) be the pdfs of two distributions, and let the random variable XX have a mixture distribution with the pdf f(x)f(x), which is a mixture of f1(x)f_1(x) and f2(x)f_2(x) with weights pp and 1p1-p, respectively. Then, the pdf of the mixture distribution is given by:

f(x)=pf1(x)+(1p)f2(x)f(x) = p \cdot f_1(x) + (1-p) \cdot f_2(x)

Equivalently, the cumulative distribution function (cdf) of the mixture distribution is given by:

F(x)=pF1(x)+(1p)F2(x)F(x) = p \cdot F_1(x) + (1-p) \cdot F_2(x)

where F1(x)F_1(x) and F2(x)F_2(x) are the cdfs of the component distributions.

Example Here is an example of a mixture distribution. Suppose

two uniform random variables X1X_1 and X2X_2 are independent and identically distributed with pdf f(x)=1f(x) = 1 for 0x10 \leq x \leq 1. Let XX be a mixture of X1X_1 and X2X_2 with weights 0.5:0.5. Then, the E(X) is given by:

E(X)=0.5E(X1)+0.5E(X2)=0.50.5+0.50.5=0.5E(X) = 0.5 \cdot E(X_1) + 0.5 \cdot E(X_2) = 0.5 \cdot 0.5 + 0.5 \cdot 0.5 = 0.5

Let Y=X1+X22Y = \frac{X_1 + X_2}{2}. Then, the E(Y) is given by:

E(Y)=E(X1+X22)=12E(X1+X2)=12E(X1)+12E(X2)=0.5E(Y) = E\left(\frac{X_1 + X_2}{2}\right) = \frac{1}{2} \cdot E(X_1 + X_2) = \frac{1}{2} \cdot E(X_1) + \frac{1}{2} \cdot E(X_2) = 0.5

Thus, the expected value of XX and YY are the same. However, the two random variables are not the same. XX is a mixture of X1X_1 and X2X_2, while YY is the average of X1X_1 and X2X_2. The range of XX is [0,1][0, 1], while the range of YY is [0,0.5][0, 0.5]. :::

More generally, a mixture distribution can be a mixture of more than two distributions. The pdf of a mixture distribution with nn component distributions is given by:

f(x)=p1f1(x)+p2f2(x)++pnfn(x)f(x) = p_1 \cdot f_1(x) + p_2 \cdot f_2(x) + \ldots + p_n \cdot f_n(x)

where p1,p2,,pnp_1, p_2, \ldots, p_n are the weights of the component distributions, and f1(x),f2(x),,fn(x)f_1(x), f_2(x), \ldots, f_n(x) are the pdfs of the component distributions.

Suppose that for every θ\theta in the set of Θ\Theta is a distribution FθF_{\theta}. If GG is a c.d.fc.d.f on Θ\Theta corresponding p.d.fp.d.f gg.

Then we can define the mixture distribution FF by

F(x)=ΘFθ(x)dG(θ)=ΘFθ(X)g(θ)dθF(x) = \int_{\Theta} F_{\theta}(x) dG(\theta) = \int_{\Theta} F_{\theta} (X) g(\theta) d \theta

In the previous section, we discussed how the Pareto distribution could be derived from the mixture of the Exponential distribution and the Gamma distribution.

::: example Here we provide another of the negative binomial distribution

Suppose the variability in the claim rate λ\lambda follows a Γ(α,β)\Gamma(\alpha, \beta) distribution and let g()g(\cdot) be the density function of the Γ(α,β)\Gamma(\alpha, \beta) random variable, and the claim cases follows a Poisson distribution with parameter λ\lambda, the density function of random variable XX will be given by

P(X=x)=0P(X=xλ)g(λ)dλ\begin{align*} P(X = x) & = \int_0^{\infin} P(X = x | \lambda) g(\lambda) d \lambda \\ \end{align*}

By using the following R code, we can derive the negative binomial distribution

r
x <- c(rep(0, 10000)) # generate random samples

for (i in 1:10000) {
    x[i] <- rpois(
        n = 1,
        lambda = rgamma(n = 1, shape = 5, rate = 25)
    )
}

table(x)

print(mean(x))
print(var(x))
text
x
   0    1    2    3    5 
8204 1619  150   26    1
text
[1] 0.2002
[1] 0.2077407

Powered by VitePress