Mixture Distributions

Sometimes, we can model a random variable as a mixture of two or more distributions in the actuarial context. For example, the distribution of the total claim amount in a portfolio of insurance policies can be modeled as a mixture of two distributions: one for the claims that are below the deductible and another for the claims that exceed the deductible.

A mixture distribution is a probability distribution that is a mixture of two or more distributions. The probability density function (pdf) of a mixture distribution is given by the weighted sum of the pdfs of the component distributions. The weights are non-negative and sum to 1.

Let $f_1(x)$ and $f_2(x)$ be the pdfs of two distributions, and let the random variable $X$ have a mixture distribution with the pdf $f(x)$ , which is a mixture of $f_1(x)$ and $f_2(x)$ with weights $p$ and $1-p$ , respectively. Then, the pdf of the mixture distribution is given by:

f(x) = p \cdot f_1(x) + (1-p) \cdot f_2(x)

Equivalently, the cumulative distribution function (cdf) of the mixture distribution is given by:

F(x) = p \cdot F_1(x) + (1-p) \cdot F_2(x)

where $F_1(x)$ and $F_2(x)$ are the cdfs of the component distributions.

Example Here is an example of a mixture distribution. Suppose

two uniform random variables $X_1$ and $X_2$ are independent and identically distributed with pdf $f(x) = 1$ for $0 \leq x \leq 1$ . Let $X$ be a mixture of $X_1$ and $X_2$ with weights 0.5:0.5. Then, the E(X) is given by:

E(X) = 0.5 \cdot E(X_1) + 0.5 \cdot E(X_2) = 0.5 \cdot 0.5 + 0.5 \cdot 0.5 = 0.5

Let $Y = \frac{X_1 + X_2}{2}$ . Then, the E(Y) is given by:

E(Y) = E\left(\frac{X_1 + X_2}{2}\right) = \frac{1}{2} \cdot E(X_1 + X_2) = \frac{1}{2} \cdot E(X_1) + \frac{1}{2} \cdot E(X_2) = 0.5

Thus, the expected value of $X$ and $Y$ are the same. However, the two random variables are not the same. $X$ is a mixture of $X_1$ and $X_2$ , while $Y$ is the average of $X_1$ and $X_2$ . The range of $X$ is $[0, 1]$ , while the range of $Y$ is $[0, 0.5]$ . :::

More generally, a mixture distribution can be a mixture of more than two distributions. The pdf of a mixture distribution with $n$ component distributions is given by:

f(x) = p_1 \cdot f_1(x) + p_2 \cdot f_2(x) + \ldots + p_n \cdot f_n(x)

where $p_1, p_2, \ldots, p_n$ are the weights of the component distributions, and $f_1(x), f_2(x), \ldots, f_n(x)$ are the pdfs of the component distributions.

Suppose that for every $\theta$ in the set of $\Theta$ is a distribution $F_{\theta}$ . If $G$ is a $c.d.f$ on $\Theta$ corresponding $p.d.f$ $g$ .

Then we can define the mixture distribution $F$ by

F(x) = \int_{\Theta} F_{\theta}(x) dG(\theta) = \int_{\Theta} F_{\theta} (X) g(\theta) d \theta

In the previous section, we discussed how the Pareto distribution could be derived from the mixture of the Exponential distribution and the Gamma distribution.

::: example Here we provide another of the negative binomial distribution

Suppose the variability in the claim rate $\lambda$ follows a $\Gamma(\alpha, \beta)$ distribution and let $g(\cdot)$ be the density function of the $\Gamma(\alpha, \beta)$ random variable, and the claim cases follows a Poisson distribution with parameter $\lambda$ , the density function of random variable $X$ will be given by

\begin{align*} P(X = x) & = \int_0^{\infin} P(X = x | \lambda) g(\lambda) d \lambda \\ \end{align*}

By using the following R code, we can derive the negative binomial distribution

x <- c(rep(0, 10000)) # generate random samples

for (i in 1:10000) {
    x[i] <- rpois(
        n = 1,
        lambda = rgamma(n = 1, shape = 5, rate = 25)
    )
}

table(x)

print(mean(x))
print(var(x))

text

x
   0    1    2    3    5 
8204 1619  150   26    1

text

[1] 0.2002
[1] 0.2077407

Mixture Distributions ​

Mixture Distributions