More Distributions for Modeling Claim  In this section, we will discuss two more distributions that are commonly used to model the insurance claims, which are the Pareto distribution  and the Weibull distribution .
Pareto distribution  Characteristics of the Pareto distribution  If a random variable X X 
X ∼ Pareto ( α , λ ) , X \sim \text{Pareto}(\alpha, \lambda),  
where α \alpha λ \lambda 
PDF :
f ( x ) = α λ α ( λ + x ) α + 1 , f o r   x > 0 f(x)=\frac{\alpha\lambda^{\alpha}}{(\lambda+x)^{\alpha+1}},\mathrm{for}\ x>0  
The PDF can be derived by the following mixture distribution:
Suppose X ∣ θ ∼ exp ( θ ) X|\theta \sim \text{exp}(\theta) θ ∼ gamma ( α , λ ) \theta \sim \text{gamma}(\alpha, \lambda) λ \lambda 
f ( θ ∣ α , λ ) = λ α Γ ( α ) θ α − 1 e − λ θ , θ > 0 f(\theta|\alpha, \lambda) = \frac{\lambda^\alpha}{\Gamma(\alpha)} \theta^{\alpha - 1} e^{-\lambda \theta}, \quad \theta > 0  
Then, the density function of X X 
f X ( x ∣ α , λ ) = ∫ 0 ∞ f ( x ∣ θ ) f ( θ ∣ α , λ ) d θ = ∫ 0 ∞ ( θ e − θ x ) [ λ α Γ ( α ) θ α − 1 e − λ θ ] d θ = ∫ 0 ∞ λ α Γ ( α ) θ α e − ( λ + x ) θ d θ = λ α Γ ( α ) ∫ 0 ∞ θ α e − ( λ + x ) θ d θ = λ α Γ ( α + 1 ) Γ ( α ) ∫ 0 ∞ 1 Γ ( α + 1 ) θ α e − ( λ + x ) θ d θ = λ α Γ ( α + 1 ) Γ ( α ) ( λ + x ) α + 1 ( ∫ 0 ∞ ( λ + x ) α + 1 Γ ( α + 1 ) θ α e − ( λ + x ) θ d θ ) ⏟ 1 = λ α Γ ( α + 1 ) Γ ( α ) ( λ + x ) α + 1 = α λ α ( λ + x ) α + 1 \begin{align*} f_X(x|\alpha, \lambda) &= \int_0^\infty f(x|\theta) f(\theta|\alpha, \lambda) d\theta \\ &= \int_0^\infty (\theta e^{-\theta x}) \left[ \frac{\lambda^\alpha}{\Gamma(\alpha)} \theta^{\alpha - 1} e^{-\lambda \theta} \right] d\theta \\ &= \int_0^\infty \frac{\lambda^\alpha}{\Gamma(\alpha)} \theta^\alpha e^{-(\lambda + x)\theta} d\theta \\ &= \frac{\lambda^\alpha}{\Gamma(\alpha)} \int_0^\infty \theta^\alpha e^{-(\lambda + x)\theta} d\theta \\ &= \frac{\lambda^\alpha \Gamma(\alpha+1)}{\Gamma(\alpha)} \int_0^\infty \frac{1}{\Gamma(\alpha+1)} \theta^\alpha e^{-(\lambda + x)\theta} d\theta \\ &= \frac{\lambda^\alpha \Gamma(\alpha+1)}{\Gamma(\alpha)(\lambda + x)^{\alpha+1}} \underbrace{\left( \int_0^\infty \frac{(\lambda + x)^{\alpha+1}}{\Gamma(\alpha+1)} \theta^\alpha e^{-(\lambda + x)\theta} d\theta \right)}_{1} \\ &= \frac{\lambda^\alpha \Gamma(\alpha+1)}{\Gamma(\alpha)(\lambda + x)^{\alpha+1}} = \frac{\alpha \lambda^\alpha}{(\lambda + x)^{\alpha+1}} \end{align*}  
CDF :
F ( x ) = 1 − ( λ λ + x ) α , f o r   x > 0 F(x)=1-\left(\frac{\lambda}{\lambda+x}\right)^{\alpha},\mathrm{for}\ x>0  
Survival function :
F ˉ ( x ) = ( λ λ + x ) α , f o r   x > 0 \bar{F}(x)=\left(\frac{\lambda}{\lambda+x}\right)^{\alpha},\mathrm{for}\ x>0  
The survival function can be derived by the following:
F ˉ ( x ∣ α , λ ) = ∫ 0 ∞ F ( x ∣ θ ) f ( θ ∣ α , λ ) d θ = ∫ 0 ∞ e − θ x [ λ α Γ ( α ) θ α − 1 e − λ θ ] d θ = λ α Γ ( α ) ∫ 0 ∞ θ α − 1 e − ( λ + x ) θ d θ = λ α Γ ( α ) Γ ( α ) ∫ 0 ∞ 1 Γ ( α ) θ α − 1 e − ( λ + x ) θ d θ = λ α Γ ( α ) Γ ( α ) ( λ + x ) α ( ∫ 0 ∞ ( λ + x ) α Γ ( α ) θ α − 1 e − ( λ + x ) θ d θ ) ⏟ 1 = λ α Γ ( α ) Γ ( α ) ( λ + x ) α = λ α ( λ + x ) α = ( λ λ + x ) α \begin{align*} \bar{F}(x|\alpha, \lambda) &= \int_0^\infty F(x|\theta) f(\theta|\alpha, \lambda) d\theta \\ &= \int_0^\infty e^{-\theta x} \left[ \frac{\lambda^\alpha}{\Gamma(\alpha)} \theta^{\alpha - 1} e^{-\lambda \theta} \right] d\theta \\ &= \frac{\lambda^\alpha}{\Gamma(\alpha)} \int_0^\infty \theta^{\alpha - 1} e^{-(\lambda + x)\theta} d\theta \\ &= \frac{\lambda^\alpha \Gamma(\alpha)}{\Gamma(\alpha)} \int_0^\infty \frac{1}{\Gamma(\alpha)} \theta^{\alpha-1} e^{-(\lambda + x)\theta} d\theta \\ &= \frac{\lambda^\alpha \Gamma(\alpha)}{\Gamma(\alpha)(\lambda + x)^\alpha} \underbrace{\left( \int_0^\infty \frac{(\lambda + x)^\alpha}{\Gamma(\alpha)} \theta^{\alpha - 1} e^{-(\lambda + x)\theta} d\theta \right)}_{1} \\ &= \frac{\lambda^\alpha \Gamma(\alpha)}{\Gamma(\alpha)(\lambda + x)^\alpha} = \frac{\lambda^\alpha}{(\lambda + x)^\alpha} = \left( \frac{\lambda}{\lambda + x} \right)^\alpha \end{align*}  
Propositions of the Pareto distribution  Mean :
E ( X ) = λ α − 1 , α > 1 E(X) = \frac{\lambda}{\alpha - 1}, \quad \alpha > 1  
Following the previous setting, we can have:
E ( X ) = ∫ 0 ∞ E ( X ∣ θ ) f ( θ ∣ α , λ ) d θ = ∫ 0 ∞ 1 θ f ( θ ∣ α , λ ) d θ = ∫ 0 ∞ 1 θ λ α Γ ( α ) θ α − 1 e − λ θ d θ = λ α Γ ( α ) ∫ 0 ∞ θ α − 2 e − λ θ d θ = λ α Γ ( α − 1 ) Γ ( α ) ∫ 0 ∞ 1 Γ ( α − 1 ) θ α − 2 e − λ θ d θ = λ α Γ ( α − 1 ) Γ ( α ) λ α − 1 ( ∫ 0 ∞ λ α − 1 Γ ( α − 1 ) θ α − 2 e − λ θ d θ ) ⏟ 1 = λ α Γ ( α − 1 ) Γ ( α ) λ α − 1 = λ α − 1 \begin{align*} E(X) &= \int_0^\infty E(X|\theta) f(\theta|\alpha, \lambda) d\theta \\ &= \int_0^\infty \frac{1}{\theta} f(\theta|\alpha, \lambda) d\theta \\ &= \int_0^\infty \frac{1}{\theta} \frac{\lambda^\alpha}{\Gamma(\alpha)} \theta^{\alpha - 1} e^{-\lambda \theta} d\theta \\ &= \frac{\lambda^\alpha}{\Gamma(\alpha)} \int_0^\infty \theta^{\alpha - 2} e^{-\lambda \theta} d\theta \\ &= \frac{\lambda^\alpha \Gamma(\alpha - 1)}{\Gamma(\alpha)} \int_0^\infty \frac{1}{\Gamma(\alpha - 1)} \theta^{\alpha - 2} e^{-\lambda \theta} d\theta \\ &= \frac{\lambda^\alpha \Gamma(\alpha - 1)}{\Gamma(\alpha) \lambda^{\alpha - 1}} \underbrace{\left( \int_0^\infty \frac{\lambda^{\alpha - 1}}{\Gamma(\alpha - 1)} \theta^{\alpha - 2} e^{-\lambda \theta} d\theta \right)}_{1} \\ &= \frac{\lambda^\alpha \Gamma(\alpha - 1)}{\Gamma(\alpha) \lambda^{\alpha - 1}} = \frac{\lambda}{\alpha - 1} \end{align*}  
K-th origin moment :
E ( X k ) = λ k Γ ( α + k ) Γ ( α ) ( α − 1 ) k , α > 1 E(X^k) = \frac{\lambda^k \Gamma(\alpha + k)}{\Gamma(\alpha) (\alpha - 1)^k}, \quad \alpha > 1  
Derive the k k 
E ( X k ) = ∫ 0 ∞ E ( X k ∣ θ ) f ( θ ∣ α , λ ) d θ = ∫ 0 ∞ Γ ( k + 1 ) θ k f ( θ ∣ α , λ ) d θ = ∫ 0 ∞ Γ ( k + 1 ) θ k λ α Γ ( α ) θ α − 1 e − λ θ d θ = λ α Γ ( k + 1 ) Γ ( α ) ∫ 0 ∞ θ α − k − 1 e − λ θ d θ = λ α Γ ( k + 1 ) Γ ( α − k ) Γ ( α ) ∫ 0 ∞ 1 Γ ( α − k ) θ α − k − 1 e − λ θ d θ = λ α Γ ( k + 1 ) Γ ( α − k ) Γ ( α ) λ α − k ( ∫ 0 ∞ λ α − k Γ ( α − k ) θ α − k − 1 e − λ θ d θ ) ⏟ 1 = λ α Γ ( k + 1 ) Γ ( α − k ) Γ ( α ) λ α − k = λ k Γ ( k + 1 ) Γ ( α − k ) Γ ( α ) = λ k k ! ( α − k − 1 ) ! ( α − 1 ) ! = λ k k ! ( a − 1 ) ⋯ ( a − k ) , − 1 < k < α \begin{align*} E(X^k) &= \int_0^\infty E(X^k|\theta) f(\theta|\alpha, \lambda) d\theta \\ &= \int_0^\infty \frac{\Gamma(k+1)}{\theta^k} f(\theta|\alpha, \lambda) d\theta \\ &= \int_0^\infty \frac{\Gamma(k+1)}{\theta^k} \frac{\lambda^\alpha}{\Gamma(\alpha)} \theta^{\alpha - 1} e^{-\lambda \theta} d\theta \\ &= \frac{\lambda^\alpha \Gamma(k+1)}{\Gamma(\alpha)} \int_0^\infty \theta^{\alpha - k - 1} e^{-\lambda \theta} d\theta \\ &= \frac{\lambda^\alpha \Gamma(k+1) \Gamma(\alpha - k)}{\Gamma(\alpha)} \int_0^\infty \frac{1}{\Gamma(\alpha - k)} \theta^{\alpha - k - 1} e^{-\lambda \theta} d\theta \\ &= \frac{\lambda^\alpha \Gamma(k+1) \Gamma(\alpha - k)}{\Gamma(\alpha) \lambda^{\alpha - k}} \underbrace{\left( \int_0^\infty \frac{\lambda^{\alpha - k}}{\Gamma(\alpha - k)} \theta^{\alpha - k - 1} e^{-\lambda \theta} d\theta \right)}_{1} \\ &= \frac{\lambda^\alpha \Gamma(k+1) \Gamma(\alpha - k)}{\Gamma(\alpha) \lambda^{\alpha - k}} = \frac{\lambda^k \Gamma(k+1) \Gamma(\alpha - k)}{\Gamma(\alpha)} = \frac{\lambda^k k! (\alpha - k - 1)!}{(\alpha - 1)!} \\ &= \frac{\lambda^k k!}{(a-1) \cdots (a-k)}, \quad -1 < k < \alpha \end{align*}  
Variance :
Var ( X ) = λ 2 ( α − 1 ) 2 ( α − 2 ) , α > 2 \text{Var}(X) = \frac{\lambda^2}{(\alpha - 1)^2 (\alpha - 2)}, \quad \alpha > 2  
Derive the variance of Pareto distribution by standard formula
Var ( X ) = E ( X 2 ) − E 2 ( X ) = λ 2 Γ ( α + 2 ) Γ ( α ) ( α − 1 ) 2 − ( λ α − 1 ) 2 = λ 2 Γ ( α + 2 ) Γ ( α ) ( α − 1 ) 2 − λ 2 ( α − 1 ) 2 = λ 2 Γ ( α + 2 ) Γ ( α ) ( α − 1 ) 2 − λ 2 Γ ( α + 1 ) Γ ( α ) ( α − 1 ) 2 = λ 2 Γ ( α + 2 ) − λ 2 Γ ( α + 1 ) Γ ( α ) ( α − 1 ) 2 = λ 2 Γ ( α + 2 − 1 ) ( α + 1 − 1 ) − λ 2 Γ ( α + 1 ) Γ ( α ) ( α − 1 ) 2 = λ 2 Γ ( α + 2 ) Γ ( α ) ( α − 1 ) 2 − λ 2 Γ ( α + 1 ) Γ ( α ) ( α − 1 ) 2 = λ 2 Γ ( α + 2 ) − λ 2 Γ ( α + 1 ) Γ ( α ) ( α − 1 ) 2 = λ 2 Γ ( α + 2 − 1 ) ( α + 1 − 1 ) − λ 2 Γ ( α + 1 ) Γ ( α ) ( α − 1 ) 2 = λ 2 Γ ( α + 1 ) ( α + 1 − 1 ) − λ 2 Γ ( α + 1 ) Γ ( α ) ( α − 1 ) 2 = λ 2 Γ ( α + 1 ) Γ ( α ) ( α − 1 ) 2 = λ 2 ( α − 1 ) 2 ( α − 2 ) \begin{align*} \text{Var}(X) &= E(X^2) - E^2(X) = \frac{\lambda^2 \Gamma(\alpha + 2)}{\Gamma(\alpha) (\alpha - 1)^2} - \left( \frac{\lambda}{\alpha - 1} \right)^2 \\ &= \frac{\lambda^2 \Gamma(\alpha + 2)}{\Gamma(\alpha) (\alpha - 1)^2} - \frac{\lambda^2}{(\alpha - 1)^2} \\ &= \frac{\lambda^2 \Gamma(\alpha + 2)}{\Gamma(\alpha) (\alpha - 1)^2} - \frac{\lambda^2 \Gamma(\alpha + 1)}{\Gamma(\alpha) (\alpha - 1)^2} \\ &= \frac{\lambda^2 \Gamma(\alpha + 2) - \lambda^2 \Gamma(\alpha + 1)}{\Gamma(\alpha) (\alpha - 1)^2} \\ &= \frac{\lambda^2 \Gamma(\alpha + 2 - 1) (\alpha + 1 - 1) - \lambda^2 \Gamma(\alpha + 1)}{\Gamma(\alpha) (\alpha - 1)^2} \\ &= \frac{\lambda^2 \Gamma(\alpha + 2)}{\Gamma(\alpha) (\alpha - 1)^2} - \frac{\lambda^2 \Gamma(\alpha + 1)}{\Gamma(\alpha) (\alpha - 1)^2} \\ &= \frac{\lambda^2 \Gamma(\alpha + 2) - \lambda^2 \Gamma(\alpha + 1)}{\Gamma(\alpha) (\alpha - 1)^2} \\ &= \frac{\lambda^2 \Gamma(\alpha + 2 - 1) (\alpha + 1 - 1) - \lambda^2 \Gamma(\alpha + 1)}{\Gamma(\alpha) (\alpha - 1)^2} \\ &= \frac{\lambda^2 \Gamma(\alpha + 1) (\alpha + 1 - 1) - \lambda^2 \Gamma(\alpha + 1)}{\Gamma(\alpha) (\alpha - 1)^2} \\ &= \frac{\lambda^2 \Gamma(\alpha + 1)}{\Gamma(\alpha) (\alpha - 1)^2} = \frac{\lambda^2}{(\alpha - 1)^2 (\alpha - 2)} \end{align*}  
Estimation of parameters  Method of moments (MM)  In the Pareto distribution, we have two parameters to estimate, so we need to use the method of moments to estimate the parameters α \alpha λ \lambda λ α − 1 \frac{\lambda}{\alpha - 1} λ 2 ( α − 1 ) 2 ( α − 2 ) \frac{\lambda^2}{(\alpha - 1)^2 (\alpha - 2)} 
We can solve the following criteria:
E ( X ) = λ α − 1 = 1 n ∑ i = 1 n X i = X ˉ E(X) = \frac{\lambda}{\alpha - 1} = \frac{1}{n} \sum_{i=1}^n X_i = \bar{X}  
V a r ( X ) = λ 2 ( α − 1 ) 2 ( α − 2 ) = 1 n ∑ i = 1 n ( X i − X ˉ ) 2 = S 2 Var(X) = \frac{\lambda^2}{(\alpha - 1)^2 (\alpha - 2)} = \frac{1}{n} \sum_{i=1}^n (X_i - \bar{X})^2 = S^2  
Solve the first moment,
λ ^ = X ˉ ( α ^ − 1 ) \hat{\lambda} = \bar{X}(\hat{\alpha} - 1)  
Then, replace λ \lambda X ˉ ( α ^ − 1 ) \bar{X}(\hat{\alpha} - 1) α \alpha 
X ˉ ( α ^ − 1 ) ( α ^ − 1 ) 2 ( α ^ − 2 ) = S 2 X ˉ ( α ^ − 1 ) = S 2 ( α ^ − 1 ) 2 ( α ^ − 2 ) X ˉ = S 2 ( α ^ − 1 ) ( α ^ − 2 ) X ˉ = S 2 ( α ^ 2 − 3 α ^ + 2 ) α ^ 2 − 3 α ^ + 2 = X ˉ S 2 α ^ 2 − 3 α ^ + 2 − X ˉ S 2 = 0 \begin{align*} \frac{\bar{X}(\hat{\alpha} - 1)}{(\hat{\alpha} - 1)^2 (\hat{\alpha} - 2)} & = S^2 \\ \bar{X}(\hat{\alpha} - 1) & = S^2 (\hat{\alpha} - 1)^2 (\hat{\alpha} - 2) \\ \bar{X} & = S^2 (\hat{\alpha} - 1) (\hat{\alpha} - 2) \\ \bar{X} & = S^2 (\hat{\alpha}^2 - 3\hat{\alpha} + 2) \\ \hat{\alpha}^2 - 3\hat{\alpha} + 2 & = \frac{\bar{X}}{S^2} \\ \hat{\alpha}^2 - 3\hat{\alpha} + 2 - \frac{\bar{X}}{S^2} & = 0 \\ \end{align*}  
According to the standard solution, we can solve the equation:
α ^ = 3 ± 5 + 4 X ˉ S 2 2 \hat{\alpha} = \frac{3 \pm \sqrt{5 + 4\frac{\bar{X}}{S^2}}}{2}  
Then, we can solve the estimator of λ \lambda 
λ ^ = X ˉ ( α ^ − 1 ) = X ˉ ( 3 ± 5 + 4 X ˉ S 2 2 − 1 ) = X ˉ ( 1 ± 5 + 4 X ˉ S 2 2 ) \begin{align*} \hat{\lambda} & = \bar{X}(\hat{\alpha} - 1) \\ & = \bar{X}(\frac{3 \pm \sqrt{5 + 4\frac{\bar{X}}{S^2}}}{2} - 1) \\ & = \bar{X}(\frac{1 \pm \sqrt{5 + 4\frac{\bar{X}}{S^2}}}{2}) \end{align*}  
The estimator of α \alpha λ \lambda 
α ^ = 3 ± 5 + 4 X ˉ S 2 2 , λ ^ = X ˉ ( 1 ± 5 + 4 X ˉ S 2 2 ) \hat{\alpha} = \frac{3 \pm \sqrt{5 + 4\frac{\bar{X}}{S^2}}}{2}, \quad \hat{\lambda} = \bar{X}(\frac{1 \pm \sqrt{5 + 4\frac{\bar{X}}{S^2}}}{2})  
Maximum likelihood estimation (MLE)  The likelihood function of the Pareto distribution is given by:
L ( α , λ ) = ∏ i = 1 n α λ α ( λ + x i ) α + 1 L(\alpha, \lambda) = \prod_{i=1}^n \frac{\alpha\lambda^{\alpha}}{(\lambda+x_i)^{\alpha+1}}  
Then the log-likelihood function is:
l = ∑ i = 1 n ( log  ( α ) + α log  ( λ ) − ( α + 1 ) log  ( λ + x i ) ) = n log  ( α ) + n α log  ( λ ) − ( α + 1 ) ∑ i = 1 n log  ( λ + x i ) \begin{array}{rl} l & = \sum_{i=1}^n \left( \log(\alpha) + \alpha \log(\lambda) - (\alpha + 1) \log(\lambda + x_i) \right) \\ \\ & = n \log(\alpha) + n \alpha \log(\lambda) - (\alpha + 1) \sum_{i=1}^n \log(\lambda + x_i) \end{array}  
Taking the derivative of the log-likelihood function with respect to α \alpha λ \lambda 
we have the following equations:
d l d α = n α + n log  ( λ ) − ∑ i = 1 n log  ( λ + x i ) = 0 \frac{dl}{d\alpha} = \frac{n}{\alpha} + n \log(\lambda) - \sum_{i=1}^n \log(\lambda + x_i) = 0  
Solve the equation, we have the MLE of α \alpha 
n α = ∑ i = 1 n log  ( λ + x i ) − n log  ( λ ) α ^ = n ∑ i = 1 n log  ( λ + x i ) − n log  ( λ ) = n ∑ i = 1 n log  ( 1 + x i λ ) \begin{align*} \frac{n}{\alpha} & = \sum_{i=1}^n \log(\lambda + x_i) - n \log(\lambda) \\ \hat{\alpha} & = \frac{n}{\sum_{i=1}^n \log(\lambda + x_i) - n \log(\lambda)} \\ & = \frac{n}{\sum_{i=1}^n \log(1+ \frac{x_i}{\lambda})} \\ \end{align*}  
Then, taking the derivative of the log-likelihood function with respect to λ \lambda 
d l d λ = n α λ − ( α + 1 ) ∑ i = 1 n 1 λ + x i = 0 \frac{dl}{d\lambda} = \frac{n\alpha}{\lambda} -(\alpha + 1)\sum_{i=1}^n \frac{1}{\lambda + x_i} = 0  
Solve the equation, we have the MLE of λ \lambda 
n α λ = ( α + 1 ) ∑ i = 1 n 1 λ + x i ( n λ − ∑ i = 1 n 1 λ + x i ) α = ∑ i = 1 n 1 λ + x i ( ∑ i = 1 n ( 1 λ − 1 λ + x i ) ) α = ∑ i = 1 n 1 λ + x i ( ∑ i = 1 n x i λ ( λ + x i ) ) α = ∑ i = 1 n 1 λ + x i α ^ = ∑ i = 1 n 1 λ + x i ( ∑ i = 1 n x i λ ( λ + x i ) ) \begin{align*} \frac{n\alpha}{\lambda} & = (\alpha + 1)\sum_{i=1}^n \frac{1}{\lambda + x_i} \\ (\frac{n}{\lambda} - \sum_{i=1}^n \frac{1}{\lambda + x_i})\alpha & = \sum_{i=1}^n \frac{1}{\lambda + x_i}\\ (\sum_{i=1}^n (\frac{1}{\lambda} - \frac{1}{\lambda + x_i})) \alpha & = \sum_{i=1}^n \frac{1}{\lambda + x_i}\\ (\sum_{i=1}^{n}\frac{x_i}{\lambda(\lambda + x_i)}) \alpha & = \sum_{i=1}^{n} \frac{1}{\lambda + x_i} \\ \hat{\alpha} & = \frac{\sum_{i=1}^{n} \frac{1}{\lambda + x_i}}{(\sum_{i=1}^{n}\frac{x_i}{\lambda(\lambda + x_i)})} \end{align*}  
Then, we can have the equation:
n ∑ i = 1 n log  ( 1 + x i λ ) − ∑ i = 1 n 1 λ + x i ( ∑ i = 1 n x i λ ( λ + x i ) ) = 0  \frac{n}{\sum_{i=1}^n \log(1+ \frac{x_i}{\lambda})} -\frac{\sum_{i=1}^{n} \frac{1}{\lambda + x_i}}{(\sum_{i=1}^{n}\frac{x_i}{\lambda(\lambda + x_i)})} = 0  
r x  <-  rpareto ( 1000 ,  shape  =  2 ,  scale  =  1 ) f  <-  function (lambda) {   n  /  sum ( log ( 1  +  x  /  lambda))  -  sum ( 1  /  (lambda  +  x))  /  sum (x  /  (lambda  * (lambda  +  x))) } lambda  <-  uniroot (f,  c ( 1e-8 ,  1e8 )) $ root alpha  <-  length (x)  /  sum ( log ( 1  +  x  /  lambda)) paste0 ( "alpha:" , alpha) paste0 ( "lambda:" , lambda) Weibull distribution  Characteristics of the Weibull distribution  If a random variable X X 
X ∼ Weibull ( c , γ ) , X \sim \text{Weibull}(c, \gamma),  
where c c γ \gamma 
PDF :
f X ( x ) = c γ x γ − 1 e − c x γ f_X(x) = c \gamma x^{\gamma - 1} e ^{-c x ^{\gamma}}  
Weibull distribution can be derived from the exponential distribution.
Suppose Y ∼ e x p ( c ) Y \sim exp(c) X = Y 1 / γ ∼ W e i b u l l ( c , γ ) X = Y^{1/\gamma} \sim Weibull(c, \gamma) 
First, recall the change of the variable formula:
f X ( x ) = { f Y ( g − 1 ( x ) ) ,  if  x > 0 0 , otherwise f_X(x) = \{ \begin{align*} f_Y(g^{-1}(x)), & \ \text{if} \ x > 0 \\ 0 &, \text{otherwise} \end{align*}  
Then, recall the density function of the exponential distribution:
f Y ( y ) = c e − c y , for y > 0 f_Y(y) = ce^{-cy}, \quad \text{for} \quad y > 0  
Then, we have
X = g ( Y ) = Y 1 / γ X = g(Y) = Y^{1 / \gamma}  
The inverse function of G is given by
Y = g − 1 ( x ) = x γ Y = g^{-1}(x) = x^{\gamma}  
Therefore, the density function of X is given by:
f X ( x ) = F Y ( g − 1 ( x ) ) ∣ d g − 1 d x = c e − c x γ γ x γ − 1 = c γ x γ − 1 e − c x γ \begin{align*} f_X(x) & = F_Y(g^{-1}(x)) | \frac{dg^{-1}}{dx} \\ & = ce^{-cx^{\gamma}}\gamma x^{\gamma -1} \\ & = c \gamma x^{\gamma -1} e^{-cx^{\gamma}} \end{align*}  
CDF :
F X ( x ) = P [ X ≤ x ] = P [ Y 1 / γ ≤ x ] = P [ Y ≤ x γ ] = 1 − e − c x γ  F_X(x) = P[X\leq x] = P[Y^{1/\gamma} \leq x] = P[Y \leq x^{\gamma}] = 1 - e^{-cx^{\gamma}}  
Survival function :
F ˉ X ( x ) = P [ X > x ] = P [ Y 1 / γ > x ] = P [ Y > x γ ] = e − c x γ \bar{F}_X(x) = P[X > x] = P[Y^{1/\gamma} > x] = P[Y > x^{\gamma}] = e^{-cx^{\gamma}}  
Propositions of the Weibull distribution  Mean :
E ( X ) = E ( Y 1 / γ ) = Γ ( 1 + 1 / γ ) c 1 / γ = Γ ( 1 + 1 / γ ) c γ E(X) = E(Y^{1/\gamma}) = \frac{\Gamma(1 + 1/\gamma)}{c^{1/\gamma}} = \Gamma(1 + 1/\gamma) c^{\gamma}  
Recall the expectation of the exponential distribution:
E ( Y k ) = k ! c k = Γ ( 1 + k ) c k E(Y^{k}) = \frac{k!}{c^k} = \frac{\Gamma(1 + k)}{c^k}  
Then, extend to the k k 
E ( X k ) = E ( Y k / γ ) = Γ ( 1 + k / γ ) c k / γ = Γ ( 1 + k / γ ) c − k / γ E(X^k) = E(Y^{k/\gamma}) = \frac{\Gamma(1+k/\gamma)}{c^{k/\gamma}} = \Gamma(1 + k/\gamma) c^{-k/\gamma}  
Variance :
V a r ( X ) = E ( X 2 ) − E 2 ( X ) = Γ ( 1 + 2 / γ ) c − 2 / γ − ( Γ ( 1 + 1 / γ ) c − 1 / γ ) 2 = Γ ( 1 + 2 / γ ) c − 2 / γ − ( Γ ( 1 + 1 / γ ) 2 c − 2 / γ ) = Γ ( 1 + 2 / γ ) c − 2 / γ − Γ 2 ( 1 + 1 / γ ) c − 2 / γ = c − 2 / γ ( Γ ( 1 + 2 / γ ) − Γ 2 ( 1 + 1 / γ ) ) \begin{align*} Var(X) & = E(X^2) - E^2(X)\\ & = \Gamma(1 + 2/\gamma) c^{-2/\gamma} - (\Gamma(1 + 1/\gamma) c^{-1/\gamma})^2 \\ & = \Gamma(1 + 2/\gamma) c^{-2/\gamma} - (\Gamma(1 + 1/\gamma)^2 c^{-2/\gamma}) \\ & = \Gamma(1 + 2/\gamma) c^{-2/\gamma} - \Gamma^2(1 + 1/\gamma) c^{-2/\gamma} \\ & = c^{-2/\gamma} (\Gamma(1 + 2/\gamma) - \Gamma^2(1 + 1/\gamma)) \end{align*}  
Estimation of parameters  Method of moments (MM)  The first two moments of the Weibull distribution are:
μ 1 = E ( X ) = Γ ( 1 + 1 / γ ) c γ μ 2 = V a r ( X ) = c − 2 / γ ( Γ ( 1 + 2 / γ ) − Γ 2 ( 1 + 1 / γ ) ) \begin{align*} \mu_1 & = E(X) = \Gamma(1 + 1/\gamma) c^{\gamma} \\ \mu_2 & = Var(X) = c^{-2/\gamma} (\Gamma(1 + 2/\gamma) - \Gamma^2(1 + 1/\gamma)) \end{align*}  
Solve the first equation with taking logarithem:
log  ( μ 1 ) = log  ( Γ ( 1 + 1 / γ ) ) + γ log  ( c ) → γ = log  ( μ 1 ) − log  ( Γ ( 1 + 1 / γ ) ) log  ( c ) \begin{align*} \log(\mu_1) & = \log(\Gamma(1 + 1/\gamma)) + \gamma \log(c) \\ \rightarrow \gamma & = \frac{\log(\mu_1) - \log(\Gamma(1 + 1/\gamma))}{\log(c)} \end{align*}  
Then, we can get the estimator of c c 
log  ( μ 2 ) = − 2 γ log  ( c ) + log  ( Γ ( 1 + 2 / γ ) ) − log  ( Γ 2 ( 1 + 1 / γ ) ) 2 γ log  ( c ) = log  ( Γ ( 1 + 2 / γ ) ) − log  ( Γ 2 ( 1 + 1 / γ ) ) − l o g ( μ 2 ) log  ( c ) = log  ( Γ ( 1 + 2 / γ ) ) − log  ( Γ 2 ( 1 + 1 / γ ) ) − log  ( μ 2 ) 2 / γ c = exp  ( log  ( Γ ( 1 + 2 / γ ) ) − log  ( Γ 2 ( 1 + 1 / γ ) ) − log  ( μ 2 ) 2 / γ ) \begin{align*} \log(\mu_2) & = -\frac{2}{\gamma} \log(c) + \log(\Gamma(1 + 2/\gamma)) - \log(\Gamma^2(1 + 1/\gamma)) \\ \frac{2}{\gamma} \log(c) & = \log(\Gamma(1 + 2/\gamma)) - \log(\Gamma^2(1 + 1/\gamma)) - log(\mu_2) \\ \log(c) & = \frac{\log(\Gamma(1 + 2/\gamma)) - \log(\Gamma^2(1 + 1/\gamma)) - \log(\mu_2)}{2/\gamma} \\ c & = \exp\left(\frac{\log(\Gamma(1 + 2/\gamma)) - \log(\Gamma^2(1 + 1/\gamma)) - \log(\mu_2)}{2/\gamma}\right) \\ \end{align*}  
Then, we replace c = exp  ( log  ( Γ ( 1 + 2 / γ ) ) − log  ( Γ 2 ( 1 + 1 / γ ) ) − log  ( μ 2 ) 2 / γ ) c = \exp\left(\frac{\log(\Gamma(1 + 2/\gamma)) - \log(\Gamma^2(1 + 1/\gamma)) - \log(\mu_2)}{2/\gamma}\right) γ \gamma 
γ = log  ( μ 1 ) − log  ( Γ ( 1 + 1 / γ ) ) log  ( exp  ( log  ( Γ ( 1 + 2 / γ ) ) − log  ( Γ 2 ( 1 + 1 / γ ) ) − log  ( μ 2 ) 2 / γ ) ) \gamma = \frac{\log(\mu_1) - \log(\Gamma(1 + 1/\gamma))}{\log\left(\exp\left(\frac{\log(\Gamma(1 + 2/\gamma)) - \log(\Gamma^2(1 + 1/\gamma)) - \log(\mu_2)}{2/\gamma}\right)\right)}  
Next, we replace the first and second moments with the sample moments:
μ 1 = 1 n ∑ i = 1 n x i = X ˉ μ 2 = V a r ( X ) = ∑ i = 1 n ( x i − x ˉ ) 2 n = S 2 \begin{align*} \mu_1 & = \frac{1}{n} \sum_{i=1}^n x_i = \bar{X} \\ \mu_2 & = Var(X) = \frac{\sum_{i=1}^n (x_i - \bar{x})^2}{n} = S^2 \end{align*}  
Then, we can estimate the parameters γ \gamma c c 
γ ^ = log  ( X ˉ ) − log  ( Γ ( 1 + 1 / γ ^ ) ) log  ( exp  ( log  ( Γ ( 1 + 2 / γ ^ ) ) − log  ( Γ 2 ( 1 + 1 / γ ^ ) ) − log  ( S 2 ) 2 / γ ^ ) ) c ^ = exp  ( log  ( Γ ( 1 + 2 / γ ^ ) ) − log  ( Γ 2 ( 1 + 1 / γ ^ ) ) − log  ( S 2 ) 2 / γ ^ ) \begin{align*} \hat{\gamma} & = \frac{\log(\bar{X}) - \log(\Gamma(1 + 1/\hat{\gamma}))}{\log\left(\exp\left(\frac{\log(\Gamma(1 + 2/\hat{\gamma})) - \log(\Gamma^2(1 + 1/\hat{\gamma})) - \log(S^2)}{2/\hat{\gamma}}\right)\right)} \\ \hat{c} & = \exp\left(\frac{\log(\Gamma(1 + 2/\hat{\gamma})) - \log(\Gamma^2(1 + 1/\hat{\gamma})) - \log(S^2)}{2/\hat{\gamma}}\right) \end{align*}  
Finally, we can solve the equation by R or Python.
r # generate 1000 Weibull random variables x  <-  rweibull ( 1000 ,  shape  =  2 ,  scale  =  1 ) aux  <-  log ( mean (x)) f  <-  function (gamma) {   ( log ( mean (x))  -  log ( gamma ( 1  +  1  /  gamma)))  /  log ( exp (( log ( gamma ( 1  +  2  /  gamma))  -  log ( gamma ( 1  +  1  /  gamma)) ^ 2  -  log ( var (x)))  /  ( 2  /  gamma)))  -  gamma } gamma  <-  uniroot (f,  c ( 1e-8 ,  1e8 )) $ root c  <-  exp (( log ( gamma ( 1  +  2  /  gamma))  -  log ( gamma ( 1  +  1  /  gamma)) ^ 2  -  log ( var (x)))  /  ( 2  /  gamma)) cat ( "Estimated shape parameter: " , gamma,  " \n " , "Estimated scale parameter: " , c,  " \n " ) Maximum likelihood estimation (MLE)  The likelihood functon of Weibull distribution is given by:
L = ∏ i = 1 n c γ x i y − 1 e − c x i γ L = \prod_{i = 1}^{n} c \gamma x_i^{y-1} e^{-cx_i^{\gamma}}  
The log-likelihood function will be:
l = ∑ i = 1 n log  ( c ) + ∑ i = 1 n log  ( γ ) + ( γ − 1 ) ∑ i = 1 n − c ∑ i = 1 n x i γ = n log  ( c ) + n log  ( γ ) + ( γ − 1 ) ∑ i = 1 n log  ( x + i ) − c ∑ i = 1 n x i γ \begin{align*} l & = \sum_{i = 1}^n\log(c) +\sum_{i = 1}^n\log(\gamma) + (\gamma - 1) \sum_{i = 1}^n - c\sum_{i = 1}^n x_i^{\gamma} \\ & =n \log(c) + n \log(\gamma) + (\gamma - 1) \sum_{i = 1}^n \log(x+i) - c\sum_{i = 1}^n x_i^{\gamma} \end{align*}  
Taking the derivative of the log-likelihood function with respect to c c γ \gamma 
d l d c = n c − ∑ i = 1 n x i γ = 0 \frac{dl}{dc} = \frac{n}{c} - \sum_{i = 1}^n x_i^{\gamma} = 0  
Solve the equation, we have the MLE of c c 
n c = ∑ i = 1 n x i γ ⇒ c ^ = n ∑ i = 1 n x i γ \frac{n}{c} = \sum_{i = 1}^n x_i^{\gamma} \Rightarrow \hat{c} = \frac{n}{\sum_{i = 1}^n x_i^{\gamma}}  
Then, taking the derivative of the log-likelihood function with respect to γ \gamma 
d l d γ = n γ + ∑ i = 1 n log  ( x i ) − c ∑ i = 1 n x i γ log  ( x i ) = 0 \frac{dl}{d\gamma} = \frac{n}{\gamma} + \sum_{i = 1}^n \log(x_i) - c\sum_{i = 1}^n x_i^{\gamma} \log(x_i) = 0  
Replace c c n ∑ i = 1 n x i γ \frac{n}{\sum_{i = 1}^n x_i^{\gamma}} γ \gamma 
n γ + ∑ i = 1 n log  ( x i ) − n ∑ i = 1 n x i γ ∑ i = 1 n x i γ log  ( x i ) = 0 1 γ + log  ( x i ) ‾ − 1 ∑ i = 1 n x i γ ∑ i = 1 n x i γ log  ( x i ) = 0 \begin{align*} \frac{n}{\gamma} + \sum_{i = 1}^n \log(x_i) - \frac{n}{\sum_{i = 1}^n x_i^{\gamma}}\sum_{i = 1}^n x_i^{\gamma} \log(x_i) & = 0 \\ \frac{1}{\gamma} + \overline{\log(x_i)} - \frac{1}{\sum_{i = 1}^n x_i^{\gamma}}\sum_{i = 1}^n x_i^{\gamma} \log(x_i) & = 0 \\ \end{align*}  
This equation can be solved by R or Python.
r # generate 1000 Weibull random variables x  <-  rweibull ( 1000 ,  shape  =  2 ,  scale  =  1 ) aux  <-  mean ( log (x)) f  <-  function (gamma) { 1  /  gamma  +  aux  -  sum (x ^ gamma  *  log (x))  /  sum (x ^ gamma) } gamma  <-  uniroot (f,  c ( 1e-8 ,  1e8 )) $ root c  <-  length (x)  /  sum (x ^ gamma) cat ( "Estimated shape parameter: " , gamma,  " \n " , "Estimated scale parameter: " , c,  " \n " )