--- title: "On the Statistical Properties and Computational Inference of the Generalized Kumaraswamy Distribution Family" author: "José Evandeilton Lopes" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{On the Statistical Properties and Computational Inference of the Generalized Kumaraswamy Distribution Family} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` **Abstract.** We present a comprehensive mathematical treatment of the Generalized Kumaraswamy (GKw) distribution, a five-parameter family for modeling continuous random variables on the unit interval by Carrasco et all (2010). We establish the hierarchical structure connecting GKw to several nested sub-models including the Beta and Kumaraswamy distributions, derive closed-form expressions for the log-likelihood function, score vector, and observed information matrix, and prove asymptotic properties of maximum likelihood estimators. All analytical derivatives are derived from the compositional structure of the distribution and written in a form suitable for stable numerical implementation. The theoretical results provide the foundation for efficient numerical routines in the R package `gkwdist`. **Keywords:** Bounded distributions, Beta distribution, Kumaraswamy distribution, Maximum likelihood estimation, Fisher information, Numerical stability --- ## 1. Introduction and Preliminaries ### 1.1 Motivation and Background The analysis of continuous random variables constrained to the unit interval \((0,1)\) arises naturally in numerous statistical applications, including proportions, rates, percentages, and index measurements. The classical Beta distribution (Johnson et al., 1995) has long served as the canonical model for such data, offering analytical tractability and well-understood properties. However, its cumulative distribution function (CDF) involves the incomplete beta function, requiring numerical evaluation of special functions for quantile computation and simulation. Kumaraswamy (1980) introduced an alternative two-parameter family with closed-form CDF and quantile function, facilitating computational efficiency while maintaining comparable flexibility to the Beta distribution. Jones (2009) demonstrated that the Kumaraswamy distribution exhibits similar shape characteristics to the Beta family while offering superior computational advantages. Building upon these foundations, Cordeiro and de Castro (2011) developed the Generalized Kumaraswamy (GKw) distribution, a five-parameter extension incorporating both Beta and Kumaraswamy structures through nested transformations. This distribution encompasses a rich hierarchy of submodels, providing substantial flexibility for modeling diverse patterns in bounded data. Despite its theoretical appeal, a fully explicit and internally consistent analytical treatment of the GKw family—particularly for likelihood-based inference—has remained incomplete in the literature. This vignette fills this gap by providing a rigorous development, including validated expressions for all first and second derivatives of the log-likelihood function, written in a form convenient for implementation in the `gkwdist` R package. ### 1.2 Mathematical Preliminaries We establish notation and fundamental results required for subsequent development. **Notation 1.1.** Throughout, we denote - \(\Gamma(\cdot)\): the gamma function - \(B(a,b) = \dfrac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)}\): the beta function - \(I_z(a,b) = \dfrac{B_z(a,b)}{B(a,b)}\): the regularized incomplete beta function - \(\psi(x) = \Gamma'(x)/\Gamma(x) = (\ln\Gamma(x))'\): the digamma function - \(\psi_1(x) = \psi'(x) = (\ln\Gamma(x))''\): the trigamma function - \(\mathbf{1}_A\): the indicator function of a set \(A\) We recall basic derivatives of the beta function. **Lemma 1.1 (Derivatives of the beta function).** For \(a,b>0\), \[ \begin{align} \frac{\partial}{\partial a}\ln B(a,b) &= \psi(a) - \psi(a+b), \tag{1.1}\\[3pt] \frac{\partial^2}{\partial a^2}\ln B(a,b) &= \psi_1(a) - \psi_1(a+b), \tag{1.2}\\[3pt] \frac{\partial^2}{\partial a\,\partial b}\ln B(a,b) &= -\psi_1(a+b). \tag{1.3} \end{align} \] *Proof.* Since \[ \ln B(a,b) = \ln\Gamma(a) + \ln\Gamma(b) - \ln\Gamma(a+b), \] the identities follow immediately from the definitions of \(\psi\) and \(\psi_1\) and the chain rule. \(\square\) We will also repeatedly use the following cascade of transformations. **Lemma 1.2 (Cascade transformations).** Define, for \(x\in(0,1)\), \[ \begin{align} v(x; \alpha) &= 1 - x^\alpha, \tag{1.4}\\ w(x; \alpha, \beta) &= 1 - v(x;\alpha)^\beta = 1 - (1-x^\alpha)^\beta, \tag{1.5}\\ z(x; \alpha, \beta, \lambda) &= 1 - w(x;\alpha,\beta)^\lambda = 1 - [1-(1-x^\alpha)^\beta]^\lambda. \tag{1.6} \end{align} \] Then, for \(\alpha,\beta,\lambda>0\), \[ \begin{align} \frac{\partial v}{\partial x} &= -\alpha x^{\alpha-1}, \tag{1.7}\\[3pt] \frac{\partial w}{\partial x} &= \alpha\beta x^{\alpha-1}(1-x^\alpha)^{\beta-1}, \tag{1.8}\\[3pt] \frac{\partial z}{\partial x} &= \alpha\beta\lambda\,x^{\alpha-1} (1-x^\alpha)^{\beta-1}\bigl[1-(1-x^\alpha)^\beta\bigr]^{\lambda-1}. \tag{1.9} \end{align} \] *Proof.* Direct differentiation and repeated application of the chain rule. \(\square\) For brevity we will often write \(v(x)\), \(w(x)\) and \(z(x)\) when the dependence on \((\alpha,\beta,\lambda)\) is clear from the context. --- ## 2. The Generalized Kumaraswamy Distribution and Its Subfamily ### 2.1 Definition and Fundamental Properties We start from the five-parameter Generalized Kumaraswamy family. **Definition 2.1 (Generalized Kumaraswamy distribution).** A random variable \(X\) has a Generalized Kumaraswamy distribution with parameter vector \[ \boldsymbol{\theta} = (\alpha,\beta,\gamma,\delta,\lambda)^\top, \] denoted \(X \sim \mathrm{GKw}(\alpha,\beta,\gamma,\delta,\lambda)\), if its probability density function (pdf) is \[ \boxed{ f(x; \boldsymbol{\theta}) = \frac{\lambda\alpha\beta}{B(\gamma,\delta+1)}\; x^{\alpha-1} v(x)^{\beta-1} w(x)^{\gamma\lambda-1} z(x)^{\delta} \,\mathbf{1}_{(0,1)}(x), } \tag{2.1} \] where \[ v(x) = 1-x^\alpha,\qquad w(x) = 1-(1-x^\alpha)^\beta,\qquad z(x) = 1-w(x)^\lambda, \] and the parameter space is \[ \Theta = \Bigl\{(\alpha,\beta,\gamma,\delta,\lambda)^\top : \alpha,\beta,\gamma,\lambda>0,\ \delta\ge 0\Bigr\}. \tag{2.2} \] Note that \(B(\gamma,\delta+1)\) is well-defined for all \(\gamma>0\) and \(\delta>-1\); we restrict to \(\delta\ge 0\) for convenience and consistency with the literature. We now verify that (2.1) defines a proper density. **Theorem 2.1 (Validity of the pdf).** For any \(\boldsymbol{\theta} \in \Theta\), the function \(f(\cdot; \boldsymbol{\theta})\) in (2.1) is a valid probability density on \((0,1)\). *Proof.* Non-negativity is immediate from the definition. To prove normalization, consider the change of variable \[ u = w(x)^\lambda,\qquad 00\) and \(\tilde\delta>0\). Consider the GKw submodel \[ X \sim \mathrm{GKw}(\alpha,\beta,\gamma=1,\delta=\tilde\delta-1,\lambda), \quad\text{with }\tilde\delta\ge 1. \] Then \(X\) has pdf \[ \boxed{ f_{\mathrm{KKw}}(x; \alpha,\beta,\tilde\delta,\lambda) = \tilde\delta\,\lambda\alpha\beta\; x^{\alpha-1}v^{\beta-1}w^{\lambda-1}z^{\tilde\delta-1}, } \tag{2.7} \] CDF \[ \boxed{ F_{\mathrm{KKw}}(x; \alpha,\beta,\tilde\delta,\lambda) = 1 - z(x)^{\tilde\delta} = 1 - \bigl[1-w(x)^\lambda\bigr]^{\tilde\delta}, } \tag{2.8} \] and quantile function \[ \boxed{ Q_{\mathrm{KKw}}(p; \alpha,\beta,\tilde\delta,\lambda) = \left\{ 1-\left[ 1-\Bigl(1-(1-p)^{1/\tilde\delta}\Bigr)^{1/\lambda} \right]^{1/\beta} \right\}^{1/\alpha},\quad 00\): \[ D_h = \frac{\ell(\boldsymbol{\theta}+h\mathbf{e}_j) - \ell(\boldsymbol{\theta}-h\mathbf{e}_j)}{2h}. \] Then \[ \left| D_h - \frac{\partial\ell}{\partial\theta_j} \right| = O(h^2) + O\!\left(\frac{\epsilon}{h}\right), \tag{5.4} \] where \(\epsilon\) is machine precision (approximately \(2.22\times10^{-16}\) in double precision). The optimal step size is \(h^* \asymp (\epsilon/M)^{1/3}\), where \(M\) bounds the third derivative of \(\ell\) in a neighborhood of \(\boldsymbol{\theta}\). *Proof.* Standard finite-difference error analysis; see Nocedal and Wright (2006), Chapter 8. \(\square\) In contrast, the analytical gradients of Theorem 4.1 can be evaluated with accuracy limited essentially only by floating-point roundoff and require a single evaluation of \(f\) per data point, rather than \(2p\) evaluations per gradient component (\(p=5\) here) for central differences. ### 5.4 Practical Recommendations **Guideline 5.1 (Model selection within the GKw hierarchy).** 1. Start from the simplest two-parameter models: - Beta\((\gamma,\delta+1)\), - Kumaraswamy\((\alpha,\beta)\). 2. If these are inadequate, consider three-parameter extensions: - EKw\((\alpha,\beta,\lambda)\), - McDonald\((\gamma,\delta,\lambda)\). 3. For more complex patterns, move to four-parameter models: - BKw\((\alpha,\beta,\gamma,\delta)\), - KKw\((\alpha,\beta,\delta,\lambda)\). 4. Use the full five-parameter GKw model only when the sample size is sufficiently large (e.g. \(n\gtrsim 500\)) to avoid over-parameterization and numerical instability. 5. Compare candidate models using information criteria such as \[ \mathrm{AIC} = -2\ell(\hat{\boldsymbol{\theta}}) + 2p, \quad \mathrm{BIC} = -2\ell(\hat{\boldsymbol{\theta}}) + p\ln n, \] where \(p\) is the number of free parameters. **Guideline 5.2 (Diagnostics).** 1. **Q–Q plot.** Compare empirical quantiles with theoretical quantiles from the fitted model. 2. **Probability integral transform.** The transformed values \(\{F(x_i;\hat{\boldsymbol{\theta}})\}_{i=1}^n\) should be approximately i.i.d. \(\mathrm{Uniform}(0,1)\). 3. **Conditioning of the information matrix.** Check \(\kappa(\mathcal{J}(\hat{\boldsymbol{\theta}}))\), the condition number of the observed information; large values (e.g. \(>10^8\)) indicate potential identifiability problems. 4. **Positive definiteness.** All eigenvalues of \(\mathcal{J}(\hat{\boldsymbol{\theta}})\) should be strictly positive for valid standard error estimates. ### 5.5 Discussion We have developed a rigorous mathematical framework for the Generalized Kumaraswamy (GKw) family, including: 1. **Hierarchical embedding.** The GKw family neatly contains Beta, McDonald, Kumaraswamy, exponentiated Kumaraswamy, Beta–Kumaraswamy and Kumaraswamy–Kumaraswamy distributions as submodels, with explicit parameter mappings. 2. **Likelihood theory.** We derived explicit expressions for the log-likelihood, the score vector and the full observed information matrix in terms of the cascade transformations \(v,w,z\), in a form suitable for stable numerical implementation. 3. **Asymptotic properties.** Under standard regularity conditions, the MLEs are consistent and asymptotically normal, with variance–covariance matrix obtained from the inverse observed information. 4. **Computational considerations.** Log-scale evaluations and carefully structured derivatives provide numerical stability and efficiency. In our C++ implementation via RcppArmadillo, analytical gradients and Hessians yield substantial speedups over finite-difference approximations, together with better numerical accuracy. Open problems and possible extensions include: - Closed-form expressions for moments of the full GKw distribution (currently only some sub-families, such as Kw and Beta, admit simple formulas). - Analytic inversion of the BKw CDF (solving \(I_y(\gamma,\delta+1)=p\) for \(y\), followed by inversion of the cascade). - Multivariate generalizations using copulas constructed from GKw marginals. - Fully Bayesian treatments with suitable priors on \((\alpha,\beta,\gamma,\delta,\lambda)\). The `gkwdist` R package implements all the theoretical results described in this vignette and provides a practical toolkit for likelihood-based inference in bounded continuous data models. --- ## References Carrasco, J. M. F., Ferrari, S. L. P., & Cordeiro, G. M. (2010). **A new generalized Kumaraswamy distribution.** *arXiv:1004.0911*. [arxiv.org/abs/1004.0911](https://arxiv.org/abs/1004.0911) Casella, G. and Berger, R. L. (2002). *Statistical Inference*, 2nd ed. Duxbury Press, Pacific Grove, CA. Cordeiro, G. M. and de Castro, M. (2011). A new family of generalized distributions. *J. Stat. Comput. Simul.* **81**, 883–898. Cox, D. R. and Hinkley, D. V. (1974). *Theoretical Statistics*. Chapman and Hall, London. Johnson, N. L., Kotz, S. and Balakrishnan, N. (1995). *Continuous Univariate Distributions*, Volume 2, 2nd ed. Wiley, New York. Jones, M. C. (2009). Kumaraswamy's distribution: A beta-type distribution with some tractability advantages. *Statist. Methodol.* **6**, 70–81. Kumaraswamy, P. (1980). A generalized probability density function for double-bounded random processes. *J. Hydrol.* **46**, 79–88. Lehmann, E. L. and Casella, G. (1998). *Theory of Point Estimation*, 2nd ed. Springer, New York. Mächler, M. (2012). Accurately computing \(\log(1-\exp(-|a|))\). R package vignette, https://CRAN.R-project.org/package=Rmpfr. Nocedal, J. and Wright, S. J. (2006). *Numerical Optimization*, 2nd ed. Springer, New York. van der Vaart, A. W. (1998). *Asymptotic Statistics*. Cambridge University Press, Cambridge. --- **Author's address:** J. E. Lopes Laboratory of Statistics and Geoinformation (LEG) Graduate Program in Numerical Methods in Engineering (PPGMNE) Federal University of Paraná (UFPR) Curitiba, PR, Brazil E-mail: evandeilton@gmail.com