Two new PyPI packages
Long time without posting… Let’s try to get back on track!
I have created two simple packages implementing useful probability distributions.
The LogUniform package provides the log-uniform and the modified log-uniform distributions, while the kumaraswamy package provides an implementation of the Kumaraswamy distribution.
Both packages implement similar APIs to the scipy.stats
package.
LogUniform
The log-uniform distribution, often called the reciprocal distribution (Wikipedia) or, in some contexts, the Jeffreys prior, is commonly used as a prior for parameters which vary over several orders of magnitude.
Its probability density function (pdf) is given by
\[f( x | a,b ) = \frac{ 1 }{ x \, [ \ln( b ) - \ln( a ) ]} \quad \text{ for } a \le x \le b \text{ and } a > 0.\]The two parameters \(a\) and \(b\) correspond to the lower and upper bounds of the support.
Here is what it looks like, for \(a=1\), \(b=100\)

The modified log-uniform distribution is a modified version of the log-uniform distribution (!), which extends its support to include \(x=0\). The pdf is
\[f( x | x_0,b ) = \frac{ 1 }{ (x + x_0) \, \ln \left( \frac{b}{x_0} + 1 \right) } \quad \text{ for } 0 \le x \le b \text{ and } 0 < x_0 < b.\]with \(x_0\) (sometimes called the knee of the distribution) and \(b\) as parameters.
The support of this modified distribution goes from \(0\) to \(b\).
It looks like this, for \(x_0=1\) and \(b=100\)

Ok, now some code. To use the LogUniform implementation1, we would first install the package
$ pip install LogUniform
and then use it from Python as
import loguniform
dist = loguniform.LogUniform(a=1, b=100)
The dist
object has methods much like those of a scipy.stats
distribution:
dist.pdf(x) # Probability density function evaluated at x
dist.cdf(x) # Cumulative distribution function (cdf) evaluated at x
dist.sf(x) # Survival function (1 - `cdf`) evaluated at x
dist.ppf(p) # Percent point function (inverse of `cdf`) evaluated at p
and a few extra properties:
>>> dist.mean
21.497576854210962
>>> dist.mode
1
>>> dist.std, dist.var
(24.96961794931886, 623.4818205349466)
>>> dist.skewness, dist.kurtosis
(1.4283873461330863, 1.0492789588735452)
The same thing for the modified log-uniform:
dist = loguniform.ModifiedLogUniform(knee=1, b=100)
>>> dist.pdf(0)
array(0.21667907)
>>> dist.cdf(dist.b)
1.0
>>> dist.mean
20.667906533553168
>>> dist.mode
0
>>> dist.std, dist.var
(25.210415697969275, 635.5650596644156)
>>> dist.skewness, dist.kurtosis
(1.430935496096598, 1.0573553176553114)
kumaraswamy
The Kumaraswamy distribution (see Wikipedia) is a little known family of continuous probability distributions defined on the interval [0,1]. It’s particular because of the similarities to the Beta distribution, and because (unlike the Beta) its pdf and cdf are easily evaluated.
The pdf is given by
\[f(x | a,b) = a \, b \, x^{a-1}{ (1-x^a)}^{b-1}, \quad \text{ for } 0 \le x \le 1 \text{ and } a>0, b>0,\]with \(a\) and \(b\) the two shape parameters. Depending on the values of these parameters, the distribution can have a range of different shapes:

In order to use the kumaraswamy package, we first install it
$ pip install kumaraswamy
and then use it from Python, much like before
import kumaraswamy
dist = kumaraswamy.kumaraswamy(a=.5, b=.5)
with the same methods and properties now available in dist
.
To finish, one extra method shared by the three distributions provides random samples:
>>> dist = loguniform.LogUniform(a=1, b=100)
>>> dist.rvs(3)
array([99.88014766, 28.10705983, 34.18542689])
>>> dist = loguniform.ModifiedLogUniform(knee=1, b=100)
>>> dist.rvs(100)
array([1.30284699e+01, 3.26860325e+01, 1.10717848e+00, 8.66021941e+01,
3.66928668e+01, 6.40408882e+00, 1.50455730e+01, 3.90749222e+01,
...
>>> dist = kumaraswamy.kumaraswamy(a=.5, b=.5)
>>> dist.rvs()
0.4416781513613892
wrap up
That’s it, two simple packages implementing three probability distributions for some of your statistical needs.
Let me know in the comments if this is helpful, wrong or super-duper cool.
-
Note that the log-uniform distribution is already implemented in
scipy.stats.reciprocal
, but not the modified distribution. ↩