StatsLib is a templated C++ library of statistical distribution functions, featuring unique compile-time computing capabilities and seamless integration with several popular linear algebra libraries.
Features
constexpr
format, enabling the library to operate as both a compile-time and run-time computation engine.Author: Keith O'Hara
Contents
Functions to compute the cdf, pdf, and quantile, as well as random sampling methods, are available for the following distributions:
git clone -b master --single-branch https://github.com/kthohr/stats ./stats
StatsLib is a header-only library. Simply add the header files to your project using
#include "stats.hpp"
The only dependencies are a C++11-compatible compiler and the GCE-Math library which comes pre-packaged with StatsLib.
To build the test files:
# clone stats git clone -b master --single-branch https://github.com/kthohr/stats ./stats # compile tests cd ./stats/tests ./setup cd dens ./configure && make
You can test the library online using an interactive Jupyter notebook:
The following options should be declared before including the StatsLib header files.
constexpr
specifiers):#define STATS_GO_INLINE
_OPENMP
macro is detected (e.g., by invoking -fopenmp
with a GCC or Clang compiler). To explicitly enable OpenMP features use:#define STATS_USE_OPENMP
#define STATS_DONT_USE_OPENMP
#define STATS_USE_ARMA #define STATS_USE_BLAZE #define STATS_USE_EIGEN
Functions are called using an R-like syntax. Some general rules:
stats::d*
density functions. For example, the Normal (Gaussian) density is called usingstats::dnorm(<value>,<mean parameter>,<standard deviation>);
stats::p*
cumulative distribution functions. For example, the Gamma CDF is called usingstats::pgamma(<value>,<shape parameter>,<scale parameter>);
stats::q*
quantile functions. For example, the Beta quantile is called usingstats::qbeta(<value>,<a parameter>,<b parameter>);
stats::r*
random sampling. For example, to generate a single draw from the Logistic distribution:stats::rlogis(<location parameter>,<scale parameter>,<seed value or random number engine>);
All of these functions have matrix-based equivalents using Armadillo, Blaze, and Eigen dense matrices.
// Using Armadillo: arma::mat norm_pdf_vals = stats::dnorm(arma::ones(10,20),1.0,2.0);
r*
) can output random matrices of arbitrary size. For example,
// Armadillo: arma::mat gamma_rvs = stats::rgamma<arma::mat>(100,50,3.0,2.0); // Blaze: blaze::DynamicMatrix<double> gamma_rvs = stats::rgamma<blaze::DynamicMatrix<double>>(100,50,3.0,2.0); // Eigen: Eigen::MatrixXd gamma_rvs = stats::rgamma<Eigen::MatrixXd>(100,50,3.0,2.0);will generate a 100-by-50 matrix of iid draws from a Gamma(3,2) distribution.
-fopenmp
option during compilation.Random number seeding is available in two formats: seed values and random number engines.
stats::rnorm(1,2,1776);
std::mt19937_64 engine(1776); stats::rnorm(1,2,engine);
More examples with code:
// evaluate the normal PDF at x = 1, mu = 0, sigma = 1 double dval_1 = stats::dnorm(1.0,0.0,1.0) // evaluate the normal PDF at x = 1, mu = 0, sigma = 1, and return the log value double dval_2 = stats::dnorm(1.0,0.0,1.0,true) // evaluate the normal CDF at x = 1, mu = 0, sigma = 1 double pval_1 = stats::pnorm(1.0,0.0,1.0) // evaluate the Laplacian quantile at p = 0.1, mu = 0, sigma = 1 double qval_1 = stats::qlaplace(0.1,0.0,1.0) // draw from a t-distribution with dof = 30 double rval = stats::rt(30); // matrix output arma::mat beta_rvs = stats::rbeta<arma::mat>(100,100,3.0,2.0); // matrix input arma::mat beta_cdf_vals = stats::pbeta(beta_rvs,3.0,2.0);
StatsLib is designed to operate equally well as a compile-time computation engine. Compile-time computation allows the compiler to replace function calls (e.g., dnorm(0,0,1)
) with static values in the source code. That is, functions are evaluated during the compilation process, rather than at run-time. This capability is made possible due to the templated constexpr
design of the library and can be verified by inspecting the assembly code generated by the compiler.
The compile-time features are enabled using the constexpr
specifier. The example below computes the pdf, cdf, and quantile function of the Laplace distribution:
#include "stats.hpp" int main() { constexpr double dens_1 = stats::dlaplace(1.0,1.0,2.0); // answer = 0.25 constexpr double prob_1 = stats::plaplace(1.0,1.0,2.0); // answer = 0.5 constexpr double quant_1 = stats::qlaplace(0.1,1.0,2.0); // answer = -2.218875... return 0; }
Assembly code generated by Clang without any optimization:
LCPI0_0: .quad -4611193153885729483 ## double -2.2188758248682015 LCPI0_1: .quad 4602678819172646912 ## double 0.5 LCPI0_2: .quad 4598175219545276417 ## double 0.25000000000000006 .section __TEXT,__text,regular,pure_instructions .globl _main .p2align 4, 0x90 _main: ## @main push rbp mov rbp, rsp xor eax, eax movsd xmm0, qword ptr [rip + LCPI0_0] ## xmm0 = mem[0],zero movsd xmm1, qword ptr [rip + LCPI0_1] ## xmm1 = mem[0],zero movsd xmm2, qword ptr [rip + LCPI0_2] ## xmm2 = mem[0],zero mov dword ptr [rbp - 4], 0 movsd qword ptr [rbp - 16], xmm2 movsd qword ptr [rbp - 24], xmm1 movsd qword ptr [rbp - 32], xmm0 pop rbp ret