SSJ
3.3.1
Stochastic Simulation in Java
|
This class implements random variate generators for distributions obtained via kernel density estimation methods from a set of \(n\) individual observations \(x_1,…,x_n\) [45], [46], [89], [90], [215] . More...
Public Member Functions | |
KernelDensityGen (RandomStream s, EmpiricalDist dist, RandomVariateGen kGen, double h) | |
Creates a new generator for a kernel density estimated from the observations given by the empirical distribution dist , using stream s to select the observations, generator kGen to generate the added noise from the kernel density, and bandwidth h . | |
KernelDensityGen (RandomStream s, EmpiricalDist dist, NormalGen kGen) | |
This constructor uses a gaussian kernel and the default bandwidth \(h = \alpha_k h_0\) with the \(\alpha_k\) suggested in Table kernels for the gaussian distribution. More... | |
Public Member Functions inherited from RandomVariateGen | |
RandomVariateGen (RandomStream s, Distribution dist) | |
Creates a new random variate generator from the distribution dist , using stream s . More... | |
double | nextDouble () |
Generates a random number from the continuous distribution contained in this object. More... | |
void | nextArrayOfDouble (double[] v, int start, int n) |
Generates n random numbers from the continuous distribution contained in this object. More... | |
double [] | nextArrayOfDouble (int n) |
Generates n random numbers from the continuous distribution contained in this object, and returns them in a new array of size n . More... | |
RandomStream | getStream () |
Returns the umontreal.ssj.rng.RandomStream used by this generator. More... | |
void | setStream (RandomStream stream) |
Sets the umontreal.ssj.rng.RandomStream used by this generator to stream . | |
Distribution | getDistribution () |
Returns the umontreal.ssj.probdist.Distribution used by this generator. More... | |
String | toString () |
Returns a String containing information about the current generator. | |
Protected Attributes | |
RandomVariateGen | kernelGen |
double | bandwidth |
boolean | positive |
Protected Attributes inherited from RandomVariateGen | |
RandomStream | stream |
Distribution | dist |
Kernel selection and parameters | |
void | setBandwidth (double h) |
Sets the bandwidth to h . | |
void | setPositiveReflection (boolean reflect) |
After this method is called with true , the generator will produce only positive values, by using the reflection method: replace all negative values by their absolute values. More... | |
double | nextDouble () |
static double | getBaseBandwidth (EmpiricalDist dist) |
Computes and returns the value of \(h_0\) in ( bandwidth0 ). | |
This class implements random variate generators for distributions obtained via kernel density estimation methods from a set of \(n\) individual observations \(x_1,…,x_n\) [45], [46], [89], [90], [215] .
The basic idea is to center a copy of the same symmetric density at each observation and take an equally weighted mixture of the \(n\) copies as an estimator of the density from which the observations come. The resulting kernel density has the general form
\[ f_n(x) = \frac{1}{nh} \sum_{i=1}^n k((x-x_i)/h), \]
where \(k\) is a fixed pre-selected density called the kernel and \(h\) is a positive constant called the bandwidth or smoothing factor. A difficult practical issue is the selection of \(k\) and \(h\). Several approaches have been proposed for that; see, e.g., [16], [32], [90], [215] .
The constructor of a generator from a kernel density requires a random stream \(s\), the \(n\) observations in the form of an empirical distribution, a random variate generator for the kernel density \(k\), and the value of the bandwidth \(h\). The random variates are then generated as follows: select an observation \(x_I\) at random, by inversion, using stream \(s\), then generate random variate \(Y\) with the generator provided for the density \(k\), and return \(x_I + hY\).
A simple formula for the bandwidth, suggested in [215], [90] , is \(h = \alpha_k h_0\), where
\[ h_0 = 1.36374 \min(s_n, q / 1.34) n^{-1/5}, \tag{bandwidth0} \]
\(s_n\) and \(q\) are the empirical standard deviation and the interquartile range of the \(n\) observations, and \(\alpha_k\) is a constant that depends on the type of kernel \(k\). It is defined by
\[ \alpha_k = \left(\sigma_k^{-4} \int_{-\infty}^{\infty}k(x)dx \right)^{1/5} \]
where \(\sigma_k\) is the standard deviation of the density \(k\). The static method getBaseBandwidth permits one to compute \(h_0\) for a given empirical distribution.
name | constructor | \(\alpha_k\) | \(\sigma_k^2\) | efficiency |
Epanechnikov | BetaSymmetricalDist(2, -1, 1) | 1.7188 | 1/5 | 1.000 |
triangular | TriangularDist(-1, 1, 0) | 1.8882 | 1/6 | 0.986 |
Gaussian | NormalDist() | 0.7764 | 1 | 0.951 |
boxcar | UniformDist(-1, 1) | 1.3510 | 1/3 | 0.930 |
logistic | LogisticDist() | 0.4340 | 3.2899 | 0.888 |
Student-t(3) | StudentDist(3) | 0.4802 | 3 | 0.674 |
Table kernels gives the precomputed values of \(\sigma_k\) and \(\alpha_k\) for selected (popular) kernels. The values are taken from [90] . The second column gives the name of a function (in this package) that constructs the corresponding distribution. The efficiency of a kernel is defined as the ratio of its mean integrated square error over that of the Epanechnikov kernel, which has optimal efficiency and corresponds to the beta distribution with parameters \((2,2)\) over the interval \((-1,1)\).
KernelDensityGen | ( | RandomStream | s, |
EmpiricalDist | dist, | ||
NormalGen | kGen | ||
) |
This constructor uses a gaussian kernel and the default bandwidth \(h = \alpha_k h_0\) with the \(\alpha_k\) suggested in Table kernels for the gaussian distribution.
This kernel has an efficiency of 0.951.
void setPositiveReflection | ( | boolean | reflect | ) |
After this method is called with true
, the generator will produce only positive values, by using the reflection method: replace all negative values by their absolute values.
That is, #nextDouble will return \(|x|\) if \(x\) is the generated variate. The mecanism is disabled when the method is called with false
.