SSJ
3.3.1
Stochastic Simulation in Java
|
Goodness-of-fit test Statistics. More...
Classes | |
class | FBar |
This class is similar to FDist, except that it provides static methods to compute or approximate the complementary distribution function of \(X\), which we define as \(\bar{F} (x) = P[X\ge x]\), instead of \(F (x)=P[X\le x]\). More... | |
class | FDist |
This class provides methods to compute (or approximate) the distribution functions of special types of goodness-of-fit test statistics. More... | |
class | GofFormat |
This class contains methods used to format results of GOF test statistics, or to apply a series of tests simultaneously and format the results. More... | |
class | GofStat |
This class provides methods to compute several types of EDF goodness-of-fit test statistics and to apply certain transformations to a set of observations. More... | |
class | KernelDensity |
This static class provides methods to compute a kernel density estimator from a set of \(n\) individual observations \(x_0, …, x_{n-1}\), which define an empirical distribution. More... | |
Goodness-of-fit test Statistics.
This package contains tools for performing univariate goodness-of-fit (GOF) statistical tests. Methods for computing (or approximating) the distribution function \(F(x)\) of certain GOF test statistics, as well as their complementary distribution function \(\bar{F}(x) = 1 - F(x)\), are implemented in classes of package umontreal.ssj.probdist. Tools for computing the GOF test statistics and the corresponding \(p\)-values, and for formating the results, are provided in classes GofStat and GofFormat.
We are concerned here with GOF test statistics for testing the hypothesis \(\mathcal{H}_0\) that a sample of \(N\) observations \(X_1,…,X_N\) comes from a given univariate probability distribution \(F\). We consider tests such as those of Kolmogorov-Smirnov, Anderson-Darling, Crámer-von Mises, etc. These test statistics generally measure, in different ways, the distance between a continuous cumulative distribution function (cdf) \(F\) and the corresponding empirical distribution function (EDF) \(\hat{F}_N\) of \(X_1,…,X_N\). They are also called EDF test statistics. The observations \(X_i\) are usually transformed into \(U_i = F (X_i)\), which satisfy \(0\le U_i\le1\) and which follow the \(U(0,1)\) distribution under \(\mathcal{H}_0\). (This is called the probability integral transformation.) Methods for applying this transformation, as well as other types of transformations, to the observations \(X_i\) or \(U_i\) are provided in umontreal.ssj.gof.GofStat.
Then the GOF tests are applied to the \(U_i\) sorted by increasing order. The corresponding \(p\)-values are easily computed by calling the appropriate methods in the classes of package umontreal.ssj.probdist. If a GOF test statistic \(Y\) has a continuous distribution under \(\mathcal{H}_0\) and takes the value \(y\), its (right) \(p\)-value is defined as \(p = P[Y \ge y \mid\mathcal{H}_0]\). The test usually rejects \(\mathcal{H}_0\) if \(p\) is deemed too close to 0 (for a one-sided test) or too close to 0 or 1 (for a two-sided test).
In the case where \(Y\) has a discrete distribution under \(\mathcal{H}_0\), we distinguish the right \(p\)-value \(p_R = P[Y \ge y \mid\mathcal{H}_0]\) and the left \(p\)-value \(p_L = P[Y \le y \mid\mathcal{H}_0]\). We then define the \(p\)-value for a two-sided test as
\begin{align} p & = \left\{ \begin{array}{l@{qquad}l} p_R, & \mbox{if } p_R < p_L \\ 1 - p_L, \mbox{if } p_R \ge p_L \mbox{ and } p_L < 0.5 \\ 0.5 & \mbox{otherwise.} \end{array} \right. \tag{pdisc} \end{align}
Why such a definition? Consider for example a Poisson random variable \(Y\) with mean 1 under \(\mathcal{H}_0\). If \(Y\) takes the value 0, the right \(p\)-value is \(p_R = P[Y \ge0 \mid\mathcal{H}_0] = 1\). In the uniform case, this would obviously lead to rejecting \(\mathcal{H}_0\) on the basis that the \(p\)-value is too close to 1. However, \(P[Y = 0 \mid\mathcal{H}_0] = 1/e \approx0.368\), so it does not really make sense to reject \(\mathcal{H}_0\) in this case. In fact, the left \(p\)-value here is \(p_L = 0.368\), and the \(p\)-value computed with the above definition is \(p = 1 - p_L \approx0.632\). Note that if \(p_L\) is very small, in this definition, \(p\) becomes close to 1. If the left \(p\)-value was defined as \(p_L = 1 - p_R = P[Y < y \mid\mathcal{H}_0]\), this would also lead to problems. In the example, one would have \(p_L = 0\) in that case.
A very common type of test in the discrete case is the chi-square test, which applies when the possible outcomes are partitioned into a finite number of categories. Suppose there are \(k\) categories and that each observation belongs to category \(i\) with probability \(p_i\), for \(0\le i < k\). If there are \(n\) independent observations, the expected number of observations in category \(i\) is \(e_i = n p_i\), and the chi-square test statistic is defined as
\[ X^2 = \sum_{i=0}^{k-1} \frac{(o_i - e_i)^2}{e_i} \tag{chi-square0} \]
where \(o_i\) is the actual number of observations in category \(i\). Assuming that all \(e_i\)’s are large enough (a popular rule of thumb asks for \(e_i \ge5\) for each \(i\)), \(X^2\) follows approximately the chi-square distribution with \(k-1\) degrees of freedom [207]. The class GofStat.OutcomeCategoriesChi2, a nested class defined inside the GofStat class, provides tools to automatically regroup categories in the cases where some \(e_i\)’s are too small.
The class GofFormat contains methods used to format results of GOF test statistics, or to apply several such tests simultaneously to a given data set and format the results to produce a report that also contains the \(p\)-values of all these tests. A C version of this class is actually used extensively in the package TestU01, which applies statistical tests to random number generators [133]. The class also provides tools to plot an empirical or theoretical distribution function, by creating a data file that contains a graphic plot in a format compatible with a given software.