SSJ API Documentation
Stochastic Simulation in Java
Loading...
Searching...
No Matches
umontreal.ssj.gof.GofFormat Class Reference

This class contains methods used to format results of GOF test statistics, or to apply a series of tests simultaneously and format the results. More...

Plotting distribution functions @{

static final int GNUPLOT = 0
 Data file format used for plotting functions with Gnuplot.
static final int MATHEMATICA = 1
 Data file format used for creating graphics with Mathematica.
static int graphSoft = GNUPLOT
 Environment variable that selects the type of software to be used for plotting the graphs of functions.
static String drawCdf (ContinuousDistribution dist, double a, double b, int m, String desc)
 Formats data to plot the graph of the distribution function \(F\) over the interval \([a,b]\), and returns the result as a String.
static String drawDensity (ContinuousDistribution dist, double a, double b, int m, String desc)
 Formats data to plot the graph of the density \(f(x)\) over the interval \([a,b]\), and returns the result as a String.
static String graphDistUnif (DoubleArrayList data, String desc)
 Formats data to plot the empirical distribution of \(U_{(1)},…,U_{(N)}\), which are assumed to be in data[0...N-1], and to compare it with the uniform distribution.

Computing and printing (p)-values for EDF test statistics @{

static double EPSILONP = 1.0E-15
 Environment variable used in formatp0 to determine which.
static double SUSPECTP = 0.01
 Environment variable used in formatp1 to determine which.
static String formatp0 (double p)
 Returns the \(p\)-value \(p\) of a test, in the format "@f$1-p@f$" if \(p\) is close to 1, and \(p\) otherwise.
static String formatp1 (double p)
 Returns the string "<tt>p-value of test : </tt>", then calls formatp0 to print \(p\), and adds the marker "<tt>****</tt>" if.
static String formatp2 (double x, double p)
 Returns x on a single line, then go to the next line and calls formatp1.
static String formatp3 (String testName, double x, double p)
 Formats the test statistic x for a test named testName with.
static String formatChi2 (int k, int d, double chi2)
 Computes the \(p\)-value of the chi-square statistic chi2 for a test with k intervals.
static String formatKS (int n, double dp, double dm, double d)
 Computes the \(p\)-values of the three Kolmogorov-Smirnov statistics \(D_N^+\), \(D_N^-\), and \(D_N\), whose values are in dp, dm, d, respectively, assuming a sample of size n.
static String formatKS (DoubleArrayList data, ContinuousDistribution dist)
 Computes the KS test statistics to compare the empirical distribution of the observations in data with the theoretical distribution dist and formats the results.
static String formatKSJumpOne (int n, double a, double dp)
 Similar to formatKS(int,double,double,double), but for the KS statistic \(D_N^+(a)\) defined in ( KSPlusJumpOne ).
static String formatKSJumpOne (DoubleArrayList data, ContinuousDistribution dist, double a)
 Similar to formatKS(DoubleArrayList,ContinuousDistribution), but for \(D_N^+(a)\) defined in ( KSPlusJumpOne ).

Applying several tests at once and printing results

  Higher-level tools for applying several EDF goodness-of-fit tests
  simultaneously are offered here. The environment variable `activeTests`
  specifies which tests in this list are to be performed when asking for
  several simultaneous tests via the functions `activeTests`,
  `formatActiveTests`, etc.
static final int KSP = 0
 Kolmogorov-Smirnov+ test.
static final int KSM = 1
 Kolmogorov-Smirnov \(-\) test.
static final int KS = 2
 Kolmogorov-Smirnov test.
static final int AD = 3
 Anderson-Darling test.
static final int CM = 4
 Cramér-von Mises test.
static final int WG = 5
 Watson G test.
static final int WU = 6
 Watson U test.
static final int MEAN = 7
 Mean.
static final int COR = 8
 Correlation.
static final int NTESTTYPES = 9
 Total number of test types.
static final String[] TESTNAMES
 Name of each testType test.
static boolean[] activeTests = null
 The set of EDF tests that are to be performed when calling the methods activeTests, formatActiveTests, etc.
static void tests (DoubleArrayList sortedData, double[] sVal)
 Computes all EDF test statistics enumerated above (except COR) to compare the empirical distribution of \(U_{(0)},…,U_{(N-1)}\) with the uniform distribution, assuming that these sorted observations are in sortedData.
static void tests (DoubleArrayList data, ContinuousDistribution dist, double[] sVal)
 The observations \(V\) are in data, not necessarily sorted, and their empirical distribution is compared with the continuous distribution dist.
static void activeTests (DoubleArrayList sortedData, double[] sVal, double[] pVal)
 Computes the EDF test statistics by calling tests(DoubleArrayList,double[]), then computes the \(p\)-values of those that currently belong to activeTests, and return these quantities in sVal and pVal, respectively.
static void activeTests (DoubleArrayList data, ContinuousDistribution dist, double[] sVal, double[] pVal)
 The observations are in data, not necessarily sorted, and we want to compare their empirical distribution with the distribution dist.
static String formatActiveTests (int n, double[] sVal, double[] pVal)
 Gets the \(p\)-values of the active EDF test statistics, which are in activeTests.
static String iterSpacingsTests (DoubleArrayList sortedData, int k, boolean printval, boolean graph, PrintWriter f)
 Repeats the following k times: Applies the GofStat.iterateSpacings transformation to the \(U_{(0)},…,U_{(N-1)}\), assuming that these observations are in sortedData, then computes the EDF test statistics and calls activeTests(DoubleArrayList,double[],double[]) after each transformation.
static String iterPowRatioTests (DoubleArrayList sortedData, int k, boolean printval, boolean graph, PrintWriter f)
 Similar to iterSpacingsTests, but with the GofStat.powerRatios transformation.

Detailed Description

This class contains methods used to format results of GOF test statistics, or to apply a series of tests simultaneously and format the results.

It is in fact a translation from C to Java of a set of functions that were specially written for the implementation of TestU01, a software package for testing uniform random number generators [128] .

Strictly speaking, applying several tests simultaneously makes the

\(p\)-values "invalid" in the sense that the probability of having at least one \(p\)-value less than 0.01, say, is larger than 0.01. One must therefore be careful with the interpretation of these \(p\)-values (one could use, e.g., the Bonferroni inequality [114] ). Applying simultaneous tests is convenient in some situations, such as in screening experiments for detecting statistical deficiencies in random number generators. In that context, rejection of the null hypothesis typically occurs with extremely small \(p\)-values (e.g., less than \(10^{-15}\)), and the interpretation is quite obvious in this case.

The class also provides tools to plot an empirical or theoretical distribution function, by creating a data file that contains a graphic plot in a format compatible with the software specified by the environment variable graphSoft. NOTE: see also the more recent package umontreal.ssj.charts.

 Note: This class uses the Colt library.

 <div class="SSJ-bigskip"></div>

Definition at line 64 of file GofFormat.java.

Member Function Documentation

◆ activeTests() [1/2]

void umontreal.ssj.gof.GofFormat.activeTests ( DoubleArrayList data,
ContinuousDistribution dist,
double[] sVal,
double[] pVal )
static

The observations are in data, not necessarily sorted, and we want to compare their empirical distribution with the distribution dist.

If \(N = 1\), only puts data.get(0) in sVal[MEAN], and

\(1 - {}\)dist.cdf (data.get (0)) in sVal[KSP], pVal[KSP], and pVal[MEAN].

Parameters
dataarray of observations to test
distassumed distribution of the observations
sValarray that will be filled with the results of the tests
pValarray that will be filled with the \(p\)-values

Definition at line 809 of file GofFormat.java.

◆ activeTests() [2/2]

void umontreal.ssj.gof.GofFormat.activeTests ( DoubleArrayList sortedData,
double[] sVal,
double[] pVal )
static

Computes the EDF test statistics by calling tests(DoubleArrayList,double[]), then computes the \(p\)-values of those that currently belong to activeTests, and return these quantities in sVal and pVal, respectively.

Assumes that \(U_{(0)},…,U_{(N-1)}\) are in sortedData and that we want to compare their empirical distribution with the uniform distribution. If \(N = 1\), only puts \(1 - {}\)sortedData.get (0) in sVal[KSP], pVal[KSP], and pVal[MEAN].

Parameters
sortedDataarray of sorted observations
sValarray that will be filled with the results of the tests
pValarray that will be filled with the \(p\)-values

Definition at line 749 of file GofFormat.java.

◆ drawCdf()

String umontreal.ssj.gof.GofFormat.drawCdf ( ContinuousDistribution dist,
double a,
double b,
int m,
String desc )
static

Formats data to plot the graph of the distribution function \(F\) over the interval \([a,b]\), and returns the result as a String.

The method dist.cdf(x) returns the value of \(F\) at \(x\). The String desc gives a short caption for the graphic plot. The method computes the \(m+1\) points \((x_i,  F (x_i))\), where

\(x_i = a + i (b-a)/m\) for \(i=0,1,…,m\), and formats these points into a String in a format suitable for the software specified by graphSoft. NOTE: see also the more recent class umontreal.ssj.charts.ContinuousDistChart.

Parameters
distcontinuous distribution function to plot
alower bound of the interval to plot
bupper bound of the interval to plot
mnumber of points in the plot minus one
descshort caption describing the plot
Returns
a string representation of the plot data

Definition at line 210 of file GofFormat.java.

◆ drawDensity()

String umontreal.ssj.gof.GofFormat.drawDensity ( ContinuousDistribution dist,
double a,
double b,
int m,
String desc )
static

Formats data to plot the graph of the density \(f(x)\) over the interval \([a,b]\), and returns the result as a String.

The method dist.density(x) returns the value of \(f(x)\) at \(x\). The String desc gives a short caption for the graphic plot. The method computes the \(m+1\) points \((x_i,  f(x_i))\), where

\(x_i = a + i (b-a)/m\) for \(i=0,1,…,m\), and formats these points into a String in a format suitable for the software specified by graphSoft. NOTE: see also the more recent class umontreal.ssj.charts.ContinuousDistChart.

Parameters
distcontinuous density function to plot
alower bound of the interval to plot
bupper bound of the interval to plot
mnumber of points in the plot minus one
descshort caption describing the plot
Returns
a string representation of the plot data

Definition at line 232 of file GofFormat.java.

◆ formatActiveTests()

String umontreal.ssj.gof.GofFormat.formatActiveTests ( int n,
double[] sVal,
double[] pVal )
static

Gets the \(p\)-values of the active EDF test statistics, which are in activeTests.

It is assumed that the values of these statistics and their \(p\)-values are already computed, in sVal and pVal, and that the sample size is n. These statistics and \(p\)-values are formated using formatp2 for each one. If n=1, prints only pVal[KSP] using formatp1.

Parameters
nsample size
sValarray containing the results of the tests
pValarray containing the \(p\)-values
Returns
the results formated as a string

Definition at line 837 of file GofFormat.java.

◆ formatChi2()

String umontreal.ssj.gof.GofFormat.formatChi2 ( int k,
int d,
double chi2 )
static

Computes the \(p\)-value of the chi-square statistic chi2 for a test with k intervals.

Uses \(d\) decimal digits of precision in the calculations. The result of the test is returned as a string. The \(p\)-value is computed using GofStat.pDisc.

Parameters
knumber of subintervals for the chi-square test
chi2chi-square statistic
Returns
the string representation of the test result and \(p\)-value

Definition at line 448 of file GofFormat.java.

◆ formatKS() [1/2]

String umontreal.ssj.gof.GofFormat.formatKS ( DoubleArrayList data,
ContinuousDistribution dist )
static

Computes the KS test statistics to compare the empirical distribution of the observations in data with the theoretical distribution dist and formats the results.

See also method kolmogorovSmirnov(double[],ContinuousDistribution,double[],double[]).

Parameters
dataarray of observations to be tested
distassumed distribution of the observations
Returns
the string representation of the Kolmogorov-Smirnov statistics and their p-values

Definition at line 490 of file GofFormat.java.

◆ formatKS() [2/2]

String umontreal.ssj.gof.GofFormat.formatKS ( int n,
double dp,
double dm,
double d )
static

Computes the \(p\)-values of the three Kolmogorov-Smirnov statistics \(D_N^+\), \(D_N^-\), and \(D_N\), whose values are in dp, dm, d, respectively, assuming a sample of size n.

Then formats these statistics and their \(p\)-values using formatp2 for each one.

Parameters
nsample size
dpvalue of the \(D_N^+\) statistic
dmvalue of the \(D_N^-\) statistic
dvalue of the \(D_N\) statistic
Returns
the string representation of the Kolmogorov-Smirnov statistics and their p-values

Definition at line 470 of file GofFormat.java.

◆ formatKSJumpOne() [1/2]

String umontreal.ssj.gof.GofFormat.formatKSJumpOne ( DoubleArrayList data,
ContinuousDistribution dist,
double a )
static

Similar to formatKS(DoubleArrayList,ContinuousDistribution), but for \(D_N^+(a)\) defined in ( KSPlusJumpOne ).

Parameters
dataarray of observations to be tested
distassumed distribution of the data
asize of the jump
Returns
string representation of the Kolmogorov-Smirnov statistic and its p-value

Definition at line 531 of file GofFormat.java.

◆ formatKSJumpOne() [2/2]

String umontreal.ssj.gof.GofFormat.formatKSJumpOne ( int n,
double a,
double dp )
static

Similar to formatKS(int,double,double,double), but for the KS statistic \(D_N^+(a)\) defined in ( KSPlusJumpOne ).

Writes a header, computes the \(p\)-value and calls formatp2.

Parameters
nsample size
asize of the jump
dpvalue of \(D_N^+(a)\)
Returns
the string representation of the Kolmogorov-Smirnov statistic and its p-value

Definition at line 513 of file GofFormat.java.

◆ formatp0()

String umontreal.ssj.gof.GofFormat.formatp0 ( double p)
static

Returns the \(p\)-value \(p\) of a test, in the format "@f$1-p@f$" if \(p\) is close to 1, and \(p\) otherwise.

Uses the environment variable EPSILONP and replaces \(p\) by

\(\epsilon\) when it is too small.

Parameters
pthe \(p\)-value to be formated
Returns
the string representation of the \(p\)-value

Definition at line 368 of file GofFormat.java.

◆ formatp1()

String umontreal.ssj.gof.GofFormat.formatp1 ( double p)
static

Returns the string "<tt>p-value of test : </tt>", then calls formatp0 to print \(p\), and adds the marker "<tt>****</tt>" if.

\(p\) is considered suspect (uses the environment variable SUSPECTP for this).

Parameters
pthe \(p\)-value to be formated
Returns
the string representation of the p-value of test

Definition at line 391 of file GofFormat.java.

◆ formatp2()

String umontreal.ssj.gof.GofFormat.formatp2 ( double x,
double p )
static

Returns x on a single line, then go to the next line and calls formatp1.

Parameters
xvalue of the statistic for which the p-value is formated
pthe \(p\)-value to be formated
Returns
the string representation of the p-value of test

Definition at line 409 of file GofFormat.java.

◆ formatp3()

String umontreal.ssj.gof.GofFormat.formatp3 ( String testName,
double x,
double p )
static

Formats the test statistic x for a test named testName with.

\(p\)-value p. The first line of the returned string contains the name of the test and the statistic whereas the second line contains its p-value. The formated values of x and p are aligned.

Parameters
testNamename of the test that was performed
xvalue of the test statistic
p\(p\)-value of the test
Returns
the string representation of the test result

Definition at line 425 of file GofFormat.java.

◆ graphDistUnif()

String umontreal.ssj.gof.GofFormat.graphDistUnif ( DoubleArrayList data,
String desc )
static

Formats data to plot the empirical distribution of \(U_{(1)},…,U_{(N)}\), which are assumed to be in data[0...N-1], and to compare it with the uniform distribution.

The \(U_{(i)}\) must be sorted. The two endpoints \((0, 0)\) and \((1, 1)\) are always included in the plot. The string desc gives a short caption for the graphic plot. The data is printed in a format suitable for the software specified by graphSoft. NOTE: see also the more recent class umontreal.ssj.charts.EmpiricalChart.

Parameters
dataarray of observations to plot
descshort caption describing the plot
Returns
a string representation of the plot data

Definition at line 294 of file GofFormat.java.

◆ iterPowRatioTests()

String umontreal.ssj.gof.GofFormat.iterPowRatioTests ( DoubleArrayList sortedData,
int k,
boolean printval,
boolean graph,
PrintWriter f )
static

Similar to iterSpacingsTests, but with the GofStat.powerRatios transformation.

Parameters
sortedDataarray containing the sorted observations
knumber of times the tests are applied
printvalif true, stores all the values of the observations at each iteration
graphif true, the distribution of the \(U_i\) will be plotted after each iteration
fstream where the plots are written to
Returns
a string representation of the test results

Definition at line 937 of file GofFormat.java.

◆ iterSpacingsTests()

String umontreal.ssj.gof.GofFormat.iterSpacingsTests ( DoubleArrayList sortedData,
int k,
boolean printval,
boolean graph,
PrintWriter f )
static

Repeats the following k times: Applies the GofStat.iterateSpacings transformation to the \(U_{(0)},…,U_{(N-1)}\), assuming that these observations are in sortedData, then computes the EDF test statistics and calls activeTests(DoubleArrayList,double[],double[]) after each transformation.

The function returns the original array sortedData (the transformations are applied on a copy of sortedData). If printval = true, stores all the values into the returned String after each iteration. If graph = true, calls graphDistUnif after each iteration to print to stream f the data for plotting the distribution function of the \(U_i\).

Parameters
sortedDataarray containing the sorted observations
knumber of times the tests are applied
printvalif true, stores all the values of the observations at each iteration
graphif true, the distribution of the \(U_i\) will be plotted after each iteration
fstream where the plots are written to
Returns
a string representation of the test results

Definition at line 887 of file GofFormat.java.

◆ tests() [1/2]

void umontreal.ssj.gof.GofFormat.tests ( DoubleArrayList data,
ContinuousDistribution dist,
double[] sVal )
static

The observations \(V\) are in data, not necessarily sorted, and their empirical distribution is compared with the continuous distribution dist.

If \(N = 1\), only puts data.get (0) in sVal[MEAN], and \(1 - {}\)dist.cdf (data.get (0)) in sVal[KSP].

Parameters
dataarray of observations to test
distassumed distribution of the observations
sValarray that will be filled with the results of the tests

Definition at line 720 of file GofFormat.java.

◆ tests() [2/2]

void umontreal.ssj.gof.GofFormat.tests ( DoubleArrayList sortedData,
double[] sVal )
static

Computes all EDF test statistics enumerated above (except COR) to compare the empirical distribution of \(U_{(0)},…,U_{(N-1)}\) with the uniform distribution, assuming that these sorted observations are in sortedData.

If \(N > 1\), returns sVal with the values of the KS statistics \(D_N^+\),

\(D_N^-\) and \(D_N\), of the Cramér-von Mises statistic \(W_N^2\), Watson’s \(G_N\) and \(U_N^2\), Anderson-Darling’s \(A_N^2\), and the average of the \(U_i\)’s, respectively. If \(N = 1\), only puts \(1 - {}\)sortedData.get (0) in sVal[KSP]. Calling this method is more efficient than computing these statistics separately by calling the corresponding methods in GofStat.

Parameters
sortedDataarray of sorted observations
sValarray that will be filled with the results of the tests

Definition at line 652 of file GofFormat.java.

Member Data Documentation

◆ activeTests

boolean [] umontreal.ssj.gof.GofFormat.activeTests = null
static

The set of EDF tests that are to be performed when calling the methods activeTests, formatActiveTests, etc.

By default, this set contains KSP, KSM, and AD. Note: MEAN and COR are always excluded from this set of active tests. The valid indices for this array are KSP, KSM, KS, AD, CM, WG, WU, MEAN, and COR.

Definition at line 621 of file GofFormat.java.

◆ AD

final int umontreal.ssj.gof.GofFormat.AD = 3
static

Anderson-Darling test.

Definition at line 575 of file GofFormat.java.

◆ CM

final int umontreal.ssj.gof.GofFormat.CM = 4
static

Cramér-von Mises test.

Definition at line 580 of file GofFormat.java.

◆ COR

final int umontreal.ssj.gof.GofFormat.COR = 8
static

Correlation.

Definition at line 600 of file GofFormat.java.

◆ EPSILONP

double umontreal.ssj.gof.GofFormat.EPSILONP = 1.0E-15
static

Environment variable used in formatp0 to determine which.

\(p\)-values are too close to 0 or 1 to be printed explicitly. If EPSILONP \(= \epsilon\), then any \(p\)-value less than \(\epsilon\) or larger than \(1-\epsilon\) is not written explicitly; the program simply writes "<tt>eps</tt>" or "<tt>1-eps</tt>". The default value is \(10^{-15}\).

Definition at line 346 of file GofFormat.java.

◆ GNUPLOT

final int umontreal.ssj.gof.GofFormat.GNUPLOT = 0
static

Data file format used for plotting functions with Gnuplot.

Definition at line 75 of file GofFormat.java.

◆ graphSoft

int umontreal.ssj.gof.GofFormat.graphSoft = GNUPLOT
static

Environment variable that selects the type of software to be used for plotting the graphs of functions.

The data files produced by graphFunc and graphDistUnif will be in a format suitable for this selected software. The default value is GNUPLOT. To display a graphic in file f using gnuplot, for example, one can use the command "<tt>plot f with steps, x with lines</tt>" in gnuplot. graphSoft can take the values GNUPLOT or MATHEMATICA.

Definition at line 91 of file GofFormat.java.

◆ KS

final int umontreal.ssj.gof.GofFormat.KS = 2
static

Kolmogorov-Smirnov test.

Definition at line 570 of file GofFormat.java.

◆ KSM

final int umontreal.ssj.gof.GofFormat.KSM = 1
static

Kolmogorov-Smirnov \(-\) test.

Definition at line 565 of file GofFormat.java.

◆ KSP

final int umontreal.ssj.gof.GofFormat.KSP = 0
static

Kolmogorov-Smirnov+ test.

Definition at line 560 of file GofFormat.java.

◆ MATHEMATICA

final int umontreal.ssj.gof.GofFormat.MATHEMATICA = 1
static

Data file format used for creating graphics with Mathematica.

Definition at line 80 of file GofFormat.java.

◆ MEAN

final int umontreal.ssj.gof.GofFormat.MEAN = 7
static

Mean.

Definition at line 595 of file GofFormat.java.

◆ NTESTTYPES

final int umontreal.ssj.gof.GofFormat.NTESTTYPES = 9
static

Total number of test types.

Definition at line 605 of file GofFormat.java.

◆ SUSPECTP

double umontreal.ssj.gof.GofFormat.SUSPECTP = 0.01
static

Environment variable used in formatp1 to determine which.

\(p\)-values should be marked as suspect when printing test results. If SUSPECTP \(= \alpha\), then any \(p\)-value less than \(\alpha\) or larger than \(1-\alpha\) is considered suspect and is "singled out" by formatp1. The default value is 0.01.

Definition at line 357 of file GofFormat.java.

◆ TESTNAMES

final String [] umontreal.ssj.gof.GofFormat.TESTNAMES
static
Initial value:
= { "KolmogorovSmirnovPlus", "KolmogorovSmirnovMinus", "KolmogorovSmirnov",
"Anderson-Darling", "CramerVon-Mises", "Watson G", "Watson U", "Mean", "Correlation" }

Name of each testType test.

Could be used for printing the test results, for example.

Definition at line 611 of file GofFormat.java.

◆ WG

final int umontreal.ssj.gof.GofFormat.WG = 5
static

Watson G test.

Definition at line 585 of file GofFormat.java.

◆ WU

final int umontreal.ssj.gof.GofFormat.WU = 6
static

Watson U test.

Definition at line 590 of file GofFormat.java.


The documentation for this class was generated from the following file: