SSJ
3.3.1
Stochastic Simulation in Java
|
This package implements the correlation matching algorithms proposed in [13] for the situation where one wants to use the NORTA method to fit a multivariate distribution with discrete marginals. More...
Classes | |
class | NI1 |
Extends the class NortaInitDisc and implements the algorithm NI1. More... | |
class | NI2a |
Extends the class NortaInitDisc and implements the algorithm NI2a. More... | |
class | NI2b |
Extends the class NortaInitDisc and implements the algorithm NI2b. More... | |
class | NI3 |
Extends the class NortaInitDisc and implements the algorithm NI3. More... | |
class | NortaInitDisc |
This abstract class defines the algorithms used for NORTA initialization when the marginal distributions are discrete. More... | |
This package implements the correlation matching algorithms proposed in [13] for the situation where one wants to use the NORTA method to fit a multivariate distribution with discrete marginals.
The four different algorithms discussed in [13] are implemented in four subclasses of an abstract class named NortaInitDisc
. This software makes use of the SSJ library [155] . An example of how to use it is given at the end of this document.
The NORTA method is an approach for modeling dependence in a finite-dimensional random vector \(X=(X_1,…,X_d)\) with given univariate marginals via normal copula that fits the rank or the linear correlation between each pair of coordinates of \(X\). The standard normal distribution function is applied to each coordinate of a vector \(Z=(Z_1,…,Z_d)\) of correlated standard normals to produce a vector \(U=(U_1,…,U_d)\) of correlated uniforms over \([0,1]\). Then \(X\) is obtained by applying the inverse of each marginal distribution function to each coordinate of \(U\). The fitting requires finding the correlation between the coordinates of each pair of \(Z\) that would yield the correlation between the coordinates of the corresponding pair of \(X\). The step of finding the correlation matrix of \(Z\), given the correlation matrix of \(X\) and the marginal distributions, constitutes the NORTA initialization step. In [13] , we present a detailed analysis of the NORTA method and root-finding problem when the marginal distributions are discrete.
With the NORTA method, we have the following representation:
\[ X_l=F_l^{-1}(\Phi(Z_l)), \quad l=1,…,d, \]
where \(\Phi\) is the standard normal distribution function and \(F_l^{-1}(u) = \inf\{x: F_l(x) \ge u\}\) for \(0\le u \le1\), which is the quantile function of the marginal distribution \(F_l, l=1,…,d\).
For the bivariate case ( \(d=2\)), we have a vector \(X=(X_1, X_2)\) and the two marginal distributions \(F_1\) and \(F_2\) with means and standard deviations \(\mu_{F_1}=E[F_1(X_1)]\), \(\mu_{F_2}=E[F_2(X_2)]\), \(\sigma_{F_1}=\mbox{Var}(F_1(X_1))^{1/2}\) and \(\sigma_{F_2}=\mbox{Var}(F_2(X_2))^{1/2}\), respectively. For this case, NORTA initialization is reduced to the problem of finding the correlation \(\rho_Z=\mbox{Corr}(Z_1,Z_2)\).
In this document, we present a set of Java classes for NORTA initialization in the bivariate case given the rank correlation and two discrete marginal distributions. We have:
\begin{align} \tag{r} r_X(\rho)=\mbox{Corr}(F_1(X_1),F_2(X_2))=\frac{g_r(\rho) -\mu_{F_1}\mu_{F_2} }{\sigma_{F_1}\sigma_{F_2}}, \end{align}
\begin{align} g_r(\rho) & = E \left[ F_1 (X_1) F_2 (X_2) \right]\nonumber \\ & = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} F_1 \{ F_1^{-1} [ \Phi(x_1)]\} F_2\{ F_2^{-1}[\Phi(x_2)]\} \phi_{\rho}(x_1,x_2) dx_1 dx_2, \tag{gr} \end{align}
where \(\phi_{\rho}\) is the bivariate standard normal density. Then, for a given correlation \(r_X\), we use an algorithm of root-finder to find the corresponding correlation \(\rho_Z\) that verifies
\begin{align} \tag{fr} f_r(\rho_Z)=g_r(\rho_Z)-r_X\sigma_{F_1}\sigma_{F_2}-\mu_{F_1}\mu_{F_2}=0. \end{align}
When the marginal distributions are continuous, the root-finding problem is easy to solve when we use the rank correlation. We have an analytic solution for ( gr ) and the relation in ( r ) becomes:
\begin{align} r_X(\rho)=(6/\pi) \arcsin(\rho/2).\nonumber \end{align}
In this example, we consider two random variables \(X_1\) and \(X_2\) with negative binomial marginals, denoted by NegBin \((s,p)\). In our example, the parameters \((s,p)\) for \(X_1\) and \(X_2\), respectively, are: \(s_1=15.68\), \(p_1=0.3861\), \(s_2=60.21\) and \(p_2=0.6211\). We want to calculate the correlation \(\rho_Z\) for a target rank correlation \(r_X=0.43\). Since the negative binomial has an unbounded support, we set the upper bound points of each support at the quantile of order \(tr=1-10^{-6}\), so that the number of support points \(m_l=F_l^{-1}(1-10^{-6})+1\), for \(l=1,2\).
The Java program uses the class umontreal.ssj.probdist.DiscreteDistributionInt of package probdist
from SSJ, to specify the two discrete marginal distributions. Each of the four subclasses NI1, NI2a, NI2b and NI3 are called for each algorithm to compute the correlation \(\rho_Z\), so we can compare the results.
Example with correlated negative binomial distributions. [Collision]
Results of the program ExampleNortaInitDisc.java
[Collision]