SSJ  3.3.1 Stochastic Simulation in Java
Package umontreal.ssj.simexp

Tools for Simulation Experiments. More...

## Classes

class  BatchMeansSim
Performs a simulation experiment on an infinite horizon, for estimating steady-state performance measures, using batch means. More...

class  RepSim
Performs a simulation experiment on a finite horizon, using a certain number of independent runs or replications. More...

class  SimExp
Represents a framework for performing experiments using simulation. More...

## Detailed Description

Tools for Simulation Experiments.

Provides some classes to manage simulation experiments with many replications, batch means, etc. Let $$\bar{\mathbf{X}}_n$$ be an average of random vectors:

$\bar{\mathbf{X}}_n=\frac{1}{n}\sum_{r=0}^{n-1}\mathbf{X}_r,$

where $$\mathbf{X}_r$$ is the $$r$$th observation obtained during an experiment. Assuming that the $$\mathbf{X}_r$$’s are i.i.d., $$\{\bar{\mathbf{X}}_n, n\ge0\}$$ is a sequence of vectors in $$\mathbb{R}^d$$ converging to a vector $$\boldsymbol{\mu}=E[\mathbf{X}_r]$$ when $$n\to\infty$$. We use simulation to estimate this $$\boldsymbol{\mu}$$.

The simplest way to generate the sample $$(\mathbf{X}_0, …, \mathbf{X}_{n-1})$$ is by simulating the same system $$n$$ times, independently. In that setting, $$r$$ becomes the index of a replication. In general, a simulation generates $$n$$ copies of $$\mathbf{X}_r$$ to compute $$\bar{\mathbf{X}}_n$$, in order to estimate $$\boldsymbol{\mu}$$. We may also be interested in some sample covariances of components of $$\mathbf{X}_r$$, for computing confidence intervals on functions of $$\boldsymbol{\mu}$$. The most common functions return a single component of $$\boldsymbol{\mu}$$, or a ratio of two components.

For example, when simulating a $$M/M/1$$ queue, we may be able to get the total waiting time $$W$$ for all customers, the number $$N$$ of served customers, and the integral of the queue size over simulation time $$Q(T)=\int_0^Tq(t) dt$$, where $$q(t)$$ is the queue size at time $$t$$. In this case, $$\mathbf{X}_r=(W_r, N_r, Q_r(T))$$, and $$\boldsymbol{\mu}=(E[W], E[N], E[Q(T)])$$. Two interesting functions of $$\boldsymbol{\mu}$$ are the expected waiting time per customer $$E[W]/E[N]$$, and the time-average queue size $$E[Q(T)]/T$$. The functions are simply evaluated at $$\bar{\mathbf{X}}_n$$ to estimate the latter quantities.

The number of observations $$n$$ is usually constant, but it may also be random if sequential sampling is used. With this scheme, after $$n_0$$ observations are available, an error check is performed to determine if simulation should continue. For example, this check may evaluate the relative error of an estimated performance measure by dividing the half-width of a computed confidence interval by the point estimator, and require additional observations if this error is too high. The procedure is repeated until the stopping conditions are verified, or a maximal number of observations is obtained. However, because the sample size $$n$$ is random, the estimator $$\bar{\mathbf{X}}_n$$ is biased when using sequential sampling.

The vector $$\mathbf{X}_r$$ is usually computed by summing costs incurred for various events during a part of the experiment. These costs can be waiting times, number of items, etc., not necessarily money. The computed sums may then be processed in a way depending of the simulated horizon and model to get the required values. If the horizon is finite, simulation stops after a finite time $$T$$, or a finite number $$N$$ of events, and is usually repeated $$n$$ times independently. Then, for replication  $$r$$,

$\mathbf{X}_r=\sum_{k=0}^{N-1}\mathbf{C}_{k, r}$

or

$\mathbf{X}_r=\sum_{k=0}^{N_r(T)-1}\mathbf{C}_{k, r}$

where $$\mathbf{C}_{k, r}$$ is the cost of the $$k$$th event during replication  $$r$$, and $$N_r(T)$$ is the total number of events occurring during the time interval $$[0, T]$$. This estimates the total expected cost over the horizon.

When the horizon is infinite, a single replication is usually simulated, and the cost per time unit or per event is computed in the long run in order to estimate

$\boldsymbol{\mu}=\lim_{N\to\infty} \left(\frac{1}{N}\; E\!\left[\sum_{k=0}^{N-1} \mathbf{C}_k\right]\right) =\lim_{N\to\infty} \left(\frac{1}{N}\sum_{k=0}^{N-1} \mathbf{C}_k\right)$

or

$\boldsymbol{\mu}=\lim_{T\to\infty} \left(\frac{1}{T}\; E\!\left[\sum_{k=0}^{N(T) - 1} \mathbf{C}_k\right]\right) =\lim_{T\to\infty} \left(\frac{1}{T}\sum_{k=0}^{N(T) - 1} \mathbf{C}_k\right).$

Any estimator of these previous quantities is biased, because the horizon must be truncated, and the model does not necessarily start in steady state. A single long replication is simulated to reduce the bias, and the first events are dropped (warmup).

However, with a single long replication, computing sample covariances and confidence intervals is more difficult. A simple technique to overcome this problem is batch means  , which divides the truncated horizon into successive intervals called batches. More specifically, let $$T_0$$ be the time at which the warmup ends. We divide the horizon $$[T_0, T]$$ in $$n$$ batches starting at times $$T_0 < \cdots< T_{n-1}$$, and the last batch ends at $$T_n=T>T_{n-1}$$. Then, for batch  $$r$$,

$\mathbf{X}_r=\sum_{k=N(T_r)}^{N(T_{r+1})-1}\mathbf{C}_k,$

where $$r=0,…, n-1$$. Batches may have a fixed duration in simulation time units, contain a fixed number of events, etc. The beginning of each batch can even correspond to a regeneration point, i.e., a simulation time at which the state and behavior of the model does not depend on the past. In the latter case, each batch corresponds to a regenerative cycle. In all these cases, $$\boldsymbol{\mu}$$ is estimated by

$\frac{\sum\limits_{r=0}^{n-1}\mathbf{X}_r}{\sum\limits_{r=0}^{n-1} (T_{r+1} - T_r)} = \frac{\sum\limits_{r=0}^{n-1}\mathbf{X}_r}{T-T_0}.$

The simplest way for estimating covariances when all batches have the same length is to consider the $$\mathbf{X}_r$$’s as i.i.d. random vectors, and use the same techniques as with independent replications. For regenerative cycles, confidence intervals must be computed on ratios of means. The class umontreal.ssj.stat.FunctionOfMultipleMeansTally can be used for this.

This package provides helper classes to facilitate management of a complex simulation experiment. A base class called SimExp contains methods to initialize lists of statistical probes and to help in sequential sampling. The subclass RepSim is used for simulating independent replications of a given model on a finite horizon. The subclass BatchMeansSim can be used for simulating a stationary model using the batch means technique.