14.2 Model Indepependent Tests for
against
A stochastic process is
if it needs to be differentiated
times
in order to become
. We shall test for
against fractional
alternatives by using more formal definitions.
In a first approach, we define a stochastic process
as
if the normalized partial sums follow a particular
distribution. We only require is the existence of a consistent
estimator of the variance for normalizing the partial sums. The
tests presented here make use of the Newey and West (1987)
heteroskedastic and autocorrelation consistent (HAC) estimator of the
variance, defined as
 |
(14.3) |
where
is the variance of the process, and the
sequence
denotes the autocovariances of
the process up to the order
. This spectral based HAC variance
estimator depends on the user chosen truncation lag
.
Andrews (1991) has proposed a selection rule for the order
.
The quantlet
neweywest
computes the Newey and West (1987)
estimator of the variance of a unidimensional process. Its syntax is:
sigma = neweywest(y{, q})
where the input parameters are:
- y
- the series of observations
- q
- optional parameter, which can be either a
vector of truncation lags or a single scalar
The HAC estimator is calculated for all the orders included in the parameter q.
If no optional parameter is provided, the HAC estimator is evaluated for the
default orders q = 5, 10, 25, 50.
The estimated HAC variances are stored in the vector q.
In the following example the HAC variance of the first 2000 observations of the 20
minutes spaced sample of Deutschmark-Dollar FX is computed.
library("times")
y = read("dmus58.dat")
y = y[1:2000]
q = 5|10|25|50
sigma = neweywest(y,q)
q~sigma
As an output we get
Contents of _tmp
[1,] 5 0.0047841
[2,] 10 0.008743
[3,] 25 0.020468
[4,] 50 0.039466
14.2.1 Robust Rescaled Range Statistic
The first test for long-memory was devised by the hydrologist Hurst (1951) for the design of an optimal reservoir for the Nile
river, of which flow regimes were persistent. Although Mandelbrot (1975) gave
a formal justification for the use of this test, Lo (1991)
demonstrated that this statistic was not robust to short range
dependence, and proposed the following one:
![$ Q_T = \frac{1}{\hat{\sigma}_T(q)} \left[\max_{1\le k\le T}\sum_{j=1}^k (X_j- \overline{X}_T) - \min_{1\le k\le T} \sum_{j=1}^k(X_j- \overline{X}_T)\right]$](xploreapplichtmlimg1490.gif) |
(14.4) |
which consists of replacing the variance by the HAC variance estimator
in the denominator of the statistic. If
, Lo's statistic reduces
to Hurst's
statistic. Unlike spectral analysis which detects
periodic cycles in a series, the
analysis has been advocated by
Mandelbrot for detecting nonperiodic cycles.
Under the null hypothesis of no
long-memory, the statistic
converges to a
distribution equal to the range of a Brownian bridge on the unit
interval:
where
is a Brownian bridge defined as
,
being the standard Brownian motion. The distribution function
is given in Siddiqui (1976), and is tabulated in Lo (1991).
This statistic is extremely sensitive to the order of truncation
but there is no statistical criteria for
choosing
in the framework of this statistic. Andrews (1991) rule
gives mixed results. If
is too small, this estimator does not
account for the autocorrelation of the process, while if
is too
large, it accounts for any form of autocorrelation and the power of
this test tends to its size. Given that the power of a useful test
should be greater than its size, this statistic is not very helpful.
For that reason, Teverovsky, Taqqu and Willinger (1999) suggest to use this
statistic with other tests.
Since there is no data driven guidance for the choice of this parameter, we
consider the default values for
= 5, 10, 25, 50. XploRe
users have the option to provide their own vector of truncation lags.
Let's consider again the series of absolute returns on the 20 minutes spaced
Deutschmark-Dollar FX rates.
library("times")
y = read("dmus58.dat")
ar = abs(tdiff(y[1:2000]))
lostat = lo(ar)
lostat
Given that we do not provide a vector of truncation lags, Lo's
statistic is computed for the default truncation lags. The results are
displayed in the form of a table: the first column contains the
truncation orders, the second columns contains the computed statistic.
If the computed statistic is outside the 95% confidence interval for
no long-memory, a star
is displayed
after that statistic.
Contents of lostat
[1,] " Order Statistic"
[2,] "__________________ "
[3,] ""
[4,] " 5 2.0012 *"
[5,] " 10 1.8741 *"
[6,] " 25 1.7490 "
[7,] " 50 1.6839 "
This result illustrates the issue of the choice of the bandwidth
parameter q. For q = 5 and 10, we reject the null hypothesis of no
long-memory. However, when q = 25 or 50, this null hypothesis is
accepted, as the power of this test is too low for these levels of
truncation orders.
14.2.2 The KPSS Statistic
Equivalently, we can test for
against fractional alternatives
by using the KPSS test Kwiatkowski, Phillips, Schmidt, and Shin (1992), as Lee and Schmidt (1996)
have shown that this test has a power equivalent to Lo's statistic
against long-memory processes. The two KPSS statistics, denoted by
and
, are respectively based on the residuals of
two regression models: on an intercept and a trend
, and on a
constant
. If we denote by
the partial sums
, where
are the residuals of these
regressions, the KPSS statistic is defined by:
 |
(14.5) |
where
is the HAC estimator of the variance of the
residuals defined in equation (14.3). The statistic
tests for stationarity against a long-memory alternative, while the
statistic
tests for trend-stationarity against a long-memory
alternative.
The quantlet
kpss
computes both statistics. The default
bandwidths, denoted by
,
and
are the one given in
Kwiatkowski, Phillips, Schmidt, and Shin (1992). We evaluate both tests on the series of
absolute returns ar as follows:
library("times")
y = read("dmus58.dat")
ar = abs(tdiff(y[1:2000]))
kpsstest = kpss(ar)
kpsstest
The quantlet
kpss
returns the results in the form of a table.
The first column contains the truncation order, the second column
contains the type of the test: const means the test for
stationary sequence, while trend means the test for trend
stationarity. The third column contains the computed statistic. If
this statistic exceeds the 95% critical value, a
symbol is displayed. The last column contains this
critical value.
Thus, XploRe returns the following table:
Contents of kpsstest
[1,] " Order Test Statistic Crit. Value "
[2,] "_________________________________________ "
[3,] ""
[4,] " L0 = 0 const 1.8259 * 0.4630"
[5,] " L4 = 8 const 1.2637 * 0.4630"
[6,] " L12= 25 const 1.0483 * 0.4630"
[7,] " L0 = 0 trend 0.0882 0.1460"
[8,] " L4 = 8 trend 0.0641 0.1460"
[9,] " L12= 25 trend 0.0577 0.1460"
14.2.3 The Rescaled Variance
Statistic
Giraitis, Kokoszka and Leipus (1998) have proposed a centering of the
KPSS statistic based on the partial sum of the deviations from the
mean. They called it a rescaled variance test
as its expression
given by
![$ V/S = \frac{1}{T^2\hat{\sigma}^2_T(q)}\left[ \sum_{k=1}^T \left(\sum_{j=1}^k (...
...1}{T} \left( \sum_{k=1}^T \sum_{j=1}^k (Y_j - \overline{Y}_T) \right)^2 \right]$](xploreapplichtmlimg1509.gif) |
(14.6) |
can equivalently be rewritten as
 |
(14.7) |
where
are the partial sums
of the observations. The
statistic is the sample variance of the
series of partial sums
. The limiting distribution of
this statistic is a Brownian bridge of which the distribution is linked to
the Kolmogorov statistic. This statistic has uniformly higher power
than the KPSS, and is less sensitive than the Lo statistic to the choice
of the order
. For
, the
statistic can
appropriately detect the presence of long-memory in the levels series,
although, like most tests and estimators, this test may wrongly detect
the presence of long-memory in series with shifts in the levels.
Giraitis, Kokoszka and Leipus (1998) have shown that this statistic
can be used for the detection of long-memory in the volatility for the
class of ARCH(
) processes.
We evaluate the
statistic with the quantlet
rvlm
which has
the following syntax:
vstest = rvlm(ary{, q})
where
- ary
- is the series
- q
- is a vector of truncation lags. If this optional argument is
not provided, then the default vector of truncation lags is used, with
q = 0, 8, 25.
This quantlet returns the results in the form of a table: the first
column contains the order of truncation
, the second column
contains the estimated
statistic. If this statistic is outside
the 95% confidence interval for no long-memory, a star
symbol is
displayed. The fourth column displays the 95% critical value.
Thus the instruction
library("times")
y = read("dmus58.dat")
ar = abs(tdiff(y[1:2000]))
vstest = rvlm(ar)
vstest
returns
Contents of vstest
[1,] " Order Statistic Crit. Value "
[2,] "_________________________________"
[3,] ""
[4,] " 0 0.3305 * 0.1869"
[5,] " 8 0.2287 * 0.1869"
[6,] " 25 0.1897 * 0.1869"
14.2.4 Nonparametric Test for
Lobato and Robinson (1998) nonparametric test for
against
is also based on the
approximation (14.2) of the spectrum of a long-memory process.
In the univariate case, the
statistic is equal to:
 |
(14.8) |
where
is the periodogram estimated for a degenerate band of Fourier
frequencies
, where
is a bandwidth parameter. Under the null hypothesis of a
time
series, the
statistic is asymptotically normally distributed. This
two sided test is of interest as it allows to discriminate between
and
: if the
statistic is in the lower fractile of the
standardized normal distribution, the series exhibits long-memory
whilst if the series is in the upper fractile of that distribution,
the series is antipersistent.
The quantlet
lobrob
evaluates the Lobato-Robinson test. Its
syntax is as follows:
l = lobrob(ary{, m})
where
- ary
- is the series,
- m
- is the vector of bandwidth parameters. If this optional
argument is missing, the default bandwidth suggested by Lobato and
Robinson is used.
The results are displayed in the form of a table: the first column
contains the value of the bandwidth parameter while the second column
displays the corresponding statistic. In the following example, the
Lobato-Robinson statistic is evaluated by using this default
bandwidth:
library("times")
y = read("dmus58.dat")
ar = abs(tdiff(y[1:2000]))
l = lobrob(ar)
l
which yields
Contents of l
[1,] "Bandwidth Statistic "
[2,] "_____________________ "
[3,] ""
[4,] " 334 -4.4571"
In the next case, we provide a vector of bandwidths m, and evalutate
this statistic for all the elements of m. The sequence of
instructions:
library("times")
y = read("dmus58.dat")
ar = abs(tdiff(y[1:2000]))
m = #(100,150,200)
l = lobrob(ar,m)
l
returns the following table:
Contents of l
[1,] "Bandwidth Statistic "
[2,] "_____________________ "
[3,] ""
[4,] " 100 -1.7989"
[5,] " 150 -2.9072"
[6,] " 200 -3.3308"