|
Spatial statistics for large spatial datasets is challenging. The
size of the dataset, n, causes problems in computing optimal spatial
predictors such as kriging, since its computational complexity is on
the order of the cube of n. In addition, a
large dataset is often
defined on a large spatial domain, so that the spatial process of
interest typically exhibits nonstationary behavior over that
domain. In this research, a family of nonstationary covariance
functions is defined using a set of basis functions that is fixed in
number, which is motivated by a spatial random effects (SRE)
model. This leads to a spatial prediction method we call Fixed Rank
Kriging (FRK). FRK relies on computational simplifications when n is
large, for obtaining the spatial best linear unbiased predictor
(BLUP) and its mean squared prediction error for a hidden spatial process. A
weighted-least-squares method is derived to estimate the
covariance-function parameters, and these are substituted into the
FRK equations. The article, "Fixed rank kriging for very large spatial
datasets" by Noel Cressie and Gardar Johannesson appears in 2008 in the
Journal of the Royal Statistical Society, Series B.
A related approach can be taken for spatio-temporal datasets, where
the SRE model becomes a spatio-temporal random effects (STRE) model.
Here, the hidden random effects are assumed to evolve dynamically.
This results in a filtering methodology for massive data, which we
call Fixed Rank Filtering (FRF); Fixed Rank Smoothing and Fixed Rank
Forecasting can also be derived. The article, "Using temporal
variability to improve spatial mapping with application to satellite
data" by E.L. Kang, N. Cressie, and T. Shi, will apear in
the Canadian Journal of Statistics, in 2010.
The dataset analyzed the in the FRF paper referred to above is
available for download. Level-2 aerosol data are collected at high
spatial resolution, 17.6 km x 17.6 km. Level-3 data products are
generated from level-2 data at a much lower spatial resolution (0.5
deg. x 0.5 deg.), by averaging level-2 observations falling in the
level-3 pixels in a given time period. The dataset analyzed consists
of MISR daily level-3 AOD data from July 1, 2001 through August 9,
2001. There are 720 x 300 = 259,200 level-3 pixels, but only pixels
where retrievals are obtained and which are
on the satellite's orbit have AOD
data. In the FRF paper, a rectangular region
D between longitudes -125 deg. and +3 deg., and between
latitudes -20 deg. and +44 deg., are chosen. This covers North
and South America,
the western part of the Sahara desert in Africa, the Iberian
Pennisula in Europe, and parts of the Atlantic and Pacific Oceans;
it was chosen because of expected aerosol activity coming
from the Sahara Desert.
Click here for the MISR AOD data used in the FRF paper.
Click here for more complete
MISR
AOD data.
In the Technical Report ``Fixed-rank filtering for spatio-temporal
data'' by N. Cressie, T. Shi, and E.L. Kang, a simulation
experiment is given that compares FRF to FRK. The code used to
perform the experiment is available:
Click here for simulation-expereiment code to compare FRF with
FRK.
|