Volume 70, Issue 1

Fixed rank kriging for very large spatial data sets

Noel Cressie

The Ohio State University, Columbus, USA

Search for more papers by this author
Gardar Johannesson

Lawrence Livermore National Laboratory, Livermore, USA

Search for more papers by this author
First published: 04 January 2008
Citations: 366
Noel Cressie, Department of Statistics, Ohio State University, 1958 Neil Avenue, Columbus, OH 43210‐1247, USA.
E‐mail: ncressie@stat.osu.edu

Abstract

Summary. Spatial statistics for very large spatial data sets is challenging. The size of the data set, n, causes problems in computing optimal spatial predictors such as kriging, since its computational cost is of order inline image. In addition, a large data set is often defined on a large spatial domain, so the spatial process of interest typically exhibits non‐stationary behaviour over that domain. A flexible family of non‐stationary covariance functions is defined by using a set of basis functions that is fixed in number, which leads to a spatial prediction method that we call fixed rank kriging. Specifically, fixed rank kriging is kriging within this class of non‐stationary covariance functions. It relies on computational simplifications when n is very large, for obtaining the spatial best linear unbiased predictor and its mean‐squared prediction error for a hidden spatial process. A method based on minimizing a weighted Frobenius norm yields best estimators of the covariance function parameters, which are then substituted into the fixed rank kriging equations. The new methodology is applied to a very large data set of total column ozone data, observed over the entire globe, where n is of the order of hundreds of thousands.

Number of times cited according to CrossRef: 366

  • Spatiotemporal Multitask Learning for 3-D Dynamic Field Modeling, IEEE Transactions on Automation Science and Engineering, 10.1109/TASE.2019.2941736, 17, 2, (708-721), (2020).
  • Basic Concepts and Methods of Estimation, Random Fields for Spatial Data Modeling, 10.1007/978-94-024-1918-4_12, (517-550), (2020).
  • More on Spatial Prediction, Random Fields for Spatial Data Modeling, 10.1007/978-94-024-1918-4_11, (485-515), (2020).
  • Sensitivity and uncertainty quantification for the ECOSTRESS evapotranspiration algorithm – DisALEXI, International Journal of Applied Earth Observation and Geoinformation, 10.1016/j.jag.2020.102088, 89, (102088), (2020).
  • Clear-sky index space-time trajectories from probabilistic solar forecasts: Comparing promising copulas, Journal of Renewable and Sustainable Energy, 10.1063/1.5140604, 12, 2, (026102), (2020).
  • Screening Effect in Isotropic Gaussian Processes, Acta Mathematica Sinica, English Series, 10.1007/s10114-020-7300-5, 36, 5, (512-534), (2020).
  • Bayesian inference of spatially varying Manning’s n coefficients in an idealized coastal ocean model using a generalized Karhunen-Loève expansion and polynomial chaos, Ocean Dynamics, 10.1007/s10236-020-01382-4, 70, 8, (1103-1127), (2020).
  • undefined, 2020 American Control Conference (ACC), 10.23919/ACC45564.2020.9147916, (719-724), (2020).
  • High-level land product integration methods, Advanced Remote Sensing, 10.1016/B978-0-12-815826-5.00021-0, (789-812), (2020).
  • Geostatistics and Gaussian process models, Spatial Analysis Using Big Data, 10.1016/B978-0-12-813127-5.00004-7, (57-112), (2020).
  • A sandwich smoother for spatio-temporal functional data, Spatial Statistics, 10.1016/j.spasta.2020.100413, (100413), (2020).
  • Modeling massive spatial datasets using a conjugate Bayesian linear modeling framework, Spatial Statistics, 10.1016/j.spasta.2020.100417, (100417), (2020).
  • Great expectations and even greater exceedances from spatially referenced data, Spatial Statistics, 10.1016/j.spasta.2020.100420, (100420), (2020).
  • Uncertainty quantification in materials modeling, Uncertainty Quantification in Multiscale Materials Modeling, 10.1016/B978-0-08-102941-1.00001-8, (1-40), (2020).
  • Data-driven acceleration of first-principles saddle point and local minimum search based on scalable Gaussian processes, Uncertainty Quantification in Multiscale Materials Modeling, 10.1016/B978-0-08-102941-1.00005-5, (119-168), (2020).
  • Efficiency improvement of Kriging surrogate model by subset simulation in implicit expression problems, Computational and Applied Mathematics, 10.1007/s40314-020-01147-1, 39, 2, (2020).
  • Spatiotemporal Analysis of Urban Heatwaves Using Tukey G-and-H Random Field Models, SSRN Electronic Journal, 10.2139/ssrn.3575789, (2020).
  • Marginally parameterized spatio-temporal models and stepwise maximum likelihood estimation, Computational Statistics & Data Analysis, 10.1016/j.csda.2020.107018, (107018), (2020).
  • Data processing, Agricultural Internet of Things and Decision Support for Precision Smart Farming, 10.1016/B978-0-12-818373-1.00003-2, (139-182), (2020).
  • Incorporating covariate information in the covariance structure of misaligned spatial data, Environmetrics, 10.1002/env.2623, 31, 6, (2020).
  • Modeling spatial data using local likelihood estimation and a Matérn to spatial autoregressive translation, Environmetrics, 10.1002/env.2652, 31, 6, (2020).
  • Machine learning for digital soil mapping: Applications, challenges and suggested solutions, Earth-Science Reviews, 10.1016/j.earscirev.2020.103359, (103359), (2020).
  • Vecchia-Laplace approximations of generalized Gaussian processes for big non-Gaussian spatial data, Computational Statistics & Data Analysis, 10.1016/j.csda.2020.107081, (107081), (2020).
  • Spatial auto‐correlation and auto‐regressive models estimation from sample survey data, Biometrical Journal, 10.1002/bimj.201800225, 62, 6, (1494-1507), (2020).
  • Spatiotemporal multi-resolution approximations for analyzing global environmental data, Spatial Statistics, 10.1016/j.spasta.2020.100465, 38, (100465), (2020).
  • An Effective and Efficient Enhanced Fixed Rank Smoothing Method for the Spatiotemporal Fusion of Multiple-Satellite Aerosol Optical Depth Products, Remote Sensing, 10.3390/rs12071102, 12, 7, (1102), (2020).
  • Vecchia Approximations of Gaussian-Process Predictions, Journal of Agricultural, Biological and Environmental Statistics, 10.1007/s13253-020-00401-7, (2020).
  • Spatio-temporal modeling of global ozone data using convolution, Japanese Journal of Statistics and Data Science, 10.1007/s42081-019-00069-5, (2020).
  • Bayesian spatial and spatiotemporal models based on multiscale factorizations, WIREs Computational Statistics , 10.1002/wics.1509, 0, 0, (2020).
  • Comparing spatial regression to random forests for large environmental data sets, PLOS ONE, 10.1371/journal.pone.0229509, 15, 3, (e0229509), (2020).
  • A Distributed and Integrated Method of Moments for High-Dimensional Correlated Data Analysis, Journal of the American Statistical Association, 10.1080/01621459.2020.1736082, (1-14), (2020).
  • The Integration of Collaborative Robot Systems and Their Environmental Impacts, Processes, 10.3390/pr8040494, 8, 4, (494), (2020).
  • Krigings over space and time based on latent low-dimensional structures, Science China Mathematics, 10.1007/s11425-019-1606-2, (2020).
  • More efficient approximation of smoothing splines via space-filling basis selection, Biometrika, 10.1093/biomet/asaa019, (2020).
  • 30 Years of space–time covariance functions, WIREs Computational Statistics , 10.1002/wics.1512, 0, 0, (2020).
  • The Reconstruction Approach: From Interpolation to Regression, Technometrics, 10.1080/00401706.2020.1764869, (1-11), (2020).
  • TROPOMI NO2 Tropospheric Column Data: Regridding to 1 km Grid-Resolution and Assessment of their Consistency with In Situ Surface Observations, Remote Sensing, 10.3390/rs12142212, 12, 14, (2212), (2020).
  • Computationally simple anisotropic lattice covariograms, Environmental and Ecological Statistics, 10.1007/s10651-020-00456-2, (2020).
  • Scalable GWR: A Linear-Time Algorithm for Large-Scale Geographically Weighted Regression with Polynomial Kernels, Annals of the American Association of Geographers, 10.1080/24694452.2020.1774350, (1-22), (2020).
  • Spatial and covariate-varying relationships among dominant tree species in Utah, Environmental and Ecological Statistics, 10.1007/s10651-020-00460-6, (2020).
  • Effective probability distribution approximation for the reconstruction of missing data, Stochastic Environmental Research and Risk Assessment, 10.1007/s00477-020-01765-5, (2020).
  • Spatial analysis and visualization of global data on multi-resolution hexagonal grids, Japanese Journal of Statistics and Data Science, 10.1007/s42081-020-00077-w, (2020).
  • Modeling multivariate profiles using Gaussian process-controlled B-splines, IISE Transactions, 10.1080/24725854.2020.1798038, (1-12), (2020).
  • Data assimilation using an ensemble of models: a hierarchical approach, Atmospheric Chemistry and Physics, 10.5194/acp-20-3725-2020, 20, 6, (3725-3737), (2020).
  • Bayesian prediction of spatial data with non-ignorable missingness, Statistical Papers, 10.1007/s00362-020-01186-0, (2020).
  • A Fused Gaussian Process Model for Very Large Spatial Data, Journal of Computational and Graphical Statistics, 10.1080/10618600.2019.1704293, (1-11), (2020).
  • Nonstationary modeling with sparsity for spatial data via the basis graphical lasso, Journal of Computational and Graphical Statistics, 10.1080/10618600.2020.1811103, (1-36), (2020).
  • Multi-scale process modelling and distributed computation for spatial data, Statistics and Computing, 10.1007/s11222-020-09962-6, (2020).
  • Highly Scalable Bayesian Geostatistical Modeling via Meshed Gaussian Processes on Partitioned Domains, Journal of the American Statistical Association, 10.1080/01621459.2020.1833889, (1-31), (2020).
  • Testing Independence Between Two Spatial Random Fields, Journal of Agricultural, Biological and Environmental Statistics, 10.1007/s13253-020-00421-3, (2020).
  • A Bayesian spatial categorical model for prediction to overlapping geographical areas in sample surveys, Journal of the Royal Statistical Society: Series A (Statistics in Society), 10.1111/rssa.12526, 183, 2, (535-563), (2019).
  • Modeling sea‐level processes on the U.S. Atlantic Coast, Environmetrics, 10.1002/env.2609, 31, 4, (2019).
  • Spatio‐Temporal data fusion for massive sea surface temperature data from MODIS and AMSR‐E instruments, Environmetrics, 10.1002/env.2594, 31, 2, (2019).
  • Likelihood approximation with hierarchical matrices for large spatial datasets, Computational Statistics & Data Analysis, 10.1016/j.csda.2019.02.002, (2019).
  • Spatially varying coefficient modeling for large datasets: Eliminating N from spatial regressions, Spatial Statistics, 10.1016/j.spasta.2019.02.003, (2019).
  • Evaluation of empirical Bayesian kriging, Spatial Statistics, 10.1016/j.spasta.2019.100368, (100368), (2019).
  • An analysis of spatiotemporal patterns in Chinese agricultural productivity between 2004 and 2014, Ecological Indicators, 10.1016/j.ecolind.2018.05.073, 105, (591-600), (2019).
  • Trajectory-Enhanced AIRS Observations of Environmental Factors Driving Severe Convective Storms, Monthly Weather Review, 10.1175/MWR-D-18-0055.1, 147, 5, (1633-1653), (2019).
  • Preliminary assessment of two spatio-temporal forecasting technics for hourly satellite-derived irradiance in a complex meteorological context, Solar Energy, 10.1016/j.solener.2018.11.010, 177, (703-712), (2019).
  • Spatial analysis of total dissolved solids in Dezful Aquifer: Comparison between universal and fixed ranked kriging, Journal of Contaminant Hydrology, 10.1016/j.jconhyd.2019.01.001, (2019).
  • Efficient Bayesian modeling of large lattice data using spectral properties of Laplacian matrix, Spatial Statistics, 10.1016/j.spasta.2019.01.003, (2019).
  • A diagonally weighted matrix norm between two covariance matrices, Spatial Statistics, 10.1016/j.spasta.2019.01.001, (2019).
  • Bayesian Sequential Data Collection for Stochastic Simulation Calibration, European Journal of Operational Research, 10.1016/j.ejor.2019.01.073, (2019).
  • Bayesian analysis of areal data with unknown adjacencies using the stochastic edge mixed effects model, Spatial Statistics, 10.1016/j.spasta.2019.100357, (100357), (2019).
  • A variational method for parameter estimation in a logistic spatial regression, Spatial Statistics, 10.1016/j.spasta.2019.100365, (100365), (2019).
  • Producing high-quality solar resource maps by integrating high- and low-accuracy measurements using Gaussian processes, Renewable and Sustainable Energy Reviews, 10.1016/j.rser.2019.109260, 113, (109260), (2019).
  • An additive approximate Gaussian process model for large spatio‐temporal data, Environmetrics, 10.1002/env.2569, 30, 8, (2019).
  • Computationally efficient nonstationary nearest‐neighbor Gaussian process models using data‐driven techniques, Environmetrics, 10.1002/env.2571, 30, 8, (2019).
  • Spherical Data Handling and Analysis with R package rcosmo, Statistics and Data Science, 10.1007/978-981-15-1960-4_15, (211-225), (2019).
  • Axially symmetric models for global data: A journey between geostatistics and stochastic generators, Environmetrics, 10.1002/env.2555, 30, 1, (2019).
  • Nonseparable covariance models on circles cross time: A study of Mexico City ozone, Environmetrics, 10.1002/env.2558, 30, 5, (2019).
  • undefined, 2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC), 10.1109/HiPC.2019.00028, (152-162), (2019).
  • A comparison of Landsat 8, RapidEye and Pleiades products for improving empirical predictions of satellite-derived bathymetry, Remote Sensing of Environment, 10.1016/j.rse.2019.111414, 233, (111414), (2019).
  • Efficient inference of generalized spatial fusion models with flexible specification, Stat, 10.1002/sta4.216, 8, 1, (2019).
  • Adaptive spatial sampling design for environmental field prediction using low-cost sensing technologies, Atmospheric Environment, 10.1016/j.atmosenv.2019.117091, (117091), (2019).
  • A Variational Inference-Based Heteroscedastic Gaussian Process Approach for Simulation Metamodeling, ACM Transactions on Modeling and Computer Simulation, 10.1145/3299871, 29, 1, (1-22), (2019).
  • Modeling of a three-dimensional dynamic thermal field under grid-based sensor networks in grain storage, IISE Transactions, 10.1080/24725854.2018.1504356, (1-16), (2019).
  • Threshold knot selection for large-scale spatial models with applications to the Deepwater Horizon disaster , Journal of Statistical Computation and Simulation, 10.1080/00949655.2019.1610884, (1-17), (2019).
  • Spatially Heterogeneous Land Surface Deformation Data Fusion Method Based on an Enhanced Spatio-Temporal Random Effect Model, Remote Sensing, 10.3390/rs11091084, 11, 9, (1084), (2019).
  • Dynamic spatiotemporal modeling of the infected rate of visceral leishmaniasis in human in an endemic area of Amhara regional state, Ethiopia, PLOS ONE, 10.1371/journal.pone.0212934, 14, 3, (e0212934), (2019).
  • Reconstructing Cloud Contaminated Pixels Using Spatiotemporal Covariance Functions and Multitemporal Hyperspectral Imagery, Remote Sensing, 10.3390/rs11101145, 11, 10, (1145), (2019).
  • Interpolation of daily rainfall data using censored Bayesian spatially varying model, Computational Statistics, 10.1007/s00180-019-00911-0, (2019).
  • Bayesian Hierarchical Models With Conjugate Full-Conditional Distributions for Dependent Data From the Natural Exponential Family, Journal of the American Statistical Association, 10.1080/01621459.2019.1677471, (1-16), (2019).
  • Randomized algorithms of maximum likelihood estimation with spatial autoregressive models for large-scale networks, Statistics and Computing, 10.1007/s11222-019-09862-4, (2019).
  • Data generation for axially symmetric processes on the sphere, Communications in Statistics - Simulation and Computation, 10.1080/03610918.2019.1588309, (1-20), (2019).
  • Spatiotemporal Lagged Models for Variable Rate Irrigation in Agriculture, Journal of Agricultural, Biological and Environmental Statistics, 10.1007/s13253-019-00365-3, (2019).
  • Modeling Bronchiolitis Incidence Proportions in the Presence of Spatio-Temporal Uncertainty, Journal of the American Statistical Association, 10.1080/01621459.2019.1609480, (1-29), (2019).
  • A scalable Bayesian nonparametric model for large spatio-temporal data, Computational Statistics, 10.1007/s00180-019-00905-y, (2019).
  • A Decision Tree Approach for Spatially Interpolating Missing Land Cover Data and Classifying Satellite Images, Remote Sensing, 10.3390/rs11151796, 11, 15, (1796), (2019).
  • An adjusted parameter estimation for spatial regression with spatial confounding, Stochastic Environmental Research and Risk Assessment, 10.1007/s00477-019-01716-9, (2019).
  • GPU-Accelerated Simulation of Massive Spatial Data Based on the Modified Planar Rotator Model, Mathematical Geosciences, 10.1007/s11004-019-09835-3, (2019).
  • A memory-free spatial additive mixed modeling for big spatial data, Japanese Journal of Statistics and Data Science, 10.1007/s42081-019-00063-x, (2019).
  • Sequential Model-Based Optimization for Continuous Inputs with Finite Decision Space, Technometrics, 10.1080/00401706.2019.1665589, (1-13), (2019).
  • Fast Nonseparable Gaussian Stochastic Process With Application to Methylation Level Interpolation, Journal of Computational and Graphical Statistics, 10.1080/10618600.2019.1665534, (1-11), (2019).
  • Making Recursive Bayesian Inference Accessible, The American Statistician, 10.1080/00031305.2019.1665584, (1-10), (2019).
  • Boundary Detection Using a Bayesian Hierarchical Model for Multiscale Spatial Data, Technometrics, 10.1080/00401706.2019.1677268, (1-13), (2019).
  • A new method (M<sup>3</sup>Fusion v1) for combining observations and multiple model output for an improved estimate of the global surface ozone distribution, Geoscientific Model Development, 10.5194/gmd-12-955-2019, 12, 3, (955-978), (2019).
  • Computer model calibration with large non‐stationary spatial outputs: application to the calibration of a climate model, Journal of the Royal Statistical Society: Series C (Applied Statistics), 10.1111/rssc.12309, 68, 1, (51-78), (2018).
  • Dynamically Updated Spatially Varying Parameterizations of Hierarchical Bayesian Models for Spatial Data, Journal of Computational and Graphical Statistics, 10.1080/10618600.2018.1482761, 28, 1, (105-116), (2018).
  • Eigenvector Spatial Filtering for Large Data Sets: Fixed and Random Effects Approaches, Geographical Analysis, 10.1111/gean.12156, 51, 1, (23-49), (2018).
  • See more