Volume 75, Issue 1

Dimension reduction and alleviation of confounding for spatial generalized linear mixed models

John Hughes

University of Minnesota, Minneapolis, USA

Search for more papers by this author
Murali Haran

Pennsylvania State University, University Park, USA

Search for more papers by this author
First published: 09 October 2012
Citations: 111
Address for correspondence: John Hughes, Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA.
E‐mail: hughes@umn.edu

Abstract

Summary. Non‐Gaussian spatial data are very common in many disciplines. For instance, count data are common in disease mapping, and binary data are common in ecology. When fitting spatial regressions for such data, one needs to account for dependence to ensure reliable inference for the regression coefficients. The spatial generalized linear mixed model offers a very popular and flexible approach to modelling such data, but this model suffers from two major shortcomings: variance inflation due to spatial confounding and high dimensional spatial random effects that make fully Bayesian inference for such models computationally challenging. We propose a new parameterization of the spatial generalized linear mixed model that alleviates spatial confounding and speeds computation by greatly reducing the dimension of the spatial random effects. We illustrate the application of our approach to simulated binary, count and Gaussian spatial data sets, and to a large infant mortality data set.

Number of times cited according to CrossRef: 111

  • A Generalized Additive Model Correlating Blacklegged Ticks With White-Tailed Deer Density, Temperature, and Humidity in Maine, USA, 1990–2013, Journal of Medical Entomology, 10.1093/jme/tjaa180, (2020).
  • Assessing spatial confounding in cancer disease mapping using R, CANCER REPORTS, 10.1002/cnr2.1263, 3, 4, (2020).
  • Sustaining community infrastructure through community-based governance (the social practice of collective design policy), Smart and Sustainable Built Environment, 10.1108/SASBE-10-2019-0142, ahead-of-print, ahead-of-print, (2020).
  • Selecting a scale for spatial confounding adjustment, Journal of the Royal Statistical Society: Series A (Statistics in Society), 10.1111/rssa.12556, 183, 3, (1121-1143), (2020).
  • Time‐of‐flight secondary ion tandem mass spectrometry depth profiling of organic light‐emitting diode devices for elucidating the degradation process, Rapid Communications in Mass Spectrometry, 10.1002/rcm.8640, 34, 7, (2020).
  • Identification of the metabolites of erianin in rat and human by liquid chromatography/electrospray ionization tandem mass spectrometry, Rapid Communications in Mass Spectrometry, 10.1002/rcm.8661, 34, 7, (2020).
  • Low‐Profile Electromagnetic Holography by Using Coding Fabry–Perot Type Metasurface with In‐Plane Feeding, Advanced Optical Materials, 10.1002/adom.201902057, 8, 9, (2020).
  • Does the Babinski sign predict functional outcome in acute ischemic stroke?, Brain and Behavior, 10.1002/brb3.1575, 10, 4, (2020).
  • Spatial confounding in hurdle multilevel beta models: the case of the Brazilian Mathematical Olympics for Public Schools, Journal of the Royal Statistical Society: Series A (Statistics in Society), 10.1111/rssa.12551, 183, 3, (1051-1073), (2020).
  • A regularized spatial market segmentation method with Dirichlet process—Gaussian mixture prior, Spatial Statistics, 10.1016/j.spasta.2019.100402, 35, (100402), (2020).
  • Vecchia-Laplace approximations of generalized Gaussian processes for big non-Gaussian spatial data, Computational Statistics & Data Analysis, 10.1016/j.csda.2020.107081, (107081), (2020).
  • Bayesian estimation of spatial filters with Moran’s eigenvectors and hierarchical shrinkage priors, Spatial Statistics, 10.1016/j.spasta.2020.100450, (100450), (2020).
  • Dynamic multiscale spatiotemporal models for multivariate Gaussian data, Spatial Statistics, 10.1016/j.spasta.2020.100475, (100475), (2020).
  • A spatiotemporal model for multivariate occupancy data, Environmetrics, 10.1002/env.2657, 0, 0, (2020).
  • Elusive cats in our backyards: persistence of the North Chinese leopard (Panthera pardus japonensis) in a human‐dominated landscape in central China, Integrative Zoology, 10.1111/1749-4877.12482, 0, 0, (2020).
  • Restricted Spatial Regression Methods: Implications for Inference, Journal of the American Statistical Association, 10.1080/01621459.2020.1788949, (1-13), (2020).
  • Spatial regression and spillover effects in cluster randomized trials with count outcomes, Biometrics, 10.1111/biom.13316, 0, 0, (2020).
  • On spline-based approaches to spatial linear regression for geostatistical data, Environmental and Ecological Statistics, 10.1007/s10651-020-00441-9, (2020).
  • Modeling the Social and Spatial Proximity of Crime: Domestic and Sexual Violence Across Neighborhoods, Journal of Quantitative Criminology, 10.1007/s10940-020-09454-w, (2020).
  • A Machine Learning Approach to Delineating Neighborhoods from Geocoded Appraisal Data, ISPRS International Journal of Geo-Information, 10.3390/ijgi9070451, 9, 7, (451), (2020).
  • Utilizing bycatch camera-trap data for broad-scale occupancy and conservation: a case study of the brown hyaena Parahyaena brunnea , Oryx, 10.1017/S0030605319000747, (1-11), (2020).
  • Reduced-Dimensional Monte Carlo Maximum Likelihood for Latent Gaussian Random Field Models, Journal of Computational and Graphical Statistics, 10.1080/10618600.2020.1811106, (1-15), (2020).
  • Inferring the outcomes of rejected loans: an application of semisupervised clustering, Journal of the Royal Statistical Society: Series A (Statistics in Society), 10.1111/rssa.12534, 183, 2, (631-654), (2019).
  • Information asymmetry and leverage adjustments: a semiparametric varying‐coefficient approach, Journal of the Royal Statistical Society: Series A (Statistics in Society), 10.1111/rssa.12524, 183, 2, (581-605), (2019).
  • Identifying the effect of public holidays on daily demand for gas, Journal of the Royal Statistical Society: Series A (Statistics in Society), 10.1111/rssa.12504, 183, 2, (471-492), (2019).
  • Spatial cluster detection of regression coefficients in a mixed‐effects model, Environmetrics, 10.1002/env.2578, 31, 2, (2019).
  • Modeling sea‐level processes on the U.S. Atlantic Coast, Environmetrics, 10.1002/env.2609, 31, 4, (2019).
  • A Bayesian spatial categorical model for prediction to overlapping geographical areas in sample surveys, Journal of the Royal Statistical Society: Series A (Statistics in Society), 10.1111/rssa.12526, 183, 2, (535-563), (2019).
  • Autologistic network model on binary data for disease progression study, Biometrics, 10.1111/biom.13111, 75, 4, (1310-1320), (2019).
  • Spatial and Spatio-Temporal Analysis of Precipitation Data from South Carolina, Modern Statistical Methods for Spatial and Multivariate Data, 10.1007/978-3-030-11431-2_2, (31-50), (2019).
  • A Sparse Areal Mixed Model for Multivariate Outcomes, with an Application to Zero-Inflated Census Data, Modern Statistical Methods for Spatial and Multivariate Data, 10.1007/978-3-030-11431-2_3, (51-74), (2019).
  • Bayesian analysis of areal data with unknown adjacencies using the stochastic edge mixed effects model, Spatial Statistics, 10.1016/j.spasta.2019.100357, (100357), (2019).
  • Linking spatial patterns of terrestrial herbivore community structure to trophic interactions, eLife, 10.7554/eLife.44937, 8, (2019).
  • Modeling Response Time to Structure Fires, The American Statistician, 10.1080/00031305.2019.1695664, (1-9), (2019).
  • A memory-free spatial additive mixed modeling for big spatial data, Japanese Journal of Statistics and Data Science, 10.1007/s42081-019-00063-x, (2019).
  • Modular regression - a Lego system for building structured additive distributional regression models with tensor product interactions, TEST, 10.1007/s11749-019-00631-z, (2019).
  • Physically constrained spatiotemporal modeling: generating clear-sky constructions of land surface temperature from sparse, remotely sensed satellite data, Journal of Applied Statistics, 10.1080/02664763.2019.1681384, (1-21), (2019).
  • Negative Spatial Autocorrelation: One of the Most Neglected Concepts in Spatial Statistics, Stats, 10.3390/stats2030027, 2, 3, (388-415), (2019).
  • Bayesian Multi-Scale Spatio-Temporal Modeling of Precipitation in the Indus Watershed, Frontiers in Earth Science, 10.3389/feart.2019.00210, 7, (2019).
  • An adjusted parameter estimation for spatial regression with spatial confounding, Stochastic Environmental Research and Risk Assessment, 10.1007/s00477-019-01716-9, (2019).
  • An Area-Level Indicator of Latent Soda Demand: Spatial Statistical Modeling of Grocery Store Transaction Data to Characterize the Nutritional Landscape in Montreal, Canada, American Journal of Epidemiology, 10.1093/aje/kwz115, (2019).
  • Modeling Bronchiolitis Incidence Proportions in the Presence of Spatio-Temporal Uncertainty, Journal of the American Statistical Association, 10.1080/01621459.2019.1609480, (1-29), (2019).
  • Careful prior specification avoids incautious inference for log‐Gaussian Cox point processes, Journal of the Royal Statistical Society: Series C (Applied Statistics), 10.1111/rssc.12321, 68, 3, (543-564), (2018).
  • Stratified space–time infectious disease modelling, with an application to hand, foot and mouth disease in China, Journal of the Royal Statistical Society: Series C (Applied Statistics), 10.1111/rssc.12284, 67, 5, (1379-1398), (2018).
  • Joint Temporal Point Pattern Models for Proximate Species Occurrence in a Fixed Area Using Camera Trap Data, Journal of Agricultural, Biological and Environmental Statistics, 10.1007/s13253-018-0327-8, 23, 3, (334-357), (2018).
  • Bayesian Spatiotemporal Modeling for Detecting Neuronal Activation via Functional Magnetic Resonance Imaging, Handbook of Big Data Analytics, 10.1007/978-3-319-18284-1_19, (485-501), (2018).
  • Living on the edge: Opportunities for Amur tiger recovery in China, Biological Conservation, 10.1016/j.biocon.2017.11.008, 217, (269-279), (2018).
  • Generalised spatial and spatiotemporal autoregressive conditional heteroscedasticity, Spatial Statistics, 10.1016/j.spasta.2018.07.005, 26, (125-145), (2018).
  • A Computationally Efficient Projection-Based Approach for Spatial Generalized Linear Mixed Models, Journal of Computational and Graphical Statistics, 10.1080/10618600.2018.1425625, 27, 4, (701-714), (2018).
  • Bayesian geostatistical modelling of PM10 and PM2.5 surface level concentrations in Europe using high-resolution satellite-derived products, Environment International, 10.1016/j.envint.2018.08.041, 121, (57-70), (2018).
  • GIS and Spatial Statistics/Econometrics: An Overview, Comprehensive Geographic Information Systems, 10.1016/B978-0-12-409548-9.09680-9, (1-26), (2018).
  • Structural Equation Models for Dealing With Spatial Confounding, The American Statistician, 10.1080/00031305.2017.1305290, 72, 3, (239-252), (2018).
  • Bibliography, Occupancy Estimation and Modeling, 10.1016/B978-0-12-407197-1.00030-2, (597-630), (2018).
  • Spatial Modeling to Identify Sociodemographic Predictors of Hydraulic Fracturing Wastewater Injection Wells in Ohio Census Block Groups, Environmental Health Perspectives, 10.1289/EHP2663, 126, 6, (067008), (2018).
  • Extensions to Basic Approaches, Occupancy Estimation and Modeling, 10.1016/B978-0-12-407197-1.00008-9, (243-311), (2018).
  • Spatially Dependent Multiple Testing Under Model Misspecification, With Application to Detection of Anthropogenic Influence on Extreme Climate Events, Journal of the American Statistical Association, 10.1080/01621459.2018.1451335, (0-0), (2018).
  • Areal prediction of survey data using Bayesian spatial generalised linear models, Communications in Statistics - Simulation and Computation, 10.1080/03610918.2018.1530787, (1-16), (2018).
  • Comments on: Process modeling for slope and aspect with application to elevation data maps, TEST, 10.1007/s11749-018-0620-4, (2018).
  • Models for Geostatistical Binary Data: Properties and Connections, The American Statistician, 10.1080/00031305.2018.1444674, (1-8), (2018).
  • Modeling Efficiency of Foreign Aid Allocation in Malawi, The American Statistician, 10.1080/00031305.2018.1470032, (1-15), (2018).
  • A multivariate space–time model for analysing county level heart disease death rates by race and sex, Journal of the Royal Statistical Society: Series C (Applied Statistics), 10.1111/rssc.12215, 67, 1, (291-304), (2017).
  • Assessing Cyanobacterial Harmful Algal Blooms as Risk Factors for Amyotrophic Lateral Sclerosis, Neurotoxicity Research, 10.1007/s12640-017-9740-y, 33, 1, (199-212), (2017).
  • Confronting preferential sampling when analysing population distributions: diagnosis and model‐based triage, Methods in Ecology and Evolution, 10.1111/2041-210X.12803, 8, 11, (1535-1546), (2017).
  • A robust approach for exploring hemodynamics and thrombus growth associations in abdominal aortic aneurysms, Medical & Biological Engineering & Computing, 10.1007/s11517-016-1610-x, 55, 8, (1493-1506), (2017).
  • Habitat use and predicted range for the mainland clouded leopard Neofelis nebulosa in Peninsular Malaysia, Biological Conservation, 10.1016/j.biocon.2016.12.012, 206, (65-74), (2017).
  • Spatial Bayesian hierarchical model with variable selection to fMRI data, Spatial Statistics, 10.1016/j.spasta.2017.06.002, 21, (96-113), (2017).
  • Some robustness assessments of Moran eigenvector spatial filtering, Spatial Statistics, 10.1016/j.spasta.2017.09.001, 22, (155-179), (2017).
  • Bayesian Disease Mapping for Public Health, Disease Modelling and Public Health, Part A, 10.1016/bs.host.2017.05.001, (443-481), (2017).
  • Foraging Profile, Activity Budget and Spatial Ecology of Exclusively Natural-Foraging Chacma Baboons (Papio ursinus) on the Cape Peninsula, South Africa, International Journal of Primatology, 10.1007/s10764-017-9978-5, 38, 4, (751-779), (2017).
  • Dynamic spatio-temporal models for spatial data, Spatial Statistics, 10.1016/j.spasta.2017.02.005, 20, (206-220), (2017).
  • The association of air pollution and greenness with mortality and life expectancy in Spain: A small-area study, Environment International, 10.1016/j.envint.2016.11.009, 99, (170-176), (2017).
  • Introducing bootstrap methods to investigate coefficient non-stationarity in spatial regression models, Spatial Statistics, 10.1016/j.spasta.2017.07.006, 21, (241-261), (2017).
  • References, Animal Movement, 10.1201/9781315117744, (273-290), (2017).
  • The Bayesian Group Lasso for Confounded Spatial Data, Journal of Agricultural, Biological and Environmental Statistics, 10.1007/s13253-016-0274-1, 22, 1, (42-59), (2017).
  • A Moran coefficient-based mixed effects approach to investigate spatially varying relationships, Spatial Statistics, 10.1016/j.spasta.2016.12.001, 19, (68-89), (2017).
  • Bayesian Methods for Estimating Animal Abundance at Large Spatial Scales Using Data from Multiple Sources, Journal of Agricultural, Biological and Environmental Statistics, 10.1007/s13253-017-0276-7, 22, 2, (111-139), (2017).
  • Aggregate-level lead exposure, gun violence, homicide, and rape, PLOS ONE, 10.1371/journal.pone.0187953, 12, 11, (e0187953), (2017).
  • Modeling Spatial Covariance Using the Limiting Distribution of Spatio-Temporal Random Walks, Journal of the American Statistical Association, 10.1080/01621459.2016.1224714, 112, 518, (497-507), (2016).
  • Bayesian spatial binary classification, Spatial Statistics, 10.1016/j.spasta.2016.01.004, 16, (72-102), (2016).
  • Spatial variations in cervical cancer prevention in Colombia: Geographical differences and associated socio-demographic factors, Spatial and Spatio-temporal Epidemiology, 10.1016/j.sste.2016.07.002, 19, (78-90), (2016).
  • A survey on ecological regression for health hazard associated with air pollution, Spatial Statistics, 10.1016/j.spasta.2016.05.003, 18, (276-299), (2016).
  • How robust are the estimated effects of air pollution on health? Accounting for model uncertainty using Bayesian model averaging, Spatial and Spatio-temporal Epidemiology, 10.1016/j.sste.2016.04.001, 18, (53-62), (2016).
  • Hierarchical copula regression models for areal data, Spatial Statistics, 10.1016/j.spasta.2016.04.006, 17, (38-49), (2016).
  • Eigenvector selection with stepwise regression techniques to construct eigenvector spatial filters, Journal of Geographical Systems, 10.1007/s10109-015-0225-3, 18, 1, (67-85), (2016).
  • Bayesian Spatial Change of Support for Count-Valued Survey Data With Application to the American Community Survey, Journal of the American Statistical Association, 10.1080/01621459.2015.1117471, 111, 514, (472-487), (2016).
  • Fast, fully Bayesian spatiotemporal inference for fMRI data, Biostatistics, 10.1093/biostatistics/kxv044, 17, 2, (291-303), (2016).
  • Is Collaboration a Good Investment? Modeling the Link Between Funds Given to Collaborative Watershed Councils and Water Quality, Journal of Public Administration Research and Theory, 10.1093/jopart/muw033, 26, 4, (769-786), (2016).
  • Impact of socioeconomic inequalities on geographic disparities in cancer incidence: comparison of methods for spatial disease mapping, BMC Medical Research Methodology, 10.1186/s12874-016-0228-x, 16, 1, (2016).
  • Bayesian spatially dependent variable selection for small area health modeling, Statistical Methods in Medical Research, 10.1177/0962280215627184, (096228021562718), (2016).
  • Multi-retinal disease classification by reduced deep learning features, Neural Computing and Applications, 10.1007/s00521-015-2059-9, 28, 2, (329-334), (2015).
  • An ecological analysis of pertussis disease in Minnesota, 2009–2013, Epidemiology and Infection, 10.1017/S0950268815002046, 144, 04, (847-855), (2015).
  • Point process models for presence‐only analysis, Methods in Ecology and Evolution, 10.1111/2041-210X.12352, 6, 4, (366-379), (2015).
  • Spatial Analysis of Census Mail Response Rates: 1990–2010, Space-Time Integration in Geography and GIScience, 10.1007/978-94-017-9205-9, (145-156), (2015).
  • Bayesian semiparametric hierarchical empirical likelihood spatial models, Journal of Statistical Planning and Inference, 10.1016/j.jspi.2015.04.002, 165, (78-90), (2015).
  • Childhood lead exposure and sexually transmitted infections: New evidence, Environmental Research, 10.1016/j.envres.2015.10.009, 143, (131-137), (2015).
  • copCAR: A Flexible Regression Model for Areal Data, Journal of Computational and Graphical Statistics, 10.1080/10618600.2014.948178, 24, 3, (733-755), (2015).
  • Random effects specifications in eigenvector spatial filtering: a simulation study, Journal of Geographical Systems, 10.1007/s10109-015-0213-7, 17, 4, (311-331), (2015).
  • The SAR Model for Very Large Datasets: A Reduced Rank Approach, Econometrics, 10.3390/econometrics3020317, 3, 2, (317-338), (2015).
  • Characterizing urban vulnerability to heat stress using a spatially varying coefficient model, Spatial and Spatio-temporal Epidemiology, 10.1016/j.sste.2014.01.002, 8, (23-33), (2014).
  • Wombling Analysis of Childhood Tumor Rates in Florida, Statistics and Public Policy, 10.1080/2330443X.2014.913512, 1, 1, (60-67), (2014).
  • See more