A novel bootstrap procedure for assessing the relationship between class size and achievement
Abstract
Summary. There is on‐going concern about the relationship between class size and achievement for children in their first years of schooling. The Institute of Education's class size project was set up to address this issue and began recruiting in the autumn of 1996. However, because of the non‐normality of achievement measures, especially in mathematics, the results have hitherto been presented by using transformed achievement measures. This makes the interpretation difficult for non‐statisticians. Ideally, the data would be modelled on the original scale and a bootstrap procedure used to ensure that inferences are robust to non‐normality. However, the data are multilevel. In the paper we therefore propose a nonparametric residual bootstrap procedure that is suitable for multilevel models, show that it is consistent and present a simulation study which demonstrates its potential to yield substantial reductions in the difference between nominal and actual confidence interval coverage, compared with a parametric bootstrap, when the underlying distribution of the data is non‐normal. We then apply our approach to estimate the relationship between class size and achievement for children in their reception year, after adjusting for other possible determinants.
Citing Literature
Number of times cited according to CrossRef: 36
- Peter C. Austin, Douglas S. Lee, Estimating the Net Benefit of Improvements in Hospital Performance, Medical Care, 10.1097/MLR.0000000000001312, 58, 7, (651-657), (2020).
- Jenelle Wallace, Julia Lord, Lasse Dissing-Olesen, Beth Stevens, Venkatesh N Murthy, Microglial depletion disrupts normal functional development of adult-born neurons in the olfactory bulb, eLife, 10.7554/eLife.50531, 9, (2020).
- Rebecca C. Steorts, Timo Schmid, Nikos Tzavidis, Smoothing and Benchmarking for Small Area Estimation, International Statistical Review, 10.1111/insr.12373, 0, 0, (2020).
- Sumonkanti Das, Bappi Kumar, Luthful Alahi Kawsar, Disaggregated level child morbidity in Bangladesh: An application of small area estimation method, PLOS ONE, 10.1371/journal.pone.0220164, 15, 5, (e0220164), (2020).
- Mark H. C. Lai, Bootstrap Confidence Intervals for Multilevel Standardized Effect Size, Multivariate Behavioral Research, 10.1080/00273171.2020.1746902, (1-21), (2020).
- Peter C. Austin, George Leckie, Bootstrapped inference for variance parameters, measures of heterogeneity and random effects in multilevel logistic regression models, Journal of Statistical Computation and Simulation, 10.1080/00949655.2020.1797738, (1-25), (2020).
- Valery D. Jiongo, Pierre Nguimkeu, Bootstrapping mean‐squared errors of robust small‐area estimators: Application to the method‐of‐payments surveys data, Scandinavian Journal of Statistics, 10.1111/sjos.12394, 46, 4, (1274-1299), (2019).
- Daniel Flores-Agreda, Eva Cantoni, Bootstrap estimation of uncertainty in prediction for generalized linear mixed models, Computational Statistics & Data Analysis, 10.1016/j.csda.2018.08.006, 130, (1-17), (2019).
- Wen Shi, Xi Chen, Jennifer Shang, An Efficient Morris Method-Based Framework for Simulation Factor Screening, INFORMS Journal on Computing, 10.1287/ijoc.2018.0836, (2019).
- Wouter Talloen, Tom Loeys, Beatrijs Moerkerke, A bootstrap version of the Hausman test to assess the impact of cluster-level endogeneity beyond the random intercept model, Multivariate Behavioral Research, 10.1080/00273171.2018.1482192, (1-14), (2019).
- Yujiao Mai, Trung Ha, Julia N. Soulakova, Multimediation Method With Balanced Repeated Replications For Analysis Of Complex Surveys, Structural Equation Modeling: A Multidisciplinary Journal, 10.1080/10705511.2018.1559065, (1-7), (2019).
- Alina Peluso, Paolo Berta, Veronica Vinciotti, Do pay-for-performance incentives lead to a better health outcome?, Empirical Economics, 10.1007/s00181-018-1425-8, 56, 6, (2167-2184), (2018).
- George Leckie, Avoiding Bias When Estimating the Consistency and Stability of Value-Added School Effects, Journal of Educational and Behavioral Statistics, 10.3102/1076998618755351, 43, 4, (440-468), (2018).
- Harvey Goldstein, James Carpenter, Michael G. Kenward, Bayesian models for weighted data with missing values: a bootstrap approach, Journal of the Royal Statistical Society: Series C (Applied Statistics), 10.1111/rssc.12259, 67, 4, (1071-1081), (2018).
- Trine Filges, Christoffer Scavenius Sonne‐Schmidt, Bjørn Christian Viinholt Nielsen, Small class sizes for improving student achievement in primary and secondary schools: a systematic review, Campbell Systematic Reviews, 10.4073/csr.2018.10, 14, 1, (1-107), (2018).
- Tom M. Palmer, Corrie M. Macdonald-Wallis, Debbie A. Lawlor, Kate Tilling, Estimating Adjusted Associations between Random Effects from Multilevel Models: The Reffadjust Package, The Stata Journal: Promoting communications on statistics and Stata, 10.1177/1536867X1401400109, 14, 1, (119-140), (2018).
- Jieru Chen, Audrey J. Leroux, Residual Normality Assumption and the Estimation of Multiple Membership Random Effects Models, Multivariate Behavioral Research, 10.1080/00273171.2018.1533445, (1-16), (2018).
- N.-T. Ha, A.R. Sharifi, J. Heise, M. Schlather, U. Schnyder, J.J. Gross, F. Schmitz-Hsu, R.M. Bruckmaier, H. Simianer, A reaction norm sire model to study the effect of metabolic challenge in early lactation on the functional longevity of dairy cows, Journal of Dairy Science, 10.3168/jds.2016-12031, 100, 5, (3742-3753), (2017).
- Mark Reiser, Lanlan Yao, Xiao Wang, Jeanne Wilcox, Shelley Gray, A Comparison of Bootstrap Confidence Intervals for Multi-level Longitudinal Data Using Monte-Carlo Simulation, Monte-Carlo Simulation-Based Statistical Modeling, 10.1007/978-981-10-3307-0_17, (367-403), (2017).
- Yimeng Xie, Caleb B. King, Yili Hong, Qingyu Yang, Semiparametric Models for Accelerated Destructive Degradation Test Data Analysis, Technometrics, 10.1080/00401706.2017.1321584, (1-13), (2017).
- A Sayers, J Heron, ADAC Smith, C Macdonald-Wallis, MS Gilthorpe, F Steele, K Tilling, Joint modelling compared with two stage methods for analysing longitudinal data and prospective outcomes: A simulation study of childhood growth and BP, Statistical Methods in Medical Research, 10.1177/0962280214548822, 26, 1, (437-452), (2016).
- Robert Bagchi, Janine B. Illian, A method for analysing replicated point patterns in ecology, Methods in Ecology and Evolution, 10.1111/2041-210X.12335, 6, 4, (482-490), (2015).
- Christine Boev, Yinglin Xia, Nurse-Physician Collaboration and Hospital-Acquired Infections in Critical Care, Critical Care Nurse, 10.4037/ccn2015809, 35, 2, (66-72), (2015).
- Guillermo Vallejo, Paula Fernández, Marcelino Cuesta, Pablo E. Livacic-Rojas, Effects of Modeling the Heterogeneity on Inferences Drawn from Multilevel Designs, Multivariate Behavioral Research, 10.1080/00273171.2014.955604, 50, 1, (75-90), (2015).
- Lucia Modugno, Simone Giannerini, The Wild Bootstrap for Multilevel Models, Communications in Statistics - Theory and Methods, 10.1080/03610926.2013.802807, 44, 22, (4812-4825), (2015).
- Carlo Fezzi, Ian Bateman, The Impact of Climate Change on Agriculture: Nonlinear Effects and Aggregation Bias in Ricardian Models of Farmland Values, Journal of the Association of Environmental and Resource Economists, 10.1086/680257, 2, 1, (57-92), (2015).
- Raymond Chambers, Hukum Chandra, A Random Effect Block Bootstrap for Clustered Data, Journal of Computational and Graphical Statistics, 10.1080/10618600.2012.681216, 22, 2, (452-470), (2013).
- Herman Aguinis, Ryan K. Gottfredson, Steven Andrew Culpepper, Best-Practice Recommendations for Estimating Cross-Level Interaction Effects Using Multilevel Modeling, Journal of Management, 10.1177/0149206313478188, 39, 6, (1490-1528), (2013).
- Ying Xue, Linda H. Aiken, Deborah A. Freund, Katia Noyes, Quality Outcomes of Hospital Supplemental Nurse Staffing, JONA: The Journal of Nursing Administration, 10.1097/NNA.0b013e318274b5bc, 42, 12, (580-585), (2012).
- Per Liv, Svend Erik Mathiassen, Susanne Wulff Svendsen, Accuracy and precision of variance components in occupational posture recordings: a simulation study of different data collection strategies, BMC Medical Research Methodology, 10.1186/1471-2288-12-58, 12, 1, (2012).
- Lazlo Ring, Timothy Bickmore, Daniel Schulman, Longitudinal Affective Computing, Intelligent Virtual Agents, 10.1007/978-3-642-33197-8_9, (89-96), (2012).
- Theoretical and Empirical Efficiency of Sampling Strategies for Estimating Upper Arm Elevation, The Annals of Occupational Hygiene, 10.1093/annhyg/meq095, (2011).
- Keenan A. Pituch, Laura M. Stapleton, The Performance of Methods to Test Upper-Level Mediation in the Presence of Nonnormal Data, Multivariate Behavioral Research, 10.1080/00273170802034844, 43, 2, (237-267), (2008).
- Henrik Winkelmann, Marja Heuvel-Panhuizen, Alexander Robitzsch, Gender differences in the mathematics achievements of German primary school students: results from a German large-scale study, ZDM, 10.1007/s11858-008-0124-x, 40, 4, (601-616), (2008).
- C. A. Field, A. H. Welsh, Bootstrapping clustered data, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 10.1111/j.1467-9868.2007.00593.x, 69, 3, (369-390), (2007).
- David Afshartous, Michael Wolf, Avoiding ‘data snooping’ in multilevel and mixed effects models, Journal of the Royal Statistical Society: Series A (Statistics in Society), 10.1111/j.1467-985X.2007.00494.x, 170, 4, (1035-1059), (2007).




