Variable selection with error control: another look at stability selection
Abstract
Summary. Stability selection was recently introduced by Meinshausen and Bühlmann as a very general technique designed to improve the performance of a variable selection algorithm. It is based on aggregating the results of applying a selection procedure to subsamples of the data. We introduce a variant, called complementary pairs stability selection, and derive bounds both on the expected number of variables included by complementary pairs stability selection that have low selection probability under the original procedure, and on the expected number of high selection probability variables that are excluded. These results require no (e.g. exchangeability) assumptions on the underlying model or on the quality of the original selection procedure. Under reasonable shape restrictions, the bounds can be further tightened, yielding improved error control, and therefore increasing the applicability of the methodology.
Citing Literature
Number of times cited according to CrossRef: 104
- Mathias Cardner, Mustafa Yalcinkaya, Sandra Goetze, Edlira Luca, Miroslav Balaz, Monika Hunjadi, Johannes Hartung, Andrej Shemet, Nicolle Kränkel, Silvija Radosavljevic, Michaela Keel, Alaa Othman, Gergely Karsai, Thorsten Hornemann, Manfred Claassen, Gerhard Liebisch, Erick Carreira, Andreas Ritsch, Ulf Landmesser, Jan Krützfeldt, Christian Wolfrum, Bernd Wollscheid, Niko Beerenwinkel, Lucia Rohrer, Arnold von Eckardstein, Structure-function relationships of HDL in diabetes and coronary heart disease, JCI Insight, 10.1172/jci.insight.131491, 5, 1, (2020).
- Armeen Taeb, Parikshit Shah, Venkat Chandrasekaran, False discovery and its control in low rank estimation, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 10.1111/rssb.12387, 82, 4, (997-1027), (2020).
- Jana Janková, Rajen D. Shah, Peter Bühlmann, Richard J. Samworth, Goodness‐of‐fit testing in high dimensional generalized linear models, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 10.1111/rssb.12371, 82, 3, (773-795), (2020).
- Mehmet Ali Kaygusuz, Vilda Purutçuoğlu, The Model Selection Methods for Sparse Biological Networks, Artificial Intelligence and Applied Mathematics in Engineering Problems, 10.1007/978-3-030-36178-5_10, (107-126), (2020).
- Yuxin Wang, Kathrin Fenner, Damian E. Helbling, Clustering micropollutants based on initial biotransformations for improved prediction of micropollutant removal during conventional activated sludge treatment, Environmental Science: Water Research & Technology, 10.1039/C9EW00838A, (2020).
- Marinela Capanu, Mihai Giurcanu, Colin B. Begg, Mithat Gönen, Optimized variable selection via repeated data splitting, Statistics in Medicine, 10.1002/sim.8538, 39, 16, (2167-2184), (2020).
- Kang K. Yan, Xiaofei Wang, Wendy Lam, Varut Vardhanabhuti, Anne W.M. Lee, Herbert H. Pang, Radiomics analysis using stability selection supervised component analysis for right-censored survival data, Computers in Biology and Medicine, 10.1016/j.compbiomed.2020.103959, (103959), (2020).
- Eva Govarts, Lützen Portengen, Nathalie Lambrechts, Liesbeth Bruckers, Elly Den Hond, Adrian Covaci, Vera Nelen, Tim S Nawrot, Ilse Loots, Isabelle Sioen, Willy Baeyens, Bert Morrens, Greet Schoeters, Roel Vermeulen, Early-life exposure to multiple persistent organic pollutants and metals and birth weight: Pooled analysis in four Flemish birth cohorts, Environment International, 10.1016/j.envint.2020.106149, 145, (106149), (2020).
- Heejong Sung, Paula L. Hyland, Alexander Pemov, Jeremy A. Sabourin, Andrea M. Baldwin, Sara Bass, Kedest Teshome, Wen Luo, Brigitte C. Widemann, Douglas R. Stewart, Alexander F. Wilson, Genome‐wide association study of café‐au‐lait macule number in neurofibromatosis type 1, Molecular Genetics & Genomic Medicine, 10.1002/mgg3.1400, 8, 10, (2020).
- Tim Richter-Heitmann, Benjamin Hofner, Franz-Sebastian Krah, Johannes Sikorski, Pia K. Wüst, Boyke Bunk, Sixing Huang, Kathleen M. Regan, Doreen Berner, Runa S. Boeddinghaus, Sven Marhan, Daniel Prati, Ellen Kandeler, Jörg Overmann, Michael W. Friedrich, Stochastic Dispersal Rather Than Deterministic Selection Explains the Spatio-Temporal Distribution of Soil Bacteria in a Temperate Grassland, Frontiers in Microbiology, 10.3389/fmicb.2020.01391, 11, (2020).
- Sonja Greven, Fabian Scheipl, Comments on: Inference and computation with Generalized Additive Models and their extensions, TEST, 10.1007/s11749-020-00714-2, (2020).
- Rakib Al-Fahad, Mohammed Yeasin, Gavin M Bidelman, Decoding of single-trial EEG reveals unique states of functional brain connectivity that drive rapid speech categorization decisions, Journal of Neural Engineering, 10.1088/1741-2552/ab6040, 17, 1, (016045), (2020).
- Kipoong Kim, Jajoon Koo, Hokeun Sun, An empirical threshold of selection probability for analysis of high-dimensional correlated data, Journal of Statistical Computation and Simulation, 10.1080/00949655.2020.1739286, (1-12), (2020).
- Timothy I. Cannings, Random projections: Data perturbation for classification problems, WIREs Computational Statistics , 10.1002/wics.1499, 0, 0, (2020).
- Claude Renaux, Laura Buzdugan, Markus Kalisch, Peter Bühlmann, Rejoinder on: Hierarchical inference for genome-wide association studies: a view on methodology with software, Computational Statistics, 10.1007/s00180-019-00948-1, (2020).
- Junwei Lu, Mladen Kolar, Han Liu, Kernel Meets Sieve: Post-Regularization Confidence Bands for Sparse Additive Model, Journal of the American Statistical Association, 10.1080/01621459.2019.1689984, (1-16), (2020).
- Claude Renaux, Laura Buzdugan, Markus Kalisch, Peter Bühlmann, Hierarchical inference for genome-wide association studies: a view on methodology with software, Computational Statistics, 10.1007/s00180-019-00939-2, (2020).
- Simon Klau, Marie‐Laure Martin‐Magniette, Anne‐Laure Boulesteix, Sabine Hoffmann, Sampling uncertainty versus method uncertainty: A general framework with applications to omics biomarker selection, Biometrical Journal, 10.1002/bimj.201800309, 62, 3, (670-687), (2019).
- Bin Gao, Xu Liu, Hongzhe Li, Yuehua Cui, Integrative analysis of genetical genomics data incorporating network structures, Biometrics, 10.1111/biom.13072, 75, 4, (1063-1075), (2019).
- Michael Altenbuchinger, Antoine Weihs, John Quackenbush, Hans Jörgen Grabe, Helena U. Zacharias, Gaussian and Mixed Graphical Models as (multi-)omics data analysis tools, Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms, 10.1016/j.bbagrm.2019.194418, (194418), (2019).
- Peiyang Guo, Jacqueline C.K. Lam, Victor O.K. Li, Drivers of domestic electricity users’ price responsiveness: A novel machine learning approach, Applied Energy, 10.1016/j.apenergy.2018.11.014, 235, (900-913), (2019).
- Narges Sohrabi, Hadi Movaghari, Reliable Factors of Capital Structure: Stability Selection Approach, The Quarterly Review of Economics and Finance, 10.1016/j.qref.2019.11.001, (2019).
- Rakib Al-Fahad, Mohammed Yeasin, John O. Glass, Heather M. Conklin, Lisa M. Jacola, Wilburn E. Reddick, Early Imaging-Based Predictive Modeling of Cognitive Performance Following Therapy for Childhood ALL, IEEE Access, 10.1109/ACCESS.2019.2946240, 7, (146662-146674), (2019).
- Lukas Pfannschmidt, Christina Gopfert, Ursula Neumann, Dominik Heider, Barbara Hammer, undefined, 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 10.1109/CIBCB.2019.8791489, (1-10), (2019).
- Pan Wang, Hongmei Jiang, Variable Selection for High Dimensional Metagenomic Data, Contemporary Biostatistics with Biopharmaceutical Applications, 10.1007/978-3-030-15310-6_2, (19-32), (2019).
- Moritz Berger, Matthias Schmid, Flexible modeling of ratio outcomes in clinical and epidemiological research, Statistical Methods in Medical Research, 10.1177/0962280219891195, (096228021989119), (2019).
- Ron Ammar, Pitchumani Sivakumar, Gabor Jarai, John Ryan Thompson, A robust data-driven genomic signature for idiopathic pulmonary fibrosis with applications for translational model selection, PLOS ONE, 10.1371/journal.pone.0215565, 14, 4, (e0215565), (2019).
- Jing Ma, Alla Karnovsky, Farsad Afshinnia, Janis Wigginton, Daniel J Rader, Loki Natarajan, Kumar Sharma, Anna C Porter, Mahboob Rahman, Jiang He, Lee Hamm, Tariq Shafi, Debbie Gipson, Crystal Gadegbeku, Harold Feldman, George Michailidis, Subramaniam Pennathur, Differential network enrichment analysis reveals novel lipid pathways in chronic kidney disease, Bioinformatics, 10.1093/bioinformatics/btz114, (2019).
- David Rügamer, Sonja Greven, Inference for $$L_2$$ L 2 -Boosting, Statistics and Computing, 10.1007/s11222-019-09882-0, (2019).
- Gunther Schauberger, Patrick Mair, A regularization approach for the detection of differential item functioning in generalized partial credit models, Behavior Research Methods, 10.3758/s13428-019-01224-2, (2019).
- Benjamin Frot, Luke Jostins, Gilean McVean, Graphical Model Selection for Gaussian Conditional Random Fields in the Presence of Latent Variables, Journal of the American Statistical Association, 10.1080/01621459.2018.1434531, 114, 526, (723-734), (2018).
- Jie Yang, Mengru Zhang, Hongshik Ahn, Qing Zhang, Tony B. Jin, Ien Li, Matthew Nemesure, Nandita Joshi, Haoran Jiang, Jeffrey M. Miller, Robert Todd Ogden, Eva Petkova, Matthew S. Milak, Mary Elizabeth Sublette, Gregory M. Sullivan, Madhukar H. Trivedi, Myrna Weissman, Patrick J. McGrath, Maurizio Fava, Benji T. Kurian, Diego A. Pizzagalli, Crystal M. Cooper, Melvin McInnis, Maria A. Oquendo, Joseph John Mann, Ramin V. Parsey, Christine DeLorenzo, Development and evaluation of a multimodal marker of major depressive disorder, Human Brain Mapping, 10.1002/hbm.24282, 39, 11, (4420-4439), (2018).
- Carine A. Dantas, Romulo de O. Nunes, Anne. M. P. Canuto, Joao C. Xavier-Juunior, undefined, 2018 International Joint Conference on Neural Networks (IJCNN), 10.1109/IJCNN.2018.8489162, (1-8), (2018).
- Jhoseph Jesus, Anne Canuto, Daniel Araujo, undefined, 2018 International Joint Conference on Neural Networks (IJCNN), 10.1109/IJCNN.2018.8489680, (1-8), (2018).
- Unal Mutlu, Mohammad K. Ikram, Gennady V. Roshchupkin, Pieter W. M. Bonnemaijer, Johanna M. Colijn, Johannes R. Vingerling, Wiro J. Niessen, Mohammad A. Ikram, Caroline C. W. Klaver, Meike W. Vernooij, Thinner retinal layers are associated with changes in the visual pathway: A population‐based study, Human Brain Mapping, 10.1002/hbm.24246, 39, 11, (4290-4301), (2018).
- Akitoshi Ogawa, Takahiro Osada, Masaki Tanaka, Masaaki Hori, Shigeki Aoki, Aki Nikolaidis, Michael P. Milham, Seiki Konishi, Striatal subdivisions that coherently interact with multiple cerebrocortical networks, Human Brain Mapping, 10.1002/hbm.24275, 39, 11, (4349-4359), (2018).
- Dorien van Blooijs, Frans S. S. Leijten, Peter C. van Rijen, Hil G. E. Meijer, Geertjan J. M. Huiskamp, Evoked directional network characteristics of epileptogenic tissue derived from single pulse electrical stimulation, Human Brain Mapping, 10.1002/hbm.24309, 39, 11, (4611-4622), (2018).
- Rong Jiang, Xueping Hu, Keming Yu, Weimin Qian, Composite quantile regression for massive datasets, Statistics, 10.1080/02331888.2018.1500579, 52, 5, (980-1004), (2018).
- Wenda Zhou, Shaw-Hwa Lo, Analysis of genotype by methylation interactions through sparsity-inducing regularized regression, BMC Proceedings, 10.1186/s12919-018-0145-6, 12, S9, (2018).
- Andrés Hoyos-Idrobo, Gaël Varoquaux, Yannick Schwartz, Bertrand Thirion, FReM – Scalable and stable decoding with fast regularized ensemble of models, NeuroImage, 10.1016/j.neuroimage.2017.10.005, 180, (160-172), (2018).
- Jonathan Gillard, Anatoly Zhigljavsky, Optimal estimation of direction in regression models with large number of parameters, Applied Mathematics and Computation, 10.1016/j.amc.2017.05.050, 318, (281-289), (2018).
- Etienne Patin, Milena Hasan, Jacob Bergstedt, Vincent Rouilly, Valentina Libri, Alejandra Urrutia, Cécile Alanio, Petar Scepanovic, Christian Hammer, Friederike Jönsson, Benoît Beitz, Hélène Quach, Yoong Wearn Lim, Julie Hunkapiller, Magge Zepeda, Cherie Green, Barbara Piasecka, Claire Leloup, Lars Rogge, François Huetz, Isabelle Peguillet, Olivier Lantz, Magnus Fontes, James P. Di Santo, Stéphanie Thomas, Jacques Fellay, Darragh Duffy, Lluís Quintana-Murci, Matthew L. Albert, Natural variation in the parameters of innate immune cells is preferentially driven by genetic factors, Nature Immunology, 10.1038/s41590-018-0049-7, 19, 3, (302-314), (2018).
- Christian Rosas-Salazar, Meghan H. Shilts, Andrey Tovchigrechko, Seth Schobel, James D. Chappell, Emma K. Larkin, Tebeb Gebretsadik, Rebecca A. Halpin, Karen E. Nelson, Martin L. Moore, Larry J. Anderson, R. Stokes Peebles, Suman R. Das, Tina V. Hartert, Nasopharyngeal Lactobacillus is associated with a reduced risk of childhood wheezing illnesses following acute respiratory syncytial virus infection in infancy, Journal of Allergy and Clinical Immunology, 10.1016/j.jaci.2017.10.049, (2018).
- Longmei Chen, Alan T. K. Wan, Geoffrey Tso, Xinyu Zhang, A model averaging approach for the ordered probit and nested logit models with applications, Journal of Applied Statistics, 10.1080/02664763.2018.1450367, 45, 16, (3012-3052), (2018).
- Aditya Ganeshpurkar, Rahul Maheshwari, Muktika Tekade, Rakesh K. Tekade, Concepts of Hypothesis Testing and Types of Errors, Dosage Form Design Parameters, 10.1016/B978-0-12-814421-3.00007-5, (257-280), (2018).
- Jingwen Yan, Lei Du, Sungeun Kim, Shannon L. Risacher, Heng Huang, Mark Inlow, Jason H. Moore, Andrew J. Saykin, Li Shen, Bootstrapped Sparse Canonical Correlation Analysis, Imaging Genetics, 10.1016/B978-0-12-813968-4.00006-7, (101-117), (2018).
- A. Mayr, H. Binder, O. Gefeller, M. Schmid, Extending Statistical Boosting, Methods of Information in Medicine, 10.3414/ME13-01-0123, 53, 06, (428-435), (2018).
- Jacob Bien, Irina Gaynanova, Johannes Lederer, Christian L. Müller, Prediction error bounds for linear regression with the TREX, TEST, 10.1007/s11749-018-0584-4, (2018).
- Loann Desboulets, A Review on Variable Selection in Regression Analysis, Econometrics, 10.3390/econometrics6040045, 6, 4, (45), (2018).
- Joann Romano-Keeler, Meghan H. Shilts, Andrey Tovchigrechko, Chunlin Wang, Robert M. Brucker, Daniel J. Moore, Christopher Fonnesbeck, Shufang Meng, Hernan Correa, Harold N. Lovvorn, Yi-Wei Tang, Lora Hooper, Seth R. Bordenstein, Suman R. Das, Jörn-Hendrik Weitkamp, Distinct mucosal microbial communities in infants with surgical necrotizing enterocolitis correlate with age and antibiotic exposure, PLOS ONE, 10.1371/journal.pone.0206366, 13, 10, (e0206366), (2018).
- Andrew J Sedgewick, Kristina Buschur, Ivy Shi, Joseph D Ramsey, Vineet K Raghu, Dimitris V Manatakis, Yingze Zhang, Jessica Bon, Divay Chandra, Chad Karoleski, Frank C Sciurba, Peter Spirtes, Clark Glymour, Panayiotis V Benos, Mixed graphical models for integrative causal analysis with application to chronic lung disease diagnosis and prognosis, Bioinformatics, 10.1093/bioinformatics/bty769, (2018).
- Yafei Lyu, Lingzhou Xue, Feipeng Zhang, Hillary Koch, Laura Saba, Katerina Kechris, Qunhua Li, Condition-adaptive fused graphical lasso (CFGL): An adaptive procedure for inferring condition-specific gene co-expression network, PLOS Computational Biology, 10.1371/journal.pcbi.1006436, 14, 9, (e1006436), (2018).
- Yingying Fan, Emre Demirkaya, Gaorong Li, Jinchi Lv, RANK: Large-Scale Inference With Graphical Nonlinear Knockoffs, Journal of the American Statistical Association, 10.1080/01621459.2018.1546589, (1-43), (2018).
- Wenhao Hu, Eric B. Laber, Clay Barker, Leonard A. Stefanski, Assessing Tuning Parameter Selection Variability in Penalized Regression, Technometrics, 10.1080/00401706.2018.1513380, (1-11), (2018).
- Rajen D. Shah, Peter Bühlmann, Goodness‐of‐fit tests for high dimensional linear models, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 10.1111/rssb.12234, 80, 1, (113-135), (2017).
- Janek Thomas, Andreas Mayr, Bernd Bischl, Matthias Schmid, Adam Smith, Benjamin Hofner, Gradient boosting for distributional regression: faster tuning and improved variable selection via noncyclical updates, Statistics and Computing, 10.1007/s11222-017-9754-6, 28, 3, (673-687), (2017).
- Chun-Xia Zhang, Jiang-She Zhang, Guan-Wei Wang, Nan-Nan Ji, A novel bagging approach for variable ranking and selection via a mixed importance measure, Journal of Applied Statistics, 10.1080/02664763.2017.1391181, 45, 10, (1734-1755), (2017).
- Xiang Liu, Tian Chen, Yuanzhang Li, Hua Liang, Bootstrap-Based LASSO-Type Selection to Build Generalized Additive Partially Linear Models for High-Dimensional Data, Monte-Carlo Simulation-Based Statistical Modeling, 10.1007/978-981-10-3307-0_18, (405-424), (2017).
- Ruben Dezeure, Peter Bühlmann, Cun-Hui Zhang, High-dimensional simultaneous inference with the bootstrap, TEST, 10.1007/s11749-017-0554-2, 26, 4, (685-719), (2017).
- Timothy I. Cannings, Richard J. Samworth, Random‐projection ensemble classification, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 10.1111/rssb.12228, 79, 4, (959-1035), (2017).
- Rakib Al-Fahad, Mohammed Yeasin, A S M Iftekhar Anam, Bahareh Elahian, undefined, 2017 International Joint Conference on Neural Networks (IJCNN), 10.1109/IJCNN.2017.7965989, (1202-1209), (2017).
- Moritz Gerstung, Elli Papaemmanuil, Inigo Martincorena, Lars Bullinger, Verena I Gaidzik, Peter Paschka, Michael Heuser, Felicitas Thol, Niccolo Bolli, Peter Ganly, Arnold Ganser, Ultan McDermott, Konstanze Döhner, Richard F Schlenk, Hartmut Döhner, Peter J Campbell, Precision oncology for acute myeloid leukemia using a knowledge bank approach, Nature Genetics, 10.1038/ng.3756, 49, 3, (332-340), (2017).
- Chi Ma, undefined, 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA)(, 10.1109/ICBDA.2017.8078704, (48-52), (2017).
- Sonja Greven, Fabian Scheipl, A general framework for functional regression modelling, Statistical Modelling: An International Journal, 10.1177/1471082X16681317, 17, 1-2, (1-35), (2017).
- Sonja Greven, Fabian Scheipl, Rejoinder, Statistical Modelling: An International Journal, 10.1177/1471082X16689188, 17, 1-2, (100-115), (2017).
- Mathias Drton, Marloes H. Maathuis, Structure Learning in Graphical Modeling, Annual Review of Statistics and Its Application, 10.1146/annurev-statistics-060116-053803, 4, 1, (365-393), (2017).
- Martha Imprialou, Enrico Petretto, Leonardo Bottolo, Expression QTLs Mapping and Analysis: A Bayesian Perspective, Systems Genetics, 10.1007/978-1-4939-6427-7_8, (189-215), (2017).
- Maja Mitrović, Maja Marković, Stefan Zdravković, Statistical Approach for Ranking OECD Countries Based on Composite GICSES Index and I-Distance Method, Emerging Trends in the Development and Application of Composite Indicators, 10.4018/978-1-5225-0714-7.ch014, (324-348), (2017).
- Janek Thomas, Tobias Hepp, Andreas Mayr, Bernd Bischl, Probing for Sparse and Fast Variable Selection with Model-Based Boosting, Computational and Mathematical Methods in Medicine, 10.1155/2017/1421409, 2017, (1-8), (2017).
- Andreas Mayr, Benjamin Hofner, Elisabeth Waldmann, Tobias Hepp, Sebastian Meyer, Olaf Gefeller, An Update on Statistical Boosting in Biomedicine, Computational and Mathematical Methods in Medicine, 10.1155/2017/6083072, 2017, (1-12), (2017).
- Seyed Mostafa Kia, Fabian Pedregosa, Anna Blumenthal, Andrea Passerini, Group-level spatio-temporal pattern recovery in MEG decoding using multi-task joint feature learning, Journal of Neuroscience Methods, 10.1016/j.jneumeth.2017.05.004, 285, (97-108), (2017).
- Emma Schwager, Himel Mallick, Steffen Ventz, Curtis Huttenhower, A Bayesian method for detecting pairwise associations in compositional data, PLOS Computational Biology, 10.1371/journal.pcbi.1005852, 13, 11, (e1005852), (2017).
- JinXing Che, YouLong Yang, Stochastic correlation coefficient ensembles for variable selection, Journal of Applied Statistics, 10.1080/02664763.2016.1221913, 44, 10, (1721-1742), (2016).
- PJ Newcombe, H Raza Ali, FM Blows, E Provenzano, PD Pharoah, C Caldas, S Richardson, Weibull regression with Bayesian variable selection to identify prognostic tumour markers of breast cancer survival, Statistical Methods in Medical Research, 10.1177/0962280214548748, 26, 1, (414-436), (2016).
- Sarah Brockhaus, Michael Melcher, Friedrich Leisch, Sonja Greven, Boosting flexible functional regression models with a high number of functional historical effects, Statistics and Computing, 10.1007/s11222-016-9662-1, 27, 4, (913-926), (2016).
- Miaolin Fan, Chun-An Chou, Exploring stability-based voxel selection methods in MVPA using cognitive neuroimaging data: a comprehensive study, Brain Informatics, 10.1007/s40708-016-0048-0, 3, 3, (193-203), (2016).
- Jonas Zierer, Tess Pallister, Pei-Chien Tsai, Jan Krumsiek, Jordana T. Bell, Gordan Lauc, Tim D Spector, Cristina Menni, Gabi Kastenmüller, Exploring the molecular basis of age-related disease comorbidities using a multi-omics graphical model, Scientific Reports, 10.1038/srep37646, 6, 1, (2016).
- Chloé-Agathe Azencott, Network-Guided Biomarker Discovery, Machine Learning for Health Informatics, 10.1007/978-3-319-50478-0_16, (319-336), (2016).
- Andreas Mayr, Benjamin Hofner, Matthias Schmid, Boosting the discriminatory power of sparse survival models via optimization of the concordance index and stability selection, BMC Bioinformatics, 10.1186/s12859-016-1149-8, 17, 1, (2016).
- Chun-Xia Zhang, Jiang-She Zhang, Sang-Woon Kim, PBoostGA: pseudo-boosting genetic algorithm for variable ranking and selection, Computational Statistics, 10.1007/s00180-016-0652-8, 31, 4, (1237-1262), (2016).
- Chinghway Lim, Bin Yu, Estimation Stability With Cross-Validation (ESCV), Journal of Computational and Graphical Statistics, 10.1080/10618600.2015.1020159, 25, 2, (464-492), (2016).
- Rajen D. Shah, Richard J. Samworth, Comment, Journal of the American Statistical Association, 10.1080/01621459.2015.1102142, 110, 512, (1439-1442), (2016).
- Han Liu, John Mulvey, Tianqi Zhao, A semiparametric graphical modelling approach for large-scale equity selection, Quantitative Finance, 10.1080/14697688.2015.1101149, 16, 7, (1053-1067), (2015).
- Andre Beinrucker, Ürün Dogan, Gilles Blanchard, Extensions of stability selection using subsamples of observations and covariates, Statistics and Computing, 10.1007/s11222-015-9589-y, 26, 5, (1059-1077), (2015).
- Xingyu Tang, Heng Lian, Mean and quantile boosting for partially linear additive models, Statistics and Computing, 10.1007/s11222-015-9592-3, 26, 5, (997-1008), (2015).
- Max Grazier G'Sell, Stefan Wager, Alexandra Chouldechova, Robert Tibshirani, Sequential selection procedures and false discovery rate control, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 10.1111/rssb.12122, 78, 2, (423-444), (2015).
- Xiangyu Wang, Chenlei Leng, High dimensional ordinary least squares projection for screening variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 10.1111/rssb.12127, 78, 3, (589-611), (2015).
- Hassan S. Uraibi, Habshah Midi, Sohel Rana, Robust Stability Best Subset Selection for Autocorrelated Data Based on Robust Location and Dispersion Estimator, Journal of Probability and Statistics, 10.1155/2015/432986, 2015, (1-8), (2015).
- Yuguang Ban, Lingling An, Hongmei Jiang, Investigating microbial co-occurrence patterns based on metagenomic compositional data, Bioinformatics, 10.1093/bioinformatics/btv364, 31, 20, (3322-3329), (2015).
- Yilun Wang, Sheng Zhang, Junjie Zheng, Heng Chen, Huafu Chen, Randomized Structural Sparsity-Based Support Identification with Applications to Locating Activated or Discriminative Brain Areas: A Multicenter Reproducibility Study, IEEE Transactions on Autonomous Mental Development, 10.1109/TAMD.2015.2427341, 7, 4, (287-300), (2015).
- Benjamin Hofner, Luigi Boccuto, Markus Göker, Controlling false discoveries in high-dimensional situations: boosting with stability selection, BMC Bioinformatics, 10.1186/s12859-015-0575-3, 16, 1, (2015).
- Moo-Jin Suh, Andrey Tovchigrechko, Vishal Thovarai, Melanie A. Rolfe, Manolito G. Torralba, Junmin Wang, Joshua N. Adkins, Bobbie-Jo M. Webb-Robertson, Whitney Osborne, Fran R. Cogen, Paul B. Kaplowitz, Thomas O. Metz, Karen E. Nelson, Ramana Madupu, Rembert Pieper, Quantitative Differences in the Urinary Proteome of Siblings Discordant for Type 1 Diabetes Include Lysosomal Enzymes, Journal of Proteome Research, 10.1021/acs.jproteome.5b00052, 14, 8, (3123-3135), (2015).
- Yilun Wang, Junjie Zheng, Sheng Zhang, Xunjuan Duan, Huafu Chen, Randomized structural sparsity via constrained block subsampling for improved sensitivity of discriminative voxel identification, NeuroImage, 10.1016/j.neuroimage.2015.05.057, 117, (170-183), (2015).
- Chun-Xia Zhang, Jiang-She Zhang, Guan-Wei Wang, A Novel Bagging Ensemble Approach for Variable Ranking and Selection for Linear Regression Models, Multiple Classifier Systems, 10.1007/978-3-319-20248-8_1, (3-14), (2015).
- Lei Du, Jingwen Yan, Sungeun Kim, Shannon L. Risacher, Heng Huang, Mark Inlow, Jason H. Moore, Andrew J. Saykin, Li Shen, [Authorinst]for the Alzheimer’s Dis Initiative, GN-SCCA: GraphNet Based Sparse Canonical Correlation Analysis for Brain Imaging Genetics, Brain Informatics and Health, 10.1007/978-3-319-23344-4_27, (275-284), (2015).
- Benjamin Hofner, Thomas Kneib, Torsten Hothorn, A unified framework of constrained regression, Statistics and Computing, 10.1007/s11222-014-9520-y, 26, 1-2, (1-14), (2014).
- Chun-Xia Zhang, Guan-Wei Wang, Jun-Min Liu, RandGA: injecting randomness into parallel genetic algorithm for variable selection, Journal of Applied Statistics, 10.1080/02664763.2014.980788, 42, 3, (630-647), (2014).
- Nicolai Meinshausen, Group bound: confidence intervals for groups of variables in sparse high dimensional regression without assumptions on the design, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 10.1111/rssb.12094, 77, 5, (923-945), (2014).
- Peter Bühlmann, Markus Kalisch, Lukas Meier, High-Dimensional Statistics with a View Toward Applications in Biology, Annual Review of Statistics and Its Application, 10.1146/annurev-statistics-022513-115545, 1, 1, (255-278), (2014).
- David Dernoncourt, Blaise Hanczar, Jean-Daniel Zucker, Analysis of feature selection stability on high dimension and small sample data, Computational Statistics & Data Analysis, 10.1016/j.csda.2013.07.012, 71, (681-693), (2014).
- See more




