The neglog transformation and quantile regression for the analysis of a large credit scoring database
Abstract
Summary. A statistical analysis of a bank's credit card database is presented. The database is a snapshot of accounts whose holders have missed a payment on a given month but who do not subsequently default. The variables on which there is information are observable measures on the account (such as profit and activity), and whether actions that are available to the bank (such as letters and telephone calls) have been taken. A primary objective for the bank is to gain insight into the effect that collections activity has on on‐going account usage. A neglog transformation that highlights features that are hidden on the original scale and improves the joint distribution of the covariates is introduced. Quantile regression, a novel methodology to the credit scoring industry, is used as it is relatively assumption free, and it is suspected that different relationships may be manifest in different parts of the response distribution. The large size is handled by selecting relatively small subsamples for training and then building empirical distributions from repeated samples for validation. In the application to the database of clients who have missed a single payment a substantive finding is that the predictor of the median of the target variable contains different variables from those of the predictor of the 30% quantile. This suggests that different mechanisms may be at play in different parts of the distribution.
Citing Literature
Number of times cited according to CrossRef: 19
- Elian Fink, Marc Rosnay, Praveetha Patalay, Caroline Hunt, Early pathways to bullying: A prospective longitudinal study examining the influences of theory of mind and social preference on bullying behaviour during the first 3 years of school, British Journal of Developmental Psychology, 10.1111/bjdp.12328, 38, 3, (458-477), (2020).
- Anthony C. Atkinson, Marco Riani, Aldo Corbellini, The analysis of transformations for profit‐and‐loss data, Journal of the Royal Statistical Society: Series C (Applied Statistics), 10.1111/rssc.12389, 69, 2, (251-275), (2019).
- Nicolas Koch, Houdou Basse Mama, Does the EU Emissions Trading System induce investment leakage? Evidence from German multinational firms, Energy Economics, 10.1016/j.eneco.2019.04.018, 81, (479-492), (2019).
- José Fernando Rodríguez-Palomares, Lydia Dux-Santoy, Andrea Guala, Raquel Kale, Giuliana Maldonado, Gisela Teixidó-Turà, Laura Galian, Marina Huguet, Filipa Valente, Laura Gutiérrez, Teresa González-Alujas, Kevin M. Johnson, Oliver Wieben, David García-Dorado, Arturo Evangelista, Aortic flow patterns and wall shear stress maps by 4D-flow cardiovascular magnetic resonance in the assessment of aortic dilatation in bicuspid aortic valve disease, Journal of Cardiovascular Magnetic Resonance, 10.1186/s12968-018-0451-1, 20, 1, (2018).
- Ana Abeliansky, Klaus Prettner, Automation and Demographic Change, SSRN Electronic Journal, 10.2139/ssrn.2959977, (2017).
- Cristina Davino, Vincenzo Esposito Vinzi, Quantile composite-based path modeling, Advances in Data Analysis and Classification, 10.1007/s11634-015-0231-9, 10, 4, (491-520), (2016).
- Lars Wenzel, André Wolf, Towards a new measure of a country’s competitiveness: applying canonical correlation, Competitiveness Review, 10.1108/CR-09-2014-0030, 26, 1, (87-107), (2016).
- Gan Chew Peng, Pooi Ah Hin, C. K. Ho, undefined, , 10.1063/1.4937089, (050007), (2015).
- Francesco Castellaneta, Maurizio Zollo, The Dimensions of Experiential Learning in the Management of Activity Load, Organization Science, 10.1287/orsc.2014.0906, 26, 1, (140-157), (2015).
- Javier De Andrés, Beatriz Pariente, Martin Gonzalez-Rodriguez, Daniel Fernandez Lanvin, Towards an automatic user profiling system for online information sites, Online Information Review, 10.1108/OIR-06-2014-0134, 39, 1, (61-80), (2015).
- Lorenzo Rotunno, Pierre-Louis Vézina, Zheng Wang, The rise and fall of (Chinese) African apparel exports, Journal of Development Economics, 10.1016/j.jdeveco.2013.08.001, 105, (152-163), (2013).
- Etti G. Baranoff, Thomas W. Sager, Bo Shi, Implications of the Capital and Risk Interrelationship of Health Insurers, SSRN Electronic Journal, 10.2139/ssrn.2236499, (2012).
- Javier de Andrés, Manuel Landajo, Pedro Lorca, Bankruptcy prediction models based on multinorm analysis: An alternative to accounting ratios, Knowledge-Based Systems, 10.1016/j.knosys.2011.11.005, 30, (67-77), (2012).
- Jie Zhang, Lyn C. Thomas, Comparisons of linear regression and survival analysis using single and mixture distributions approaches in modelling LGD, International Journal of Forecasting, 10.1016/j.ijforecast.2010.06.002, 28, 1, (204-215), (2012).
- Michel Ballings, Dries Benoit, Dirk Van den Poel, undefined, 2011 IEEE 11th International Conference on Data Mining Workshops, 10.1109/ICDMW.2011.148, (1163-1169), (2011).
- Steven Finlay, Multiple classifier architectures and their application to credit risk assessment, European Journal of Operational Research, 10.1016/j.ejor.2010.09.029, 210, 2, (368-378), (2011).
- Javier De Andrés, Manuel Landajo, Pedro Lorca, Flexible quantile-based modeling of bivariate financial relationships: The case of ROA ratio, Expert Systems with Applications, 10.1016/j.eswa.2008.11.021, 36, 5, (8955-8966), (2009).
- Manuel Landajo, Javier De Andrés, Pedro Lorca, Measuring firm performance by using linear and non‐parametric quantile regressions, Journal of the Royal Statistical Society: Series C (Applied Statistics), 10.1111/j.1467-9876.2007.00610.x, 57, 2, (227-250), (2008).
- Mark Somers, Joe Whittaker, Quantile regression for modelling distributions of profit and loss, European Journal of Operational Research, 10.1016/j.ejor.2006.08.063, 183, 3, (1477-1487), (2007).




