Needles and Haystacks

Kimberly WestFLOCK Notes

January 9, 2024

One experimental design concept that no one in financial modeling is using – but should be.

Anyone who has engaged in statistical modeling – particularly in consumer finance – knows that identifying and pricing risk associated for individual accounts is analogous to finding a needle in a haystack.

Traditional modeling applications that have a specified outcome (like probability of repayment) and a long list of potential predictors (like age, income, state), will generate results with some predictive accuracy – and a great deal of error.  As the famous statistician George Box, said “All models are wrong, but some are useful”.[1]

A standard toolbox for any financial analyst will include various forms of traditional modeling options like logistic regression, which is the basic math behind standard applications like FICO scores and has been around for the better part of half a century.  We can think of logistic regression as directing us to the general area of the haystack rather than pinpointing a needle.

Another concept that has been around for decades is propensity score matching – an experimental design concept used heavily in medical research[2] and in social science research[3].  The direct application of an experimental design concept to consumer financial modeling may not be immediately apparent…but stay with me…

Propensity score matching (PSM) has historically been used to “backward engineer” randomization when results have already occurred and random assignment is not possible – like matching the medical records of patients across two groups (e.g., a control group and a treatment) to create matched pairs to assess the efficacy of a treatment, post hoc.  In risk analysis and account pricing, we can use this same approach to “match” the expected financial performance of a potential customer considering dozens of characteristics with the known – post hoc – financial performance of existing or past customers from a database.  In other words, rather than developing a logistic model to predict the probability of payment within error bands, we can use PSM to pinpoint a specific customer in our database who has the same characteristics as the considered customer – the proverbial needle in the haystack.  Using the results of the match, we can directly apply the performance of the known customer to the expectations of the potential customer.  This matching can be 1:1 or 1:n.  In this context, PSM is 100% empirical with no parametric expectations or error bands.

Our work using PSM as an alternative to traditional modeling to assess and price risk was recently published in the Journal of Applied Statistics[4].

 

For more information about Flock Specialty Finance, contact Jennifer Lewis Priestley, CDO (jpriestley@flockfinance.com)

[1] https://en.wikipedia.org/wiki/All_models_are_wrong

[2] J. Peterson, N. Paranjape, N. Grundlingh, and J. Priestley, Outcomes and Adverse Effects of Baricitinib Versus Tocilizumab in the Management of Severe COVID-19, Critical Care Medicine 51(3) (2023), pp. 337-346.

[3] M. G. Powell, M. Darrell, A. Hull and A. Beaujean, Propensity Score Matching for Education Data: Worked Examples, The Journal of Experimental Education. 88 (2020), pp. 145-164.

[4] Jennifer Lewis Priestley & Eric VonDohlen (2024) Propensity score matching: a tool for consumer risk modeling and portfolio underwriting, Journal of Applied Statistics, DOI: 10.1080/02664763.2024.2302058