Interpretable Ensembles: Enhancing Claim Frequency Modeling with External Socioeco-nomic Factors

0 views
0 comments
0 likes
0 favorites

UlmUniversityifa
32 media
uploaded 2 hours ago

The modeling of claim frequencies with interpretable risk factors is central for tariffing and risk classification in Non‑Life Insurance. Advanced ML can estimate the proven and interpretable (sparse) risk factors of a GLM in an automated and data‑driven approach [1]. Alongside the methodical advances, the value of high‑quality internal data histories is recognized by insurers. Additional external features, e.g. climate, spatial or socioeconomic information, can deliver further valuable insights for understanding the underlying risks [2,3,4]. The pooling of diverse external information and detailed internal claims creates a very comprehensive database for a more detailed claims analysis. However, most of the innovative methods are not designed to exploit the advantages of large datasets. Moreover, some of them have convergence difficulties with very large datasets in practice.In this work, we present a novel method to investigate the value of the combination of high‑quality data histories enriched with manifold socioeconomic information using a real dataset. Our approach, inspired by ensembles, enables an efficient modeling while the forecasts remain fully interpretable. In addition, the uncertainty and stability of the effects of single risk factors become visible. The results show quantitatively that both the addition of socioeconomic information and the utilization of concepts for large datasets significantly improve the forecasting quality of both established and innovative actuarial models. References.[1] Devriendt, S., Antonio, K., Reynkens, T., & Verbelen, R. (2021). Sparse regression with multi‑type regularized feature modeling. Insurance: Mathematics and Economics 96, 248‑261. [2] Tufvesson, O., Lindström, J., & Lindström, E. (2019). Spatial statistical modelling of insurance risk: a spatial epidemiological approach to car insurance. Scandinavian Actuarial Journal, 2019(6), 508‑522. [3] Knighton, J., Buchanan, B., Guzman, C., Elliott, R., White, E., & Rahm, B. (2020). Predicting flood insurance claims with hydrologic and socioeconomic demographics via machine learning: Exploring the roles of topography, minority populations, and political dissimilarity. Journal of Environmental Management 272, 111051. [4] NAIC (2025). 2021/2022 Auto Insurance Database Report. https://content.naic.org/sites/default/files/publication-aut-pb-auto-ins.... (Download on 11.08.2025)