
Predictive Modeling Applications in Actuarial Science
- Volume 1
- Introduction
- Predictive Modeling Foundations
- Predictive Modeling Methods
- Bayesian and Mixed Modeling
- Longitudinal Modeling
- Volume 2
- Generalized Linear Model
- Extensions of the Generalized Linear Model
- Unsupervised Predictive Modeling Methods
-
Applications on Current Problems in Actuarial Science
- Chapter 8 - The Predictive Distribution of Loss Reserve Estimates over a Finite Time Horizon
- Chapter 9 - Finite Mixture Model and Workers’ Compensation Large-Loss Regression Analysis
- Chapter 10 - A Framework for Managing Claim Escalation Using Predictive Modeling
- Chapter 11 - Predictive Modeling for Usage-Based Auto Insurance
Chapter 7 - Application of Two Unsupervised Learning Techniques to Questionable Claims: PRIDIT and Random Forest
Authors
Louise A. Francis | Francis Analytics and Actuarial Data Mining Inc.
louise_francis@msn.com
Chapter Preview
Predictive modeling can be divided into two major kinds of model- ing, referred to as supervised and unsupervised learning, distinguished primarily by the presence or absence of dependent/target variable data in the data used for modeling. Supervised learning approaches probably account for the majority of model- ing analyses. The topic of unsupervised learning was introduced in Chapter 12 of Volume I of this book. This chapter follows up with an introduction to two advanced unsupervised learning techniques PRIDIT (Principal Components of RIDITS) and Random Forest (a tree based data-mining method that is most commonly used in supervised learning applications). The methods will be applied to an automobile insurance database to model questionable1 claims. A couple of additional unsupervised learning methods used for visualization, including multidimensional scaling, will also be brie y introduced.
Databases used for detecting questionable claims often do not contain a questionable claims indicator as a dependent variable. Unsupervised learning methods are often used to address this limitation. A simulated database containing features observed in actual questionable claims data was developed for this research based on actual data. The methods in this chapter will be applied to this data. The database is available online at the book’s website.
Data | R Code |
Data | PRIDIT - Book |
Data(.xlsx) | RFClustering Book2 |