Abstract

In insurance analytics, it is now common to utilize generalized linear models (GLM) of outcomes such as automobile and homeowners claims for pricing and other purposes. More recently, analysts have explored time to event tools such as logistic regression modeling to understand the duration that a customer will be with a company for long-run profitability.

This tutorial, and the companion paper, considers a joint model of insurance claims and lapsation. For example, if a policyholder is aggressive or a risk seeker (or just careless), then we would expect that customer to have both large auto as well as homeowner claims. As another example, if a policyholder has an auto claim during the year, then we might think that this outcome is related to the decision to lapse (or its converse, renew) an insurance contract.

Using simulated data, this tutorial shows how to estimate two types of claims (auto and homeowners) using familiar GLM models that employ the Tweedie distribution. Logistic regression is used for the lapse model. The novel aspect is that we specify their joint behavior through a copula. Estimation is done using both an established composite likelihood approach as well as a new (in this context) generalized method of moments technique. The models and estimation techniques are built into this demonstration and are not required knowledge in order to review and interact with this tutorial.

1 Background

Consider the case where we follow policyholders over time. During the year, there are three outcome variables of interest. The claims outcomes are

$Y_{1}$ which represents claims from an auto coverage and
$Y_{2}$ which represents claims from a homeowners coverage.

Naturally, these can be claims from any coverage - we use auto and homeowners for illustration purposes. As claims outcomes, these variables may take on value of zero (representing no claim) and are otherwise positive continuous outcomes (representing claim amount). We use subscripts $i$ to distinguish among policyholders and $t$ to distinguish observations over time. Thus, $Y_{1,it}$ and $Y_{2,it}$ represent auto and homeowner claims for the $i$ th policyholder at time $t$ .

The third random variable, $L$ , is a binary variable that represents a policyholder’s decision to lapse the policy. Specifically,

$L_{it}=1$ indicates that the $i$ th policyholder in the $t$ th year decides to lapse the policy and
$L_{it}=0$ indicates that the $i$ th policyholder in the $t$ th year decides to not lapse (or renew) the policy, i.e, to lapse.

Note that if $L_{i,t}=1$ then we do not observe the policy at time $t$ . In the same way, if $L_{it}=0$ , then we observe the policy at time $t$ , subject to limitations on the number of time periods available. We use $m$ to represent the maximal number of observations over time.

Associated with each policyholder is a set of (possibly time varying) rating variables $\mathbf{x}_{it}$ for the $i$ th policyholder at time $t$ that is described in Section 2.2. We represent the marginal distribution of each outcome variables in terms of a generalized linear model. Specifically, following standard insurance industry practice, we represent the marginal distributions of the claims random variables using a Tweedie distribution so that the distributions have a mass at zero and are otherwise positive. The marginal distribution of the renewal variable is modeled using a logit function. Marginal distributions may use common rating variables and so are naturally related in this sense.

The dependence among lapse and claims outcomes is captured using a copula function. That is, we allow outcomes from the same time period (and the same policyholder) to depend on one another. This specification permits, for example, large claims to influence (directly) the tendency to lapse a policy or (indirectly) a latent variable to simultaneously influence both lapse and claims outcome. Lapsation dictates the availability of data which may be related to the outcomes, a violation of the statistical principle known as missing at random. This means that analyzing claims while ignoring lapse can lead to biased estimation. Thus, joint modeling of lapse and claims are critical because the claims model depends on the data observed through the lapsation/renewal process.

This tutorial is interactive in two ways. First, if you are viewing the .html file in a web browser, you will be able to reveal R code and output by clicking on the text. For example, here is a list of the R packages needed to run this tutorial.

Click Here to Show R Code for a List of Packages

# Here are the packages that you need to install to run this tutorial
library(tweedie)
library(reshape)
library(statmod)
library(knitr)
library(BB)
library(MASS)
library(copula)
library(numDeriv)
library(VineCopula)
library(mvtnorm)
time1 <- Sys.time()  # define a variable that we can use to check the run time

Second, you may change selected input parameters so that the tutorial reflects a business situation of interest to you. This is accomplished by making changes to the companion .Rmd file that is written in R’s version of markdown, known as R Markdown. The code that includes the label Settings contains input parameters that you can readily change. For example, the code for specifying the number of policyholder is given as follows.

R Code for Data Structure Parameter Settings

# Input Parameters
Nsamp =  10000         # Number of policyholders 
m=5                    # Number of years # DO NOT CHANGE for now
SUPPRESS <- FALSE      # Use SUPPRESS<-TRUE if you wish to suppress the output of the Appendices
TRUEMARGINALS <- TRUE  # Use TRUEMARGINALS <- FALSE if you wish to use estimated
                       #   marginal parameters for copula estimation

From this input, for this demonstration we have

$n$ , the number of policyholders, is 10^{4}, and
$m$ , the number of observation years, is 5.

In the following, this tutorial is split into five additional sections.

Data Generation Process. Claims follow a Tweedie model with two explanatory/rating variables. Retention/Lapse follows a logistic model with a covariate (time). The user can set the sample size, correlation among outcome variables, mean claim size, the proportion of zero claims (no claim), and mean retention.
Summarize the Data. Ignoring that the data are simulated, this section provides basic summary statistics to understand data features.
Fit the Marginal Distributions. This section provides the usual regression model fitting using Tweedie distributions for claims and a logistic model for renewal. Residuals from the model fits are calculated and used to display patterns among outcomes that are not accounted for in the marginal models.
Joint Model Specification. A copula (Gaussian) model is specified to accommodate dependencies. This model is fit using composite maximum likelihood, as is available from the literature.
Estimation Using Generalized Method of Moments. This procedure is novel in the copula context.

In addition, two appendices provide an overview of the theory that underpins the joint model and estimation procedures.

2 Data Generation Process

Because the data for this tutorial are simulated, users can conduct sensitivity tests to see how the estimation techniques behave as the data varies. For example, how large a data set is needed for reliable inference? How does the required sample size depend on the strength of the dependencies or what if they follow a different pattern? Users can interactively address these and other questions with access to this tutorial. Moreover, simulated data allows us to readily share the tutorial. In contrast, for real data, restrictions often apply.

2.1 Marginal Models

2.1.1 Marginal Outcome Models

As outlined in the Background Section 1, we represent the marginal distributions of the claims random variables using a Tweedie distribution so that the distributions have a mass at zero and are otherwise positive. For each claim variable, we use a logarithmic link to form the mean claims of the form $\mu_{j,it} = \exp\left(\mathbf{x}_{it}^{\prime} \boldsymbol \beta_j\right), j=1,2 .$ Thus, the parameters are allowed to differ between auto and homeowners claims. As a consequence of this, you need not use the same variables for each claim type (a zero beta means that the variable is not part of the mean). Each claim is simulated using the Tweedie distribution, a mean, and two other parameters, $\phi$ (for dispersion) and $P$ (the power parameter).

Recall, for a Tweedie distribution, that the variance is $\phi_j \mu^{P}$ . For this tutorial, we use $P=1.67$ for both auto and home based on our experiences analyzing real insurance data sets. In the Tweedie model, the probability of a zero claim is $e^{-\lambda}$ , where $\lambda = \mu^{2-P} /(\phi*(2-P))$ . So, if we use $\mu = 1000$ , then the probability of a zero claim is $\exp\left[-1000^{0.33}/(\phi*0.33)\right]$ . For example, by selecting $\phi=42$ , the probability of a zero claim is 49.4%.

Interactive users have an opportunity to change the overall mean as well as dispersion parameters for each claim type.

R Code for Marginal Claim Parameter Settings

# Input Parameters
MeanClaim1 = 5000  # Mean for Type 1 Claims
MeanClaim2 = 10000  # Mean for Type 2 Claims
Externalphi1= 500   # Tweedie dispersion parameter # With a mean=1000, phi=500 for 94% zeros
Externalphi2= 500   # phi=2 for near continuous data, phi=42 for data for about half zeros

For the lapse variable, the expected value is of the form $\pi_{it} = \frac{\exp\left(\mathbf{x}_{it}^{\prime} \boldsymbol \beta_L\right)}{1+\exp\left(\mathbf{x}_{it}^{\prime} \boldsymbol \beta_L\right)},$ a common form for the logit model. Interactive users have an opportunity to change the overall mean lapse parameter.

R Code for Marginal Lapse Parameter Settings

# Input Parameter
MeanLapse  = 0.05           # Mean for Lapse

2.1.2 Rating Variables

For this tutorial, we have five rating (explanatory) variables:

$x_1$ , a binary variable that takes on values 1 or 2 depending on whether or not an attribute holds
$x_2$ , $x_3$ , $x_4$ , generic continuous explanatory variables
$x_5$ , time trend ( $x_{it} = t$ )

Interactive users have an opportunity to change the covariate parameters.

R Code for Covariate Parameters Settings

beta11 = 0.2;  beta12=2         # Coefficients for Auto
beta21 = 0.3;  beta22=3         # Coefficients for Home
betaL1 =2;     betaL2 =-0.25        # Coefficients  for Lapse

With these values of covariate parameters, the systematic components are

auto: $\mathbf{x}_{it}^{\prime} \boldsymbol \beta_1 = \beta_{0,1} +$ 0.2 $x_1$ + 2 $x_2$
home: $\mathbf{x}_{it}^{\prime} \boldsymbol \beta_2 = \beta_{0,2} +$ 0.3 $x_3$ + 3 $x_4$
lapse: $\mathbf{x}_{it}^{\prime} \boldsymbol \beta_L = \beta_{0,L} +$ 2 $x_2$ + -0.25 $x_5$ .

Intercept parameters are determined using the overall mean terms specified above.

We specify a negative coefficient associated with the time trend variable (2) to reflect the fact that lapse probabilities tend to decrease with policyholder duration.

The following gives the code to generate the rating variables (covariates).

R Code for Generating Covariates

# Generate covariates
#set.seed(123458)
set.seed(20181)
x11 = 1+rbinom(Nsamp, size=1, prob=0.4)  # Time constant Bernoulli variable
x1  = matrix(x11,nrow=Nsamp,ncol=m)
x2  = matrix(1+(runif(m*Nsamp)*2),nrow=Nsamp,ncol=m)
x3  = matrix(1+(runif(m*Nsamp)^2),nrow=Nsamp,ncol=m)
x4  = matrix(1+(runif(m*Nsamp)^2),nrow=Nsamp,ncol=m)
mu1a = exp(beta11*x1+beta12*x2)
mu2a = exp(beta21*x3+beta22*x4)
mu1  = MeanClaim1*mu1a/mean(mu1a)                                                # Rescale so that the average claim is MeanClaim1
mu2  = MeanClaim2*mu2a/mean(mu2a) 
yearMat <- t(matrix(rep(1:m,Nsamp),nrow=m,ncol=Nsamp))
LapseInter = log(MeanLapse/(1-MeanLapse)) - mean(betaL1*x2+betaL2*yearMat)  # Set the intercept so that the average lapse is approximately MeanLapse
PI_it  = exp(LapseInter  + betaL1*x2 + betaL2*yearMat)/(1+exp(LapseInter+ betaL1*x2 + betaL2*yearMat)) 
PolIDMat <- t(matrix(rep(1:Nsamp,each=m),nrow=m,ncol=Nsamp))

2.2 Dependence Model

Dependence among outcome variables is taken to be a Gaussian copula with the following structure $\boldsymbol \Sigma = \left( \begin{array}{ccccc} 1 & \rho_{LA} & \rho_{LH} \\ \rho_{LA} & 1 & \rho_{AH} \\ \rho_{LH} & \rho_{AH} & 1 \\ \end{array} \right) .$ Interactive users have an opportunity to change the dependence parameters.

R Code for Dependence Model Parameter Settings

# Input Parameters
rhoLA =  0.4       # Association between Lapse and Type 1 (Auto) Claims
rhoLH =  0.4       # Association between Lapse and Type 2 (Home) Claims
rhoAH =  0.1       # Association between Type 1 and Type 2 Claims

For this tutorial, we have the values $\rho_{LA}=$ 0.4, $\rho_{LH}=$ 0.4, and $\rho_{AH}=$ 0.1. Here, we use negative values for the association between lapse and claims, large claims are associated with lower renewal. We use a positive value for the association between claim types.

The following R code shows how to simulate dependent outcomes.

R Code for Simulating Dependent Outcomes

# Dependence Structure
BigSigma = matrix(c(1,rhoLA,rhoLH,
                     rhoLA,1,rhoAH,
                     rhoLH,rhoAH,1),nrow=3,ncol=3)
BigSigmaChol <- chol(BigSigma)
Z = matrix(rnorm(m*Nsamp*3),nrow=m*Nsamp,ncol=3)%*%BigSigmaChol
# Generating Dependent Lapses and Claims
UCop = pnorm(Z)
PIVec   =  as.vector(matrix(PI_it,nrow=Nsamp*m,ncol=1))
mu1Vec  =  as.vector(matrix(mu1,nrow=Nsamp*m,ncol=1))
mu2Vec  =  as.vector(matrix(mu2,nrow=Nsamp*m,ncol=1))
# Simulate Tweedie claims
Externalxi  = 1.67                          # Tweedie power parameter 
Lapse      = 1*(UCop[,1] > 1- PIVec)        # high values of U ==> lapse
TAvailable  = matrix(1,nrow=Nsamp,ncol=m)
TAvailable1 = matrix(1-Lapse,nrow=Nsamp,ncol=m)
for (icol in 2:m) { TAvailable[,icol]=TAvailable[,icol-1]*TAvailable1[,icol-1]}
#TAvailable dictates the number of observed variables
Claims1 = qtweedie(UCop[,2], power=Externalxi, mu=mu1Vec, phi=Externalphi1)
Claims2 = qtweedie(UCop[,3], power=Externalxi, mu=mu2Vec, phi=Externalphi2)
SampleDataC <- as.data.frame(cbind(
     Lapse,Claims1,Claims2,
     matrix(TAvailable,nrow=Nsamp*m,ncol=1) ,
     matrix(yearMat,nrow=Nsamp*m,ncol=1) ,
     matrix(PolIDMat,nrow=Nsamp*m,ncol=1) ,
     matrix(x1,nrow=Nsamp*m,ncol=1) ,
     matrix(x2,nrow=Nsamp*m,ncol=1) ,
     matrix(x3,nrow=Nsamp*m,ncol=1) ,
     matrix(x4,nrow=Nsamp*m,ncol=1) ,
     PIVec,mu1Vec,mu2Vec))
colnames(SampleDataC) <- c("Lapse","Claims1","Claims2","TAvail", "year","PolID","x1","x2","x3","x4","PI","mu1","mu2")
SampleDatC <- SampleDataC[order(SampleDataC$PolID, SampleDataC$year),]
SampleData <- subset(SampleDataC, TAvail==1)

3 Summarize the Data

This section demonstrates some basic techniques to look at the data. Although the data are simulated, we use the same techniques to understand this sample as if it represented real data.

3.1 Basic Summary Statistics

From the panel of 10^{4} policyholders observed over 5 years, there are $N=$ 40776 potential observations in the data set. Remember that a policyholder who lapses is not observed in subsequent periods so we have fewer than 510^{4} potential observations that one would expect with a balanced panel. Of these potential observations, there were 3491 lapses, for a lapse rate of 8.56%.

For type 1 (auto) claims, we have 3517/510^{4} or about 7.03% (positive) claims. For type 2 (home) claims, we have 4366/510^{4} or about 8.73% (positive) claims.

As is common, we begin by examining basic measures that summarize the distribution of each variable, initially focus on continuous outcomes.

R Code for Summarizing Continuous Variables

Vars <- c("x2", "x3", "x4", "Claims1", "Claims2")
SumDat <- summary(SampleData[Vars])
knitr::kable(SumDat,digits=3, caption="Continuous Variable Summary Measures")

Continuous Variable Summary Measures
x2	x3	x4	Claims1	Claims2
Min. :1.000	Min. :1.000	Min. :1.000	Min. : 0	Min. : 0
1st Qu.:1.503	1st Qu.:1.063	1st Qu.:1.062	1st Qu.: 0	1st Qu.: 0
Median :1.997	Median :1.251	Median :1.251	Median : 0	Median : 0
Mean :2.000	Mean :1.334	Mean :1.333	Mean : 4766	Mean : 9736
3rd Qu.:2.503	3rd Qu.:1.563	3rd Qu.:1.562	3rd Qu.: 0	3rd Qu.: 0
Max. :3.000	Max. :2.000	Max. :2.000	Max. :888274	Max. :2366464

Tables can be used to summarize discrete variables.

R Code for Summarizing Discrete Variables

# Discrete Variables x1 and year 
table(SampleData$x1)


    1     2 
24693 16083

table(SampleData$year)


    1     2     3     4     5 
10000  8789  7904  7264  6819

Graphical summaries, such as scatter plots, can be used to demonstrate relationships among variables.

R Code for Basic Scatter Plots

par(mfrow=c(1, 2))
plot(SampleData$x2,SampleData$Claims1,xlab=expression(x[2]),ylab="Claims1")
plot(log(SampleData$x2),log(1+SampleData$Claims1),xlab=expression(paste("log ",x[2])),ylab="log(1+Claims1)")

The structure of our problem set-up is complicated. We have three outcome variables, several rating variables, and observe a cross-section of policyholders over time. Analysts encountering data with this structure typically do a far more complete analysis of the basic summary statistics than presented here. The purpose of this section is just to provide a taste of the type of analyses needed at this step. We assume that readers are familiar with these tasks and so proceed to more interesting steps.

3.2 Outcomes by Year

You should examine the distribution of outcomes, auto and home claims, as well as lapse, over time. After all, the whole point is think about how the availability of observations, as dictated by lapse/renewal, impacts the claims distribution.

For this tutorial, by design the distribution of the claims frequency and severity is fairly stable over time. The largest type 1 (auto) claim is 888.27 and the largest type 2 (home) claim is 2366.46, both in thousands.

Claims Summary by Year
	2010	2011	2012	2013	2014
Number of Potential Obs	10000.00	8789.00	7904.00	7264.00	6819.00
Number of Lapse	1211.00	885.00	640.00	445.00	310.00
Number of Claims 1	845.00	799.00	686.00	605.00	582.00
Average Claim 1	4448.52	5462.49	4849.93	4585.87	4430.28
Number of Claims 2	1064.00	962.00	828.00	785.00	727.00
Average Claim 2	9487.35	9593.35	9886.32	9234.29	10644.43

R Code for Summarizing Outcomes by Year

SumClaims = matrix(0,7,5)
colnames(SumClaims) <-  c("2010","2011","2012","2013","2014")
rownames(SumClaims) <-  c("Number of Potential Obs","Number of Lapse","Number of Realized Obs","Number of Claims 1","Average Claim 1","Number of Claims 2","Average Claim 2")
SampleData$PosClaim1 <- 1*(SampleData$Claims1>0)
SampleData$PosClaim2 <- 1*(SampleData$Claims2>0)
SumClaims[1,] <- aggregate(Lapse ~ year, data=SampleData, length)$Lapse 
SumClaims[2,] <- aggregate(Lapse ~ year, data=SampleData, sum)$Lapse 
SumClaims[3,] <- SumClaims[1,] - SumClaims[2,]
SumClaims[4,] <- aggregate(PosClaim1 ~ year, data=SampleData, sum)$PosClaim1
SumClaims[5,] <- aggregate(Claims1 ~ year, data=SampleData, mean)$Claims1
SumClaims[6,] <- aggregate(PosClaim2 ~ year, data=SampleData, sum)$PosClaim2
SumClaims[7,] <- aggregate(Claims2 ~ year, data=SampleData, mean)$Claims2
knitr::kable(SumClaims,digits=2, caption="Claims Summary by Year")

3.3 Dependence Summary Statistics

Of particular interest is the relationship among outcome variables. First, we take a look at the number of observations where a policy has:

Type 1: neither an auto nor a home claim
Type 2: an auto but not a home claim
Type 3: not an auto but a home claim
Type 4: both an auto and a home claim

This frequency distribution is given for our simulated data in the table below, followed by the R code that generated the table.

Types
    1     2     3     4 
33371  3039  3888   478

R Code for Summarizing Dependent Claims Numbers

Types = 1*(SampleData$Claims1==0)*(SampleData$Claims2==0)+
             2*(SampleData$Claims1>0) *(SampleData$Claims2==0)+
             3*(SampleData$Claims1==0)*(SampleData$Claims2>0)+
             4*(SampleData$Claims1>0) *(SampleData$Claims2>0)
table(Types)

We can also summarize relationships among outcome variables using association measures such as correlations. However, our claims variables are a hybrid combination of zeros (for no claims) and long-tailed continuous variables (for positive claim amounts). Although this feature is captured by the Tweedie distribution, it can sometimes be difficult to establish dependence with basic summary statistics. Depending on parameter values, there can be many zeros (74.52% for this data set) and when positive, claims distributions tend to be right skewed. Here is some code that provides Spearman correlations, a nonparametric correlation coefficient.

As you experiment with different parameter values, you will find that the more zeros in the data, the more difficult it is to establish dependence with basic techniques. This is interesting because we know that, when generating the data, that important dependencies exist. Recall that the known theoretical association measures are $\rho_{LA}=$ 0.4, $\rho_{LH}=$ 0.4, and $\rho_{AH}=$ 0.1.

Spearman Outcome Correlations
	Lapse	Claims 1	Claims 2
Lapse	1.000	0.209	0.146
Claims 1	0.209	1.000	0.029
Claims 2	0.146	0.029	1.000

R Code for Summarizing Dependent Claims Associations

Vars <- c( "Lapse","Claims1", "Claims2")
pnames <- c( "Lapse","Claims 1", "Claims 2")
CorDat <- SampleData[Vars]
CorMat <- cor(CorDat, method="spearman", use="pairwise.complete.obs")
colnames(CorMat) <- rownames(CorMat) <- pnames
knitr::kable(CorMat,digits=3, caption="Spearman Outcome Correlations")

When summarizing the data, it is sometimes convenient to work in terms of lapse, as this is the event that is of interest to insurers. However, going forward, we work with its complement, renewal ( = one minus lapse). This is slightly more convenient mathematically in that we condition on a policy renewing to examine the claims distribution in subsequent periods.

4 Fit the Marginal Distributions

After careful work to summarize data (only a small portion shown here), the next step is fit marginal models. In this context, the descriptor marginal means analyzing each outcome without reference to the others. In subsequent sections, we join the marginal models via the copula.

Marginal model estimation is typically done assuming that each year has the same set of parameters and that observations from different years are independent. This is not necessary but provides a convenient starting point.

4.1 Logistic (Lapse) Regression Results

To model lapse, we employ a simple logistic regression (marginal) model.

Logistic Lapse Model Summary
	Parameter	Estimate	Std. Error	z value
(Intercept)	-6.193	-5.893	0.104	-56.426
x2	2.000	1.896	0.041	46.736
year	-0.250	-0.277	0.014	-19.991

An advantage of using simulated data is that the underlying theoretical relationships are known. For example, the true coefficient associated with $year$ is $\beta_{L1}$ = 2 can be readily compared to that estimated from the simulated data which is $\hat{\beta}_{L1}$ = 2. When examining the proximity of these two quantities, we can also look to the standard error 1.8964.

R Code for Estimating the Logistic Renewal Marginal Model

logistic.fit <- glm(Lapse ~x2+year,data=SampleData,control = glm.control(maxit = 50),family=binomial(link=logit))
sum.logistic.fit <- summary(logistic.fit)
Parameter <- c(LapseInter, betaL1, betaL2)
table1=cbind(Parameter,coefficients(sum.logistic.fit))
knitr::kable(table1,digits=3, caption="Logistic Lapse Model Summary")

4.2 Tweedie (Claims) Regression Results

The Tweedie is commonly used in insurance applications for claims. In part, this is because it can be expressed as a generalized linear model. In the following illustrative code, we have skipped the determination of the $power$ parameters (=1.67 for us).

For type 1 (auto) claims model, we have the following:

Tweedie Claims 1 (Auto) Model Summary
	Parameter	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	3.64	3.807	0.144	26.522	0.000
x1	0.20	0.199	0.058	3.409	0.001
x2	2.00	1.918	0.052	37.209	0.000

The externally specified parameter phi $\phi$ is 500. The estimated value of $\phi$ is 488.09.

As noted above, with simulated data, we know the theoretical relationships. For example, the coefficient associated with $x_2$ is $\beta_2$ = 2. From the table, the estimated value is 1.918 with a standard error of 0.0515. If the difference between the true parameter and its estimated value is large (relative to the standard error), then one easy solution is to increase the sample size $n$ .

R Code for Estimating the Tweedie Marginal Auto Marginal Model

tryCatch(tweedie.fit1 <- glm(Claims1 ~x1+x2,data=SampleData,control = glm.control(maxit = 500),family=tweedie(var.power=Externalxi, link.power=0)),
         error   = function(e) {tweedie1.fit <- glm(Claims1 ~x1+x2,data=SampleData,control = glm.control(maxit = 1500),family=tweedie(var.power=Externalxi, link.power=0))})
sum.tweedie.fit1 <- summary(tweedie.fit1)
Claim1Inter = log(MeanClaim1) - log(mean(mu1a))
Parameter <- c(Claim1Inter, beta11, beta12)
table2=cbind(Parameter,coefficients(sum.tweedie.fit1))
knitr::kable(table2,digits=3, caption="Tweedie Claims 1 (Auto) Model Summary")

The type 2 (home) claims model estimation is similar:

Tweedie Claims 2 (Home) Model Summary
	Parameter	Estimate	Std. Error	t value
(Intercept)	4.364	4.194	0.159	26.293
x3	0.300	0.346	0.082	4.212
x4	3.000	3.052	0.078	39.094

R Code for Estimating the Tweedie Marginal Home Model

tryCatch(tweedie.fit2 <- glm(Claims2 ~x3+x4,data=SampleData,control = glm.control(maxit = 500),family=tweedie(var.power=Externalxi, link.power=0)),
         error   = function(e) {tweedie2.fit <- glm(Claims2 ~x3+x4,data=SampleData,control = glm.control(maxit = 1500),family=tweedie(var.power=Externalxi, link.power=0))})
sum.tweedie.fit2 <- summary(tweedie.fit2)
Claim2Inter = log(MeanClaim2) - log(mean(mu2a))
Parameter <- c(Claim2Inter, beta21, beta22)
table3=cbind(Parameter,coefficients(sum.tweedie.fit2))
knitr::kable(table3,digits=3, caption="Tweedie Claims 2 (Home) Model Summary")

4.3 Residual Checking

As with all model estimation procedures, it is good standard practice to check model assumptions via an examination of the residuals. For generalized linear models, one typically examines deviance residuals. The following provides an example of a standard set of diagnostic plots based on the deviance residuals.

R Code for Diagnostic Plots for the Tweedie Marginal Auto Model

par(mfrow=c(2, 2))
plot(tweedie.fit1)

Standard residual plots from the Tweedie model can be difficult to assess due to mass at zero. Another type of residual can be calculated via the probability integral transform. That is, if $Y$ is a continuous random variable with distribution function $F$ , then $F(Y)$ has a uniform distribution. We can use this relationship to assess quality of our identification of the distribution. Like the deviance residuals, this relationship breaks down in the presence of mass points, e.g. zeros, but can still be used to supplement the usual diagnostic modeling checking. We refer to these as Cox Snell residuals.

R Code for Cox-Snell Residuals from the Probability Integral Transform

# Auto Model
phiEstimated1          = summary(tweedie.fit1)$dis # Externalphi1 is known  
SampleData$fitted1     = tweedie.fit1$fitted.values
SampleData$resids1     = residuals(tweedie.fit1,type=c("deviance"))
dfTweedie1A  = ptweedie(SampleData$Claims1,xi=Externalxi,mu=tweedie.fit1$fitted.values,phi=phiEstimated1)
SampleData$dfTweedie1  = pmin(pmax( 1e-05,dfTweedie1A),.99999)
SampleData$pdfTweedie1 = dtweedie(SampleData$Claims1,xi=Externalxi,mu=tweedie.fit1$fitted.values,phi=phiEstimated1)
# Home Model
phiEstimated2          = summary(tweedie.fit2)$dis # Externalphi2  is known 
SampleData$fitted2     = tweedie.fit2$fitted.values
SampleData$resids2     = residuals(tweedie.fit2,type=c("deviance"))
dfTweedie2A      = ptweedie(SampleData$Claims2,xi=Externalxi,mu=tweedie.fit2$fitted.values,phi=phiEstimated2)
SampleData$dfTweedie2  = pmin(pmax( 1e-05,dfTweedie2A),.99999)
SampleData$pdfTweedie2 = dtweedie(SampleData$Claims2,xi=Externalxi,mu=tweedie.fit2$fitted.values,phi=phiEstimated2)
# Lapse Model
SampleData$Lapsefitted   = logistic.fit$fitted.values
SampleData$LapseResids   = residuals(logistic.fit,type=c("deviance"))
SampleData$dfLapse       = 1-SampleData$Lapsefitted*(SampleData$Lapse==0)

par(mfrow=c(1, 1))
hist(SampleData$dfTweedie1, main="Cox Snell Residuals")

Dependence among residuals suggest patterns not accounted for in the marginal modeling. The first matrix shows correlations among residuals via the probability integral transform (Cox-Snell). The second are deviance residuals. These results are qualitatively similar which is reassuring - it means that they analyst can use either definition and may select the measure that fits the audience of the work.

Spearman Correlations of Cox Snell (CS) Residuals
	Lapse Resids (CS)	Claims 1 Resids (CS)	Claims 2 Resids (CS)
Lapse Resids (CS)	1.000	0.643	0.041
Claims 1 Resids (CS)	0.643	1.000	0.003
Claims 2 Resids (CS)	0.041	0.003	1.000

Spearman Correlations of Deviance (Dev) Residuals
	Lapse Resids (Dev)	Claims 1 Resids (Dev)	Claims 2 Resids (Dev)
Lapse Resids (Dev)	1.000	0.612	0.051
Claims 1 Resids (Dev)	0.612	1.000	0.006
Claims 2 Resids (Dev)	0.051	0.006	1.000

R Code for Residual Correlations

Vars <- c("dfLapse", "dfTweedie1", "dfTweedie2")
CorDat <- SampleData[Vars]
CorMat <- cor(CorDat, method="spearman", use="pairwise.complete.obs")
pnames <- c("Lapse Resids (CS)", "Claims 1 Resids (CS)", "Claims 2 Resids (CS)")
colnames(CorMat) <- rownames(CorMat) <- pnames
knitr::kable(CorMat,digits=3, caption="Spearman Correlations of Cox Snell (CS) Residuals")

Vars <- c("LapseResids", "resids1", "resids2")
CorDat <- SampleData[Vars]
CorMat <- cor(CorDat, method="spearman", use="pairwise.complete.obs")
pnames <- c("Lapse Resids (Dev)", "Claims 1 Resids (Dev)", "Claims 2 Resids (Dev)")
colnames(CorMat) <- rownames(CorMat) <- pnames
knitr::kable(CorMat,digits=3, caption="Spearman Correlations of Deviance (Dev) Residuals")

5 Joint Model Specification

Estimation procedures are based on a likelihood analysis. With the assumption of conditional independence over time, the joint distribution function is $\begin{array}{cl} & \Pr \left(L_{i1} \le r_{1}, \ldots, L_{im} \le r_{m}, Y_{1,i1} \le y_{11}, \ldots, Y_{1,im} \le y_{1m} , Y_{2,i1} \le y_{21}, \ldots, Y_{2,im} \le y_{2m} \right) \\ & \ \ \ \ = \prod_{t=1}^m C\left(F_{Lit}(r_{t}), F_{1,it}(y_{1t}), F_{2,it}(y_{2t}) \right) . \end{array}$ Here, $F_{Lit}$ , $F_{1,it}$ , and $F_{2,it}$ represent the marginal distributions of $L_{it}$ , $Y_{1,it}$ , and $Y_{2,it}$ , respectively.

Using the so-called inference for margins technique, we first estimate the parameters for the marginal distribution as described in Section 4. With this, the marginal distributions $F_{Lit}$ , $F_{1,it}$ , and $F_{2,it}$ can be taken as given and we only need estimate the parameters of the copula $C$ .

With the joint distribution given above, we can determine the likelihood. This is a straight-forward exercise when all of the outcomes are continuous (you only need take derivatives to determine the likelihood) For our situation, it is more complex because the lapse variable $L$ is discrete (binary) and the claims outcomes $Y_1$ and $Y_2$ are hybrid combination of discrete (mass at zero) and continuous (for positive claims). In Appendix A, you will find a short development of the likelihood function. A more detailed treatment is available in the paper.

It is well known that if the observation process depends on the outcomes then parameter estimation may be biased and inconsistent, and hence unreliable. Depending on the parameter settings, you may observe this in the Tweedie coefficient estimates in Sections 4.2 and 4.3. Note, however, that the lapse coefficients are reliable because their estimation does not depend on other outcomes (auto and home claims).

In order to demonstrate dependence estimation, we assume that reliable estimation of the marginals are available from some source. For example, insurer typically have external data for estimation coefficients of rating variables. Alternatively, one can use first year claims that are not affected by a lapse/renewal decision. Here is the R code needed to use true marginal coefficients as inputs for our dependence estimation.

R Code for Setting Marginals to Given Parameters

#  Use True Marginal Parmeters
if (TRUEMARGINALS==TRUE){
   dfTweedie1A  = ptweedie(SampleData$Claims1,xi=Externalxi,mu=SampleData$mu1,phi=Externalphi1)
   SampleData$dfTweedie1 = pmin(pmax( 1e-05,dfTweedie1A),.99999)
   SampleData$pdfTweedie1 = dtweedie(SampleData$Claims1,xi=Externalxi,mu=SampleData$mu1,phi=Externalphi1)
   dfTweedie2A  = ptweedie(SampleData$Claims2,xi=Externalxi,mu=SampleData$mu2,phi=Externalphi2)
   SampleData$dfTweedie2 = pmin(pmax( 1e-05,dfTweedie2A),.99999)
   SampleData$pdfTweedie2 = dtweedie(SampleData$Claims2,xi=Externalxi,mu=SampleData$mu2,phi=Externalphi2)
   SampleData$dfLapse  = 1-SampleData$PI*(SampleData$Lapse==0)
   }

5.1 R Code for Trivariate Likelihoods

Here is code that shows practical details for evaluating the joint likelihood.`

R Code for Trivariate Likelihood Calculation

SampleData$dfLapse  = 1-          #Remove ones (when lapse = 1) just for convenience
          ( SampleData$PI*(TRUEMARGINALS==TRUE) + SampleData$Lapsefitted*(TRUEMARGINALS==FALSE) )
rhoAH_L = (rhoAH -  rhoLA*rhoLH)/sqrt( (1-rhoLA^2)*(1-rhoLH^2) )  
                   # Conditional version needed for Likelihood Calculations
TweedieLike1 <- SampleData[order(-SampleData$PolID, SampleData$year),]
VarsLike <- c("PolID", "year", "Lapse", "dfLapse", "Lapsefitted",
              "Claims1","dfTweedie1","pdfTweedie1", "Claims2","dfTweedie2","pdfTweedie2")
TweedieLike <- TweedieLike1[VarsLike]
NPol = length(TweedieLike$PolID)
calcOrder = 1:NPol

NegLikelihood <- function(param) {
  rhoLA <- param[1];  rhoLH <- param[2];  rhoAH_L <- param[3]
# Transformed parameter
  rhoAH <-  rhoLA*rhoLH + rhoAH_L*sqrt( (1-rhoLA^2)*(1-rhoLH^2) )
  rhoLA = pmin(pmax(-.99,rhoLA),.99)
  rhoLH = pmin(pmax(-.99,rhoLH),.99)
  rhoAH = pmin(pmax(-.99,rhoAH),.99)
  
  SigmaList <- SigmaFct(rhoLA,rhoLH,rhoAH);Sigma <- SigmaList[[1]]
  Sigma12 <- SigmaList[[2]];Sigma13 <- SigmaList[[3]];Sigma23 <- SigmaList[[4]];
  Sigma13.2 <- SigmaList[[5]];Sigma12.3 <- SigmaList[[6]];Sigma23.1 <- SigmaList[[7]];
  Sigma3.12 <- SigmaList[[8]];Sigma1.23 <- SigmaList[[9]];Sigma2.13 <- SigmaList[[10]]

  caset1t2 = 1*(TweedieLike$Claims1==0)*(TweedieLike$Claims2==0)+
             2*(TweedieLike$Claims1>0) *(TweedieLike$Claims2==0)+
             3*(TweedieLike$Claims1==0)*(TweedieLike$Claims2>0)+
             4*(TweedieLike$Claims1>0) *(TweedieLike$Claims2>0)
  u = cbind(TweedieLike$dfLapse,TweedieLike$dfTweedie1,TweedieLike$dfTweedie2)
  zu = qnorm(u)
  pdf = cbind(TweedieLike$pdfTweedie1,TweedieLike$pdfTweedie2)
  mydata <- data.frame(caset1t2,u,zu,pdf,calcOrder,TweedieLike$Lapse)
  names(mydata) <- c("caset1t2","u10", "u2","u3","zu10", "zu2","zu3","pdf1","pdf2","calcOrder", "Lapse")
  mydata1 = mydata[which(caset1t2==1),]
  mydata2 = mydata[which(caset1t2==2),]
  mydata3 = mydata[which(caset1t2==3),]
  mydata4 = mydata[which(caset1t2==4),]
  norm.cops  <- normalCopula(param=c(rhoLA,rhoLH,rhoAH), dispstr="un", dim=3)

  likehd = 0*calcOrder
  if (nrow(mydata1)>0) {
            Fbb = as.matrix(GMMScore00(rhoLA,rhoLH,rhoAH,mydata1$zu10,mydata1$zu2,mydata1$zu3)[,1],ncol=1)
            Faa = as.matrix(BiCopCDF(mydata1$u2,mydata1$u3, family=1, par=rhoAH),ncol=1)
            likehd[mydata1$calcOrder] = 
               (1-mydata1$Lapse)*Fbb/mydata1$u10 +  mydata1$Lapse*(Faa-Fbb)/(1-mydata1$u10) }
  if (nrow(mydata2)>0) {
            Fbb = as.matrix(GMMScore01(rhoLA,rhoLH,rhoAH,mydata2$zu10,mydata2$zu3,mydata2$zu2)[,1],ncol=1)
            Faa = as.matrix(unlist(BiCopHfunc(mydata2$u2,mydata2$u3, family=1, par=rhoAH)[1]),ncol=1)
            likehd[mydata2$calcOrder] =  mydata2$pdf1*
              ((1-mydata2$Lapse)*Fbb/mydata2$u10 + mydata2$Lapse*(Faa-Fbb)/(1-mydata2$u10)) }
  if (nrow(mydata3)>0) {
            Fbb = as.matrix(GMMScore01(rhoLA,rhoLH,rhoAH,mydata3$zu10,mydata3$zu2,mydata3$zu3)[,1],ncol=1)
            Faa = as.matrix(unlist(BiCopHfunc(mydata3$u2,mydata3$u3, family=1, par=rhoAH)[2]),ncol=1)
            likehd[mydata3$calcOrder] =  mydata3$pdf2*
             ((1-mydata3$Lapse)*Fbb/mydata3$u10 + mydata3$Lapse*(Faa-Fbb)/(1-mydata3$u10)) }
  if (nrow(mydata4)>0) {
            Fbb = as.matrix(GMMScore11(rhoLA,rhoLH,rhoAH,mydata4$zu10,mydata4$zu2,mydata4$zu3)[,1],ncol=1)
            Faa = as.matrix(BiCopPDF(mydata4$u2,mydata4$u3, family=1, par=rhoAH),ncol=1)
            likehd[mydata4$calcOrder] = mydata4$pdf1*mydata4$pdf2*
              ((1-mydata4$Lapse)*Fbb/mydata4$u10 + mydata4$Lapse*(Faa-Fbb)/(1-mydata4$u10)) }
  -sum(log(likehd))
  }

5.2 Visualizing Trivariate Likelihoods

With the margins fixed, the likelihood is a function of three association parameters, $\rho_{LA}$ , $\rho_{LH}$ , and $\rho_{AH}$ . Although it is possible to plot the likelihood as a function of these parameters, we find it more helpful to plot the likelihood as a function of a single parameter, holding the other two fixed.

Logarithmic Likelihood

To interpret these plots, recall that we specified the values to be $\rho_{LA}=$ 0.4, $\rho_{LH}=$ 0.4, and $\rho_{AH}=$ 0.1. When creating these plots, we use the (known) specified values to help understand the likelihood curvature for each parameter.

R Code for Visualizing Trivariate Likelihoods

# Plot the likelihood
par(mfrow=c(1, 3))
rhoseq <- function(rho){
    rhomin = pmax(-.9, rho-0.4)
    rhomax = pmin(0.9, rho+0.4)
    RHO = seq(rhomin,rhomax,length.out=10)
    RHO}
RHOLA = rhoseq(rhoLA)
LikeTemp1 <- matrix(0,1,length(RHOLA))
for (i in 1:length(RHOLA)){LikeTemp1[i] <- -NegLikelihood(c(RHOLA[i],rhoLH,rhoAH_L))}
plot(RHOLA,LikeTemp1,main=expression(paste("Hold ",rho[LH]," and ", rho[AH]," Fixed")),ylab="Log Like",xlab=expression(rho[LA]))

RHOLH = rhoseq(rhoLH)
LikeTemp2 <- matrix(0,1,length(RHOLH))
for (i in 1:length(RHOLH)){LikeTemp2[i] <- -NegLikelihood(c(rhoLA,RHOLH[i],rhoAH_L))}
plot(RHOLH,LikeTemp2,main=expression(paste("Hold ",rho[LA]," and ", rho[AH]," Fixed")),ylab="Log Like",xlab=expression(rho[LH]))

RHOAH = rhoseq(rhoAH)
LikeTemp3 <- matrix(0,1,length(RHOAH))
for (i in 1:length(RHOAH)){
  RHOAH_L <- (RHOAH[i] - rhoLA*rhoLH)/sqrt( (1-rhoLA^2)*(1-rhoLH^2) )
  LikeTemp3[i] <- -NegLikelihood(c(rhoLA,rhoLH,rhoAH=RHOAH_L))}
plot(RHOAH,LikeTemp3,main=expression(paste("Hold ",rho[LA]," and ", rho[LH]," Fixed")),ylab="Log Like",xlab=expression(rho[AH]))

5.3 Estimation Results Based on Trivariate Likelihoods

We maximized this likelihood and determined standard errors in the usual fashion (by using a numerical approximation of the second derivative). The following provides the results, followed by the code.

Likelihood Estimation Results
	Parameter	Estimate	Std Error
Lapse-Auto	0.4	0.4219	0.0191
Lapse-Home	0.4	0.3983	0.0097
Auto-Home	0.1	0.0779	0.0137

R Code for Trivariate Likelihood Estimation

# Pairwise First - For Starting Values
toler = 0.4
trueparam <- c(rhoLA,rhoLH,rhoAH_L)
lbound = trueparam -toler*c(1,1,1); lbound = pmax(-.9, lbound)
ubound = trueparam +toler*c(1,1,1); ubound = pmin(0.9, ubound)
pair12<-function(rho){return(NegLikelihood(c(rho,rhoLH,rhoAH_L)))} #not the best...
op12<-optim(0,pair12,method=c("L-BFGS-B"),lower=lbound[1],upper=ubound[1])
pair13<-function(rho){return(NegLikelihood(c(op12$par,rho,rhoAH_L)))}
op13<-optim(0,pair13,method=c("L-BFGS-B"),lower=lbound[2],upper=ubound[2])
pair23<-function(rho){return(NegLikelihood(c(op12$par,op13$par,rho)))}
op23<-optim(0,pair23,method=c("L-BFGS-B"),lower=lbound[3],upper=ubound[3])

# Joint Estimation    
param0 <- c(op12$par,op13$par,op23$par)
lbound1 = param0-toler*c(1,1,1); lbound1 = pmax(-.9, lbound1)
ubound1 = param0+toler*c(1,1,1); ubound1 = pmin(0.9, ubound1)
MaxLikop <- optim(par=param0,NegLikelihood,method=c("L-BFGS-B"),
                  lower=lbound1,upper=ubound1 ,hessian=TRUE)
PairEst <- MaxLikop$par 
GTransform = matrix(c(1,0,0,0,1,0,
    PairEst[2]-PairEst[1]*PairEst[3]*sqrt( (1-PairEst[2]^2)/(1-PairEst[1]^2) )  ,
    PairEst[1]-PairEst[2]*PairEst[3]*sqrt( (1-PairEst[1]^2)/(1-PairEst[2]^2) )  ,
    sqrt( (1-PairEst[1]^2)*(1-PairEst[2]^2) ) ) ,  nrow=3,ncol=3)
AVar = t(GTransform) %*% ginv(MaxLikop$hessian) %*% GTransform
tryCatch(PairSE <- sqrt(diag(AVar)), error   = function(e) {PairSE <- 0*AVar})
#Adjustments for Reparameterization
PairEst[3] <- PairEst[1]*PairEst[2] + PairEst[3]*sqrt( (1-PairEst[1]^2)*(1-PairEst[2]^2) ) 
paramvalues <- c(rhoLA, rhoLH, rhoAH)
EstResults <- cbind(paramvalues,PairEst,PairSE) 
rownames(EstResults) <- c("Lapse-Auto","Lapse-Home","Auto-Home")
colnames(EstResults) <- c("Parameter","Estimate","Std Error")
knitr::kable(EstResults,digits=4, caption="Likelihood Estimation Results")

6 Estimation Using Generalized Method of Moments

In Appendix B, you will find a short development of the details needed for estimation using generalized method of moments, GMM. A more detailed treatment is available in the paper.

6.1 GMM Scores

Here is some code for functions needed later.

R Code for GMM Scores

GMMScore <- function(param) {
  rhoLA <- param[1];  rhoLH <- param[2];  rhoAH_L <- param[3]
# Transformed parameter
  rhoAH <- rhoLA*rhoLH + rhoAH_L*sqrt( (1-rhoLA^2)*(1-rhoLH^2) )
  rhoLA = pmin(pmax(-.99,rhoLA),.99)
  rhoLH = pmin(pmax(-.99,rhoLH),.99)
  rhoAH = pmin(pmax(-.99,rhoAH),.99)

  SigmaList <- SigmaFct(rhoLA,rhoLH,rhoAH);Sigma <- SigmaList[[1]]
  Sigma12   <- SigmaList[[2]];Sigma13   <- SigmaList[[3]];Sigma23   <- SigmaList[[4]]
  Sigma13.2 <- SigmaList[[5]];Sigma12.3 <- SigmaList[[6]];Sigma23.1 <- SigmaList[[7]]
  Sigma3.12 <- SigmaList[[8]];Sigma1.23 <- SigmaList[[9]];Sigma2.13 <- SigmaList[[10]]

  caset1t2 = 1*(TweedieLike$Claims1==0)*(TweedieLike$Claims2==0)+
             2*(TweedieLike$Claims1>0) *(TweedieLike$Claims2==0)+
             3*(TweedieLike$Claims1==0)*(TweedieLike$Claims2>0)+
             4*(TweedieLike$Claims1>0) *(TweedieLike$Claims2>0)
  u = cbind(TweedieLike$dfLapse,TweedieLike$dfTweedie1,TweedieLike$dfTweedie2)
  zu = qnorm(u)
  pdf = cbind(TweedieLike$pdfTweedie1,TweedieLike$pdfTweedie2)
  mydata <- data.frame(caset1t2,u,zu,pdf,calcOrder,TweedieLike$Lapse)
  names(mydata) <- c("caset1t2","u10", "u2","u3","zu10", "zu2","zu3","pdf1","pdf2","calcOrder", "Lapse")
  mydata1 = mydata[which(caset1t2==1),]
  mydata2 = mydata[which(caset1t2==2),]
  mydata3 = mydata[which(caset1t2==3),]
  mydata4 = mydata[which(caset1t2==4),]

  vec111 <- as.vector(cbind(1,1,1));vec001 <- as.vector(cbind(0,0,1))
  ScoreLike = matrix(0,length(calcOrder),6)
  ScoreLike[,1] =TweedieLike$PolID
  ScoreLike[,2] = TweedieLike$year
  if (nrow(mydata1)>0) {
            Reten0 = 1-mydata1$Lapse
            fbb =           GMMScore00(rhoLA,rhoLH,rhoAH,mydata1$zu10,mydata1$zu2,mydata1$zu3)[,2:4]
            Fbb = as.matrix(GMMScore00(rhoLA,rhoLH,rhoAH,mydata1$zu10,mydata1$zu2,mydata1$zu3)[,1],ncol=1)            %*% vec111  
            faa = as.matrix(dmvnorm(cbind(mydata1$zu2,mydata1$zu3), mean=rep(0, 2), sigma=Sigma23, log=FALSE),ncol=1) %*% vec001
            Faa = as.matrix(BiCopCDF(mydata1$u2,mydata1$u3, family=1, par=rhoAH),ncol=1)                              %*% vec111     
            ScoreLike[mydata1$calcOrder,3]   =  Reten0*log(Fbb)[,1] + (1-Reten0)*log(Faa-Fbb)[,1]   
            ScoreLike[mydata1$calcOrder,4:6] =  Reten0*fbb/Fbb      + (1-Reten0)*(faa-fbb)/(Faa-Fbb)
            }
  
  if (nrow(mydata2)>0) {
            Reten0 = 1-mydata2$Lapse
            fbb =           GMMScore01(rhoLA,rhoLH,rhoAH,mydata2$zu10,mydata2$zu3,mydata2$zu2)[,2:4]
            Fbb = as.matrix(GMMScore01(rhoLA,rhoLH,rhoAH,mydata2$zu10,mydata2$zu3,mydata2$zu2)[,1],ncol=1) %*% vec111  
            faa = as.matrix(BiCopHfuncDeriv(mydata2$u3,mydata2$u2, family=1, par=rhoAH),ncol=1)            %*% vec001
            Faa = as.matrix(unlist(BiCopHfunc(mydata2$u2,mydata2$u3, family=1, par=rhoAH)[1]),ncol=1)      %*% vec111     
            ScoreLike[mydata2$calcOrder,3]   =  Reten0*log(Fbb)[,1] + (1-Reten0)*log(Faa-Fbb)[,1] 
            ScoreLike[mydata2$calcOrder,4:6] =  Reten0*fbb/Fbb      + (1-Reten0)*(faa-fbb)/(Faa-Fbb)
            }

  if (nrow(mydata3)>0) {
            Reten0 = 1-mydata3$Lapse
            fbb =           GMMScore01(rhoLA,rhoLH,rhoAH,mydata3$zu10,mydata3$zu2,mydata3$zu3)[,2:4]
            Fbb = as.matrix(GMMScore01(rhoLA,rhoLH,rhoAH,mydata3$zu10,mydata3$zu2,mydata3$zu3)[,1],ncol=1) %*% vec111  
            faa = as.matrix(BiCopHfuncDeriv(mydata3$u2,mydata3$u3, family=1, par=rhoAH),ncol=1)            %*% vec001
            Faa = as.matrix(unlist(BiCopHfunc(mydata3$u2,mydata3$u3, family=1, par=rhoAH)[2]),ncol=1)      %*% vec111     
            ScoreLike[mydata3$calcOrder,3]   =  Reten0*log(Fbb)[,1] + (1-Reten0)*log(Faa-Fbb)[,1] 
            ScoreLike[mydata3$calcOrder,4:6] =  Reten0*fbb/Fbb      + (1-Reten0)*(faa-fbb)/(Faa-Fbb)
            }
    if (nrow(mydata4)>0) {
            Reten0 = 1-mydata4$Lapse
            fbb =           GMMScore11(rhoLA,rhoLH,rhoAH,mydata4$zu10,mydata4$zu2,mydata4$zu3)[,2:4]
            Fbb = as.matrix(GMMScore11(rhoLA,rhoLH,rhoAH,mydata4$zu10,mydata4$zu2,mydata4$zu3)[,1],ncol=1) %*% vec111  
            faa = as.matrix(BiCopDeriv(pnorm(mydata4$zu2),pnorm(mydata4$zu3), family=1, par=rhoAH),ncol=1) %*% vec001
            Faa = as.matrix(BiCopPDF(mydata4$u2,mydata4$u3, family=1, par=rhoAH),ncol=1)                   %*% vec111 
            ScoreLike[mydata4$calcOrder,3]   =  Reten0*log(Fbb)[,1] + (1-Reten0)*log(Faa-Fbb)[,1]
            ScoreLike[mydata4$calcOrder,4:6] =  Reten0*fbb/Fbb      + (1-Reten0)*(faa-fbb)/(Faa-Fbb)
            }
  ScoreLike
}

6.2 Visualize the Scores

First, let us see how the sum of scores behaves. Recall that for a score function, the objective is to find those parameters so that the score equals zero. The plots that follow mark the line where the score equals zero.

Sum of Scores

R Code for Visualizing GMM Sum of Scores

GMMgthethaFunct<- function(param) {
  Scores <- data.frame(GMMScore(param)[,c(1,2,4:6)])
  names(Scores) <- c("PolID", "year", "Score12", "Score13", "Score23")
  ScoresMelt = melt(Scores, id=c("PolID", "year"), measured=c("Score12", "Score13", "Score23"))
  ScoresCast <- cast(ScoresMelt, PolID ~ variable ~ year)
  GMMgthetha = cbind(ScoresCast[,1,],ScoresCast[,2,],ScoresCast[,3,])
  GMMgthetha[is.na(GMMgthetha)] <- 0
  return(GMMgthetha)
  }

GMMScoreSum<- function(param) {
  GMMgthetha = GMMgthethaFunct(param)
  temp = colSums(GMMgthetha)
  cbind(sum(temp[1:5]),sum(temp[6:10]),sum(temp[11:15]))
  }

par(mfrow=c(1,3))
RHOLA = rhoseq(rhoLA)
  GMMTemp1 <- matrix(0,1,length(RHOLA)) 
  for (i in 1:length(RHOLA)){GMMTemp1[i] <- GMMScoreSum(c(RHOLA[i],rhoLH,rhoAH_L))[1]}
  plot(RHOLA,GMMTemp1,main=expression(paste("Hold ",rho[LH]," and ", rho[AH]," Fixed")),ylab="GMM Score",xlab=expression(rho[LA]));abline(0,0)

RHOLH = rhoseq(rhoLH)
  GMMTemp2 <- matrix(0,1,length(RHOLH))
  for (i in 1:length(RHOLH)){GMMTemp2[i] <- GMMScoreSum(c(rhoLH,RHOLH[i],rhoAH_L))[2]}
  plot(RHOLH,GMMTemp2,main=expression(paste("Hold ",rho[LA]," and ", rho[AH]," Fixed")),ylab="GMM Score",xlab=expression(rho[LH]));abline(0,0)

RHOAH = rhoseq(rhoAH)
  GMMTemp3 <- matrix(0,1,length(RHOAH))
  for (i in 1:length(RHOAH)){
  RHOAH_L <- (RHOAH[i]- rhoLA*rhoLH)/sqrt( (1-rhoLA^2)*(1-rhoLH^2) )
  GMMTemp3[i] <- GMMScoreSum(c(rhoLA,rhoLH,RHOAH_L))[3]}
  plot(RHOAH,GMMTemp3,main=expression(paste("Hold ",rho[LA]," and ", rho[LH]," Fixed")),ylab="GMM Score",xlab=expression(rho[AH]));abline(0,0)

For a recursive method such as the GMM, one needs starting values for the recursion. It is possible to use the sum of squared scores for this. If you wish to explore this, click on link below that shows plots and code for this. However, we will use the likelihood estimators developed in Section 5.

Visualizing GMM Sum of Scores

Sum of Squared Scores

GMMScoreSquareSum<- function(param) {
  GMMgthetha = GMMgthethaFunct(param)
   t(colSums(GMMgthetha)) %*% colSums(GMMgthetha) 
   }

par(mfrow=c(1,3))
RHOLA = rhoseq(rhoLA)
GMMTemp1 <- matrix(0,1,length(RHOLA)) 
for (i in 1:length(RHOLA)){GMMTemp1[i] <- GMMScoreSquareSum(c(RHOLA[i],rhoLH,rhoAH_L))}
plot(RHOLA,GMMTemp1,main=expression(paste("Hold ",rho[LH]," and ", rho[AH]," Fixed")),ylab="GMM Score",xlab=expression(rho[LA]))

RHOLH = rhoseq(rhoLH)
GMMTemp2 <- matrix(0,1,length(RHOLH))
for (i in 1:length(RHOLH)){GMMTemp2[i] <- GMMScoreSquareSum(c(rhoLH,RHOLH[i],rhoAH_L))}
plot(RHOLH,GMMTemp2,main=expression(paste("Hold ",rho[LA]," and ", rho[AH]," Fixed")),ylab="GMM Score",xlab=expression(rho[LH]))

RHOAH = rhoseq(rhoAH)
GMMTemp3 <- matrix(0,1,length(RHOAH))
for (i in 1:length(RHOAH)){
  RHOAH_L <- (RHOAH[i] - rhoLA*rhoLH)/sqrt( (1-rhoLA^2)*(1-rhoLH^2) )
  GMMTemp3[i] <- GMMScoreSquareSum(c(rhoLA,rhoLH,RHOAH_L))}
plot(RHOAH,GMMTemp3,main=expression(paste("Hold ",rho[LA]," and ", rho[LH]," Fixed")),ylab="GMM Score",xlab=expression(rho[AH]))

R Code for Calculating an Initial Estimator

GMMInit = PairEst

6.3 Visualize the GMM Score Function

With the initial estimator, we can now calculate the GMM score function. This allows us to minimize this function in order to get our GMM estimator, with an asymptotic variance.

GMM Scores

R Code for Visualizing GMM Score Function

GMMgthethaInit = GMMgthethaFunct(GMMInit)
Vargtheta = t(GMMgthethaInit) %*% GMMgthethaInit / length(GMMgthethaInit[,1])

GMMFunc<- function(param) {
  GMMgthetha = GMMgthethaFunct(param)
  GMMScorex = t(colSums(GMMgthetha)) %*% ginv(Vargtheta) %*% colSums(GMMgthetha) / length(GMMgthetha[,1])
  return(GMMScorex)
  }

par(mfrow=c(1,3))
RHOLA = rhoseq(rhoLA)
GMMTemp1 <- matrix(0,1,length(RHOLA)) 
for (i in 1:length(RHOLA)){GMMTemp1[i] <- GMMFunc(c(RHOLA[i],rhoLH,rhoAH_L))}
plot(RHOLA,GMMTemp1,main=expression(paste("Hold ",rho[LH]," and ", rho[AH]," Fixed")),ylab="GMM Score",xlab=expression(rho[LA]))

RHOLH = rhoseq(rhoLH)
GMMTemp2 <- matrix(0,1,length(RHOLH))
for (i in 1:length(RHOLH)){GMMTemp2[i] <- GMMFunc(c(rhoLH,RHOLH[i],rhoAH_L))}
plot(RHOLH,GMMTemp2,main=expression(paste("Hold ",rho[LA]," and ", rho[AH]," Fixed")),ylab="GMM Score",xlab=expression(rho[LH]))

RHOAH = rhoseq(rhoAH)
GMMTemp3 <- matrix(0,1,length(RHOAH))
for (i in 1:length(RHOAH)){
  rhoAH_L <- (RHOAH[i] - rhoLA*rhoLH)/sqrt( (1-rhoLA^2)*(1-rhoLH^2) )
  GMMTemp3[i] <- GMMFunc(c(rhoLA,rhoLH,rhoAH_L))}
plot(RHOAH,GMMTemp3,main=expression(paste("Hold ",rho[LA]," and ", rho[LH]," Fixed")),ylab="GMM Score",xlab=expression(rho[AH]))

6.4 GMM Estimation Results

The following table summarizes the GMM estimation results and compares these results to true parameter values and estimation results from the Section 5 likelihood estimation. The code follows.

GMM Estimation Results
	Parameter	Like Est	Like Std Error	GMM Est	GMM Std Error
Lapse-Auto	0.4	0.4219	0.0191	0.4185	0.0191
Lapse-Home	0.4	0.3983	0.0097	0.3976	0.0098
Auto-Home	0.1	0.0779	0.0137	0.0761	0.0136

R Code for GMM Score Function Estimation

#Final Estimate
toler = 0.4
lbound = GMMInit-toler*c(1,1,1); lbound = pmax(-.9, lbound) 
ubound = GMMInit+toler*c(1,1,1); ubound = pmin(0.9, ubound)
GMMResult2 <- optim(par=GMMInit,GMMFunc,method=c("L-BFGS-B"),lower=lbound,upper=ubound)
GMMEst <- GMMResult2$par

# Standard Error
GTransform = matrix(c(1,0,0,0,1,0,
    GMMEst[2]-GMMEst[1]*GMMEst[3]*sqrt( (1-GMMEst[2]^2)/(1-GMMEst[1]^2) )  ,
    GMMEst[1]-GMMEst[2]*GMMEst[3]*sqrt( (1-GMMEst[1]^2)/(1-GMMEst[2]^2) )  ,
    sqrt( (1-GMMEst[1]^2)*(1-GMMEst[2]^2) ) ) ,
                 nrow=3,ncol=3)

GMMgthethaSumFunct<- function(param) { colSums(GMMgthethaFunct(param))  }
gradient<-jacobian(func=GMMgthethaSumFunct,GMMEst)
GMMgthetha = GMMgthethaFunct(GMMEst)
Vargtheta = t(GMMgthetha) %*% GMMgthetha / length(GMMgthetha[,1])
GMMVar = t(gradient) %*% ginv(Vargtheta) %*% gradient / length(GMMgthetha[,1])

TransformGMMVar = t(GTransform) %*% ginv(GMMVar) %*% GTransform
tryCatch(GMMstderror <- sqrt(diag(TransformGMMVar)) , error   = function(e) {GMMstderror <-0* TransformGMMVar})

GMMEst[3] <- GMMEst[1]*GMMEst[2] + GMMEst[3]*sqrt( (1-GMMEst[1]^2)*(1-GMMEst[2]^2) ) 

paramvalues <- c(rhoLA, rhoLH, rhoAH)
EstResults1 <- cbind(paramvalues,PairEst,PairSE,GMMEst,GMMstderror) 
rownames(EstResults1) <- c("Lapse-Auto","Lapse-Home","Auto-Home")
colnames(EstResults1) <- c("Parameter","Like Est","Like Std Error", "GMM Est", "GMM Std Error")
knitr::kable(EstResults1,digits=4, caption="GMM Estimation Results")

7 Appendix A – Joint Model Specification

To keep this tutorial self-contained, here is a short development of the likelihood function. A more detailed treatment is available in the paper where you can also find references to supporting work in the academic literature.

Show Likelihood Appendix

As described in Section 5, the random variables potentially observed are lapse variables $L_{i1}, \ldots, L_{im}$ , and claims outcomes $Y_{1,i1}, \ldots, Y_{1,im}$ and $Y_{2,i1}, \ldots, Y_{2,im}$ .

7 Joint Distribution Function

With the conditional independence over time, the joint distribution function is $\begin{array}{cl} & \Pr \left(L_{i1} \le r_{1}, \ldots, L_{im} \le r_{m}, Y_{1,i1} \le y_{11}, \ldots, Y_{1,im} \le y_{1m} , Y_{2,i1} \le y_{21}, \ldots, Y_{2,im} \le y_{2m} \right) \\ & \ \ \ \ = \prod_{t=1}^m \Pr \left(L_{it} \le r_{t}, Y_{1,it} \le y_{1t}, Y_{2,it} \le y_{2t} \right) \\ & \ \ \ \ = \prod_{t=1}^m C\left(F_{Lit}(r_{t}), F_{1,it}(y_{1t}), F_{2,it}(y_{2t}) \right) . \end{array}$ Here, $F_{Lit}$ , $F_{1,it}$ , and $F_{2,it}$ represent the marginal distributions of $L_{it}$ , $Y_{1,it}$ , and $Y_{2,it}$ , respectively.

Dependence among these three random variables is modeled using a Gaussian copula $C$ with dependence parameters $\boldsymbol \Sigma = \left( \begin{array}{ccc} 1 & \rho_{LA} & \rho_{LH} \\ \rho_{LA} & 1 & \rho_{AH} \\ \rho_{LH} & \rho_{AH} & 1 \\ \end{array} \right) .$

It is convenient to introduce the time to lapse variable $T_i$ that represents the time that the $i$ th policyholder lapses. Specifically, $T_i = \left\{ \begin{array}{cl} 1 & \text{if }L_{i1}=1 \\ 2 & \text{if }L_{i1}=0,L_{i2}=1 \\ \vdots & \ \ \ \ \ \vdots \\ t & \text{if }L_{i1}=0,\ldots,L_{i,t-1}=0, L_{i,t}=1 \\ \vdots & \ \ \ \ \ \vdots \\ m & \text{if }L_{i1}=0,\ldots,L_{i,m-1}=0, L_{im}=1 \\ m+1 & \text{if }L_{i1}=0,\ldots,L_{i,m}=0 \\ \end{array} \right.$ We can organize the observed data based on the time to lapse variable. Specifically, suppose that we observe $T_i=t$ outcome periods for $t=1, \ldots, m+1$ . Then, the observed likelihood is based on the distribution function $\begin{array}{ll} & \Pr \left(T_i=t, Y_{1,i1} \le y_{11}, \ldots, Y_{1,it} \le y_{1t} , Y_{2,i1} \le y_{21}, \ldots, Y_{2,it} \le y_{2t} \right) \\ & \ \ \ \ = \Pr \left( L_{i1}=0,\ldots,L_{i,t-1}=0, L_{i,t}=1, L_{i,t+1} \le \infty, \ldots, L_{im} \le \infty, \right.\\ & \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ Y_{1,i1} \le y_{11}, \ldots, Y_{1,it} \le y_{1t},Y_{1,i,t+1} \le \infty, \ldots, Y_{1,im} \le \infty,\\ & \left. \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ Y_{2,i1} \le y_{21}, \ldots, Y_{2,it} \le y_{2t},Y_{2,i,t+1} \le \infty, \ldots, Y_{2,im} \le \infty \right) \end{array}$

If lapse occurs, then the observed likelihood is based on the distribution function $\begin{array}{ll} &&\Pr \left(T_i=t, Y_{1,i1} \le y_{11}, \ldots, Y_{1,it} \le y_{1t} , Y_{2,i1} \le y_{21}, \ldots, Y_{2,it} \le y_{2t} \right) \\ && \ \ \ \ = \left\{C\left(1, F_{1,it}(y_{1t}),F_{2,it}(y_{2t}) \right)-C\left(F_{Lit}(0) , F_{1,it}(y_{1t}),F_{2,it}(y_{2t}) \right)\right\} \prod_{s=1}^{t-1} C\left(F_{Lis}(0) , F_{1,is}(y_{1s}),F_{2,is}(y_{2s}) \right) \notag . \end{array}$

If a policy is renewed for all $m$ periods, then $T_i=m+1$ and the observed likelihood is based on $\begin{array}{ll} \Pr \left(T_i=m+1, Y_{1,i1} \le y_{11}, \ldots, Y_{1,im} \le y_{1m} , Y_{2,i1} \le y_{21}, \ldots, Y_{2,im} \le y_{2m} \right) = \prod_{s=1}^{m} C\left(F_{Lis}(0) , F_{1,is}(y_{1s}), F_{2,is}(y_{2s}) \right) . \notag \end{array}$

Note that the evaluation of this function involves a trivariate copula.

For the time to lapse variable, the marginal distribution is $\begin{aligned}\Pr \left(T_i=t \right) &=\Pr \left(T_i=t, Y_{1,i1} \le \infty , \ldots, Y_{1,it} \le \infty , Y_{2,i1} \infty, \ldots, Y_{2,it} \le \infty \right) \\ &=\left\{\begin{array}{cl} (1-F_{Li,t}(0))\prod_{s=1}^{t-1} F_{Lis}(0)& 1 \le t \le m \\ \prod_{s=1}^{m} F_{Lis}(0) & t=m+1. \end{array}\right. \end{aligned}$ Thus, the conditional distribution function is $\begin{array}{ll} &&\Pr \left(Y_{1,i1} \le y_{11}, \ldots, Y_{1,it} \le y_{1t}, Y_{2,i1} \le y_{21}, \ldots, Y_{2,it} \le y_{2t} | T_i=t \right) \\ && \ \ \ \ = \frac{\Pr \left(T_i=t ,Y_{1,i1} \le y_{11}, \ldots, Y_{1,it} \le y_{1t}, Y_{2,i1} \le y_{21}, \ldots, Y_{2,it} \le y_{2t}\right)}{\Pr(T_i=t)}\\ && \ \ \ \ = \left(\frac{C\left(1, F_{1,it}(y_{1t}),F_{2,it}(y_{2t}) \right)-C\left(F_{Lit}(0) , F_{1,it}(y_{1t}),F_{2,it}(y_{2t}) \right)} {1-F_{Lit}(0)} \right)^{I(t\le m)} \prod_{s=1}^{t-1} \frac{C\left(F_{Lis}(0) , F_{1,is}(y_{1s}),F_{2,is}(y_{2s})\right)}{F_{Lis}(0)} \end{array}$ for $t=1, \ldots, m+1$ . From this, the corresponding conditional distribution function is

$\begin{aligned}\Pr \left(Y_{1,is} \le y_1, Y_{2,is} \le y_2 |T_i=t \right) &= \\ &\left\{\begin{array}{cl} \frac{C\left(F_{Lis}(0), F_{1,is}(y_1), F_{2,is}(y_2) \right)}{F_{Lis}(0)} & \text{for } s<t \le m+1 \\ \frac{C\left(1, F_{1,is}(y_{1}),F_{2,is}(y_{2}) \right)-C\left(F_{Lis}(0) , F_{1,is}(y_{1}),F_{2,is}(y_{2}) \right)} {1-F_{Lis}(0)} & \text{for }s=t \le m \end{array}\right. \end{aligned}$

For the first case, where $s<t \le m+1$ , this has hybrid probability density/mass function $f_{i12,s|t}(y_1,y_2) = \frac{1}{F_{Lis}(0)} \left\{ \begin{array}{ll} C\left(F_{Lis}(0), F_{1,is}(0), F_{2,is}(0) \right) & y_1=0,y_2=0 \\ C_{2}\left(F_{Lis}(0), F_{1,is}(y_1), F_{2,is}(0) \right) f_{1,is}(y_1) & y_1>0,y_2=0 \\ C_{3}\left(F_{Lis}(0), F_{1,is}(0), F_{2,is}(y_2) \right)f_{2,is}(y_2) & y_1=0,y_2>0\\ C_{23}\left(F_{Lis}(0), F_{1,is}(y_1), F_{2,is}(y_2) \right) f_{1,is}(y_1) f_{2,is}(y_2) & y_1>0,y_2>0 \end{array} \right.$ Here, $C_{2}(u_1,u_2,u_3) = \frac{\partial}{\partial u_2} C(u_1,u_2,u_3)$ represents the partial derivative of the copula with respect to the second argument and similarly for $C_{3}$ . The term $C_{23}$ is a second derivative with respect to the second and third arguments. Further $f_{j,is}$ is the density function corresponding to the distribution function $F_{j,is}$ .

For the second case, where $s=t \le m$ , this has hybrid probability density/mass function

$\begin{array}{ll} f_{i12,s|t}(y_1,y_2) = \frac{1}{1-F_{Lit}(0)} \times \\ ~ \\ \left\{ \begin{array}{ll} \sum_{i=0}^1 \left(-1\right)^{i} C \left(F_{Lis}(0)^{i}, F_{1,is}(0),F_{2,is}(0)\right) & y_1=0,y_2=0 \\ \left\{\sum_{i=0}^1 \left(-1\right)^{i} C_{2}\left(F_{Lis}(0)^{i}, F_{1,is}(y_1), F_{2,is}(0) \right) \right\} f_{1,is}(y_1) & y_1>0,y_2=0 \\ \left\{ \sum_{i=0}^1 \left(-1\right)^{i} C_{3}\left(F_{Lis}(0)^{i}, F_{1,is}(0), F_{2,is}(y_2) \right)\right\} f_{2,is}(y_2) & y_1=0,y_2>0 \\ \left\{ \sum_{i=0}^1 \left(-1\right)^{i} C_{23}\left(F_{Lis}(0)^{i}, F_{1,is}(y_1), F_{2,is}(y_2) \right)\right\} f_{1,is}(y_1) f_{2,is}(y_2) & y_1>0,y_2>0 . \end{array} \right.\end{array}$

Using this notation, the logarithmic likelihood is $L = \sum_{i=1}^n \sum_{s=1}^{t_i \wedge m} \ln f_{i12s|t_i}(y_{1is},y_{2is}) .$

7 Trivariate Gaussian Copula Distribution Functions

This section uses the trivariate case so $d=3$ . For simplicity, define the normal scores $z_j = \Phi^{-1}(u_j)$ and the following generic expression for the association matrix $\boldsymbol \Sigma = \left( \begin{array}{ccccc} 1 & \rho_{12} & \rho_{13} \\ \rho_{12} & 1 & \rho_{23} \\ \rho_{13} & \rho_{23} & 1 \\ \end{array} \right) .$

For one derivative, we have $C_3 \left(u_1, u_2, u_3 \right) =C \left(u_1, u_2 | u_3 \right) = \Phi_2 \left( z_1 - \mu_{12 \cdot 3, 1}, z_2 - \mu_{12 \cdot 3, 2} ; \boldsymbol \Sigma _{12 \cdot 3} \right)$ where $\boldsymbol \mu _{12 \cdot 3} = \left(\begin{array}{c} \mu_{12 \cdot 3, 1} \\ \mu_{12 \cdot 3, 2} \\ \end{array}\right) = z_3 \left(\begin{array}{c} \rho_{13} \\ \rho_{23} \\ \end{array}\right) \ \ \ \text{and} \ \ \ \boldsymbol \Sigma_{12 \cdot 3} = \left(\begin{array}{cc} 1 & \rho_{12} \\ \rho_{12} & 1 \\ \end{array}\right) - \left(\begin{array}{c} \rho_{13} \\ \rho_{23} \\ \end{array}\right) \left(\begin{array}{cc} \rho_{13} & \rho_{23} \\ \end{array}\right) .$

For two derivatives, we have

$C_{23} \left(u_1, u_2, u_3 \right) = C \left(u_1 | u_{2}, u_3\right)c \left(u_{2}, u_3\right) \ \ \ \text{with} \ \ \ C \left(u_1 | u_{2}, u_3\right) = \Phi\left( z_1 - \mu_{1 \cdot 23}; \sigma_{1 \cdot 23} \right) ,$

where $\mu_{1 \cdot 23} = \left(\begin{array}{cc} \rho_{12} & \rho_{13} \\ \end{array}\right) \left(\begin{array}{cc} 1& \rho_{23} \\ \rho_{23}&1 \\ \end{array}\right)^{-1} \left(\begin{array}{c} z_2 \\ z_3 \\ \end{array}\right) \ \ \ \text{and} \ \ \ \sigma_{1 \cdot 23} = 1- \left(\begin{array}{cc} \rho_{12} & \rho_{13} \\ \end{array}\right) \left(\begin{array}{cc} 1& \rho_{23} \\ \rho_{23}&1 \\ \end{array}\right)^{-1} \left(\begin{array}{c} \rho_{12} \\ \rho_{13} \\ \end{array}\right) .$

Here is some R code for functions used for the likelihood and GMM scores.

Show R Code for Likelihood

# (0,0) Case 
  GMMScore00 <- function(rho12,rho13,rho23,zu1,zu2,zu3){
  #  Pull association parameters with this function
  SigmaList <- SigmaFct(rho12,rho13,rho23);Sigma <- SigmaList[[1]]
  Sigma12 <- SigmaList[[2]];Sigma13 <- SigmaList[[3]];Sigma23 <- SigmaList[[4]];
  Sigma3.12 <- SigmaList[[8]];Sigma1.23 <- SigmaList[[9]];Sigma2.13 <- SigmaList[[10]]
  #  Identify the copula
  norm.cops  <- normalCopula(param=c(rho12,rho13,rho23), dispstr="un", dim=3)
  u1 = pnorm(zu1);u2 = pnorm(zu2);u3 = pnorm(zu3)
  #  Trivariate likelihood calculation
  like = pCopula(cbind(u1,u2,u3), copula=norm.cops, algorithm = TVPACK)
  #  GMM score calculations
  zstar12 = zu3 - as.matrix(cbind(zu1,zu2))%*%ginv(Sigma12)%*%Sigma[c(1,2),3]
  score12 = dmvnorm(cbind(zu1,zu2),mean=rep(0,2),sigma=Sigma12,log=FALSE) * pnorm(zstar12, sd=sqrt(Sigma3.12))
  zstar13 = zu2 - as.matrix(cbind(zu1,zu3))%*%ginv(Sigma13)%*%Sigma[c(1,3),2]
  score13 = dmvnorm(cbind(zu1,zu3),mean=rep(0,2),sigma=Sigma13,log=FALSE) * pnorm(zstar13,sd=sqrt(Sigma2.13))
  zstar23 = zu1 - as.matrix(cbind(zu2,zu3))%*%ginv(Sigma23)%*%Sigma[c(2,3),1]
  score23 = dmvnorm(cbind(zu2,zu3),mean=rep(0,2),sigma=Sigma23,log=FALSE) * pnorm(zstar23,sd=sqrt(Sigma1.23))
  cbind(like,score12,score13,score23) 
  }

# (0,1) Case
bideriv1 <- function(x1,x2,rhox){return(dnorm(x1)*pnorm((x2-rhox*x1)/sqrt(1-rhox^2))) }
GMMScore01 <- function(rho12,rho13,rho23,zu1,zu2,zu3){
  SigmaList <- SigmaFct(rho12,rho13,rho23);Sigma <- SigmaList[[1]]
  Sigma12 <- SigmaList[[2]];Sigma13 <- SigmaList[[3]];Sigma23 <- SigmaList[[4]];
  Sigma13.2 <- SigmaList[[5]];Sigma12.3 <- SigmaList[[6]];Sigma23.1 <- SigmaList[[7]];
  norm.cops  <- normalCopula(param=c(rho12,rho13,rho23), dispstr="un", dim=3)
  u1 = pnorm(zu1);u2 = pnorm(zu2);u3 = pnorm(zu3)
  mu12.3 = zu3 %*% t(Sigma[c(1,2),3])
  sig1 = sqrt(Sigma12.3[1,1])
  sig2 = sqrt(Sigma12.3[2,2])
  rhox = Sigma12.3[1,2]/(sig1*sig2)
  z1s = (zu1 -mu12.3[,1])/sig1
  z2s = (zu2 -mu12.3[,2])/sig2
  like = BiCopCDF(pnorm(z1s),pnorm(z2s), family=1, par=rhox)  
  rhoxMat =  matrix(c(1,rhox,rhox,1),nrow=2,ncol=2)
  scoreAiii= dmvnorm(cbind(z1s,z2s), mean = rep(0, 2), sigma=rhoxMat, log = FALSE)
  score12 = bideriv1(z1s,z2s,rhox)*0  + bideriv1(z2s,z1s,rhox)*0 + scoreAiii/(sig1*sig2)
  score13 = bideriv1(z1s,z2s,rhox)*(zu1*rho13-zu3)/sig1^3  + bideriv1(z2s,z1s,rhox)*0 + scoreAiii*(rho12*rho13-rho23)/(sig1^3*sig2)
  score23 = bideriv1(z1s,z2s,rhox)*0  + bideriv1(z2s,z1s,rhox)*(zu2*rho23-zu3)/sig2^3 + scoreAiii*(rho12*rho23-rho13)/(sig1*sig2^3)
  cbind(like,score12,score13,score23) 
  }

# (1,1) Case 
GMMScore11 <- function(rho12,rho13,rho23,zu1,zu2,zu3){
  SigmaList <- SigmaFct(rho12,rho13,rho23);Sigma <- SigmaList[[1]]
  Sigma12 <- SigmaList[[2]];Sigma13 <- SigmaList[[3]];Sigma23 <- SigmaList[[4]];
  Sigma13.2 <- SigmaList[[5]];Sigma12.3 <- SigmaList[[6]];Sigma23.1 <- SigmaList[[7]];
  Sigma3.12 <- SigmaList[[8]];Sigma1.23 <- SigmaList[[9]];Sigma2.13 <- SigmaList[[10]]
  norm.cops  <- normalCopula(param=c(rho12,rho13,rho23), dispstr="un", dim=3)
  u1 = pnorm(zu1);u2 = pnorm(zu2);u3 = pnorm(zu3)
  mu12.3 = zu3 %*% t(Sigma[c(1,2),3])
  tempSig = ginv(Sigma[2:3,2:3])
  mu1.23 = cbind(zu2,zu3) %*% tempSig %*% Sigma[1,2:3]
  z1s = (zu1 - as.vector(mu1.23))/sqrt(Sigma1.23)
  like = pnorm(z1s)*BiCopPDF(u2,u3, family=1, par=rho23)
  
  partialCop1 = -dnorm(z1s)/Sigma1.23^(3/2)
  mu1.231 = as.vector(cbind(zu2,zu3) %*% tempSig %*% as.vector(c(1,0)))
  mu1.232 = as.vector(cbind(zu2,zu3) %*% tempSig %*% as.vector(c(0,1)))
  mu1.233 = -as.vector(cbind(zu2,zu3) %*% tempSig %*% matrix(c(0,1,1,0),nrow=2,ncol=2) %*% tempSig %*% Sigma[1,2:3])
  sig1.231 = as.numeric(-2*Sigma[2:3,1] %*% tempSig %*% as.vector(c(1,0)))
  sig1.232 = as.numeric(-2*Sigma[2:3,1] %*% tempSig %*% as.vector(c(0,1)))
  sig1.233 = as.numeric(Sigma[2:3,1] %*% tempSig %*% matrix(c(0,1,1,0),nrow=2,ncol=2) %*% tempSig %*% Sigma[1,2:3])
  partialCop12 = partialCop1*(Sigma1.23*mu1.231+0.5*(zu1-mu1.23)*sig1.231)
  partialCop13 = partialCop1*(Sigma1.23*mu1.232+0.5*(zu1-mu1.23)*sig1.232)
  partialCop23 = partialCop1*(Sigma1.23*mu1.233+0.5*(zu1-mu1.23)*sig1.233)
  score12 = pnorm(z1s)*0                                                          + partialCop12*BiCopPDF(u2,u3, family=1, par=rho23)
  score13 = pnorm(z1s)*0                                                          + partialCop13*BiCopPDF(u2,u3, family=1, par=rho23)                
  score23 = pnorm(z1s)*BiCopDeriv(u2,u3,family=1,par=rho23,deriv="par",log=FALSE) + partialCop23*BiCopPDF(u2,u3, family=1, par=rho23) 
  cbind(like,score12,score13,score23) 
}

SigmaFct <- function(rhoLA,rhoLH,rho12){
  Sigma = matrix(c(1,rhoLA,rhoLH,  rhoLA,1,rho12,  rhoLH,rho12,1),nrow=3,ncol=3)
  Sigma12 = Sigma[c(1,2),c(1,2)]
  Sigma13 = Sigma[c(1,3),c(1,3)]
  Sigma23 = Sigma[c(2,3),c(2,3)]
  Sigma13.2 = Sigma[c(1,3),c(1,3)]-Sigma[c(1,3),2] %*% t(Sigma[c(1,3),2]) 
  Sigma12.3 = Sigma[c(1,2),c(1,2)]-Sigma[c(1,2),3] %*% t(Sigma[c(1,2),3])  
  Sigma23.1 = Sigma[c(2,3),c(2,3)]-Sigma[c(2,3),1] %*% t(Sigma[c(2,3),1])
  Sigma3.12 = 1-Sigma[3,1:2] %*% ginv(Sigma[1:2,1:2]) %*% Sigma[3,1:2]
  Sigma1.23 = 1-Sigma[1,2:3] %*% ginv(Sigma[2:3,2:3]) %*% Sigma[1,2:3]
  Sigma2.13 = 1-Sigma[2,c(1,3)] %*% ginv(Sigma[c(1,3),c(1,3)]) %*% Sigma[2,c(1,3)]
  Sigma3.12 = Sigma3.12*(Sigma3.12>0)
  Sigma1.23 = Sigma1.23*(Sigma1.23>0)
  Sigma2.13 = Sigma2.13*(Sigma2.13>0)
  list(Sigma,Sigma12,Sigma13,Sigma23,Sigma13.2,Sigma12.3,Sigma23.1,Sigma3.12,Sigma1.23,Sigma2.13)
  }

8 Appendix B – Estimation Using Generalized Method of Moments

To keep this tutorial self-contained, here is a short development of the details needed for estimation using generalized method of moments, GMM. A more detailed treatment is available in the paper where you can also find references to supporting work in the academic literature.

Show GMM Appendix

8 Lapse GMM Procedure

Let $\theta$ be a three-dimensional vector that represents the parameters that quantify the association among $\{L_{it}, Y_{1,it}, Y_{2,it} \}.$ Given $T_i=t$ , the hybrid probability density/mass function of $Y_{1,it}$ and $Y_{2,it}$ is $f_{i12,s|t}(\cdot,\cdot)$ , as specified in the prior section.

To estimate $\theta$ , for $s \le t$ , define $g_{\theta,i}(y_1, y_2,T, t) = \mathrm{I}(T=t) ~ \partial_{\theta} \ln f_{i12,s|t}(y_1, y_2) .$ This is a mean zero random variable that contains information about $\theta$ .

We can use the GMM procedure. To do so, we now evaluate the score functions in terms of copula-based functions.

8 Lapse Score Evaluation

In this tutorial, the two random variables $Y_1$ and $Y_2$ both follow a Tweedie distribution. That is, they are non-negative random variables with continuous densities on the positive axis with a mass point at zero.

For the first case, where $s<t \le m+1$ , and for two zero outcomes, this can be expressed as $\partial_{\theta} \ln f_{i12,s|t}(0,0) = \frac{\partial_{\theta} ~ C\left(F_{Lis}(0), F_{1,is}(0), F_{2,is}(0) \right)} {C\left(F_{Lis}(0), F_{1,is}(0), F_{2,is}(0) \right)} .$ For $s<t \le m+1$ and a single positive outcome, $y>0$ , we have $\begin{array}{cl} \partial_{\theta} \ln f_{i12,s|t}(y,0) &= \partial_{\theta} \ln \left[C_{2}\left(F_{Lis}(0), F_{1,is}(y), F_{2,is}(0) \right) f_{1,is}(y) \right] \\ &= \frac{\partial_{\theta} C_{2}\left(F_{Lis}(0), F_{1,is}(y), F_{2,is}(0)\right)}{C_{2}\left(F_{Lis}(0), F_{1,is}(y), F_{2,is}(0)\right)} \end{array} .$

For $s<t \le m+1$ and two positive outcomes, $y_1>0$ and $y_2>0$ , we have $\begin{array}{cl} \partial_{\theta} \ln f_{i12,s|t}(y_1,y_2) &= \partial_{\theta} \ln \left[ C_{23}\left(F_{Lis}(0), F_{1,is}(y_{1}), F_{2,it}(y_{2}) \right) \right]\\ &= \frac{\partial_{\theta} ~ C_{23}\left(F_{Lis}(0), F_{1,is}(y_{1}), F_{2,is}(y_{2}) \right)} {C_{23}\left(F_{Lit}(0), F_{1,is}(y_{1}), F_{2,is}(y_{2}) \right)} . \end{array}$

For the second case, where $s=t \le m$ , and for two zero outcomes, the lapse score can be expressed as $\partial_{\theta} \ln f_{i12,s|t}(0,0) = \frac{ \sum_{i=0}^1 \left(-1\right)^{i} \partial_{\theta} C \left(F_{Lis}(0)^{i}, F_{1,is}(0),F_{2,is}(0)\right) } {\sum_{i=0}^1 \left(-1\right)^{i} C \left(F_{Lis}(0)^{i}, F_{1,is}(0),F_{2,is}(0)\right)} .$ For $s=t \le m$ and a single positive outcome, $y>0$ , we have $\partial_{\theta} \ln f_{i12,s|t}(y,0) = \frac{\sum_{i=0}^1 \left(-1\right)^{i} \partial_{\theta} C_{2}\left(F_{Lis}(0)^{i}, F_{1,is}(y), F_{2,is}(0) \right) } {\sum_{i=0}^1 \left(-1\right)^{i} C_{2}\left(F_{Lis}(0)^{i}, F_{1,is}(y), F_{2,is}(0) \right) } .$ In the same way, for $s=t \le m$ and two positive outcomes, $y_1>0$ and $y_2>0$ , we have $\partial_{\theta} \ln f_{i12,s|t}(y_1,y_2) = \frac{\sum_{i=0}^1 \left(-1\right)^{i} \partial_{\theta} C_{23}\left(F_{Lis}(0)^{i}, F_{1,is}(y_1), F_{2,is}(y_2) \right)} {\sum_{i=0}^1 \left(-1\right)^{i} C_{23}\left(F_{Lis}(0)^{i}, F_{1,is}(y_1), F_{2,is}(y_2) \right)} .$

In the following, we give explicit formulas Gaussian copula derivatives.

8 Trivariate Gaussian Copula Derivatives with Respect to Association Parameters

8 No Derivatives

From the score function, we see we need to evaluate at least a trivariate distribution function. More generally, begin by considering a copula having $d$ dimensions.

We first cite a general result due to Plackett (1954); see also Gassmann (2003). Consider a $d$ dimensional multivariate normal distribution with variance-covariance matrix $\boldsymbol \Sigma$ . As we will use this as a basis for defining copulas, consider the mean to be zero and variance to be 1 so that the diagonal elements of $\boldsymbol \Sigma$ equal 1. Let $\Phi_d( \cdot; \boldsymbol \Sigma)$ be the corresponding distribution function. Partition the matrix as $\boldsymbol \Sigma = \left( \begin{array}{cc} \boldsymbol \Sigma_{11} & \boldsymbol \Sigma_{12}\\ \boldsymbol \Sigma_{12}^{\prime} & \boldsymbol \Sigma_{22} \end{array} \right) \ \ \ \ \ \ \ \boldsymbol \Sigma_{11} = \left( \begin{array}{cc} 1 & \rho_{12}\\ \rho_{12} & 1 \end{array} \right) ,$ so that $\boldsymbol \Sigma_{11}$ is the submatrix for the first two elements and $\rho_{12}$ is the corresponding correlation coefficient. Then, from Plackett (1954), we have $\frac{\partial}{\partial \rho_{12}} \Phi_d( \mathbf{z}; \boldsymbol \Sigma) = \phi_2( \mathbf{z}_1; \boldsymbol \Sigma_{11}) ~ \Phi_{d-2}( \mathbf{z}_2^*; \boldsymbol \Sigma_{22\cdot 1}) ,$ where $\rho_2(\cdot)$ is a bivariate normal density, $\mathbf{z}_1 = (z_1,z_2)^{\prime}$ , and $\mathbf{z}_2^* = \left( \begin{array}{c} z_3 \\ \vdots \\ z_d \end{array} \right) - \boldsymbol \Sigma_{12}^{\prime} \boldsymbol \Sigma_{11}^{-1} \mathbf{z}_1 \ \ \ \text{and} \ \ \ \boldsymbol \Sigma_{22\cdot 1} = \boldsymbol \Sigma_{22} - \boldsymbol \Sigma_{12}^{\prime} \boldsymbol \Sigma_{11}^{-1}\boldsymbol \Sigma_{12} .$

8 One Derivative

We first derive the partial derivative with respect association parameter of the partial derivative of the copula function $C_3 \left(u_1, u_2, u_3 \right)$ .

Recall, for two mean zero, variance one, normally distributed random variables $X_1$ and $X_2$ , that the conditional distribution of $X_2$ given $X_1 = x_1$ is normally distributed with mean $\rho_{X} x_1$ and variance $(1-\rho_{X}^2)$ . Because $\frac{\partial}{\partial x_1} \Pr(X_1\le x_1, X_2\le x_2) = f_{X_1}(x_1) \Pr( X_2\le x_2 | X_1 = x_1)$ , we may define $h_1^*(x_1,x_2,\rho_{X}) = \frac{\partial}{\partial x_1} \Phi_2(x_1,x_2, \rho_{X}) = \phi\left(x_1\right) \Phi\left( \frac{x_2 - \rho_{X} x_1 }{ \sqrt{1-\rho_{X}^2}} \right).$

Consider a matrix $\boldsymbol \Sigma = \left(\begin{array}{cc} \sigma_1^2 & \sigma_{12} \\ \sigma_{12} & \sigma_2^2 \end{array}\right)$ and

$\Phi_2\left(x_1, x_2, \boldsymbol \Sigma \right) = \Phi_2\left(\frac{x_1}{\sigma_1},\frac{x_2}{\sigma_2}, \rho_x \right) = C\left(\Phi\left(\frac{x_1}{\sigma_1}\right),\Phi\left(\frac{x_2}{\sigma_2}\right), \rho_x \right)$ where $\rho_x=\sigma_{12}/(\sigma_1 \sigma_2)$ .

We now evaluate a derivative of this using the matrix $\boldsymbol \Sigma_{12 \cdot 3}$ , so that $\sigma_1^2 = 1-\rho_{13}^2$ , $\sigma_2^2 = 1-\rho_{23}^2$ , and $\rho_{X} \sigma_1\sigma_2= \rho_{12} - \rho_{13}\rho_{23}$ . With this notation, we have $\begin{array}{ll} &\frac{\partial}{\partial \rho} \Phi_2 \left( z_1 - \mu_{12 \cdot 3, 1}, z_2 - \mu_{12 \cdot 3, 2} ; \boldsymbol \Sigma_{12 \cdot 3}\right) = \frac{\partial}{\partial \rho} \Phi_2 \left( \frac{z_1 - \mu_{12 \cdot 3, 1}}{\sigma_1}, \frac{z_2 - \mu_{12 \cdot 3, 2} }{\sigma_2}; \rho_x\right) \\ & \ \ \ \ \ = \frac{\partial}{\partial \rho} \Phi_2 \left( z_1^*, z_2^*; \rho_x\right) \\ & \ \ \ \ \ = h_1^*\left( z_1^*, z_2^*; \rho_x\right)\frac{\partial }{\partial \rho} z_1^* + h_1^*\left( z_2^*, z_1^*; \rho_x\right)\frac{\partial }{\partial \rho} z_2^* + \phi_2\left( z_1^*, z_2^*; \rho_x\right)\frac{\partial }{\partial \rho} \rho_x , \\ \end{array}$ where $z_1^*=\left(z_1 - \mu_{12 \cdot 3, 1}\right)/\sigma_1$ and similarly for $z_2^*$ . The last equality uses a special case of the Plackett (1954) result $\frac{\partial}{\partial \rho_x} \Phi_2(y_1,y_2; \boldsymbol \Sigma) = \phi_2(y_1,y_2; \boldsymbol \Sigma)$ where $\boldsymbol \Sigma = \left(\begin{array}{cc} 1 & \rho_x \\ \rho_x & 1 \end{array}\right)$ . We also need

$\frac{\partial}{\partial \rho} z_1^* = \frac{\partial }{\partial \rho} \left(\frac{z_1 - \mu_{12 \cdot 3, 1}}{\sigma_1}\right) = \frac{\partial }{\partial \rho} \left(\frac{z_1 - z_3 \rho_{13}}{\sqrt{1-\rho_{13}^2}}\right) = \left\{\begin{array}{cl} 0 & \rho = \rho_{12} \\ \frac{z_1\rho_{13} - z_3 }{(1-\rho_{13}^2)^{3/2}} & \rho = \rho_{13} \\ 0 & \rho = \rho_{23} \\ \end{array} \right.$

In the same way $\frac{\partial}{\partial \rho} z_2^* = \frac{\partial }{\partial \rho} \left(\frac{z_2 - \mu_{12 \cdot 3, 2}}{\sigma_2}\right) = \frac{\partial }{\partial \rho} \left(\frac{z_2 - z_3 \rho_{23}}{\sqrt{1-\rho_{23}^2}}\right) = \left\{\begin{array}{cl} 0 & \rho = \rho_{12} \\ 0 & \rho = \rho_{13} \\ \frac{z_2\rho_{23} - z_3 }{(1-\rho_{23}^2)^{3/2}} & \rho = \rho_{23} \end{array} \right.$

Similarly, $\frac{\partial}{\partial \rho} \rho_x = \frac{\partial }{\partial \rho} \left(\frac{\rho_{12} - \rho_{13}\rho_{23}}{\sigma_1 \sigma_2}\right) = \frac{\partial }{\partial \rho} \left(\frac{\rho_{12} - \rho_{13}\rho_{23}}{(1-\rho_{13}^2)^{1/2} (1-\rho_{23}^2)^{1/2}}\right) = \left\{\begin{array}{cl} \frac{1}{(1-\rho_{13}^2)^{1/2} (1-\rho_{23}^2)^{1/2}} & \rho = \rho_{12} \\ \frac{\rho_{12}\rho_{13}-\rho_{23}}{(1-\rho_{13}^2)^{3/2} (1-\rho_{23}^2)^{1/2}} & \rho = \rho_{13} \\ \frac{\rho_{12}\rho_{23}-\rho_{13}}{(1-\rho_{13}^2)^{1/2} (1-\rho_{23}^2)^{3/2}}& \rho = \rho_{23} \end{array} \right.$

8 Two Derivatives

We next derive the partial derivative with respect association parameter of the copula function $C_{23} \left(u_1, u_2, u_3 \right)$ . We start with $\frac{\partial}{\partial \rho} C_{23} \left(u_1, u_2, u_3 \right) = C \left(u_1 | u_{2}, u_3\right) \frac{\partial}{\partial \rho} c \left(u_{2}, u_3\right) + \left(\frac{\partial}{\partial \rho} C \left(u_1 | u_{2}, u_3\right)\right) c \left(u_{2}, u_3\right)$ where $C \left(u_1 | u_{2}, u_3\right) = \Phi\left( z_1 - \mu_{1 \cdot 23}; \sigma_{1 \cdot 23} \right).$ Further, $\begin{array}{ll} \frac{\partial}{\partial \rho} C \left(u_1| u_2, u_3\right)&= \frac{\partial}{\partial \rho} \Phi\left(z_1 - \mu_{1 \cdot 23}; \sigma_{1 \cdot 23}\right) \nonumber \\ &= \phi\left( \frac{z_1 - \mu_{1 \cdot 23}} {\sqrt{\sigma_{1 \cdot 23}}}\right) \frac{\partial}{\partial \rho} \left( \frac{z_1 - \mu_{1 \cdot 23}} {\sqrt{\sigma_{1 \cdot 23}}}\right) \nonumber \\ &= - \phi\left( \frac{z_1 - \mu_{1 \cdot 23}} {\sqrt{\sigma_{1 \cdot 23}}}\right) \frac{1}{\sigma_{1 \cdot 23}^{3/2}} \left( \sigma_{1 \cdot 23} \frac{\partial}{\partial \rho} \mu_{1 \cdot 23} + \frac{1}{2}(z_1 - \mu_{1 \cdot 23}) \frac{\partial}{\partial \rho} \sigma_{1 \cdot 23} \right) . \end{array}$

First recall $\frac{\partial}{\partial \rho} \boldsymbol \Sigma^{-1} = -\boldsymbol \Sigma^{-1} \left(\frac{\partial}{\partial \rho} \boldsymbol \Sigma\right) \boldsymbol \Sigma^{-1}$ . Now, with $\mu_{1 \cdot 23} = \left(\begin{array}{cc} \rho_{12} & \rho_{13}\end{array}\right) \left(\begin{array}{cc} 1& \rho_{23} \\ \rho_{23}&1\end{array}\right)^{-1} \left(\begin{array}{c} z_2 \\ z_3 \end{array}\right)$ , we have

$\frac{\partial}{\partial \rho} \mu_{1 \cdot 23} = \left\{\begin{array}{cl} \left(\begin{array}{cc} 1 & 0 \\ \end{array}\right) \left(\begin{array}{cc} 1& \rho_{23} \\ \rho_{23}&1 \\ \end{array}\right)^{-1} \left(\begin{array}{c} z_2 \\ z_3 \\ \end{array}\right)& \rho = \rho_{12} \\ \left(\begin{array}{cc} 0 & 1 \\ \end{array}\right) \left(\begin{array}{cc} 1& \rho_{23} \\ \rho_{23}&1 \\ \end{array}\right)^{-1} \left(\begin{array}{c} z_2 \\ z_3 \\ \end{array}\right) & \rho = \rho_{13} \\ \left(\begin{array}{cc} \rho_{12} & \rho_{13} \\ \end{array}\right) \left(\begin{array}{cc} 1& \rho_{23} \\ \rho_{23}&1 \\ \end{array}\right)^{-1} \left(\begin{array}{cc} 0& 1 \\ 1&0 \\ \end{array}\right) \left(\begin{array}{cc} 1& \rho_{23} \\ \rho_{23}&1 \\ \end{array}\right)^{-1} \left(\begin{array}{c} z_2 \\ z_3 \\ \end{array}\right)& \rho = \rho_{23} \end{array} \right.$

Further, with $\sigma_{1 \cdot 23} = 1-\left(\begin{array}{cc} \rho_{12} & \rho_{13} \\ \end{array}\right)\left(\begin{array}{cc} 1& \rho_{23} \\ \rho_{23}&1 \\ \end{array}\right)^{-1}\left(\begin{array}{c} \rho_{12} \\ \rho_{13} \\ \end{array}\right)$ , we have

$\frac{\partial}{\partial \rho} \sigma_{1 \cdot 23} = \left\{\begin{array}{cl} - 2 \left(\begin{array}{cc} 1 & 0 \\ \end{array}\right) \left(\begin{array}{cc} 1& \rho_{23} \\ \rho_{23}&1 \\ \end{array}\right)^{-1} \left(\begin{array}{c} \rho_{12} \\ \rho_{13} \\ \end{array}\right)& \rho = \rho_{12} \\ - 2 \left(\begin{array}{cc} 0 & 1 \\ \end{array}\right) \left(\begin{array}{cc} 1& \rho_{23} \\ \rho_{23}&1 \\ \end{array}\right)^{-1} \left(\begin{array}{c} \rho_{12} \\ \rho_{13} \\ \end{array}\right) & \rho = \rho_{13} \\ - \left(\begin{array}{cc} \rho_{12} & \rho_{13} \\ \end{array}\right) \left(\begin{array}{cc} 1& \rho_{23} \\ \rho_{23}&1 \\ \end{array}\right)^{-1} \left(\begin{array}{cc} 0& 1 \\ 1&0 \\ \end{array}\right) \left(\begin{array}{cc} 1& \rho_{23} \\ \rho_{23}&1 \\ \end{array}\right)^{-1} \left(\begin{array}{c} \rho_{12} \\ \rho_{13} \\ \end{array}\right)& \rho = \rho_{23} \end{array} \right.$

Derivatives of the bivariate density $c \left(u_{2}, u_3\right)$ follow directly from results of Schepsmeier and Stober (2012, page 2).

–>

Run Time

Time taken for this tutorial to compile:

Time difference of 2.079783 hours

Joint Modeling of Insurance Claims and Lapsation

EW (Jed) Frees, University of Wisconsin-Madison

19 January 2018