Multilevel Linear Models

A Note on Fixed vs. Random Effects

There are a staggering number of different names for these models, with different disciplines using different terminology. In the language used in this course, fixed effects are varying coefficients (which can be slopes or intercepts) that are implemented by creating group dummies, random effects are varying coefficients (again, slopes or intercepts) which are drawn from a common distribution, and multilevel models are models which include any of these effects.

Random Effects

Fixed Effects; Random Effects¹

Unfortunately, one of the most widely used packages in R for multilevel models uses some confusing terminology. The lme4 package uses the following terms:

fixed effect: non-random parameter
random effect: random parameter, drawn from a distribution

Fixed Effect Models

Multilevel models allow us to do two very powerful things: control for unobserved variation across groups in grouped data, and control for the fact that group level variables don’t actually introduce as many independent observations as we think they do. Load the radon data from http://www.stat.columbia.edu/~gelman/arm/examples/radon/srrs2.dat, and log transform the radon activity variable (replace any values of 0 with .1 to avoid problems in the transformed variable). Drop any observations with a value of 9 for floor, as this is a missing data code and treating it as a value will bias our estimates. Use state2 as the state variable for all subsequent examples.

sr <- read.delim('http://www.stat.columbia.edu/~gelman/arm/examples/radon/srrs2.dat', sep = ',')
sr$radon <- log(ifelse (sr$activity == 0, .1, sr$activity))
sr <- sr[which(sr$floor < 9), ]

Run a standard linear model (the complete pooling model) explaining the level of radon as a function of floor, and a model with fixed effects, using states as the group. Remember to generate group dummies for the fixed effects model.

pooled <- lm(radon ~ floor, data = sr)
fix_int <- lm(radon ~ floor + as.factor(state2), data = sr)
fix_slope <- lm(radon ~ floor + as.factor(state2), data = sr)

We can also estimate a fixed effects model with varying slopes instead of intercepts, although again, interpreting and presenting these results is not the easiest task. Create group dummies like before, except this time interact them with floor. This produces a model with varying slope and intercept fixed effects; this is closest to the completely separate regression approach we saw last week with plyr. We could also omit the group dummies as a separate predictor, yielding a model with varying slope fixed effects. The figure below shows the regression lines for each model. In the next section we’ll see how to create these plots for random effects models.

Random Effects Models

If we want to estimate a varying intercept model by county, there are some difficulties we run into. There are 386 counties in the data, so this model would be very difficult to read and work with. As an alternative, we might want to estimate a random effects model. Doing so would let us to describe the distribution that the varying intercepts are drawn from, allowing us to better understand what a random intercept of -.72 means relative to the mean of the distributions the intercepts are drawn from, and leveraging similarities across the 386 groups. We can use the lmer() function in the package lme4, to accomplish this. This function produces a model object of class lmerMod. Estimate a random effects model with radon levels explained by floor and varying intercept by state (it’s much easier to understand how these models work with 9 groups instead of 386).

library(lme4)
no_pooling_lme <- lmer(radon ~ floor + (1 | state2), data = sr)

This model assumes that there is a different baseline probability of getting a given value for radon based on differences in state2. When you call the model object, or use summary() on it, you’ll get the usual estimates and standard errors for your fixed effects (your regular coefficients), but only the variance and standard deviation for your random effects (the varying intercepts). Since we assume that these intercepts are all drawn from the same distribution, we can describe the set of intercepts using standard deviation or variance. If we want to examine what these intercepts actually are, we can use the coef() function. Using coef() on a lmerMod is convenient, because it will automatically combine any fixed and random effects (remember what these mean in lme4…), and then report the resulting coefficients by group membership.

coef(no_pooling_lme)$state2

##    (Intercept) floor
## AZ       0.654 -0.68
## IN       1.061 -0.68
## MA       0.732 -0.68
## MI       0.083 -0.68
## MN       1.251 -0.68
## MO       0.776 -0.68
## ND       1.715 -0.68
## PA       1.232 -0.68
## WI       1.117 -0.68

If we want to extract the individual fixed and random effects, we can do so using fixef() and ranef(), respectively.

head(ranef(no_pooling_lme)$state2)

##    (Intercept)
## AZ       -0.30
## IN        0.10
## MA       -0.23
## MI       -0.87
## MN        0.29
## MO       -0.18

fixef(no_pooling_lme)

## (Intercept)       floor 
##        0.96       -0.68

We can also access the standard errors of the fixed and random effects using se.fixef() and se.ranef() in the arm package. The REextract() function in merTools allows us to easily extract both the estimates and standard errors of random effects.

library(arm)
se.ranef(no_pooling_lme)$state2

##    (Intercept)
## AZ       0.027
## IN       0.023
## MA       0.025
## MI       0.069
## MN       0.029
## MO       0.023
## ND       0.025
## PA       0.021
## WI       0.047

se.fixef(no_pooling_lme)

## (Intercept)       floor 
##       0.155       0.023

library(merTools)
REextract(no_pooling_lme)

##   groupFctr groupID (Intercept) (Intercept)_se
## 1    state2      AZ       -0.30          0.027
## 2    state2      IN        0.10          0.023
## 3    state2      MA       -0.23          0.025
## 4    state2      MI       -0.87          0.069
## 5    state2      MN        0.29          0.029
## 6    state2      MO       -0.18          0.023
## 7    state2      ND        0.76          0.025
## 8    state2      PA        0.27          0.021
## 9    state2      WI        0.16          0.047

We can even combine the fixed and random effects to recover the reported effects returned by coef(). Let’s try this for Missouri, comparing the effect of both the intercept and floor.

fixef(no_pooling_lme)[1] == coef(no_pooling_lme)$state2[6,1]
fixef(no_pooling_lme)[1] + ranef(no_pooling_lme)$state2[6,] == coef(no_pooling_lme)$state2[6, 1]
fixef(no_pooling_lme)[2] == coef(no_pooling_lme)$state2[6, 2]

## (Intercept) 
##       FALSE 
## (Intercept) 
##        TRUE 
## floor 
##  TRUE

An important thing to note here, is that the intercept is affected by the random effect. In this case, the intercept in the random effect term (1 | state2) represents a baseline relative to all the states included in the data, so we need to combine the fixed and random effect of the intercept to get its total effect. In other words, the first entry in coef() represents a case not represented in the data because it is a baseline that all of the states are relative to. However, this model doesn’t have a random effect varying slope term, so the fixed effect of floor is its total effect.

This is a lot of words to describe what’s happening in a random effects varying intercept model, but sometimes it’s easier to understand with a picture. Use the predict() function on our lmerMod object to get fitted values and then plot radon and floor values with the regression line for each state.

sr$fit <- predict(no_pooling_lme)
ggplot(data = sr, aes(x = floor, y = radon, color = state2)) +
  geom_point(alpha = .25, position = 'jitter', size = .75) + 
  geom_line(aes(y = fit)) +
  theme_bw() +
  theme(legend.position = 'right',
        plot.background = element_blank(),
        panel.grid.minor = element_blank(),
        panel.grid.major = element_blank(),
        panel.border = element_blank())

Now let’s add some county-level data to our data so that we can include effects as a function of county-level differences. Load the data file of county level info from http://www.stat.columbia.edu/~gelman/arm/examples/radon/cty.dat.

Before we can merge the county-level data onto the individual-level data, we need to have common identifiers. These data are from the US, so they use FIPS codes. The data only have separate county and state level FIPS codes. The formula for full FIPS codes is $ (1000 ) + $. Create full FIPS codes in both the individual and county level data, then merge them together.

cty <- read.delim('http://www.stat.columbia.edu/~gelman/arm/examples/radon/cty.dat', sep = ',')
sr$fips <- sr$stfips * 1000 + sr$cntyfips
cty$fips <- cty$stfips * 1000 + cty$ctfips
ml_data <- merge(sr, cty)

Write a model where radon is explained by floor and typebldg with a varying intercept by Uppm, which is a continuous predictor measured at the county level. There are 467 counties in the data, but 463 different values of Uppm, indcating that some counties have the same value. This means that Uppm is an actual subsantively interesting measurement and not just a group ID variable.

mlm <- lmer(radon ~ floor + typebldg + (1 | Uppm), data = ml_data)

Note that in this case, the baseline case where Uppm = 0 makes sense because this is a measurement, and we have 165 cases in our data where Uppm actually does equal zero. Thus, coef(mlm)$Uppm[1, 1] represents the actual intercept in these cases, which is a combination of fixef(mlm)[1] and ranef(mlm)$Uppm[1, ].

coef(mlm)$Uppm[1, 1] == fixef(mlm)[1] + ranef(mlm)$Uppm[1,]

## (Intercept) 
##        TRUE

The real advantage of lme4 however, is its ability to specify varying slope random effects models. In these models, the effect of a coefficient is dependent on group membership. Write a model where radon is explained by typebuilding, but with a coefficient that varies by state2. We can think of this model as an interaction, since the coefficient for the effect of typebldg on radon is a function of the value of state2.

mlm <- lmer(radon ~ (typebldg | state2), data = ml_data)

As with the fixed effects model above, we can access the fixed, random, and total effects separately. Compare the typebldg coefficient for Indiana in the random effect component to the typebldg coefficient for Indiana.

coef(mlm)$state2

##    typebldg (Intercept)
## AZ   -0.029       0.022
## IN   -0.158       0.889
## MA   -0.094       0.765
## MI    0.043      -0.369
## MN   -0.225       1.342
## MO   -0.192       0.710
## ND   -0.202       1.829
## PA   -0.160       1.328
## WI   -0.156       0.828

fixef(mlm)

## (Intercept) 
##        0.14

ranef(mlm)$state2

##    (Intercept) typebldg
## AZ       -0.11   -0.029
## IN        0.75   -0.158
## MA        0.63   -0.094
## MI       -0.51    0.043
## MN        1.21   -0.225
## MO        0.57   -0.192
## ND        1.69   -0.202
## PA        1.19   -0.160
## WI        0.69   -0.156

ranef(mlm)$state2[2, 2] == coef(mlm)$state2[2, 1]

## [1] TRUE

Just like we can use plots to visually explore the effect of random intercepts, we can do the same with random slopes. Use the predict() command again to create a scatter plot of radon levels against building type, and include the linear regression line for each state.

ml_data$fit<- predict(mlm)
slope_plot <- ggplot(data = ml_data, aes(x = typebldg, y = radon, color = state2)) +
  geom_point(alpha = .25, position = 'jitter', size = .75) +
  geom_line(aes(y = fit)) +
  theme_bw() +
  theme(legend.position = 'right',
        plot.background = element_blank(),
        panel.grid.minor = element_blank(),
        panel.grid.major = element_blank(),
        panel.border = element_blank())
slope_plot

In this case, the coefficient of typebldg is equal to the random effect because we did not specify a fixed effect for it in the model formula. If we did, then we would have to combine the appropriate random effect with the fixed effect to obtain our total effect. Try this with a model that explain radon as a function of floor and typebldg, with a varying coefficient on typebldg by Uppm.

mlm <- lmer(radon ~ floor + typebldg + (typebldg | Uppm), data = ml_data)
summary(mlm)

## Linear mixed model fit by REML ['lmerMod']
## Formula: radon ~ floor + typebldg + (typebldg | Uppm)
##    Data: ml_data
## 
## REML criterion at convergence: 35058
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -4.594 -0.610 -0.001  0.605  5.176 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  Uppm     (Intercept) 0.30489  0.5522        
##           typebldg    0.00225  0.0474   -0.64
##  Residual             0.88247  0.9394        
## Number of obs: 12631, groups:  Uppm, 463
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   1.2100     0.0356   34.02
## floor        -0.7049     0.0222  -31.73
## typebldg     -0.1142     0.0177   -6.45
## 
## Correlation of Fixed Effects:
##          (Intr) floor 
## floor    -0.204       
## typebldg -0.562 -0.061

ranef(mlm)$Uppm[2, 2] == coef(mlm)$Uppm[2, 3]

## [1] FALSE

fixef(mlm)[3] + ranef(mlm)$Uppm[2, 2] == coef(mlm)$Uppm[2, 3]

## typebldg 
##     TRUE

We can also specify a random effects model without a varying intercept term. Try writing a model that explains the radon level as a function of typebuilding with a varying slope by state2. Look at the summary of the model object and notice that there is no longer an intercept listed under the random effects. This means that observations in each group will have their own slope, but use the overall intercept. Plotting this model clearly illustrates the difference between one with an intercept in the random effect term

mlm <- lmer(radon ~ (typebldg - 1 | state2), data = ml_data)
ml_data$fit <- predict(mlm)
slope_plot_noint <- ggplot(data = ml_data, aes(x = typebldg, y = radon, color = state2)) +
  geom_point(alpha = .25, position = 'jitter', size = .75) +
  geom_line(aes(y = fit)) +
  labs(title = 'Shared Intercept, Random Slopes', x = 'building type') +
  theme_bw() +
  theme(plot.title = element_text(hjust = 0.5),
        legend.position = 'right',
        plot.background = element_blank(),
        panel.grid.minor = element_blank(),
        panel.grid.major = element_blank(),
        panel.border = element_blank())
slope_plot <- slope_plot +
  labs(title = 'Random Intercepts, Random Slopes', x = 'building type') +
  theme(plot.title = element_text(hjust = 0.5))
ml_data$fit <- predict(lm(radon ~ typebldg, data = ml_data))
slope_plot_pool <- ggplot(data = ml_data, aes(x = typebldg, y = radon, color = state2)) +
  geom_point(alpha = .25, position = 'jitter', size = .75) +
  geom_line(aes(y = fit), color = 'darkgrey') +
  labs(title = 'Shared Intercept, Shared Slope', x = 'building type') +
  theme_bw() +
  theme(plot.title = element_text(hjust = 0.5),
        legend.position = 'right',
        plot.background = element_blank(),
        panel.grid.minor = element_blank(),
        panel.grid.major = element_blank(),
        panel.border = element_blank())

## arrange in order of increasing similarity
gridExtra::grid.arrange(slope_plot, slope_plot_noint, slope_plot_pool, nrow = 3)

By default, lme4 assumes that random effects sharing the same factor are correlated. This is based on the assumption of unit heterogeneity i.e. that there are unobserved factors at the group level that lead to differences in the effect of certain variables. If we think that there are group level effects, but that these effects are independent of one another, we can model them independently by using two pipes instead of one in the random effect term in the formula. This is similar to, but not quite the same as specifying two different random effect terms. If we suppressed the intercept in all random effect terms, then they would be equivalent e.g. (typebldg + stratum - 1 || Uppm) = (typebldg - 1 | Uppm) + (stratum - 1 | Uppm).

mlm <- lmer(radon ~ floor + (typebldg + stratum | Uppm), data = ml_data)
summary(mlm)$varcor

##  Groups   Name        Std.Dev. Corr       
##  Uppm     (Intercept) 0.5376              
##           typebldg    0.0858   -0.25      
##           stratum     0.0172   -0.09 -0.20
##  Residual             0.9400

mlm <- lmer(radon ~ floor + (typebldg + stratum || Uppm), data = ml_data)
summary(mlm)$varcor

##  Groups   Name        Std.Dev.
##  Uppm     (Intercept) 0.5116  
##  Uppm.1   typebldg    0.0709  
##  Uppm.2   stratum     0.0153  
##  Residual             0.9403

Individual Exercise

Use the WDI package to download GDP per capita (NY.GDP.PCAP.KD) and gross domestic savings (NY.GNS.ICTR.ZS) data for the Nordic countries from 1960-2016. Fit varying intercept, varying slope, and varying intercept and slope random effects models that explain GDP as a function of savings, with country as the group. Produce plots illustrating the regression lines for each country in each model.

Metaphor borrowed from Johannes Karreth.↩

Multilevel Linear Models

Rob Williams

August 30, 2018

A Note on Fixed vs. Random Effects

Fixed Effect Models

Random Effects Models

Individual Exercise