class: bottom, left, title-slide # A .fancy[fun] intro to Multilevel Models in R ## ’
What’s so random about these effects anyway?’
### Fabio Votta ### 2022-02-22 --- layout: true <style> .onehundredtwenty { font-size: 120%; } <style> .ninety { font-size: 90%; } .eightyfive { font-size: 85%; } .eighty { font-size: 80%; } .seventyfive { font-size: 75%; } .seventy { font-size: 70%; } .fifty { font-size: 50%; } .forty { font-size: 40%; } .pull-left80 { float: left; width: 80%; } .pull-right20 { float: right; width: 20%; } </style> <div class="logo"></div> --- class: white #### It might be hard at first but it gets better! <img src="https://raw.githubusercontent.com/favstats/ds3_r_intro/main/images/r_first_then_new.png" width="90%" style="display: block; margin: auto;" /> <!-- {width=50%} --> .fifty[Illustration adapted from [Allison Horst](https://twitter.com/allison_horst)] -- + My experience is that this stuff isn't super easy... but it gets better! -- + Awesome inclusive community that is always ready to help + Active blogosphere with use cases and examples --- ### Overview + *Multilevel* data structures -- + Why *multilevel models*? -- + What are *multilevel models*? -- + Estimate *multilevel models* in R -- + Some neat R functions to deal with *multilevel models* -- + Your opportunity to ask questions! > Note: Mixed-effects models, hierarchical (linear) regression, variance component model, all refer to the same thing: multilevel modelling --- class: white ### Data Structures .eightyfive[You might be used to your data looking like this: An independent variable `\(x\)` and a dependent variable `\(y\)`.] .leftcol80[ <img src="index_files/figure-html/unnamed-chunk-2-1.png" width="720" /> ] -- .rightcol20[ <br> <br> <br> .eightyfive[From this data we may conclude that *x is **positively** correlated with y*.] ] --- class: white ### Data Structures .eightyfive[However, if we introduce *groupings* of the data, a different picture emerges.] .pull-left80[ <img src="index_files/figure-html/unnamed-chunk-3-1.png" width="720" /> ] -- .pull-left20[ <br> <br> <br> .eightyfive[In each of the groupings we find a ***negative** correlation between x and y*.] ] --- class: black ### So what is going on? -- .pull-left[ A phenomena called: # .green[.onehundredtwenty[.creepy[Simpson's Paradox]]]  ] -- .pull-right[ .eighty[+ Has nothing to do with the Simpsons] .eighty[+ Is not all that scary] .eighty["**Simpson’s Paradox** is a statistical phenomenon where an association between two variables in a population ***emerges***, ***disappears*** or ***reverses*** when the population is divided into subpopulations."] .fifty[(Stanford Encyclopedia of Philosophy 2021)] <img src="index_files/figure-html/unnamed-chunk-4-1.png" width="432" /> ] --- class: white ### Data Structures Simpson's paradox teaches us that we have to be mindful of *structures* within our data. .pull-left[ We might have students **nested** in schools  .eighty[Students on the **first level**, schools on the **second level**] ] -- .pull-right[ Or timepoints **nested** in people  .eighty[Timepoints on the **first level**, people on the **second level**] ] -- We can also easily imagine a **three-level data structure**: > timepoints nested in people nested in schools --- class: white ### Data Structures .eighty[There are two types of variance important to consider in our modeling:] + .eighty[**within-group variance** (i.e. differences *within a group*)] + .eighty[**between-group variance** (i.e. differences *between groups*)] .leftcol65[  ] .rightcol35[ <br><br><br> > Data points within a group are often more similar than between groups. <br><br><br> .fifty[Elaborate sketch of a multilevel data structure] ] <!-- <img src="img/elaborate.png" width="60%"> --> --- ### Why Multilevel Models? -- + **Explicitly model multilevel structure**: your data may have + different average levels of your dependent and independent variables + different effects varying by group -- + Needed for **correct inference** + Nested data violate key assumptions of OLS + Independent observations + Independent error terms + e.g. cases within a country more similar than between + Multilevel model accounts for this -- + Answer **unique research questions** + Estimate group-level predictors and cross-level interactions + for example, how is the corruption of a country associated with individual-level variables like attitudes + how does your effect of interest vary by country/neighborhood/family/over time? --- ### Random vs. Fixed Effects **Fixed effects** are effects assumed to be constant, i.e. they **do not vary**. **Random effects** are assumed to vary by some context, i.e. they **do vary** -- <center> <img src="https://media3.giphy.com/media/7A4KXR7DeBjfsG67SH/giphy.gif?cid=ecf05e47c59k3fuqrjtp0cqg7c0ya9fvsrmqr5jvrizzidhx&rid=giphy.gif&ct=g"> </center> --- ### Short note Multilevel models are based on *Maximum Likelihood estimation* (like logistic regression) > Maximum likelihood estimation is a method that determines values for the parameters of a model. The parameter values are found such that they **maximise the likelihood** that the process described by the model produced the data that were actually observed. .fifty[Jonny Brooks-Bartlett 2018] -- For our purposes this means: **No direct estimation of `\(R^2\)`.** **Models may not converge.** --- ### What is a model? -- > A simplified but informative representation of a phenomena. -- This is a simple linear regressions: `$$y = \color{blue}{\alpha} + \color{lightgreen}{\beta} \times X + \color{purple}{\epsilon_i}$$` -- And this is a Null Model: `$$y = \color{blue}{\alpha} + \color{purple}{\epsilon_i}$$` The null model basically models the mean of a dependent variable. --- class: white, middle, center ### .fancy[Short excourse] Let's start simple by estimating the null model of a linear regression (i.e. the mean) --- class: white ### Linear Regression I .leftcol20[ <br> <br> <br> .eightyfive[Imagine we have a dependent variable `\(y\)`. It is distributed like this:] ] .rightcol80[ <img src="index_files/figure-html/unnamed-chunk-5-1.png" width="720" /> ] --- class: white ### Linear Regression II - Null Model .leftcol20[ <br> <br> <br> **$$y = \color{blue}{\alpha} + \color{purple}{\epsilon_i}$$** We estimate a linear regression with no predictors and plot the intercept. ] .rightcol80[ <img src="index_files/figure-html/unnamed-chunk-6-1.png" width="720" /> ] --- class: white ### Linear Regression III - Null Model .leftcol20[ <br> <br> <br> **$$y = \color{blue}{\alpha} + \color{purple}{\epsilon_i}$$** Purple indicates the error term for each observation. ] .rightcol80[ <img src="index_files/figure-html/unnamed-chunk-7-1.png" width="720" /> ] --- class: white ### Linear Regression IV - Slope .leftcol20[ <br> <br> <br> **$$y = \color{blue}{\alpha} + \color{green}{\beta} \times X + \color{purple}{\epsilon_i}$$** We now also add a predictor. ] .rightcol80[ <img src="index_files/figure-html/unnamed-chunk-8-1.png" width="720" /> ] --- class: white ### Linear Regression V - Slope .leftcol20[ <br> <br> <br> **$$y = \color{blue}{\alpha} + \color{lightgreen}{\beta} \times X + \color{purple}{\epsilon_i}$$** ] .rightcol80[ <img src="index_files/figure-html/unnamed-chunk-9-1.png" width="720" /> ] --- class: black, middle, center ## .onehundredtwenty[.green[.matrix[Enter the M U L T I L E V E L]]]  --- class: white ### Multilevel Data I .leftcol30[ <br> <br> <br> Now, we have the same dataset as before. ] .rightcol70[ <img src="index_files/figure-html/unnamed-chunk-10-1.png" width="720" /> ] --- class: white ### Multilevel Data II .leftcol30[ <br> <br> <br> Now, we have the same dataset as before. But this time we introduce a multilevel data structure, i.e. our data points are **nested** in *six other groups*. ] .rightcol70[ <img src="index_files/figure-html/unnamed-chunk-11-1.png" width="720" /> ] --- class: white ### Multilevel - Random Intercepts I - Null Model .leftcol30[ First, we estimate a **null model** again. In a multilevel context this means estimating a **random intercept** only model, without any predictors. Notice how each group gets their own intercept. The intercept in thick blue is called the *grand mean* and shows the average intercept across all groups. ] .rightcol70[ <img src="index_files/figure-html/unnamed-chunk-12-1.png" width="720" /> ] --- class: white ### Multilevel - Random Intercepts II - Null Model .leftcol30[ <br> <br> <br> The null model introduces an error term relating to each intercept. This shows the **within-group** variation. ] .rightcol70[ <img src="index_files/figure-html/unnamed-chunk-13-1.png" width="720" /> ] --- class: white ### Multilevel - Random Intercepts III - Null Model .leftcol30[ <br> <br> <br> We've also introduced an error term for each group intercept, i.e. differences between the grand mean and other group-specific intercepts. This shows the **between-group** variation. ] .rightcol70[ <img src="index_files/figure-html/unnamed-chunk-14-1.png" width="720" /> ] --- class: white, middle, center ### Multilevel - Random Intercepts - Fixed Slope Varying starting point (intercept), same slope for each group. --- class: white ### Multilevel - Random Intercepts - Fixed Slope I .leftcol30[ <br> <br> <br> We now introduce a dependent variable `\(x\)` to our multilevel model but keep it fixed, i.e. it is the same across all groups. The thick blue line shows the group-adjusted slope. ] .rightcol70[ <img src="index_files/figure-html/unnamed-chunk-15-1.png" width="720" /> ] --- class: white ### Multilevel - Random Intercepts - Fixed Slope II .leftcol30[ <br> <br> <br> The slope is held constant across all of our groups. ] .rightcol70[ <img src="index_files/figure-html/unnamed-chunk-16-1.png" width="720" /> ] --- class: white ### Multilevel - Random Intercepts - Fixed Slope III .leftcol30[ <br> <br> <br> Again, in a multilevel model, error terms for individual data points are estimated by group. This shows the **within-group** variation. ] .rightcol70[ <img src="index_files/figure-html/unnamed-chunk-17-1.png" width="720" /> ] --- class: white ### Multilevel - Random Intercepts - Fixed Slope IV .leftcol30[ <br> <br> <br> And similar as before, we have an error term for the random intercept. This shows the **between-group** variation. ] .rightcol70[ <img src="index_files/figure-html/unnamed-chunk-18-1.png" width="720" /> ] --- class: white, middle, center ### Multilevel - Random Intercepts - Random Slope Varying starting point (intercept) **and** slope per group.  --- class: white ### Multilevel - Random Intercepts - Fixed Slope V .leftcol30[ <br> <br> <br> First, let's consider the random-intercept, fixed slope model again. All slopes are held constant, i.e. in each group **we assume the same effect of x on y**. *Is this really a reasonable assumption?* ] .rightcol70[ <img src="index_files/figure-html/unnamed-chunk-19-1.png" width="720" /> ] --- class: white ### Multilevel - Random Intercepts - Random Slope I .leftcol30[ <br> <br> <br> Instead we may introduce **random slopes** and let the slope vary by group. We can see that it some groups the relationship is *differs in strength* or even becomes *negative* when allowing the slopes to vary. ] .rightcol70[ <img src="index_files/figure-html/unnamed-chunk-20-1.png" width="720" /> ] --- class: white ### Multilevel - Random Intercepts - Random Slope II .leftcol30[ <br> <br> <br> That is because in addition to all previous error terms, we are also adding an error term for each slope. The dotted lines are fixed slopes. The arrows show the added error term for each random slope. ] .rightcol70[ <img src="index_files/figure-html/unnamed-chunk-21-1.png" width="720" /> ] --- ### Modelling Strategy Approach to multilevel model building based on Hox (2010) 1. Null Model (Random Intercept only) 2. Add independent Level 1 variables 3. Add independent Level 2 variables 4. Add random slopes 5. (Cross-level) interactions Each step, check whether your model is significantly improved compared to the previous one. --- ### Some real data to play with For this section, we will use some **real data**: The European Social Survey Round 9 (2018) <img src="https://sshopencloud.eu/sites/default/files/ESS-logo.png" width="25%"> The dependent variable is `income` (Household's total net income, all sources). The independent level 1 variable is `education` (Highest level of education). The independent level 2 variable is `net_migration` (net migration 2018) --- class: center ### Let's get to some coding!  For that, we will be using the `lme4` package in R. --- class: white, middle, center ### Estimating Multilevel Models - Null Model Varying intercept for each country --- ### Estimating Multilevel Models - Null Model ```r library(lme4) null_model <- `lmer`(income ~ (1|cntry), data = ess) ``` 1. We use the function `lmer` from the `lme4` package. --- ### Null Model ```r library(lme4) null_model <- lmer(`income` ~ (1|cntry), data = ess) ``` 1. We use the function `lmer` from the `lme4` package. 2. We specify the dependent variable `income`. --- ### Null Model ```r library(lme4) null_model <- lmer(income ~ `(1|cntry)`, data = ess) ``` 1. We use the function `lmer` from the `lme4` package. 2. We specify the dependent variable `income`. 3. We specify the *random* part of our equation within the brackets. --- ### Null Model ```r library(lme4) null_model <- lmer(income ~ (`1`|cntry), data = ess) ``` 1. We use the function `lmer` from the `lme4` package. 2. We specify the dependent variable `income`. 3. We specify the *random* part of our equation within the brackets. + The first part in the brackets (`1`) refers to the random slope. `1` just says: hold the slope constant, i.e. no random slope. --- <!-- Next we specify our independent variables: in our case that is only one, `education`. --> ### Null Model ```r library(lme4) null_model <- lmer(income ~ (1|`cntry`), data = ess) ``` 1. We use the function `lmer` from the `lme4` package. 2. We specify the dependent variable `income`. 3. We specify the *random* part of our equation within the brackets. + The first part in the brackets (`1`) refers to the random slope. `1` just says: hold the slope constant, i.e. no random slope. + The second part in the brackets (`cntry`) specifies the random intercepts, i.e. by which group we want the data to vary. --- ### Null Model ```r library(lme4) null_model <- lmer(income ~ (1|cntry), `data = ess`) ``` 1. We use the function `lmer` from the `lme4` package. 2. We specify the dependent variable `income`. 3. We specify the *random* part of our equation within the brackets. + The first part in the brackets (`1`) refers to the random slope. `1` just says: hold the slope constant, i.e. no random slope. + The second part in the brackets (`cntry`) specifies the random intercepts, i.e. by which group we want the data to vary. 4. We specify the dataset we use, here: `data = ess`. --- ### Null Model ```r library(lme4) null_model <- lmer(income ~ (1|cntry), data = ess) ``` 1. We use the function `lmer` from the `lme4` package. 2. We specify the dependent variable `income`. 3. We specify the *random* part of our equation within the brackets. + The first part in the brackets (`1`) refers to the random slope. `1` just says: hold the slope constant, i.e. no random slope. + The second part in the brackets (`cntry`) specifies the random intercepts, i.e. by which group we want the data to vary. 4. We specify the dataset we use, here: `data = ess`. 5. We assign the model to an object `null_model <-`. --- ### Null Model - `summary()` .pull-left[.seventy[ Estimate a Null Model ```r summary(null_model) ``` ] .code70[ ```r *## Linear mixed model fit by REML ['lmerMod'] *## Formula: income ~ (1 | cntry) *## Data: ess ## *## REML criterion at convergence: 192022.4 ## *## Scaled residuals: *## Min 1Q Median 3Q Max *## -2.04751 -0.83510 -0.04548 0.84470 2.24208 ## ## Random effects: ## Groups Name Variance Std.Dev. ## cntry (Intercept) 0.4094 0.6398 ## Residual 7.3472 2.7106 ## Number of obs: 39712, groups: cntry, 29 ## ## Fixed effects: ## Estimate Std. Error t value ## (Intercept) 5.2377 0.1197 43.77 ``` ] ] .pull-right[ .eightyfive[ There are three parts in this output **1.) Some meta info about your model** ] ] --- ### Null Model - `summary()` .pull-left[.seventy[ Estimate a Null Model ```r summary(null_model) ``` ] .code70[ ```r ## Linear mixed model fit by REML ['lmerMod'] ## Formula: income ~ (1 | cntry) ## Data: ess ## ## REML criterion at convergence: 192022.4 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -2.04751 -0.83510 -0.04548 0.84470 2.24208 ## *## Random effects: *## Groups Name Variance Std.Dev. *## cntry (Intercept) 0.4094 0.6398 *## Residual 7.3472 2.7106 *## Number of obs: 39712, groups: cntry, 29 ## ## Fixed effects: ## Estimate Std. Error t value ## (Intercept) 5.2377 0.1197 43.77 ``` ] ] .pull-right[ .eightyfive[ There are three parts in this output **1.) Some meta info about your model** **2.) Group-Level statistics (random effects)** ] ] --- ### Null Model - `summary()` .pull-left[.seventy[ Estimate a Null Model ```r summary(null_model) ``` ] .code70[ ```r ## Linear mixed model fit by REML ['lmerMod'] ## Formula: income ~ (1 | cntry) ## Data: ess ## ## REML criterion at convergence: 192022.4 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -2.04751 -0.83510 -0.04548 0.84470 2.24208 ## *## Random effects: *## Groups Name Variance Std.Dev. *## cntry (Intercept) 0.4094 0.6398 *## Residual 7.3472 2.7106 *## Number of obs: 39712, groups: cntry, 29 ## ## Fixed effects: ## Estimate Std. Error t value ## (Intercept) 5.2377 0.1197 43.77 ``` ] ] .pull-right[ .eightyfive[ There are three parts in this output **1.) Some meta info about your model** **2.) Group-Level statistics (random effects)** **3.) Fixed effects** ] ] --- ### Null Model - `summary()` .pull-left[.seventy[ Estimate a Null Model ```r summary(null_model) ``` ] .code70[ ```r *## Linear mixed model fit by REML ['lmerMod'] *## Formula: income ~ (1 | cntry) *## Data: ess ## *## REML criterion at convergence: 192022.4 ## *## Scaled residuals: *## Min 1Q Median 3Q Max *## -2.04751 -0.83510 -0.04548 0.84470 2.24208 ## ## Random effects: ## Groups Name Variance Std.Dev. ## cntry (Intercept) 0.4094 0.6398 ## Residual 7.3472 2.7106 ## Number of obs: 39712, groups: cntry, 29 ## ## Fixed effects: ## Estimate Std. Error t value ## (Intercept) 5.2377 0.1197 43.77 ``` ] ] -- .pull-right[ .eightyfive[ **1.) Meta Info** Formula and data used for estimation. This also states whether we are using Maximum Likelihood (ML) or Restricted Maximum Likelihood (REML) estimation. In our case, it is the latter. It also states the REML value at convergence (if it converges), value is not particularly important. Finally, we can also see some distribution metrics for the residuals of the metrics (also typically less important). ] ] --- ### Null Model - `summary()` .pull-left[.seventy[ Estimate a Null Model ```r summary(null_model) ``` ] .code70[ ```r ## Linear mixed model fit by REML ['lmerMod'] ## Formula: income ~ (1 | cntry) ## Data: ess ## ## REML criterion at convergence: 192022.4 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -2.04751 -0.83510 -0.04548 0.84470 2.24208 ## *## Random effects: *## Groups Name Variance Std.Dev. *## cntry (Intercept) 0.4094 0.6398 *## Residual 7.3472 2.7106 *## Number of obs: 39712, groups: cntry, 29 ## ## Fixed effects: ## Estimate Std. Error t value ## (Intercept) 5.2377 0.1197 43.77 ``` ] ] -- .pull-right[ .eightyfive[ **2.) Group-Level statistics** This part states the random effects and its distribution. This is interpreted in the following way: ] ] --- ### Null Model - `summary()` .pull-left[.seventy[ Estimate a Null Model ```r summary(null_model) ``` ] .code70[ ```r ## Linear mixed model fit by REML ['lmerMod'] ## Formula: income ~ (1 | cntry) ## Data: ess ## ## REML criterion at convergence: 192022.4 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -2.04751 -0.83510 -0.04548 0.84470 2.24208 ## *## Random effects: ## `Groups` `Name` `Variance` Std.Dev. ## `cntry` `(Intercept)` `0.4094` 0.6398 ## Residual 7.3472 2.7106 ## Number of obs: 39712, groups: cntry, 29 ## ## Fixed effects: ## Estimate Std. Error t value ## (Intercept) 5.2377 0.1197 43.77 ``` ] ] .pull-right[ .eightyfive[ **2.) Group-Level statistics** This part states the random effects and its distribution. This is interpreted in the following way: 1. Under `(Intercept)` we find the variance for the random intercept (per country: `cntry`), i.e. **between-group variance**. + The variance on the second level (country-level) is 0.41. ] ] --- ### Null Model - `summary()` .pull-left[.seventy[ Estimate a Null Model ```r summary(null_model) ``` ] .code70[ ```r ## Linear mixed model fit by REML ['lmerMod'] ## Formula: income ~ (1 | cntry) ## Data: ess ## ## REML criterion at convergence: 192022.4 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -2.04751 -0.83510 -0.04548 0.84470 2.24208 ## *## Random effects: ## `Groups` Name `Variance` Std.Dev. ## cntry (Intercept) 0.4094 0.6398 ## `Residual` `7.3472` 2.7106 ## Number of obs: 39712, groups: cntry, 29 ## ## Fixed effects: ## Estimate Std. Error t value ## (Intercept) 5.2377 0.1197 43.77 ``` ] ] .pull-right[ .eightyfive[ **2.) Group-Level statistics** This part states the random effects and its distribution. This is interpreted in the following way: 1. Under `(Intercept)` we find the variance for the random intercept (per country: `cntry`), i.e. **between-group variance**. + The variance on the second level (country-level) is 0.41. 2. Under `Residual` we find the variance on level 1 (here: individuals), i.e. **within-group variance**. + The variance on the first level is 7.35 This is in units of the *dependent variables*. ] ] --- ### Null Model - `summary()` .pull-left[.seventy[ Estimate a Null Model ```r summary(null_model) ``` ] .code70[ ```r ## Linear mixed model fit by REML ['lmerMod'] ## Formula: income ~ (1 | cntry) ## Data: ess ## ## REML criterion at convergence: 192022.4 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -2.04751 -0.83510 -0.04548 0.84470 2.24208 ## *## Random effects: ## Groups Name `Variance` Std.Dev. ## cntry (Intercept) `0.4094` 0.6398 ## Residual `7.3472` 2.7106 ## Number of obs: 39712, groups: cntry, 29 ## ## Fixed effects: ## Estimate Std. Error t value ## (Intercept) 5.2377 0.1197 43.77 ``` ] ] .pull-right[ .eightyfive[ **2.) Group-Level statistics** `$$Variance_{Total} = 7.75$$` We can now calculate the Intra-Class Coefficient (ICC): `$$ICC = \frac{Variance_{L2}}{ Variance_{Total}}$$` `$$ICC = \frac{0.41_{L2}}{7.75} = 0.053$$` > *5.2% of the variance of household income is between countries (on the second level).* .seventy[ Typically, we say 10% of the variance should be on the second level to justify multilevel model but for countries 5% can also considered a lot. ] ] ] --- ### Null Model - `summary()` .pull-left[.seventy[ Estimate a Null Model ```r summary(null_model) ``` ] .code70[ ```r ## Linear mixed model fit by REML ['lmerMod'] ## Formula: income ~ (1 | cntry) ## Data: ess ## ## REML criterion at convergence: 192022.4 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -2.04751 -0.83510 -0.04548 0.84470 2.24208 ## *## Random effects: ## Groups Name Variance Std.Dev. ## cntry (Intercept) 0.4094 0.6398 ## Residual 7.3472 2.7106 *## Number of obs: 39712, groups: cntry, 29 ## ## Fixed effects: ## Estimate Std. Error t value ## (Intercept) 5.2377 0.1197 43.77 ``` ] ] .pull-right[ .eightyfive[ **2.) Group-Level statistics** This part shows the number of observations on Level 1: `$$individuals_n = 39712$$` and Level 2: `$$countries_n = 29$$` ] ] --- ### Null Model - `summary()` .pull-left[.seventy[ Estimate a Null Model ```r summary(null_model) ``` ] .code70[ ```r ## Linear mixed model fit by REML ['lmerMod'] ## Formula: income ~ (1 | cntry) ## Data: ess ## ## REML criterion at convergence: 192022.4 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -2.04751 -0.83510 -0.04548 0.84470 2.24208 ## ## Random effects: ## Groups Name Variance Std.Dev. ## cntry (Intercept) 0.4094 0.6398 ## Residual 7.3472 2.7106 ## Number of obs: 39712, groups: cntry, 29 ## *## Fixed effects: *## Estimate Std. Error t value *## (Intercept) `5.2377` 0.1197 43.77 ``` ] ] .pull-right[ .eightyfive[ **3.) Fixed effects** This part tells us about the fixed effects. The estimate for the intercept tells us that the grand mean of income (across all countries) is *5.24*. ] ] --- class: white ### Null Model - Retrieve random intercepts .pull-left[ ```r coef(null_model) ``` ``` ## $cntry ## (Intercept) ## AT 4.955683 ## BE 5.703166 ## BG 4.263608 ## CH 5.469860 ## CY 4.618415 ## CZ 5.306594 ## DE 6.064316 ## DK 5.548790 ## EE 5.598878 ## ES 5.123275 ## FI 6.062619 ## FR 4.996828 ## GB 5.218492 ## HR 4.710382 ## HU 5.158844 ## IE 4.607740 ## IS 6.254706 ## IT 4.777290 ## LT 4.035245 ## LV 3.922685 ## ME 5.296826 ## NL 6.549926 ## NO 5.520132 ## PL 5.380241 ## PT 5.259325 ## RS 4.685073 ## SE 6.059498 ## SI 5.328434 ## SK 5.415541 ## ## attr(,"class") ## [1] "coef.mer" ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-51-1.png" width="504" /> ] --- class: white ### Null Model - Retrieve random effects .pull-left[ ```r ranef(null_model) ``` ``` ## $cntry ## (Intercept) ## AT -0.28198657 ## BE 0.46549685 ## BG -0.97406142 ## CH 0.23219108 ## CY -0.61925436 ## CZ 0.06892486 ## DE 0.82664659 ## DK 0.31112107 ## EE 0.36120827 ## ES -0.11439448 ## FI 0.82494924 ## FR -0.24084109 ## GB -0.01917756 ## HR -0.52728733 ## HU -0.07882503 ## IE -0.62992969 ## IS 1.01703661 ## IT -0.46037906 ## LT -1.20242437 ## LV -1.31498445 ## ME 0.05915616 ## NL 1.31225624 ## NO 0.28246281 ## PL 0.14257119 ## PT 0.02165600 ## RS -0.55259655 ## SE 0.82182892 ## SI 0.09076483 ## SK 0.17787125 ## ## with conditional variances for "cntry" ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-53-1.png" width="504" /> ] --- class: white, middle, center ### Estimating Multilevel Models - Level 1 Predictors Varying intercept for each country and adding a fixed slope for education --- ### Estimating Multilevel Models - Level 1 Predictors ```r lvl1_preds_model <- lmer(income `~ education` + (1|cntry), data = ess) ``` Next we specify our independent variables: in our case that is only one, `education`. --- ### Estimating Multilevel Models - Level 1 Predictors With the `texreg` package we can get a nice regression table output. .font60[ .pull-left[ ```r htmlreg(list(null_model, lvl1_preds_model)) ``` <table class="texreg" style="margin: 10px auto;border-collapse: collapse;border-spacing: 0px;caption-side: bottom;color: #000000;border-top: 2px solid #000000;"> <caption>Statistical models</caption> <thead> <tr> <th style="padding-left: 5px;padding-right: 5px;"> </th> <th style="padding-left: 5px;padding-right: 5px;">Model 1</th> <th style="padding-left: 5px;padding-right: 5px;">Model 2</th> </tr> </thead> <tbody> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">(Intercept)</td> <td style="padding-left: 5px;padding-right: 5px;">5.24<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">2.57<sup>***</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.12)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.11)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">education</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">0.78<sup>***</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.01)</td> </tr> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">AIC</td> <td style="padding-left: 5px;padding-right: 5px;">192028.43</td> <td style="padding-left: 5px;padding-right: 5px;">186538.97</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">BIC</td> <td style="padding-left: 5px;padding-right: 5px;">192054.19</td> <td style="padding-left: 5px;padding-right: 5px;">186573.33</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Log Likelihood</td> <td style="padding-left: 5px;padding-right: 5px;">-96011.21</td> <td style="padding-left: 5px;padding-right: 5px;">-93265.48</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Num. obs.</td> <td style="padding-left: 5px;padding-right: 5px;">39712</td> <td style="padding-left: 5px;padding-right: 5px;">39712</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Num. groups: cntry</td> <td style="padding-left: 5px;padding-right: 5px;">29</td> <td style="padding-left: 5px;padding-right: 5px;">29</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Var: cntry (Intercept)</td> <td style="padding-left: 5px;padding-right: 5px;">0.41</td> <td style="padding-left: 5px;padding-right: 5px;">0.34</td> </tr> <tr style="border-bottom: 2px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">Var: Residual</td> <td style="padding-left: 5px;padding-right: 5px;">7.35</td> <td style="padding-left: 5px;padding-right: 5px;">6.40</td> </tr> </tbody> <tfoot> <tr> <td style="font-size: 0.8em;" colspan="3"><sup>***</sup>p < 0.001; <sup>**</sup>p < 0.01; <sup>*</sup>p < 0.05</td> </tr> </tfoot> </table> ] ] .pull-right[ We now get a fixed effect for `education`. We can interpret this just like in a typical linear regression: > *With every level of education, household income increases by 0.78 income levels.* ] --- class: white ### Level 1 Predictors - Retrieve random intercepts .pull-left[ ```r coef(lvl1_preds_model) ``` ``` ## $cntry ## (Intercept) education ## AT 2.420451 0.7820857 ## BE 2.988959 0.7820857 ## BG 1.861890 0.7820857 ## CH 2.599999 0.7820857 ## CY 2.190022 0.7820857 ## CZ 2.630532 0.7820857 ## DE 3.135506 0.7820857 ## DK 2.534496 0.7820857 ## EE 2.733608 0.7820857 ## ES 2.840055 0.7820857 ## FI 3.180624 0.7820857 ## FR 2.356386 0.7820857 ## GB 2.411531 0.7820857 ## HR 2.178313 0.7820857 ## HU 2.701310 0.7820857 ## IE 1.820178 0.7820857 ## IS 3.308479 0.7820857 ## IT 2.648041 0.7820857 ## LT 1.300346 0.7820857 ## LV 1.005078 0.7820857 ## ME 2.751026 0.7820857 ## NL 3.765897 0.7820857 ## NO 2.399466 0.7820857 ## PL 2.762165 0.7820857 ## PT 3.030233 0.7820857 ## RS 2.311929 0.7820857 ## SE 3.110493 0.7820857 ## SI 2.696305 0.7820857 ## SK 2.863656 0.7820857 ## ## attr(,"class") ## [1] "coef.mer" ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-58-1.png" width="504" /> ] --- class: white ### Level 1 Predictors - Retrieve random effects .pull-left[ ```r ranef(lvl1_preds_model) ``` ``` ## $cntry ## (Intercept) ## AT -0.14978930 ## BE 0.41871867 ## BG -0.70835013 ## CH 0.02975875 ## CY -0.38021817 ## CZ 0.06029190 ## DE 0.56526567 ## DK -0.03574461 ## EE 0.16336729 ## ES 0.26981398 ## FI 0.61038357 ## FR -0.21385477 ## GB -0.15870940 ## HR -0.39192805 ## HU 0.13106948 ## IE -0.75006271 ## IS 0.73823799 ## IT 0.07780037 ## LT -1.26989420 ## LV -1.56516249 ## ME 0.18078564 ## NL 1.19565647 ## NO -0.17077448 ## PL 0.19192451 ## PT 0.45999284 ## RS -0.25831126 ## SE 0.54025270 ## SI 0.12606468 ## SK 0.29341506 ## ## with conditional variances for "cntry" ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-60-1.png" width="504" /> ] --- ## Level 2 Predictors ```r lvl2_preds_model <- lmer(income ~ education `+ net_migration +` (1|cntry), data = ess) ``` We add a level 2 predictor: `net_migration`. --- ### Estimating Multilevel Models - Level 2 Predictors .font60[ .pull-left[ ```r htmlreg(list(null_model, lvl1_preds_model, lvl2_preds_model)) ``` <table class="texreg" style="margin: 10px auto;border-collapse: collapse;border-spacing: 0px;caption-side: bottom;color: #000000;border-top: 2px solid #000000;"> <caption>Statistical models</caption> <thead> <tr> <th style="padding-left: 5px;padding-right: 5px;"> </th> <th style="padding-left: 5px;padding-right: 5px;">Model 1</th> <th style="padding-left: 5px;padding-right: 5px;">Model 2</th> <th style="padding-left: 5px;padding-right: 5px;">Model 3</th> </tr> </thead> <tbody> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">(Intercept)</td> <td style="padding-left: 5px;padding-right: 5px;">5.24<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">2.57<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">2.38<sup>***</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.12)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.11)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.13)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">education</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">0.78<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.78<sup>***</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.01)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.01)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">net_migration</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">0.06<sup>*</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.02)</td> </tr> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">AIC</td> <td style="padding-left: 5px;padding-right: 5px;">192028.43</td> <td style="padding-left: 5px;padding-right: 5px;">186538.97</td> <td style="padding-left: 5px;padding-right: 5px;">186541.21</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">BIC</td> <td style="padding-left: 5px;padding-right: 5px;">192054.19</td> <td style="padding-left: 5px;padding-right: 5px;">186573.33</td> <td style="padding-left: 5px;padding-right: 5px;">186584.16</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Log Likelihood</td> <td style="padding-left: 5px;padding-right: 5px;">-96011.21</td> <td style="padding-left: 5px;padding-right: 5px;">-93265.48</td> <td style="padding-left: 5px;padding-right: 5px;">-93265.61</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Num. obs.</td> <td style="padding-left: 5px;padding-right: 5px;">39712</td> <td style="padding-left: 5px;padding-right: 5px;">39712</td> <td style="padding-left: 5px;padding-right: 5px;">39712</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Num. groups: cntry</td> <td style="padding-left: 5px;padding-right: 5px;">29</td> <td style="padding-left: 5px;padding-right: 5px;">29</td> <td style="padding-left: 5px;padding-right: 5px;">29</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Var: cntry (Intercept)</td> <td style="padding-left: 5px;padding-right: 5px;">0.41</td> <td style="padding-left: 5px;padding-right: 5px;">0.34</td> <td style="padding-left: 5px;padding-right: 5px;">0.29</td> </tr> <tr style="border-bottom: 2px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">Var: Residual</td> <td style="padding-left: 5px;padding-right: 5px;">7.35</td> <td style="padding-left: 5px;padding-right: 5px;">6.40</td> <td style="padding-left: 5px;padding-right: 5px;">6.40</td> </tr> </tbody> <tfoot> <tr> <td style="font-size: 0.8em;" colspan="4"><sup>***</sup>p < 0.001; <sup>**</sup>p < 0.01; <sup>*</sup>p < 0.05</td> </tr> </tfoot> </table> ] ] .pull-right[ We now get a fixed effect for `net_migration`. We can interpret this just like in a typical linear regression: > *With every unit increase of net migration, household income increases by 0.056 income levels.* ] --- ### Estimating Multilevel Models - Random Slope ```r rs_preds_model <- lmer(income ~ education + net_migration + (`education`|cntry), data = ess) ``` Next we specify our random slope for `education`. --- ### Estimating Multilevel Models - Random Slope .font50[ .pull-left[ ```r htmlreg(list(null_model, lvl1_preds_model, lvl2_preds_model, rs_preds_model)) ``` <table class="texreg" style="margin: 10px auto;border-collapse: collapse;border-spacing: 0px;caption-side: bottom;color: #000000;border-top: 2px solid #000000;"> <caption>Statistical models</caption> <thead> <tr> <th style="padding-left: 5px;padding-right: 5px;"> </th> <th style="padding-left: 5px;padding-right: 5px;">Model 1</th> <th style="padding-left: 5px;padding-right: 5px;">Model 2</th> <th style="padding-left: 5px;padding-right: 5px;">Model 3</th> <th style="padding-left: 5px;padding-right: 5px;">Model 4</th> </tr> </thead> <tbody> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">(Intercept)</td> <td style="padding-left: 5px;padding-right: 5px;">5.24<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">2.57<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">2.38<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">2.34<sup>***</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.12)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.11)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.13)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.15)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">education</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">0.78<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.78<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.79<sup>***</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.01)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.01)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.02)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">net_migration</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">0.06<sup>*</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.06<sup>*</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.02)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.02)</td> </tr> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">AIC</td> <td style="padding-left: 5px;padding-right: 5px;">192028.43</td> <td style="padding-left: 5px;padding-right: 5px;">186538.97</td> <td style="padding-left: 5px;padding-right: 5px;">186541.21</td> <td style="padding-left: 5px;padding-right: 5px;">186497.59</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">BIC</td> <td style="padding-left: 5px;padding-right: 5px;">192054.19</td> <td style="padding-left: 5px;padding-right: 5px;">186573.33</td> <td style="padding-left: 5px;padding-right: 5px;">186584.16</td> <td style="padding-left: 5px;padding-right: 5px;">186557.71</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Log Likelihood</td> <td style="padding-left: 5px;padding-right: 5px;">-96011.21</td> <td style="padding-left: 5px;padding-right: 5px;">-93265.48</td> <td style="padding-left: 5px;padding-right: 5px;">-93265.61</td> <td style="padding-left: 5px;padding-right: 5px;">-93241.79</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Num. obs.</td> <td style="padding-left: 5px;padding-right: 5px;">39712</td> <td style="padding-left: 5px;padding-right: 5px;">39712</td> <td style="padding-left: 5px;padding-right: 5px;">39712</td> <td style="padding-left: 5px;padding-right: 5px;">39712</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Num. groups: cntry</td> <td style="padding-left: 5px;padding-right: 5px;">29</td> <td style="padding-left: 5px;padding-right: 5px;">29</td> <td style="padding-left: 5px;padding-right: 5px;">29</td> <td style="padding-left: 5px;padding-right: 5px;">29</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Var: cntry (Intercept)</td> <td style="padding-left: 5px;padding-right: 5px;">0.41</td> <td style="padding-left: 5px;padding-right: 5px;">0.34</td> <td style="padding-left: 5px;padding-right: 5px;">0.29</td> <td style="padding-left: 5px;padding-right: 5px;">0.39</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Var: Residual</td> <td style="padding-left: 5px;padding-right: 5px;">7.35</td> <td style="padding-left: 5px;padding-right: 5px;">6.40</td> <td style="padding-left: 5px;padding-right: 5px;">6.40</td> <td style="padding-left: 5px;padding-right: 5px;">6.38</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Var: cntry education</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">0.01</td> </tr> <tr style="border-bottom: 2px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">Cov: cntry (Intercept) education</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">-0.03</td> </tr> </tbody> <tfoot> <tr> <td style="font-size: 0.8em;" colspan="5"><sup>***</sup>p < 0.001; <sup>**</sup>p < 0.01; <sup>*</sup>p < 0.05</td> </tr> </tfoot> </table> ] ] .pull-right[ We now also have the random slope for `education` in our random effects report. > The variance for the slope is `\(0.01\)`. Remember: this is in units of the dependent variable. So this means the effect doesn't differ by much. ] --- class: white ### Retrieve random intercepts and slopes .pull-left[ ```r coef(rs_preds_model) ``` ``` ## $cntry ## (Intercept) education net_migration ## AT 2.3823695 0.7209433 0.05916834 ## BE 3.0107437 0.7023391 0.05916834 ## BG 1.7821846 0.8172314 0.05916834 ## CH 2.0330948 0.8613404 0.05916834 ## CY 1.4081236 0.8612987 0.05916834 ## CZ 2.6861779 0.7034622 0.05916834 ## DE 2.8912208 0.7714723 0.05916834 ## DK 2.5196858 0.7366879 0.05916834 ## EE 2.7765936 0.6849031 0.05916834 ## ES 2.6995968 0.6872384 0.05916834 ## FI 2.9610223 0.8074112 0.05916834 ## FR 1.9600204 0.8990370 0.05916834 ## GB 2.3596950 0.7323443 0.05916834 ## HR 1.9270830 0.9185291 0.05916834 ## HU 2.3069951 0.8454327 0.05916834 ## IE 1.3379183 0.7696738 0.05916834 ## IS 2.5878995 0.6866045 0.05916834 ## IT 2.6766741 0.7468343 0.05916834 ## LT 1.4244129 0.7665395 0.05916834 ## LV 0.9840822 0.8266132 0.05916834 ## ME 2.3836665 0.9203399 0.05916834 ## NL 3.5086715 0.7708756 0.05916834 ## NO 3.0529471 0.5675452 0.05916834 ## PL 2.5426111 0.8358675 0.05916834 ## PT 2.5686918 0.9197795 0.05916834 ## RS 2.1494740 0.8349456 0.05916834 ## SE 2.8038365 0.7308521 0.05916834 ## SI 1.7171374 0.9480001 0.05916834 ## SK 2.3066903 0.9389902 0.05916834 ## ## attr(,"class") ## [1] "coef.mer" ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-68-1.png" width="504" /> ] --- class: white ### Retrieve random effects .pull-left[ ```r ranef(rs_preds_model) ``` ``` ## $cntry ## (Intercept) education ## AT 0.04618600 -0.07261300 ## BE 0.67456022 -0.09121716 ## BG -0.55399880 0.02367509 ## CH -0.30308861 0.06778409 ## CY -0.92805981 0.06774242 ## CZ 0.34999448 -0.09009411 ## DE 0.55503740 -0.02208394 ## DK 0.18350236 -0.05686837 ## EE 0.44041018 -0.10865315 ## ES 0.36341330 -0.10631785 ## FI 0.62483884 0.01385489 ## FR -0.37616310 0.10548070 ## GB 0.02351152 -0.06121196 ## HR -0.40910047 0.12497279 ## HU -0.02918836 0.05187646 ## IE -0.99826516 -0.02388248 ## IS 0.25171604 -0.10695181 ## IT 0.34049061 -0.04672197 ## LT -0.91177054 -0.02701675 ## LV -1.35210129 0.03305691 ## ME 0.04748303 0.12678363 ## NL 1.17248806 -0.02268072 ## NO 0.71676365 -0.22601103 ## PL 0.20642767 0.04231125 ## PT 0.23250830 0.12622323 ## RS -0.18670944 0.04138928 ## SE 0.46765303 -0.06270416 ## SI -0.61904600 0.15444386 ## SK -0.02949310 0.14543389 ## ## with conditional variances for "cntry" ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-70-1.png" width="504" /> ] --- class: white ### Correlation between random intercepts and slopes The lower the average income-level, the greater the impact of education on income. <img src="index_files/figure-html/unnamed-chunk-71-1.png" width="792" /> --- ### Model Comparisons .eightyfive[ When models are nested (i.e. they only differ by additional parameters), we can use a likelihood ratio test (implemented via `anova()`). ] -- .eightyfive[ Luckily, `anova()` automatically refits the model with ML (instead of REML), which is the recommended estimation for model comparison. ```r anova(null_model, lvl1_preds_model, lvl2_preds_model, rs_preds_model) ``` ``` ## Data: ess ## Models: ## null_model: income ~ (1 | cntry) ## lvl1_preds_model: income ~ education + (1 | cntry) ## lvl2_preds_model: income ~ education + net_migration + (1 | cntry) ## rs_preds_model: income ~ education + net_migration + (education | cntry) ## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq) ## null_model 3 192026 192052 -96010 192020 ## lvl1_preds_model 4 186529 186563 -93261 186521 5498.9893 1 < 2.2e-16 *** ## lvl2_preds_model 5 186525 186568 -93258 186515 5.6777 1 0.01718 * ## rs_preds_model 7 186483 186543 -93235 186469 46.0279 2 1.012e-10 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` We see a significant better model fit for each addition of variables. ] --- ### Model Comparisons .eightyfive[ We can always use the AIC and BIC for model comparisons (again with ML estimation). ```r anova(null_model, lvl1_preds_model, lvl2_preds_model, rs_preds_model) ``` ``` ## Data: ess ## Models: ## null_model: income ~ (1 | cntry) ## lvl1_preds_model: income ~ education + (1 | cntry) ## lvl2_preds_model: income ~ education + net_migration + (1 | cntry) ## rs_preds_model: income ~ education + net_migration + (education | cntry) ## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq) ## null_model 3 192026 192052 -96010 192020 ## lvl1_preds_model 4 186529 186563 -93261 186521 5498.9893 1 < 2.2e-16 *** ## lvl2_preds_model 5 186525 186568 -93258 186515 5.6777 1 0.01718 * ## rs_preds_model 7 186483 186543 -93235 186469 46.0279 2 1.012e-10 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` In this case, AIC and BIC values closer to zero indicate better fit. ] --- class: white, middle, center ### Cool R packages/functions  ### that help with multilevel modelling --- class: white ### Check Model Assumptions with `performance` package -- ```r performance::check_model(rs_preds_model) ``` <img src="index_files/figure-html/unnamed-chunk-75-1.png" width="936" /> --- class: white ### Extract model statistics with `broom.mixed` The function `tidy()` gives us all model parameters (random and fixed). -- ```r broom.mixed::tidy(rs_preds_model) ``` ``` ## # A tibble: 7 × 6 ## effect group term estimate std.error statistic ## <chr> <chr> <chr> <dbl> <dbl> <dbl> ## 1 fixed <NA> (Intercept) 2.34 0.147 15.9 ## 2 fixed <NA> education 0.794 0.0226 35.1 ## 3 fixed <NA> net_migration 0.0592 0.0231 2.56 ## 4 ran_pars cntry sd__(Intercept) 0.624 NA NA ## 5 ran_pars cntry cor__(Intercept).education -0.513 NA NA ## 6 ran_pars cntry sd__education 0.107 NA NA ## 7 ran_pars Residual sd__Observation 2.53 NA NA ``` --- class: white ### Extract model statistics with `broom.mixed` The function `augment()` gives us fitted values (incl. random intercepts and slopes) for every single data point. -- ```r broom.mixed::augment(rs_preds_model) ``` ``` ## # A tibble: 39,712 × 15 ## income education net_migration cntry .fitted .resid .hat .cooksd .fixed ## <dbl+lbl> <dbl> <dbl> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 3 [C - 3… 3 4 AT 4.78 -1.78 5.05e-4 8.38e-5 4.95 ## 2 5 [F - 5… 3 4 AT 4.78 0.218 5.05e-4 1.26e-6 4.95 ## 3 9 [D - 9… 2 4 AT 4.06 4.94 1.09e-3 1.39e-3 4.16 ## 4 6 [S - 6… 3 4 AT 4.78 1.22 5.05e-4 3.92e-5 4.95 ## 5 1 [J - 1… 2 4 AT 4.06 -3.06 1.09e-3 5.35e-4 4.16 ## 6 5 [F - 5… 3 4 AT 4.78 0.218 5.05e-4 1.26e-6 4.95 ## 7 7 [K - 7… 3 4 AT 4.78 2.22 5.05e-4 1.30e-4 4.95 ## 8 2 [R - 2… 3 4 AT 4.78 -2.78 5.05e-4 2.04e-4 4.95 ## 9 3 [C - 3… 3 4 AT 4.78 -1.78 5.05e-4 8.38e-5 4.95 ## 10 4 [M - 4… 3 4 AT 4.78 -0.782 5.05e-4 1.61e-5 4.95 ## # … with 39,702 more rows, and 6 more variables: .mu <dbl>, .offset <dbl>, ## # .sqrtXwt <dbl>, .sqrtrwt <dbl>, .weights <dbl>, .wtres <dbl> ``` --- ### Plot random intercepts and slopes with `broom.mixed` .pull-left[ Output from `augment` allows us to plot random slopes and intercepts. ```r broom.mixed::augment(rs_preds_model) %>% filter(cntry %in% cntries) %>% ggplot(aes(x=education, y=income)) + geom_line(aes(x = education, y=.fitted, color = cntry), inherit.aes=FALSE, size = 1) + theme_minimal() + theme(legend.position="none") + ggthemes::scale_color_gdocs() ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-79-1.png" width="504" /> ] --- class: white ## Plot fixed effects with `sjPlot` This is a great function that plots coefficients: `plot_model()` ```r sjPlot::plot_model(rs_preds_model, show.p = T, show.values = T) ``` <img src="index_files/figure-html/unnamed-chunk-80-1.png" width="720" /> Also works on various other models (OLS, logistic regression). --- class: white ## Plot random effects with `sjPlot` With `plot_model()` you can also graph random effects ```r sjPlot::plot_model(rs_preds_model, show.values = T, type = "re", value.offset = 0.5) ``` <img src="index_files/figure-html/unnamed-chunk-81-1.png" width="720" /> --- ## Things we didn't learn about today **Cross-Level Interactions** How to do multilevel analyses with cross-level interaction by Utrecht University + [https://multilevel-analysis.sites.uu.nl/tutorials/](https://multilevel-analysis.sites.uu.nl/tutorials/) --- ## Things we didn't learn about today **Centering of Variables in Multilevel models** Centering or Not Centering in Multilevel Models? The Role of the Group Mean and the Assessment of Group Effects (Paccagnella 2006) + [https://www.ncbi.nlm.nih.gov/pubmed/16394187](https://www.ncbi.nlm.nih.gov/pubmed/16394187) Group-mean-centering independent variables in multi-level models is dangerous (Kelley et al. 2016) + [https://link.springer.com/article/10.1007/s11135-015-0304-z](https://link.springer.com/article/10.1007/s11135-015-0304-z) Understanding and misunderstanding group mean centering: A commentary on Kelley et al.'s dangerous practice (Bell et al. 2017) + [https://link.springer.com/article/10.1007/s11135-017-0593-5](https://link.springer.com/article/10.1007/s11135-017-0593-5) --- ## Things we didn't learn about today **Model diagnostics in multilevel models** HLMdiag: A Suite of Diagnostics for Hierarchical Linear Models in R: + [https://www.jstatsoft.org/article/view/v056i05](https://www.jstatsoft.org/article/view/v056i05) DHARMa - Residual Diagnostics for HierArchical (Multi-level / Mixed) Regression Models + [https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html](https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html) --- ## Further resources Under the following link you will find a great tutorial for multilevel models including applications in R, STATA, SPSS and Mlwin. LEMMA (Learning Environment for Multilevel Methods and Applications: + [https://www.cmm.bris.ac.uk/lemma/login/index.php](https://www.cmm.bris.ac.uk/lemma/login/index.php) Great blog post about linear and logistic multilevel models in R + [https://rpubs.com/corey_sparks/70812](https://rpubs.com/corey_sparks/70812) Another introduction to multilevel models in R + [https://glennwilliams.me/r4psych/mixed-effects-models.html#getting-started-6](https://glennwilliams.me/r4psych/mixed-effects-models.html#getting-started-6) --- # Time for your questions .pull-left[     ] .pull-right[     ] --- # Thank you for attending and hope you found this useful! .pull-left[       ] .pull-right[       ]