This is a simple walkthrough on how to specify a piecewise linear growth model in R using the lme4 package. Just as a quick outline, I will start with a simple linear growth pattern, then extend the logic to look at piecewise growth.
Plain old linear growth
Here is an example just simulating linear growth. We will assume that everyone follows the same linear pattern over time, with some random variability around both the intercept and slope parameters.
Step 1: Simulate data
For this example, I’m simulating 100 participants with 6 timepoints each. The intercept is 10, while the slope is -2. The intercept and slope will also be positively correlated. This indicates that participants who start higher (i.e., a higher intercept) tend to have more positive change, compared to those who start lower (i.e., a lower intercept).
require(MASS)
require(lme4)
require(ggplot2)
set.seed(42)
nPart = 100
nTime = 6
intSlope <- mvrnorm(nPart, mu = c(10, -2),
Sigma = matrix(c(4, 1, 1, 2), nrow = 2))
myData <- expand.grid(ID = 1:nPart, Time = 0:(nTime - 1))
myData <- myData[order(myData$ID, myData$Time), ]
score <- c(NA)
for(i in 1:nrow(myData)){
df <- myData[i,]
score[i] <- intSlope[df$ID, 1] + df$Time * intSlope[df$ID, 2] + rnorm(1)
}
myData$Score <- score
head(myData)
## ID Time Score
## 1 1 0 5.91669
## 101 1 1 3.75189
## 201 1 2 0.08993
## 301 1 3 -3.52136
## 401 1 4 -11.45727
## 501 1 5 -15.73077
Step 2: Plot the data
You can see that I simulated the dataset to be in long format. I also started Time at 0, so the intercept would be interpreted as the mean of the first measure. Using spaghetti plots, we can visualize what the simulated data look like.
ggplot(aes(x = Time, y = Score), data = myData) +
geom_line(aes(group = ID), color = "gray") +
geom_smooth(aes(group = 1), method = "lm", size = 3, color = "red", se = FALSE) +
theme_bw()
We see that the intercept is around 10, with a decreasing slope around 2. This should not be a surprise, given those are the parameters we simulated.
Step 3: Build the model
For this example, we are going to use lme4 to create a mixed-effects model. Here we have observations nested within individual. In this case, time will be a level-1 predictor. We will also allow both intercept and time to vary systematically around our intercept and slope estimates (and this variation between intercept and slope also has a covariance).
In this output, we are first presented with random effects. This tells us that the intercept has a variance of around 4 and the slope for Time has a variance of around 2. Again, this is similar to what we simulated. Next are the fixed effects, and we capture the original parameters quite accurately.
linearModel <- lmer(Score ~ 1 + Time + (1 + Time | ID),
data = myData)
summary(linearModel)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Score ~ 1 + Time + (1 + Time | ID)
## Data: myData
##
## REML criterion at convergence: 2373
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.6307 -0.5577 -0.0233 0.5739 2.1918
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## ID (Intercept) 4.03 2.01
## Time 1.81 1.35 0.48
## Residual 1.01 1.01
## Number of obs: 600, groups: ID, 100
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 9.931 0.214 46.5
## Time -1.962 0.137 -14.3
##
## Correlation of Fixed Effects:
## (Intr)
## Time 0.398
Piecewise linear growth
Now, here is an example simulating two growth patterns.
Step 1: Simulate data
In this case, I’m still simulating 100 participants with 6 timepoints. However, this time I will assume no growth for the first 3 time points, followed by a decline in the next 3 time points. The intercept will still be 10, and the slope for the second piece will still be -2. To make things a little easier, no correlation between intercept and either slope will be simulated.
intSlopeNew <- mvrnorm(nPart, mu = c(10, 0, -2),
Sigma = matrix(c(4, 0, 0, 0, 1, 0, 0, 0, 2),byrow = TRUE, nrow = 3))
myDataNew <- data.frame(ID = rep(1:nPart, each = 6),
Time = 0:(nTime - 1),
Time1 = c(-2, -1, 0, 0, 0, 0),
Time2 = c(0, 0, 0, 1, 2, 3))
scoreNew <- c(NA)
for(i in 1:nrow(myData)){
df <- myDataNew[i,]
score[i] <- intSlopeNew[df$ID, 1] + df$Time1 * intSlopeNew[df$ID, 2] +
df$Time2 * intSlopeNew[df$ID, 3] + rnorm(1)
}
myDataNew$Score <- score
head(myDataNew, n = 12)
## ID Time Time1 Time2 Score
## 1 1 0 -2 0 6.077
## 2 1 1 -1 0 8.049
## 3 1 2 0 0 10.843
## 4 1 3 0 1 10.599
## 5 1 4 0 2 10.642
## 6 1 5 0 3 9.790
## 7 2 0 -2 0 10.296
## 8 2 1 -1 0 10.069
## 9 2 2 0 0 12.577
## 10 2 3 0 1 10.015
## 11 2 4 0 2 6.023
## 12 2 5 0 3 5.082
Step 2: Plot the data
This time, I have two time variables: Time1 and Time2. As you can imagine, Time1 represents the first growth pattern, while Time2 represents the second. The intercept is the score when both time variables are 0, which is the third datapoint for each person.
ggplot(aes(x = Time, y = Score), data = myDataNew) +
geom_line(aes(group = ID), color = "gray") +
geom_smooth(aes(group = 1), size = 3, color = "red", se = FALSE) +
theme_bw()
Step 3: Build the model
Model building for this model seems like it should be complex, but it really is just as simple as including the second time variable as both a fixed and random effect. From the output, we see the variance components for each random effect. These are interpreted the same: they tell us how variable our sample is with regard to each component. The fixed-effects are about 10 for our intercept, around 0 for the first slope, and -2 for the second slope. Just as was simulated, this model tells us the data seem to be relatively stable throughout the baseline, then declining after baseline. Including additional predictors/interactions with time follow logically from the above example.
plgModel <- lmer(Score ~ 1 + Time1 + Time2 + (1 + Time1 + Time2 | ID),
data = myDataNew)
summary(plgModel)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Score ~ 1 + Time1 + Time2 + (1 + Time1 + Time2 | ID)
## Data: myDataNew
##
## REML criterion at convergence: 2492
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.6278 -0.4664 0.0103 0.5049 2.9113
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## ID (Intercept) 4.529 2.128
## Time1 0.854 0.924 -0.05
## Time2 2.805 1.675 0.07 0.16
## Residual 1.050 1.025
## Number of obs: 600, groups: ID, 100
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 10.2517 0.2275 45.1
## Time1 -0.0663 0.1139 -0.6
## Time2 -1.9445 0.1732 -11.2
##
## Correlation of Fixed Effects:
## (Intr) Time1
## Time1 0.115
## Time2 -0.005 0.042
Reference: http://dpmartin42.github.io/posts/Piecewise-growth