Growth Curve Analysis

Dan Mirman

13 March 2018

Workshop Schedule

Day 1: GCA Basics
Time Content
9am - 10am Part 1: Introduction, challenges of time course data, GCA basics
10am - 10:30am Break, hands-on exercise
10:30 - Noon Part 2: Modeling nonlinear change, within-subject effects
Noon - 1:30pm Lunch break, Q&A, hands-on exercise
1:30pm - 4:30pm Hands-on analysis time
Day 2: Advanced Topics
Time Content
9am - 10am Part 3: Visualizing higher-order polynomial effects, contrast coding, multiple comparisons
10am - 10:45am Hands-on analysis time
10:45am - 11:15am Part 4: Logistic GCA
11:15 - Noon Hands-on analysis time
Noon - 1:30pm Lunch break, Q&A
1:30pm - 2:30pm Part 5: Quantifying and analyzing individual differences
2:30pm - 4:30pm Hands-on analysis time

What are time course data?

Two key features

Nested data are not independent

Two key features

Nested data are not independent

Related by continuous variable (i.e., time, but could be [letter] size, number of distractors, etc.)

Gradual change

Replication crisis

Visualizing time course data: ggplot2

summary(Orange)
##  Tree       age       circumference  
##  3:7   Min.   : 118   Min.   : 30.0  
##  1:7   1st Qu.: 484   1st Qu.: 65.5  
##  5:7   Median :1004   Median :115.0  
##  2:7   Mean   : 922   Mean   :115.9  
##  4:7   3rd Qu.:1372   3rd Qu.:161.5  
##        Max.   :1582   Max.   :214.0

Basic line + points plot

ggplot(Orange, aes(x=age, y=circumference, color=Tree)) + 
  geom_point() + geom_line()

Black & White version

ggplot(Orange, aes(x=age, y=circumference, shape=Tree, linetype=Tree)) + 
  geom_point() + geom_line()

Compute summary on-the-fly

ggplot(Orange, aes(age, circumference)) + 
  geom_point() + stat_summary(fun.y=mean, geom="line")

Exclude cases on-the-fly

ggplot(subset(Orange, Tree!="5"), aes(age, circumference)) + 
  geom_point() + stat_summary(fun.y=mean, geom="line")

Other summary statistics

ggplot(Orange, aes(age, circumference)) + 
  stat_summary(fun.data=mean_se, geom="pointrange")

Publication-ready graphs

ggplot(Orange, aes(age, circumference)) + 
  stat_summary(fun.y=mean, geom="line") +
  stat_summary(fun.data=mean_se, geom="pointrange") +
  theme_bw(base_size=12) + labs(x="Age in days", y="Size")

ggsave("fig-Orange.pdf", width=3, height=3)

“Small Multiples” aka facets

ggplot(Orange, aes(age, circumference)) + facet_wrap(~ Tree) + 
  geom_line()

Linear regression: A brief review

Multilevel regression: Fixed effects

Level 1: \(Y_{ij} = \beta_{0i} + \beta_{1i} \cdot Time_{j} + \epsilon_{ij}\)

Level 2: model of the Level 1 parameter(s)

Multilevel regression: Random effects

Level 1: \(Y_{ij} = \beta_{0i} + \beta_{1i} \cdot Time_{j} + \epsilon_{ij}\)

Level 2:

\(\beta_{0i} = \gamma_{00} + \gamma_{0C} \cdot C + \zeta_{0i}\)

\(\beta_{1i} = \gamma_{10} + \gamma_{1C} \cdot C + \zeta_{1i}\)

Residual errors

Fixed vs. Random effects

Fixed effects

Random effects

Maximum Likelihood Estimation

A simple (linear) GCA example

# the psy811 package includes helper functions and example data sets
# to install: devtools::install_github("dmirman/psy811")
library(psy811)
summary(VisualSearchEx)
##   Participant        Dx        Set.Size          RT       
##  0042   :  4   Aphasic:60   Min.   : 1.0   Min.   :  414  
##  0044   :  4   Control:72   1st Qu.: 4.0   1st Qu.: 1132  
##  0083   :  4                Median :10.0   Median : 1814  
##  0166   :  4                Mean   :12.8   Mean   : 2261  
##  0186   :  4                3rd Qu.:18.8   3rd Qu.: 2808  
##  0190   :  4                Max.   :30.0   Max.   :12201  
##  (Other):108
ggplot(VisualSearchEx, aes(Set.Size, RT, color=Dx)) + stat_summary(fun.data=mean_se, geom="pointrange")

A simple (linear) GCA example: Fit the models

library(lme4)
# a null, intercept-only model
vs.null <- lmer(RT ~ 1 + (Set.Size | Participant), data=VisualSearchEx, REML=FALSE)
# add effect of set size
vs <- lmer(RT ~ Set.Size + (Set.Size | Participant), data=VisualSearchEx, REML=F)
# add effect of diagnosis
vs.0 <- lmer(RT ~ Set.Size + Dx + (Set.Size | Participant), data=VisualSearchEx, REML=F)
# add set size by diagnosis interaction
vs.1 <- lmer(RT ~ Set.Size * Dx + (Set.Size | Participant), data=VisualSearchEx, REML=F)
# compare model fits
anova(vs.null, vs, vs.0, vs.1)
## Data: VisualSearchEx
## Models:
## vs.null: RT ~ 1 + (Set.Size | Participant)
## vs: RT ~ Set.Size + (Set.Size | Participant)
## vs.0: RT ~ Set.Size + Dx + (Set.Size | Participant)
## vs.1: RT ~ Set.Size * Dx + (Set.Size | Participant)
##         Df  AIC  BIC logLik deviance Chisq Chi Df Pr(>Chisq)    
## vs.null  5 2283 2297  -1136     2273                            
## vs       6 2248 2265  -1118     2236 36.90      1    1.2e-09 ***
## vs.0     7 2241 2261  -1114     2227  8.58      1     0.0034 ** 
## vs.1     8 2241 2264  -1113     2225  2.01      1     0.1567    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

A simple (linear) GCA example: Interpet results

Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
vs.null 5 2283 2297 -1136 2273 NA NA NA
vs 6 2248 2265 -1118 2236 36.902 1 0.0000
vs.0 7 2241 2261 -1114 2227 8.585 1 0.0034
vs.1 8 2241 2264 -1113 2225 2.006 1 0.1567

A simple (linear) GCA example: Inspect model

summary(vs.1)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: RT ~ Set.Size * Dx + (Set.Size | Participant)
##    Data: VisualSearchEx
## 
##      AIC      BIC   logLik deviance df.resid 
##     2241     2264    -1113     2225      124 
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -3.759 -0.317 -0.079  0.317  6.229 
## 
## Random effects:
##  Groups      Name        Variance Std.Dev. Corr
##  Participant (Intercept) 613393   783.2        
##              Set.Size       380    19.5    1.00
##  Residual                756830   870.0        
## Number of obs: 132, groups:  Participant, 33
## 
## Fixed effects:
##                    Estimate Std. Error t value
## (Intercept)          2078.7      264.4    7.86
## Set.Size               73.5       11.2    6.54
## DxControl           -1106.1      357.9   -3.09
## Set.Size:DxControl    -21.7       15.2   -1.43
## 
## Correlation of Fixed Effects:
##             (Intr) Set.Sz DxCntr
## Set.Size    -0.090              
## DxControl   -0.739  0.066       
## St.Sz:DxCnt  0.066 -0.739 -0.090

A simple (linear) GCA example: Parameter p-values

get_pvalues(vs.1) #default: normal approximation
##                    Estimate Std..Error t.value  p.normal p.normal.star
## (Intercept)         2078.75     264.36   7.863 3.775e-15           ***
## Set.Size              73.49      11.23   6.545 5.955e-11           ***
## DxControl          -1106.05     357.95  -3.090 2.002e-03            **
## Set.Size:DxControl   -21.74      15.20  -1.430 1.528e-01
get_pvalues(vs.1, method = "KR") # alternative: Kenward-Roger approximation for DF
##                    Estimate Std..Error t.value df.KR      p.KR p.KR.star
## (Intercept)         2078.75     264.36   7.863    31 7.103e-09       ***
## Set.Size              73.49      11.23   6.545    31 2.630e-07       ***
## DxControl          -1106.05     357.95  -3.090    31 4.204e-03        **
## Set.Size:DxControl   -21.74      15.20  -1.430    31 1.628e-01

A simple (linear) GCA example: Plot model fit

ggplot(VisualSearchEx, aes(Set.Size, RT, color=Dx)) + 
  stat_summary(fun.data=mean_se, geom="pointrange") + 
  stat_summary(aes(y=fitted(vs.0)), fun.y=mean, geom="line")

A simple (linear) GCA example: Compare model fits

ggplot(VisualSearchEx, aes(Set.Size, RT, color=Dx)) + 
  stat_summary(fun.data=mean_se, geom="pointrange") + 
  stat_summary(aes(y=fitted(vs.0)), fun.y=mean, geom="line") +
  stat_summary(aes(y=fitted(vs.1)), fun.y=mean, geom="line", linetype="dashed")

BREAK

Exercise 1A: Analyze the state-level suicide rate data from the WISQARS (wisqars.suicide)

Exercise 1B: Analyze the recovery of object naming ability in aphasia (NamingRecovery)