Growth Curve Analysis, Part 5

Dan Mirman

29 November 2019

Schedule

Day 1: Core Topics
Time	Topic	Data set(s)
10-10:30	Introduction	Visual Search
10:30-11	Exercise	WISQARS, Naming Recovery
11-11:45	Non-linear change	Word Learning
Lunch	Exercise	CP
13-13:45	Within-subject effects	Target Fixation
13:45-14:30	Exercise	Az
14:30-15:15	Contrast coding, Multiple comparisons	Motor Learning, Naming Recovery
15:15-16	Exercise	WISQARS

Day 2: Advanced Topics
Time	Topic	Data set(s)
12-12:30	Logistic GCA	Target Fixation
12:30-13	Exercise	Word Learning
13-14	Individual Differences	Deviant Behavior, School Mental Health
14-17	Hands-on analysis time	Own data

Why logistic regression? (A brief review)

Aggregated binary outcomes (e.g., accuracy, fixation proportion) can look approximately continuous, but they

Are bounded: can only have values between 0.0 and 1.0
Have very specific, non-uniform variance pattern

Why logistic regression? (A brief review)

These properties can produce incorrect results in linear regression.

Logistic regression (A brief review)

Model the binomial process that produced binary data
Not enough to know that accuracy was 90%, need to know whether that was 9 out of 10 trials or 90 out of 100 trials
Can be a binary vector of single-trial 0’s (failures, No’s) or 1’s (successes, Yes’s)
More compact version: count of the number of successes and the number of failures

Outcome is log-odds (also called “logit”): \(logit(Yes, No) = \log \left(\frac{Yes}{No}\right)\)

Compare to proportions: \(p(Yes, No) = \frac{Yes}{Yes+No}\)

Note: logit is undefined (Inf) when \(p=0\) or \(p=1\), this makes it hard to fit logistic models to data with such extreme values (e.g., fixations on objects that are very rarely fixated).

Logistic GCA: Data

##     Subject         Time         timeBin   Condition     meanFix      
##  708    : 30   Min.   : 300   Min.   : 1   High:150   Min.   :0.0286  
##  712    : 30   1st Qu.: 450   1st Qu.: 4   Low :150   1st Qu.:0.2778  
##  715    : 30   Median : 650   Median : 8              Median :0.4558  
##  720    : 30   Mean   : 650   Mean   : 8              Mean   :0.4483  
##  722    : 30   3rd Qu.: 850   3rd Qu.:12              3rd Qu.:0.6111  
##  725    : 30   Max.   :1000   Max.   :15              Max.   :0.8286  
##  (Other):120                                                          
##      sumFix           N       
##  Min.   : 1.0   Min.   :33.0  
##  1st Qu.:10.0   1st Qu.:35.8  
##  Median :16.0   Median :36.0  
##  Mean   :15.9   Mean   :35.5  
##  3rd Qu.:21.2   3rd Qu.:36.0  
##  Max.   :29.0   Max.   :36.0  
##

Logistic GCA: Fit the model

Logistic model code is almost the same as linear model code. Three differences:

glmer() instead of lmer()
outcome is 1/0 or aggregated number of 1s, 0s
add family=binomial

#make 3rd-order orth poly
TargetFix <- code_poly(TargetFix, predictor="timeBin", poly.order=3, draw.poly=F)
# fit logisitc GCA model
m.log <- glmer(cbind(sumFix, N-sumFix) ~ (poly1+poly2+poly3)*Condition +
                 (poly1+poly2+poly3 | Subject) +
                 (poly1+poly2 | Subject:Condition),
               data=TargetFix, family=binomial)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00859796
## (tol = 0.001, component 1)

Logistic GCA models are much slower to fit and are prone to convergence failure

Simplified random effect structure
Convergence warning: A warning is not an error – will need to check parameter estimates and SE

Logistic GCA: Model summary

summary(m.log)

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: 
## cbind(sumFix, N - sumFix) ~ (poly1 + poly2 + poly3) * Condition +  
##     (poly1 + poly2 + poly3 | Subject) + (poly1 + poly2 | Subject:Condition)
##    Data: TargetFix
## 
##      AIC      BIC   logLik deviance df.resid 
##   1419.1   1508.0   -685.6   1371.1      276 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.7544 -0.4098 -0.0031  0.3786  2.0623 
## 
## Random effects:
##  Groups            Name        Variance Std.Dev. Corr             
##  Subject:Condition (Intercept) 0.03233  0.1798                    
##                    poly1       0.40183  0.6339   -0.68            
##                    poly2       0.14804  0.3848   -0.23  0.73      
##  Subject           (Intercept) 0.00175  0.0419                    
##                    poly1       0.34347  0.5861    1.00            
##                    poly2       0.00200  0.0447   -1.00 -1.00      
##                    poly3       0.02747  0.1657   -1.00 -1.00  1.00
## Number of obs: 300, groups:  Subject:Condition, 20; Subject, 10
## 
## Fixed effects:
##                    Estimate Std. Error z value Pr(>|z|)    
## (Intercept)         -0.1168     0.0655   -1.78  0.07449 .  
## poly1                2.8186     0.2983    9.45  < 2e-16 ***
## poly2               -0.5591     0.1695   -3.30  0.00097 ***
## poly3               -0.3208     0.1277   -2.51  0.01200 *  
## ConditionLow        -0.2615     0.0909   -2.88  0.00404 ** 
## poly1:ConditionLow   0.0638     0.3313    0.19  0.84723    
## poly2:ConditionLow   0.6951     0.2398    2.90  0.00375 ** 
## poly3:ConditionLow  -0.0707     0.1662   -0.43  0.67060    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) poly1  poly2  poly3  CndtnL pl1:CL pl2:CL
## poly1       -0.288                                          
## poly2       -0.128  0.272                                   
## poly3       -0.100 -0.228 -0.015                            
## ConditionLw -0.690  0.297  0.081  0.012                     
## ply1:CndtnL  0.372 -0.552 -0.292 -0.024 -0.541              
## ply2:CndtnL  0.080 -0.230 -0.701  0.034 -0.116  0.415       
## ply3:CndtnL  0.013 -0.020  0.037 -0.637 -0.003  0.031 -0.056
## convergence code: 0
## Model failed to converge with max|grad| = 0.00859796 (tol = 0.001, component 1)

Logistic GCA: Simplify random effects

In the original model, the Subject random effects correlations were all 1.0 or -1.0. That suggests that the random effects structure is over-parameterized, which might be causing the convergence warning.

Try removing those correlations by using || in the random effects specification – this tells glmer() not to estimate the correlations (NB: this only works if there are no factors on the left side of the double-pipe):

m.log_zc <- glmer(cbind(sumFix, N-sumFix) ~ (poly1+poly2+poly3)*Condition +
                 (poly1+poly2+poly3 || Subject) +
                 (poly1+poly2 | Subject:Condition),
               data=TargetFix, family=binomial)

## boundary (singular) fit: see ?isSingular

summary(m.log_zc)

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: 
## cbind(sumFix, N - sumFix) ~ (poly1 + poly2 + poly3) * Condition +  
##     (poly1 + poly2 + poly3 || Subject) + (poly1 + poly2 | Subject:Condition)
##    Data: TargetFix
## 
##      AIC      BIC   logLik deviance df.resid 
##   1411.6   1478.3   -687.8   1375.6      282 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.6960 -0.4150 -0.0014  0.3369  2.0756 
## 
## Random effects:
##  Groups            Name        Variance Std.Dev. Corr       
##  Subject.Condition (Intercept) 3.40e-02 1.84e-01            
##                    poly1       4.23e-01 6.51e-01 -0.63      
##                    poly2       1.53e-01 3.91e-01 -0.25  0.70
##  Subject           poly3       5.54e-09 7.44e-05            
##  Subject.1         poly2       8.25e-09 9.08e-05            
##  Subject.2         poly1       4.45e-01 6.67e-01            
##  Subject.3         (Intercept) 2.08e-07 4.56e-04            
## Number of obs: 300, groups:  Subject:Condition, 20; Subject, 10
## 
## Fixed effects:
##                    Estimate Std. Error z value Pr(>|z|)    
## (Intercept)         -0.1177     0.0654   -1.80   0.0721 .  
## poly1                2.8216     0.3182    8.87   <2e-16 ***
## poly2               -0.5589     0.1705   -3.28   0.0010 ** 
## poly3               -0.3134     0.1165   -2.69   0.0071 ** 
## ConditionLow        -0.2606     0.0928   -2.81   0.0050 ** 
## poly1:ConditionLow   0.0659     0.3379    0.19   0.8455    
## poly2:ConditionLow   0.6905     0.2421    2.85   0.0043 ** 
## poly3:ConditionLow  -0.0666     0.1663   -0.40   0.6890    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) poly1  poly2  poly3  CndtnL pl1:CL pl2:CL
## poly1       -0.379                                          
## poly2       -0.129  0.301                                   
## poly3       -0.018  0.029 -0.054                            
## ConditionLw -0.705  0.267  0.092  0.012                     
## ply1:CndtnL  0.357 -0.528 -0.284 -0.027 -0.509              
## ply2:CndtnL  0.092 -0.212 -0.703  0.038 -0.131  0.402       
## ply3:CndtnL  0.012 -0.020  0.037 -0.699 -0.003  0.033 -0.056
## convergence code: 0
## boundary (singular) fit: see ?isSingular

Logistic GCA: Plot model fit

The fitted function conveniently returns proportions from a logistic model, so plotting the model fit is easy:

ggplot(TargetFix, aes(Time, meanFix, color=Condition)) +
  stat_summary(fun.data=mean_se, geom="pointrange") +
  stat_summary(aes(y=fitted(m.log)), fun.y=mean, geom="line") +
  stat_summary(aes(y=fitted(m.log_zc)), fun.y=mean, geom="line", linetype="dashed") +
  theme_bw() + expand_limits(y=c(0,1)) + 
  labs(y="Fixation Proportion", x="Time since word onset (ms)")

Exercise 5

The word learning accuracy data in WordLearnEx are proportions from a binary response variable (correct/incorrect). Re-analyze these data using logistic GCA and compare with the linear GCA we used before.

You’ll need to convert the accuracy proportions to counts of correct and incorrect responses. To do that you need to know that there were 6 trials per block.

summary(WordLearnEx)

##     Subject       TP          Block         Accuracy    
##  244    : 10   Low :280   Min.   : 1.0   Min.   :0.000  
##  253    : 10   High:280   1st Qu.: 3.0   1st Qu.:0.667  
##  302    : 10              Median : 5.5   Median :0.833  
##  303    : 10              Mean   : 5.5   Mean   :0.805  
##  305    : 10              3rd Qu.: 8.0   3rd Qu.:1.000  
##  306    : 10              Max.   :10.0   Max.   :1.000  
##  (Other):500

# calculate number correct and incorrect
WordLearnEx$Correct <- round(WordLearnEx$Accuracy * 6)
WordLearnEx$Error <- 6 - WordLearnEx$Correct