Growth Curve Analysis, Part 4

Dan Mirman

22 May 2020

Schedule

Schedule

Time	Topic	Data set(s)
9-9:30am	Introduction	Visual Search
9:30-10am	Exercise	WISQARS, Naming Recovery
10-10:45am	Non-linear change	Word Learning
10:45-11:30am	Exercise	CP
11:30-noon	Within-subject effects	Target Fixation
12-1pm	Lunch break; Exercise	Az
1-1:45pm	Logistic GCA; Exercise	Target Fixation; Word Learning
1:45-2:30pm	Individual Differences	Deviant Behavior; School Mental Health
2:30-3pm	Exercise; Open Q&A	NA

Why logistic regression? (A brief review)

Aggregated binary outcomes (e.g., accuracy, fixation proportion) can look approximately continuous, but they

Are bounded: can only have values between 0.0 and 1.0
Have very specific, non-uniform variance pattern

Why logistic regression? (A brief review)

These properties can produce incorrect results in linear regression.

Logistic regression (A brief review)

Model the binomial process that produced binary data
Not enough to know that accuracy was 90%, need to know whether that was 9 out of 10 trials or 90 out of 100 trials
Can be a binary vector of single-trial 0’s (failures, No’s) or 1’s (successes, Yes’s)
More compact version: count of the number of successes and the number of failures

Outcome is log-odds (also called “logit”): \(logit(Yes, No) = \log \left(\frac{Yes}{No}\right)\)

Compare to proportions: \(p(Yes, No) = \frac{Yes}{Yes+No}\)

Note: logit is undefined (Inf) when \(p=0\) or \(p=1\), this makes it hard to fit logistic models to data with such extreme values (e.g., fixations on objects that are very rarely fixated).

Logistic GCA: Data

##     Subject         Time         timeBin   Condition     meanFix      
##  708    : 30   Min.   : 300   Min.   : 1   High:150   Min.   :0.0286  
##  712    : 30   1st Qu.: 450   1st Qu.: 4   Low :150   1st Qu.:0.2778  
##  715    : 30   Median : 650   Median : 8              Median :0.4558  
##  720    : 30   Mean   : 650   Mean   : 8              Mean   :0.4483  
##  722    : 30   3rd Qu.: 850   3rd Qu.:12              3rd Qu.:0.6111  
##  725    : 30   Max.   :1000   Max.   :15              Max.   :0.8286  
##  (Other):120                                                          
##      sumFix           N       
##  Min.   : 1.0   Min.   :33.0  
##  1st Qu.:10.0   1st Qu.:35.8  
##  Median :16.0   Median :36.0  
##  Mean   :15.9   Mean   :35.5  
##  3rd Qu.:21.2   3rd Qu.:36.0  
##  Max.   :29.0   Max.   :36.0  
##

Logistic GCA: Fit the model

Logistic model code is almost the same as linear model code. Three differences:

glmer() instead of lmer()
outcome is 1/0 or aggregated number of 1s, 0s
add family=binomial

#make 3rd-order orth poly
TargetFix <- code_poly(TargetFix, predictor="timeBin", poly.order=3, draw.poly=F)
# fit logisitc GCA model
m.log <- glmer(cbind(sumFix, N-sumFix) ~ (poly1+poly2+poly3)*Condition +
                 (poly1+poly2+poly3 | Subject) +
                 (poly1+poly2 | Subject:Condition),
               data=TargetFix, family=binomial)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.00859796 (tol = 0.002, component 1)

Logistic GCA models are much slower to fit and are prone to convergence failure

Simplified random effect structure
Convergence warning: A warning is not an error – will need to check parameter estimates and SE

Logistic GCA: Model summary

summary(m.log)

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: cbind(sumFix, N - sumFix) ~ (poly1 + poly2 + poly3) * Condition +  
##     (poly1 + poly2 + poly3 | Subject) + (poly1 + poly2 | Subject:Condition)
##    Data: TargetFix
## 
##      AIC      BIC   logLik deviance df.resid 
##   1419.1   1508.0   -685.6   1371.1      276 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.7544 -0.4098 -0.0031  0.3786  2.0623 
## 
## Random effects:
##  Groups            Name        Variance Std.Dev. Corr             
##  Subject:Condition (Intercept) 0.03233  0.1798                    
##                    poly1       0.40183  0.6339   -0.68            
##                    poly2       0.14804  0.3848   -0.23  0.73      
##  Subject           (Intercept) 0.00175  0.0419                    
##                    poly1       0.34347  0.5861    1.00            
##                    poly2       0.00200  0.0447   -1.00 -1.00      
##                    poly3       0.02747  0.1657   -1.00 -1.00  1.00
## Number of obs: 300, groups:  Subject:Condition, 20; Subject, 10
## 
## Fixed effects:
##                    Estimate Std. Error z value Pr(>|z|)    
## (Intercept)         -0.1168     0.0655   -1.78  0.07449 .  
## poly1                2.8186     0.2983    9.45  < 2e-16 ***
## poly2               -0.5591     0.1695   -3.30  0.00097 ***
## poly3               -0.3208     0.1277   -2.51  0.01200 *  
## ConditionLow        -0.2615     0.0909   -2.88  0.00404 ** 
## poly1:ConditionLow   0.0638     0.3313    0.19  0.84723    
## poly2:ConditionLow   0.6951     0.2398    2.90  0.00375 ** 
## poly3:ConditionLow  -0.0707     0.1662   -0.43  0.67060    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) poly1  poly2  poly3  CndtnL pl1:CL pl2:CL
## poly1       -0.288                                          
## poly2       -0.128  0.272                                   
## poly3       -0.100 -0.228 -0.015                            
## ConditionLw -0.690  0.297  0.081  0.012                     
## ply1:CndtnL  0.372 -0.552 -0.292 -0.024 -0.541              
## ply2:CndtnL  0.080 -0.230 -0.701  0.034 -0.116  0.415       
## ply3:CndtnL  0.013 -0.020  0.037 -0.637 -0.003  0.031 -0.056
## convergence code: 0
## Model failed to converge with max|grad| = 0.00859796 (tol = 0.002, component 1)

Logistic GCA: Simplify random effects

In the original model, the Subject random effects correlations were all 1.0 or -1.0. That suggests that the random effects structure is over-parameterized, which might be causing the convergence warning.

Try removing those correlations by using || in the random effects specification – this tells glmer() not to estimate the correlations (NB: this only works if there are no factors on the left side of the double-pipe):

m.log_zc <- glmer(cbind(sumFix, N-sumFix) ~ (poly1+poly2+poly3)*Condition +
                 (poly1+poly2+poly3 || Subject) +
                 (poly1+poly2 | Subject:Condition),
               data=TargetFix, family=binomial)

## boundary (singular) fit: see ?isSingular

summary(m.log_zc)

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: cbind(sumFix, N - sumFix) ~ (poly1 + poly2 + poly3) * Condition +  
##     (poly1 + poly2 + poly3 || Subject) + (poly1 + poly2 | Subject:Condition)
##    Data: TargetFix
## 
##      AIC      BIC   logLik deviance df.resid 
##   1411.6   1478.3   -687.8   1375.6      282 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.6960 -0.4150 -0.0014  0.3369  2.0756 
## 
## Random effects:
##  Groups            Name        Variance Std.Dev. Corr       
##  Subject.Condition (Intercept) 3.40e-02 1.84e-01            
##                    poly1       4.23e-01 6.51e-01 -0.63      
##                    poly2       1.53e-01 3.91e-01 -0.25  0.70
##  Subject           poly3       5.54e-09 7.44e-05            
##  Subject.1         poly2       8.25e-09 9.08e-05            
##  Subject.2         poly1       4.45e-01 6.67e-01            
##  Subject.3         (Intercept) 2.08e-07 4.56e-04            
## Number of obs: 300, groups:  Subject:Condition, 20; Subject, 10
## 
## Fixed effects:
##                    Estimate Std. Error z value Pr(>|z|)    
## (Intercept)         -0.1177     0.0654   -1.80   0.0721 .  
## poly1                2.8216     0.3182    8.87   <2e-16 ***
## poly2               -0.5589     0.1705   -3.28   0.0010 ** 
## poly3               -0.3134     0.1165   -2.69   0.0071 ** 
## ConditionLow        -0.2606     0.0928   -2.81   0.0050 ** 
## poly1:ConditionLow   0.0659     0.3379    0.19   0.8455    
## poly2:ConditionLow   0.6905     0.2421    2.85   0.0043 ** 
## poly3:ConditionLow  -0.0666     0.1663   -0.40   0.6890    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) poly1  poly2  poly3  CndtnL pl1:CL pl2:CL
## poly1       -0.379                                          
## poly2       -0.129  0.301                                   
## poly3       -0.018  0.029 -0.054                            
## ConditionLw -0.705  0.267  0.092  0.012                     
## ply1:CndtnL  0.357 -0.528 -0.284 -0.027 -0.509              
## ply2:CndtnL  0.092 -0.212 -0.703  0.038 -0.131  0.402       
## ply3:CndtnL  0.012 -0.020  0.037 -0.699 -0.003  0.033 -0.056
## convergence code: 0
## boundary (singular) fit: see ?isSingular

Logistic GCA: Plot model fit

The fitted function conveniently returns proportions from a logistic model, so plotting the model fit is easy:

ggplot(TargetFix, aes(Time, meanFix, color=Condition)) +
  stat_summary(fun.data=mean_se, geom="pointrange") +
  stat_summary(aes(y=fitted(m.log)), fun=mean, geom="line") +
  stat_summary(aes(y=fitted(m.log_zc)), fun=mean, geom="line", linetype="dashed") +
  theme_bw() + expand_limits(y=c(0,1)) + 
  labs(y="Fixation Proportion", x="Time since word onset (ms)")

Exercise 4

The word learning accuracy data in WordLearnEx are proportions from a binary response variable (correct/incorrect). Re-analyze these data using logistic GCA and compare with the linear GCA we used before.

You’ll need to convert the accuracy proportions to counts of correct and incorrect responses. To do that you need to know that there were 6 trials per block.

##     Subject       TP          Block         Accuracy    
##  244    : 10   Low :280   Min.   : 1.0   Min.   :0.000  
##  253    : 10   High:280   1st Qu.: 3.0   1st Qu.:0.667  
##  302    : 10              Median : 5.5   Median :0.833  
##  303    : 10              Mean   : 5.5   Mean   :0.805  
##  305    : 10              3rd Qu.: 8.0   3rd Qu.:1.000  
##  306    : 10              Max.   :10.0   Max.   :1.000  
##  (Other):500

Calculate number correct and incorrect

# WordLearnEx$Correct <- round(WordLearnEx$Accuracy * 6)
# WordLearnEx$Error <- 6 - WordLearnEx$Correct
WordLearnEx <- mutate(WordLearnEx, 
                      Correct = round(Accuracy*6),
                      Error = 6-Correct)