# Cognitive Models

In analyzing response time data with two choices, researchers would usually examine average response times (RTs) and response proportions. Depending on the model a researcher wishes to presume, the response proportions can simply be correct and error rates, or, if using SDT model, hits, correct rejections, false alarms and misses. Here I used Pleskac, Cesario, and Johnson’s (2017) data in the first-person shooter task (FPST; Correll et al., 2002) to illustrate one method to calculate average RTs and response proportions across participants.

Firstly, I use fread function to load the data file, which is in csv format. The data set provides clear and good column names. That is, the column names have informed the coding method. I simply just followed column names to code the factor levels and later checked against the data in the paper. Of course, I had also checked against the figures of behaviour analyses in the paper to make sure I did correctly identify the dependent and independent variables, which are listed in the following.

• S: stimulus factor, gun vs. non-gun objects.
• BC: blurry or clear object
• CT: context, a safe or dangerous neighborhood
• RACE: race, a black or white target
• R: response factor, shoot or not to shoot
• RT: response times
• s: subject / participant nominal labels
library(data.table);
study3$S <- factor(ifelse(study3$Object0NG1G == 0, "non", "gun"))
study3$BC <- factor(ifelse(study3$Blurry0Clear1Blur == 0, "clear", "blur"))
study3$CT <- factor(ifelse(study3$Context1Safe2Danger == 0, "safe", "danger"))
study3$RACE <- factor(ifelse(study3$Race012B == 0, "white", "black"))
study3$R <- factor(ifelse(study3$Resp0NS1Sh == 0, "not", "shoot"))
study3$RT <- study3$RT / 1e3
study3$s <- factor(study3$Subject)


factor is a R function converting variables, numeric or character, to categorical (i.e., nominal) variable. After reorganizing the columns, I removed the replicated columns by assigning them as NULL. This is a data.table specific syntax.

study3[, c("Subject", "NewSubject", "conditionRaceDangerBlurbject",
"conditionRaceDangerBlur",  "Object0NG1G", "Blurry0Clear1Blur",
"Context1Safe2Danger", "Race012B", "Resp0NS1Sh", "DiffusionRT") := NULL]


There are NaN response times in this data set. One method is to replace them with random RTs drawn from uniform distribution, with the range of valid RTs. This was achieved by using the data.table internal function .I. I firstly found the (row) index of these NaN RTs, and then replaced them. Of course, we can simply just remove them.

## save organized data to a temporary object, so I can roll back.
dtmp <- data.table(study3)
minmax <- range(study3$RT, na.rm = TRUE); minmax idx <- dtmp[, .I[is.nan(RT)]]; idx dtmp[idx, RT := runif(1, minmax[1], minmax[2])] d <- dtmp ## scoring a correctness column d$C <- ifelse(d$S == "gun" & d$R == "shoot", TRUE,
ifelse(d$S == "non" & d$R == "not",   TRUE,
ifelse(d$S == "gun" & d$R == "not",   FALSE,
ifelse(d$S == "non" & d$R == "shoot", FALSE, NA))))


Now the data table looks like:

dplyr::tbl_df(d)
## # A tibble: 12,033 x 8
##       RT S     BC    CT    RACE  R     s     C
##    <dbl> <fct> <fct> <fct> <fct> <fct> <fct> <lgl>
##  1 0.753 gun   blur  safe  black shoot 11    TRUE
##  2 0.851 non   blur  safe  white not   11    TRUE
##  3 0.742 gun   clear safe  black shoot 11    TRUE
##  4 0.636 non   clear safe  white not   11    TRUE
##  5 0.644 gun   blur  safe  black not   11    FALSE
##  6 0.625 non   clear safe  black shoot 11    FALSE
##  7 0.889 non   clear safe  white not   11    TRUE
##  8 0.597 gun   blur  safe  black shoot 11    TRUE
##  9 0.724 gun   clear safe  white shoot 11    TRUE
## 10 0.656 non   blur  safe  white not   11    TRUE
## # ... with 12,023 more rows


### Censoring RT data

Censoring outliers is a difficult task (Ratcliff, 1993). Here I illustrated one way to do it via Heathcote’s rc, a collection of very useful R functions and my summarise, also a collection of useful R functions. First, I used R’s source function to load this large collection of R functions.

source("~/rc/data.analysis.R")
source("~/rc/utils.R")
source("~/functions/summarise.R")

## Scoring ------------
se3 <- score.rc(data.frame(d), S = "s", R = "R", RT = "RT", SC = "C",
F = c("BC", "CT", "RACE", "S"))
## Spreading 11851 of 12033 RTs that are ties given preceision 0.001 .
##    497 have ties out of 679 unique values
##
## Added the following manifest design
##      S  RACE     CT    BC     R rcell
## 1  gun black danger  blur   not     1
## 2  gun black danger  blur shoot     1
## 3  gun black danger clear   not     2
## ...
## 30 non white   safe  blur shoot    15
## 31 non white   safe clear   not    16
## 32 non white   safe clear shoot    16



score.rc function takes first argument data.frame, which is the data as seen previously. Because I stored it as data.table, I needed to convert it back to data.frame. Just a note. Although data.table may accommodate many functions operating in data.frame, there are some operations in rc functions, which cannot work in data.table.

Note the second argument, uppercase S, which takes the subject column, instead of the column of stimulus factor. The R and RT arguments take response column and the response time column. SC takes the column of score correctness, which is purely my guess. I cannot be sure why it is called SC. The last useful argument is F, which takes user-defined factors, including the stimulus factor.

score.rc detects the identical (ties) RTs and spread them into finer scale. For example, in this data set, there are 31 trials with 60y ms.

table(d$RT) ## 0.01 0.015 0.019 0.025 ## 1 1 1 1 ## 0.027 0.03 0.035 0.036 ## 1 1 1 1 ## ... ## 0.386 0.387 0.388 0.389 ## 7 3 1 3 ## 0.39 0.391 0.392 0.393 ## 1 4 3 3 ## 0.394 0.395 0.396 0.397 ## 2 6 7 3 ## ... ## 0.606 0.607 0.608 0.609 ## 50 31 40 39 ## 0.61 0.611 0.612 0.613 ## 42 49 31 41 ## 0.614 0.615 0.616 0.617 ## 49 55 42 39 ## ...  If I printed them all out, the data set after scoring spreads these RT to a se3[se3$RT >= .607 & se3$RT < .608,]  cell rcell s BC CT RACE S C R RT 539 31 16 19 clear safe white non TRUE not 0.6070000 838 31 16 24 clear safe white non TRUE not 0.6075488 1909 26 13 39 blur danger white non FALSE shoot 0.6070313 2108 6 3 44 blur safe black gun TRUE shoot 0.6072812 2321 4 2 50 clear danger black gun TRUE shoot 0.6079634 2354 19 10 50 clear danger black non TRUE not 0.6071250 2939 15 8 62 clear safe white gun FALSE not 0.6072188 3083 32 16 62 clear safe white non FALSE shoot 0.6077683 3319 12 6 72 clear danger white gun TRUE shoot 0.6075244 3762 19 10 82 clear danger black non TRUE not 0.6079146 4802 23 12 120 clear safe black non TRUE not 0.6076951 5211 17 9 129 blur danger black non TRUE not 0.6075976 5391 19 10 129 clear danger black non TRUE not 0.6074063 6095 1 1 184 blur danger black gun FALSE not 0.6073125 7057 10 5 201 blur danger white gun TRUE shoot 0.6077195 7201 17 9 201 blur danger black non TRUE not 0.6078659 7288 6 3 214 blur safe black gun TRUE shoot 0.6073438 7345 25 13 214 blur danger white non TRUE not 0.6071875 7747 23 12 218 clear safe black non TRUE not 0.6077439 8259 8 4 231 clear safe black gun TRUE shoot 0.6076707 8305 19 10 231 clear danger black non TRUE not 0.6079390 8671 12 6 235 clear danger white gun TRUE shoot 0.6071563 8923 19 10 247 clear danger black non TRUE not 0.6073750 9002 31 16 247 clear safe white non TRUE not 0.6074375 9045 31 16 247 clear safe white non TRUE not 0.6070625 9509 31 16 286 clear safe white non TRUE not 0.6076220 9551 31 16 286 clear safe white non TRUE not 0.6078415 9692 6 3 286 blur safe black gun TRUE shoot 0.6075732 9903 27 14 288 clear danger white non TRUE not 0.6079878 10432 25 13 307 blur danger white non TRUE not 0.6077927 10534 10 5 308 blur danger white gun TRUE shoot 0.6078902 10666 21 11 308 blur safe black non TRUE not 0.6074688 10979 16 8 325 clear safe white gun TRUE shoot 0.6078171 11201 9 5 326 blur danger white gun FALSE not 0.6070937 11642 19 10 344 clear danger black non TRUE not 0.6072500 11866 25 13 348 blur danger white non TRUE not 0.6076463  The original data set is to the millisecond scale. d[RT == .607]  RT S BC CT RACE R s C 1: 0.607 gun blur danger black not 11 FALSE 2: 0.607 non blur danger black not 19 TRUE 3: 0.607 non clear safe white not 19 TRUE 4: 0.607 gun blur safe black shoot 24 TRUE 5: 0.607 non clear danger white not 28 TRUE 6: 0.607 non clear safe white not 37 TRUE 7: 0.607 non blur danger white shoot 39 FALSE 8: 0.607 non blur danger black not 44 TRUE 9: 0.607 gun blur safe black shoot 44 TRUE 10: 0.607 non clear danger black not 50 TRUE 11: 0.607 gun clear danger white shoot 50 TRUE 12: 0.607 gun clear safe white not 62 FALSE 13: 0.607 non clear danger black not 129 TRUE 14: 0.607 gun blur danger black not 184 FALSE 15: 0.607 non clear safe white not 184 TRUE 16: 0.607 non blur danger black not 184 TRUE 17: 0.607 gun blur safe black shoot 214 TRUE 18: 0.607 non blur danger white not 214 TRUE 19: 0.607 non clear safe white not 218 TRUE 20: 0.607 gun clear danger white shoot 235 TRUE 21: 0.607 non clear danger black not 247 TRUE 22: 0.607 non clear safe white not 247 TRUE 23: 0.607 non clear safe white not 247 TRUE 24: 0.607 gun clear danger white shoot 288 TRUE 25: 0.607 non blur safe black not 308 TRUE 26: 0.607 non clear safe white not 325 TRUE 27: 0.607 gun blur safe white shoot 326 TRUE 28: 0.607 gun blur danger black shoot 326 TRUE 29: 0.607 gun blur danger white not 326 FALSE 30: 0.607 non blur danger black not 344 TRUE 31: 0.607 non clear danger black not 344 TRUE RT S BC CT RACE R s C  The scored data set, se3, will also attach two new columns, cell and rcell, indicating the experimental design. In this example, it has 32 cell, so cell is from 1 to 32 and rcell is from 1 to 16, because this is a two-choice experiment, response, shoot and not to shoot in cell 1 and cell 2, belong to the same experimental design, but with different response types. ## 1 gun black danger blur not 1 ## 2 gun black danger blur shoot 1 ## 3 gun black danger clear not 2 ## 4 gun black danger clear shoot 2  A usual practice is to take 3 times the standard deviation, respectively in each participants. This can be achieved via tapply function. If the data set is large, one can use data.table to achieve the same aim, which I will demonstrate in a later tutorial. sd3 <- tapply(se3$RT, se3$s, mean) + tapply(se3$RT, se3\$s, sd) * 3;


A second useful function in rc collection is the make.rc, which does the censoring work. It takes a first argument of the scored data set, from score.rc and a second argument, correct.name, indicating the character string for the correctness column, and the last two arguments, for the lower and upper bounds of the censoring.

me3 <- make.rc(se3, correct.name = "C", minrt = .2, maxrt = sd3)


## How to average across trials

• mv: measurement / dependent variable
• gvs: grouping variables
• wvs: within variables
acc0 <- summarySE(d, mv = "error", gvs = c("s", "BC", "CT", "RACE", "S"))
mrt0 <- summarySE(d[C == TRUE], mv = "RT",    gvs = c("s", "BC", "CT", "RACE", "S"))
## Within se average across subjects for pc and nt
figA <- summarySEwithin(acc0, wvs = c("BC","CT", "RACE", "S"), mv = "error")
figB <- summarySEwithin(mrt0, wvs = c("BC","CT", "RACE", "S"), mv = "RT")
names(figA) <- c("BC", "CT", "RACE", "S", "N", "y", "sd", "se", "ci")
names(figB) <- c("BC", "CT", "RACE", "S", "N", "y", "sd", "se", "ci")


## Reference

Ratcliff, R. (1993). Methods for dealing with reaction time outliers. Psychological bulletin, 114(3), 510.