Cognitive Models logo Cognitive Models

Fixed-effects models assume each participant has his/her own specific mechanism of parameter generation. This assumption is relative to the random-effect models, which assume one common mechanism is responsible for generating parameters for all participants. The latter is sometimes dubbed as hierarchical or multi-level models, although the three terms could carry subtle different ideas.

In this tutorial, I illustrate the method of conducting the fixed-effects modelling. Given many observations of response times (RT) and choices, one modelling aim is to estimate the parameters that generate the observations.

A typical scenario is we collect data by inviting participants to visit our laboratory, having them do some cognitive tasks, and recording their RTs and choices.

Often, we would use a RT model, for example diffusion decision model (DDM) (Ratcliff & McKoon, 2008)1 to estimate latent variables. I first set up a model object. The type = “rd”, refers to Ratcliff’s diffusion model.

require(ggdmc)
model <- BuildModel(
  p.map     = list(a = "1", v = "1", z = "1", d = "1", sz = "1", sv = "1",
                   t0 = "1", st0 = "1"),
  match.map = list(M = list(s1 = "r1", s2 = "r2")),
  factors   = list(S = c("s1", "s2")),
  responses = c("r1", "r2"),
  constants = c(st0 = 0, d = 0),
  type      = "rd")

p.vector <- c(a = 1, v = 1.2, z = .38, sz = .25, sv = .2, t0 = .15)
ntrial <- 1e2
dat <- simulate(model, nsim = ntrial, ps = p.vector)
dmi <- BuildDMI(dat, model)
## A tibble: 200 x 3     ## use dplyr::tbl_df(dat) to print this
##    S     R        RT
##    <fct> <fct> <dbl>
##  1 s1    r1    0.249
##  2 s1    r1    0.246
##  3 s1    r2    0.262
##  4 s1    r1    0.519
##  5 s1    r1    0.205
##  6 s1    r1    0.177
##  7 s1    r1    0.174
##  8 s1    r1    0.378
##  9 s1    r1    0.197
## 10 s1    r1    0.224
##  ... with 190 more rows

Because the data were simulated from one set of presumed true values, p.vector, I can use them later to verify whether the sampling process does recovery the parameters. In Bayesian inference, we also need prior distributions, so let’s build a set of prior distributions for each DDM parameters.

A beta distribution with shape1 = 1 and shape2 = 1, equals to a uniform distribution (beta(1, 1)). This choice is to regularize the parameters, (1) the start point, z, (2) its variability sz and (3) t0. All three are bounded by 0 and 1. Others use truncated normal distributions bounded by lower and upper arguments.

plot draws the prior distributions, providing a visual check method. This method, in the case of parameter recovery study, is to make sure the prior distribution does cover the true values.

p.prior  <- BuildPrior(
  dists = c(rep("tnorm", 2), "beta", "beta", "tnorm", "beta"),
  p1    = c(a = 1, v = 0, z = 1, sz = 1, sv = 1, t0 = 1),
  p2    = c(a = 1, v = 2, z = 1, sz = 1, sv = 1, t0 = 1),
  lower = c(0, -5, NA, NA, 0, NA),
  upper = c(5,  5, NA, NA, 5, NA))
plot(p.prior, ps = p.vector)

prior

By default StartNewsamples uses p.prior to randomly draw start points and samples 200 MCMC samples. This step uses a mixture of crossover and migration operators. The run function by default draws 500 MCMC samples, using only crossover operator. gelman function reports PSRF value of 1.06 in this case. A potential scale reduction factor (PSRF2) less than 1.1 suggests chains are converged.

fit0 <- StartNewsamples(dmi, p.prior)
fit  <- run(fit0)
rhat <- gelman(fit, verbose = TRUE)
es   <- effectiveSize(fit)
## Diagnosing a single participant, theta. Rhat = 1.06

plot by default draws posterior log-likelihood, with the option, start, to change to a latter start iteration to draw.

p0 <- plot(fit0)
## p0 <- plot(fit0, start = 101)
p1 <- plot(fit)

png("pll.png", 800, 600)
gridExtra::grid.arrange(p0, p1, ncol = 1)
dev.off()

pll

The upper panel showed the chains quickly converged to posterior log-likelihoods near 100th iteration and the right panel confirmed the rhat value (< 1.1).

p2 <- plot(fit, pll = FALSE, den= FALSE)
p3 <- plot(fit, pll = FALSE, den= TRUE)
png("den.png", 800, 600)
gridExtra::grid.arrange(p2, p3, ncol = 1)
dev.off()

den

In a simulation study, we can check whether the sampling process is OK, using summary

est <- summary(fit, recover = TRUE, ps = p.vector, verbose = TRUE)
## Recovery summarises only default quantiles: 2.5% 25% 50% 75% 97.5% 
##                     a     sv      sz      t0      v     z
## True           1.0000 0.2000  0.2500  0.1500 1.2000 0.3800
## 2.5% Estimate  0.9656 0.0401  0.0112  0.1338 1.1463 0.3504
## 50% Estimate   1.0419 0.6010  0.2174  0.1444 1.4983 0.3867
## 97.5% Estimate 1.1509 1.7128  0.4781  0.1522 2.0005 0.4273
## Median-True    0.0419 0.4010 -0.0326 -0.0056 0.2983 0.0067

Finally, we may want to check whether the model fits the data well. There are many methods to quantify the goodness of fit. Here, I illustrate two methods. First method is to calculate DIC and BPIC. These information criteria are useful for model selection. (need > ggdmc 2.5.5)

DIC(fit)
DIC(fit, BPIC=TRUE)

Secondly, I simulate post-predictive data, based on the estimated parameters. xlim trims off outlier values in the simulation data.


predict_one <- function(object, npost = 100, rand = TRUE, factors = NA,
                        xlim = NA, seed = NULL)
{
    ## Update for using S4 class
    model <- object@dmi@model
    facs <- names(attr(model, "factors")); 

    if (!is.null(factors))
    {
        if (any(is.na(factors))) factors <- facs
        if (!all(factors %in% facs))
            stop(paste("Factors argument must contain one or more of:", paste(facs, collapse=",")))
    }

    
    resp <- names(attr(model, "responses")); 
    ns   <- table(object@dmi@data[,facs], dnn = facs); 
    npar   <- object@npar
    nchain <- object@nchain
    nmc    <- object@nmc; 
    ntsample <- nchain * nmc
    pnames   <- object@pnames
    
    thetas <- matrix(aperm(object@theta, c(3,2,1)), ncol = npar)
    colnames(thetas) <- pnames

  if (is.na(npost)) {
    use <- 1:ntsample
  } else {
    if (rand) {
      use <- sample(1:ntsample, npost, replace = F)
    } else {
	  ## Debugging purpose
      use <- round(seq(1, ntsample, length.out = npost))
    }
  }

  npost  <- length(use)
  posts   <- thetas[use, ]
  nttrial <- sum(ns) ## number of total trials

  v <- lapply(1:npost, function(i) {
    ggdmc:::simulate_one(model, n = ns, ps = posts[i,], seed = seed)
  })
    
  out <- data.table::rbindlist(v)
  reps <- rep(1:npost, each = nttrial)
  out <- cbind(reps, out)

  if (!any(is.na(xlim)))
  {
    out <- out[RT > xlim[1] & RT < xlim[2]]
  }

  return(out)
}


pp  <- predict_one(fit, xlim = c(0, 5))
dat <- fit@dmi@data

dat$C <- ifelse(dat$S == "s1"  & dat$R == "r1",  TRUE,
         ifelse(dat$S == "s2" & dat$R == "r2", TRUE,
         ifelse(dat$S == "s1"  & dat$R == "r2", FALSE,
         ifelse(dat$S == "s2" & dat$R == "r1",  FALSE, NA))))
pp$C <- ifelse(pp$S == "s1"  & pp$R == "r1",  TRUE,
        ifelse(pp$S == "s2" & pp$R == "r2", TRUE,
        ifelse(pp$S == "s1"  & pp$R == "r2", FALSE,
        ifelse(pp$S == "s2" & pp$R == "r1",  FALSE, NA))))

dat$reps <- NA
dat$type <- "Data"
pp$reps <- factor(pp$reps)
pp$type <- "Simulation"

DT <- rbind(dat, pp)

require(ggplot2)
p1 <- ggplot(DT, aes(RT, color = reps, size = type)) +
  geom_freqpoly(binwidth = .05) +
  scale_size_manual(values = c(1, .3)) +
  scale_color_grey(na.value = "black") +
  theme(legend.position = "none") +
  facet_grid(S ~ C)


d <- data.table::data.table(dat)
d[, .N, .(S, R)]
##     S  R  N
## 1: s1 r1 87
## 2: s1 r2 13
## 3: s2 r1 34
## 4: s2 r2 66

post-predictive

The grey lines are model predictions. By default, predict_one randomly draws 100 parameter estimates and simulate data based on them. Therefore, there are 100 lines, showing the prediction variability. The solid dark line shows the data. In this case, the dark line is within the range covering by the grey lines. Note that the error responses (FALSE) are not predicted as well as the correct responses. This is fairly common, when the number of trial is small. In this case, it has only 13 trials.

  1. This is often dubbed, drift-diffusion model, but in Ratcliff and McKoon’s work, they called it diffusion decision model. 

  2. Brook, S. P., & Gelman, A. (1998) General Methods for Monitoring Convergence of Iterative Simulations, Journal of Computational and Graphical Statistics, 7:4 .