DHARMa - Residual Diagnostics for HierArchical (Multi-level / Mixed) Regression Models

Florian Hartig, University of Freiburg / Regensburg, website

2016-12-11

Abstract

The DHARMa package uses a simulation-based approach to create readily interpretable scaled residuals from fitted generalized linear mixed models. Currently supported are generalized linear mixed models from ‘lme4’ (classes ‘lmerMod’, ‘glmerMod’), generalized additive models (‘gam’ from ‘mgcv’), ‘glm’ (including ‘negbin’ from ‘MASS’, but excluding quasi-distributions) and ‘lm’ model classes. Alternatively, externally created simulations, e.g. posterior predictive simulations from Bayesian software such as ‘JAGS’, ‘STAN’, or ‘BUGS’ can be processed as well. The resulting residuals are standardized to values between 0 and 1 and can be interpreted as intuitively as residuals from a linear regression. The package also provides a number of plot and test functions for typical model misspecification problems, such as over/underdispersion, zero-inflation, and spatial / temporal autocorrelation.

Motivation

Residual interpretation for generalized linear mixed models (GLMMs) is often problematic. As an example, here two Poisson GLMMs, one that is lacking a quadratic effect, and one that fits the data perfectly. I show three standard residuals diagnostics each. Which is the misspecified model?

Just for completeness - it was the first one. But don’t get too excited if you got it right. Either you were lucky, or you noted that the first model seems a bit overdispersed (range of the Pearson residuals). But even when noting that, would you have added a quadratic effect, instead of adding an overdispersion correction? The point that I want to make is that misspecifications in GL(M)Ms cannot reliably be diagnosed with standard residual plots, and GLMMs are not often as thoroughly checked as LMs.

The reason why GL(M)Ms residuals are harder to interpret is that the expected distribution of the data changes with the fitted values. Reweighting with the expected variance, as done in Pearson residuals, or using deviance residuals, helps a bit, but does not lead to visually homogenous residuals even if the model is correctly specified. As a result, standard residual plots, when interpreted in the same way as for linear models, seem to show all kind of problems, such as non-normality, heteroscedasticity, even if the model is correctly specified. Questions on the R mailing lists and forums show that practitioners are regularly confused about whether such patterns in GL(M)Ms show a problem or not.

But even experienced statistical analysts currently have few options to diagnose specification problems in GLMMs. In my experience, the current standard practice is to eyeball the residual plots for major misspecifications, potentially have a look at the random effect distribution, and then run a test for overdispersion, which is usually positive for count data, after which the model is modified towards an overdispersed / zero-inflated distribution. This approach, however, has a number of problems. Just a few examples:

DHARMa aims at solving these problems by creating readily interpretable residuals for generalized linear (mixed) models that are standardized to values between 0 and 1, and that can be interpreted as intuitively as residuals for the linear model. This is achieved by a simulation-based approach, similar to the Bayesian p-value or the parametric bootstrap, that transforms the residuals to a standardized scale. The basic steps are:

  1. Simulate new data from the fitted model for the predictor variable combination of each observation.

  2. For each observation, calculate the empirical cumulative density function, which describes the expected spread for an observation at the respective point in predictor space, conditional on the fitted model.

  3. The residual is defined as the value of the empirical density function at the value of the observed data.

These steps are visualized in the following figure

The key idea for this definition is that, if the model is correctly specified, then the observed data should look like as if it was created by the assumptions of the fitted model. Hence, for a correctly specified model, all values of the cumulative distribution should appear with equal probability. What that means is that we expect the distribution of the residuals to be flat, regardless of the model structure (Poisson, binomial, random effects and so on).

I currently prepare a more exact statistical justification for the approach in an accompanying paper, but if you must provide a reference in the meantime I would suggest citing

Workflow in DHARMa

Installing, loading and citing the package

If you haven’t installed the package yet, either run

install.packages("DHARMa")

Or follow the instructions on https://github.com/florianhartig/DHARMa to install a development version.

Loading and citation

library(DHARMa)
citation("DHARMa")
## 
## To cite package 'DHARMa' in publications use:
## 
##   Florian Hartig (2016). DHARMa: Residual Diagnostics for
##   Hierarchical (Multi-Level / Mixed) Regression Models. R package
##   version 0.1.3. http://florianhartig.github.io/DHARMa/
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {DHARMa: Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models},
##     author = {Florian Hartig},
##     year = {2016},
##     note = {R package version 0.1.3},
##     url = {http://florianhartig.github.io/DHARMa/},
##   }

Calculating scaled residuals

The scaled (quantile) residuals are calculated with the simulateResiduals() function. The default number of simulations to run is 250, which proved to be a reasonable compromise between computation time and precision, but if high precision is desired, n should be raised to 1000 at least.

simulationOutput <- simulateResiduals(fittedModel = fittedModel, n = 250)

What the function does is a) creating n new synthetic datasets by simulating from the fitted model, b) calculates the cumulative distribution of simulated values for each observed value, and c) returning the quantile value that corresponds to the observed value.

For example, a simulated residual of 0.5 means that half of the simulated data are higher than the observed value, and half of them lower. A value of 0.99 would mean that nearly all simulated data are lower than the observed value. The minimum/maximum values for the residuals are 0 and 1.

The calculated residuals are stored in

simulationOutput$scaledResiduals

As discussed above, for a correctly specified model we would expect

Note: the expected uniform distribution is the only differences to the linear regression that one has to keep in mind when interpreting DHARMa residuals. If you cannot get used to this and you must have residuals that behave exactly like a linear regression, you can access a normal transformation of the residuals via

simulationOutput$scaledResidualsNormal

These normal residuals will behave exactly like the residuals of a linear regression. However, for reasons of a) numeric stability with low number of simulations and b) my conviction that it is much easier to visually detect deviations from uniformity than normality, I would STRONGLY advice against using this transformation.

Plotting the scaled residuals

We can get a visual impression of these properties with the plotSimulatedResiduals() function

plotSimulatedResiduals(simulationOutput = simulationOutput)

which creates a qq-plot to detect overall deviations from the expected distribution, and a plot of the residuals against the predicted value.

To provide a visual aid in detecting deviations from uniformity in y-direction, the plot of the residuals against the predicted values also performs an (optional) quantile regression, which provides 0.25, 0.5 and 0.75 quantile lines across the plots. These lines should be straight, horizontal, and at y-values of 0.25, 0.5 and 0.75. Note, however, that some deviations from this are to be expected by chance, even for a perfect model, especially if the sample size is small.

If you want to plot the residuals against other predictors (highly recommend), you can use the function

plotResiduals(YOURPREDICTOR, simulationOutput$scaledResiduals)

which does the same quantile plot as the main plotting function.

Formal goodness-of-fit tests on the scaled residuals

To support the visual inspection of the residuals, the DHARMa package provides a number of specialized goodness-of-fit tests on the simulated residuals. For example, the function

testUniformity(simulationOutput = simulationOutput)
## 
##  One-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.04, p-value = 0.8186
## alternative hypothesis: two-sided

runs a KS test to test for overall uniformity of the residuals. There are a number of further tests

that basically do what they say. See the help of the functions for a more detailed description.

Simulation options

There are a few important technical details regarding how the simulations are performed, in particular regarding the treatments of random effects and integer responses. I would therefore strongly recommend to read the help of

?simulateResiduals

The short summary is this: apart from the number of simulations, there are three important options in the simulateResiduals function

Refit

simulationOutput <- simulateResiduals(fittedModel = fittedModel, refit = T)
  • if refit = F (default), new data is simulated from the fitted model, and residuals are calculated by comparing the observed data to the new data

  • if refit = T, a parametric bootstrap is performed, meaning that the model is refit on the new data, and residuals are created by comparing observed residuals against refitted residuals

The second option is much much slower, and only important for running tests that rely on comparing observed to simulated residuals, e.g. the testOverdispersion function (see below), or if one expects that the tested model is biased and one wants to calculate the expected residuals conditional on this bias (this could make sense in particular for shrinkage estimators that include a purposeful bias, such as random effects or the splines in GAMs).

Random effect simulations

The second option is the treatment of the stochastic hierarchy. In a hierarchical model, several layers of stochasticity are placed on top of each other. Specifically, in a GLMM, we have a lower level stochastic process (random effect), whose result enters into a higher level (e.g. Poisson distribution). For other hierarchical models such as state-space models, similar considerations apply. When simulating, we have to decide if we want to re-simulate all stochastic levels, or only a subset of those. For example, in a GLMM, it is common to only simulate the last stochastic level (e.g. Poisson) conditional on the fitted random effects, meaning that the random effects are set on the fitted values.

For controlling how many levels should be re-simulated, the simulateResidual function allows to pass on parameters to the simulate function of the fitted model object. Please refer to the help of the different simulate functions (e.g. ?simulate.merMod) for details. For merMod (lme4) model objects, the relevant parameters are “use.u”, and “re.form”, as, e.g., in

simulationOutput <- simulateResiduals(fittedModel = fittedModel, n = 250, use.u = T)

If the model is correctly specified, the simulated residuals should be flat regardless how many hierarchical levels we re-simulate. The most thorough procedure would therefore be to test all possible options. If testing only one option, I would recommend to re-simulate all levels, because this essentially tests the model structure as a whole. This is the default setting in the DHARMa package. A potential drawback is that re-simulating the lower-level random effects creates more variability, which may reduce power for detecting problems in the upper-level stochastic processes.

Integer treatment

A third option is the treatment of integer responses. The background here is that, for integer-value variables, some additional resampling is necessary to make sure that the residual distribution becomes flat (essentially, we have to smoothen away the integer nature of the data). The idea is explained in

  • Dunn, K. P., and Smyth, G. K. (1996). Randomized quantile residuals. Journal of Computational and Graphical Statistics 5, 1-10.

The simulateResiduals function will automatically check if the family is integer valued. The parameter should therefore usually not be changed.

Visual diagnostics and tests of common misspecification problems

So far, all the plots / tests that were shown were from a correctly specified model. In this section, we discuss how model misspecification will show up in the scaled residuals.

Using external simulations

As mentioned earlier, the quantile residuals defined in DHARMa are the frequentist equivalent of the so-called “Bayesian p-values”, i.e. residuals created from posterior predictive simulations in a Bayesian analysis.

To make the plots and tests in DHARMa also available for Bayesian analysis, DHARMa provides the option to convert externally created posterior predictive simulations into a DHARMa object

res = createDHARMa(scaledResiduals = posteriorPredictiveSimulations, simulatedResponse = medianPosteriorPredictions, observedResponse = observations, integerResponse = ?)

What is provided as simulatedResponse is up to the user, but median posterior predictions seem most sensible. Note: as DHARMa doesn’t know the fitted model, it is important in this function to specify the integerResponse option by hand (see simulateResiduals for details). After the conversion, all DHARMa plots can be used, however, note that Bayesian p-values != DHARMA residuals, because in the Bayesian analysis, parameters are varied as well.

Overdispersion / underdispersion

The most common concern for GLMMs is overdispersion, underdispersion and zero-inflation.

Over/underdispersion refers to the phenomenon that residual variance is larger/smaller than expected under the fitted model. Over/underdispersion can appear for any distributional family with fixed variance, in particular for Poisson and binomial models.

A few general rules of thumb

An example of overdispersion

This this is how overdispersion looks like in the DHARMa residuals

testData = createData(sampleSize = 500, overdispersion = 2, family = poisson())
fittedModel <- glmer(observedResponse ~ Environment1 + (1|group) , family = "poisson", data = testData)

simulationOutput <- simulateResiduals(fittedModel = fittedModel)
plotSimulatedResiduals(simulationOutput = simulationOutput)

Note that we get more residuals around 0 and 1, which means that more residuals are in the tail of distribution than would be expected under the fitted model.

An example of underdispersion

This is an example of underdispersion

testData = createData(sampleSize = 500, intercept=0, fixedEffects = 2, overdispersion = 0, family = poisson(), roundPoissonVariance = 0.001, randomEffectVariance = 0)
fittedModel <- glmer(observedResponse ~ Environment1 + (1|group) , family = "poisson", data = testData)

summary(fittedModel)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: observedResponse ~ Environment1 + (1 | group)
##    Data: testData
## 
##      AIC      BIC   logLik deviance df.resid 
##   1003.2   1015.9   -498.6    997.2      497 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.6245 -0.3439 -0.0974  0.1872  0.9627 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  group  (Intercept) 0        0       
## Number of obs: 500, groups:  group, 10
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -0.16072    0.06031  -2.665  0.00771 ** 
## Environment1  2.24239    0.09032  24.828  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## Environmnt1 -0.824
# plotConventionalResiduals(fittedModel)

simulationOutput <- simulateResiduals(fittedModel = fittedModel)
plotSimulatedResiduals(simulationOutput = simulationOutput)

testUniformity(simulationOutput = simulationOutput)
## 
##  One-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.18, p-value = 1.699e-14
## alternative hypothesis: two-sided

Here, we get too many residuals around 0.5, which means that we are not getting as many residuals as we would expect in the tail of the distribution than expected from the fitted model.

Testing for over/underdispersion

Although, as discussed above, over/underdispersion will show up in the residuals, and it’s possible to detect it with the testUniformity function, simulations show that this test is considerably less powerful than more targeted tests.

DHARMa therefore contains a specific nonparametric overdispersion test that compares the dispersion of simulated residuals to the observed residuals. Note that, in this case, refit = T is necessary.

simulationOutput2 <- simulateResiduals(fittedModel = fittedModel, n = 250, refit = T)
testOverdispersion(simulationOutput2)
## 
##  Overdispersion test via comparison to simulation under H0
## 
## data:  simulationOutput2
## dispersion = 0.14978, p-value = 1
## alternative hypothesis: overdispersion

Power simulations (2nd figure below) show that this diagnostic has similar power than the standard parametric overdispersion test for GLMMs, which is implemented in

testOverdispersionParametric(fittedModel)
## [1] "Parametric overdispersion test not implemented for this model type"

while having all advantages of a non-parametric test, i.e. it should be still reliable for situations where distributional assumptions are violated

Comparison of power from simulation studies

Comparison of power from simulation studies

Zero-inflation

A common special case of overdispersion is zero-inflation, which is the situation when more zeros appear in the observation than expected under the fitted model. Zero-inflation requires special correction steps.

An example of zero-inflation

Here an example of a typical zero-inflated count dataset, plotted against the environmental predictor

testData = createData(sampleSize = 500, intercept = 2, fixedEffects = c(1), overdispersion = 0, family = poisson(), quadraticFixedEffects = c(-3), randomEffectVariance = 0, pZeroInflation = 0.6)

par(mfrow = c(1,2))
plot(testData$Environment1, testData$observedResponse, xlab = "Envrionmental Predictor", ylab = "Response")
hist(testData$observedResponse, xlab = "Response", main = "")

We see a hump-shaped dependence of the environment, but with too many zeros.

Zero-inflation in the scaled residuals

In the normal residual, plots, zero-inflation will look pretty much like overdispersion

fittedModel <- glmer(observedResponse ~ Environment1 + I(Environment1^2) + (1|group) , family = "poisson", data = testData)

simulationOutput <- simulateResiduals(fittedModel = fittedModel)
plotSimulatedResiduals(simulationOutput = simulationOutput)

The reason is that the model will usually try to find a compromise between the zeros, and the other values, which will lead to excess variance in the residuals.

Test for zero-inflation

DHARMa has a special test for zero-inflation, which compares the distribution of expected zeros in the data against the observed zeros

testZeroInflation(simulationOutput)

## 
##  Zero-inflation test via comparison to expected zeros with
##  simulation under H0
## 
## data:  simulationOutput
## ratioObsExp = 1.8786, p-value < 2.2e-16
## alternative hypothesis: more

This test is likely better suited for detecting zero-inflation than the standard plot, but note that also overdispersion will lead to excess zeros, so only seeing too many zeros is not a reliable diagnostics for moving towards a zero-inflated model. A reliable differentiation between overdispersion and zero-inflation will usually only be possible when directly comparing alternative models, e.g. through residual comparison / model selection of a model with / without zero-inflation, or by simply fitting a model with zero-inflation and looking at the parameter estimate for the zero-inflation.

Heteroscedasticity

So far, most of the things that we have tested could also have been detected with parametric tests. Here, we come to the first issue that is difficult to detect with current tests, and that is usually neglected.

Heteroscedasticity means that there is a systematic dependency of the dispersion / variance on another variable in the model. It is not sufficiently appreciated that also binomial or Poisson models can show heteroscedasticity. Basically, it means that the level of over/underdispersion depends on another parameter. Here an example where we create such data

testData = createData(sampleSize = 500, intercept = 0, overdispersion = function(x){return(rnorm(length(x), sd = 2*abs(x)))}, family = poisson(), randomEffectVariance = 0)
fittedModel <- glmer(observedResponse ~ Environment1 + (1|group), family = "poisson", data = testData)

simulationOutput <- simulateResiduals(fittedModel = fittedModel)
plotSimulatedResiduals(simulationOutput = simulationOutput)

testUniformity(simulationOutput = simulationOutput)
## 
##  One-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.26, p-value < 2.2e-16
## alternative hypothesis: two-sided

Adding a simple overdispersion correction will try to find a compromise between the different levels of dispersion in the model. The qq plot looks better now, but there is still a pattern in the residuals

testData = createData(sampleSize = 500, intercept = 0, overdispersion = function(x){return(rnorm(length(x), sd = 2*abs(x)))}, family = poisson(), randomEffectVariance = 0)
fittedModel <- glmer(observedResponse ~ Environment1 + (1|group) + (1|ID), family = "poisson", data = testData)

# plotConventionalResiduals(fittedModel)

simulationOutput <- simulateResiduals(fittedModel = fittedModel)
plotSimulatedResiduals(simulationOutput = simulationOutput)

testUniformity(simulationOutput = simulationOutput)
## 
##  One-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.046, p-value = 0.2406
## alternative hypothesis: two-sided

To remove this pattern, you would need to make the dispersion parameter dependent on a predictor (e.g. in JAGS), or apply a transformation on the data.

Missing predictors or quadratic effects

A second test that is typically run for LMs, but not for GL(M)Ms is to plot residuals against the predictors in the model (or potentially predictors that were not in the model) to detect possible misspecifications. Doing this is highly recommended. For that purpose, you can retrieve the residuals via

simulationOutput$scaledResiduals

Note again that the residual values are scaled between 0 and 1. If you plot the residuals against predictors, space or time, the resulting plots should not only show no systematic dependency of those residuals on the covariates, but they should also again be flat for each fixed situation. That means that if you have, for example, a categorical predictor: treatment / control, the distribution of residuals for each predictor alone should be flat as well.

Here an example with a missing quadratic effect in the model and 2 predictors

testData = createData(sampleSize = 200, intercept = 1, fixedEffects = c(1,2), overdispersion = 0, family = poisson(), quadraticFixedEffects = c(-3,0))
fittedModel <- glmer(observedResponse ~ Environment1 + Environment2 + (1|group) , family = "poisson", data = testData)
simulationOutput <- simulateResiduals(fittedModel = fittedModel)
# plotConventionalResiduals(fittedModel)
plotSimulatedResiduals(simulationOutput = simulationOutput, quantreg = T)

testUniformity(simulationOutput = simulationOutput)
## 
##  One-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.073, p-value = 0.2369
## alternative hypothesis: two-sided

It is difficult to see that there is a problem at all in the general plot, but it becomes clear if we plot against the environment

par(mfrow = c(1,2))
plotResiduals(testData$Environment1,  simulationOutput$scaledResiduals)
plotResiduals(testData$Environment2,  simulationOutput$scaledResiduals)

Temporal autocorrelation

A special case of plotting residuals against predictors is the plot against time and space, which should always be performed if those variables are present in the model. Let’s create some temporally autocorrelated data

testData = createData(sampleSize = 100, family = poisson(), temporalAutocorrelation = 5)

fittedModel <- glmer(observedResponse ~ Environment1 + (1|group), data = testData, family = poisson() )

simulationOutput <- simulateResiduals(fittedModel = fittedModel)

Test and plot for temporal autocorrelation

The function testTemporalAutocorrelation performs a Durbin-Watson test from the package lmtest on the uniform residuals to test for temporal autocorrelation in the residuals, and additionally plots the residuals against time.

The function also has an option to perform the test against randomized time (H0) - the sense of this is to be able to run simulations for testing if the test has correct error rates in the respective situation, i.e. is not oversensitive (too high sensitivity has sometimes been reported for Durbin-Watson).

testTemporalAutocorrelation(simulationOutput = simulationOutput, time = testData$time)

## 
##  Durbin-Watson test
## 
## data:  simulationOutput$scaledResiduals ~ 1
## DW = 1.6395, p-value = 0.03432
## alternative hypothesis: true autocorrelation is greater than 0
testTemporalAutocorrelation(simulationOutput = simulationOutput, time = "random")

## 
##  Durbin-Watson test
## 
## data:  simulationOutput$scaledResiduals ~ 1
## DW = 1.7698, p-value = 0.1225
## alternative hypothesis: true autocorrelation is greater than 0

Note other caveats mentioned about the test in the help of testTemporalAutocorrelation().

Spatial autocorrelation

testData = createData(sampleSize = 100, family = poisson(), spatialAutocorrelation = 5)

fittedModel <- glmer(observedResponse ~ Environment1 + (1|group), data = testData, family = poisson() )

simulationOutput <- simulateResiduals(fittedModel = fittedModel)

Test and plot for spatial autocorrelation

The spatial autocorrelation test performs the Moran.I test from the package ape and plots the residuals against space.

An additional test against randomized space (H0) can be performed, for the same reasons as explained above.

testSpatialAutocorrelation(simulationOutput = simulationOutput, x = testData$x, y= testData$y)

## 
##  Moran's I
## 
## data:  simulationOutput
## observed = 0.149650, expected = -0.010101, sd = 0.019442, p-value
## = 2.22e-16
## alternative hypothesis: Spatial autocorrelation
testSpatialAutocorrelation(simulationOutput = simulationOutput, x = "random", y= "random")

## 
##  Moran's I
## 
## data:  simulationOutput
## observed = -0.010861, expected = -0.010101, sd = 0.016926, p-value
## = 0.9642
## alternative hypothesis: Spatial autocorrelation

The usual caveats for Moran.I apply, in particular that it may miss non-local and heterogeneous (non-stationary) spatial autocorrelation. The former should be better detectable visually in the spatial plot, or via regressions on the pattern.

Custom tests

A big advantage of the simulations is that you can test any problem that you think you may have. For example, you think you have an excess of tens in your count data? Maybe a faulty measurement instrument that returns too many tens? Just compare the observed with the expected tens from the simulations.

You think your random effect estimates look weird? Run the model with the refit = T option and see how typical random effect estimates look for your problem.

Real-world examples

Budworm example (count-proportion n/k binomial)

This example comes from Jochen Fründ. Measured are the number of parasitized observations, with population density as a covariate

plot(N_parasitized / (N_adult + N_parasitized ) ~ logDensity, xlab = "Density", ylab = "Proportion infected", data = data)

Let’s fit the data with a regular binomial n/k glm

mod1 <- glm(cbind(N_parasitized, N_adult) ~ logDensity, data = data, family=binomial)
simulationOutput <- simulateResiduals(fittedModel = mod1)
plotSimulatedResiduals(simulationOutput = simulationOutput)

The residuals look clearly overdispersed. We can confirm that with the omnibus test

testUniformity(simulationOutput = simulationOutput)
## 
##  One-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.36509, p-value = 0.005675
## alternative hypothesis: two-sided

Or with the more powerful overdispersion test

simulationOutput2 <- simulateResiduals(fittedModel = mod1, refit = T) # remember for this test we need the refit option
testOverdispersion(simulationOutput = simulationOutput2)
## 
##  Overdispersion test via comparison to simulation under H0
## 
## data:  simulationOutput2
## dispersion = 53.862, p-value < 2.2e-16
## alternative hypothesis: overdispersion

OK, so let’s add overdispersion through an individual-level random effect

mod2 <- glmer(cbind(N_parasitized, N_adult) ~ logDensity + (1|ID), data = data, family=binomial)
simulationOutput <- simulateResiduals(fittedModel = mod2)
plotSimulatedResiduals(simulationOutput = simulationOutput)

The overdispersion looks better, but you can see that the residuals look a bit irregular.

Likely, the reason is the steep increase in the beginning that one can see in the raw data plot. One would probably need to apply another transformation or a nonlinear function to completely fit this away.

Beetlecount / Poisson example

Dataset

This example is a synthetic dataset of measured beetle counts over 50 plots across an altitudinal gradient that are yearly sampled over 20 years. The following plot shows the observed number of beetles (log10) vs. altitude. Additional variables in the data are soil moisture and the amount of deadwood on the plots.

par(mfrow = c(1,3))
plot(log10(beetles) ~ altitude + I(altitude) + moisture, data = data, main = "Beetle counts", xlab = "Altitude")

Our question is: what is the effect of altitude on the abundance of the beetle? Let’s start with a linear and quadratic term for altitude, linear effect of soil moisture, and random intercepts on plot and year

mod <- glmer(beetles ~ altitude + I(altitude^2) + moisture + (1|plot) + (1|year), data = data, family=poisson, control = glmerControl(optCtrl = list(maxfun = 10000)))
simulationOutput <- simulateResiduals(fittedModel = mod)
plotSimulatedResiduals(simulationOutput = simulationOutput)

summary(mod)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: beetles ~ altitude + I(altitude^2) + moisture + (1 | plot) +  
##     (1 | year)
##    Data: data
## Control: glmerControl(optCtrl = list(maxfun = 10000))
## 
##      AIC      BIC   logLik deviance df.resid 
##  17748.9  17778.4  -8868.5  17736.9      994 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -10.9720  -2.1048  -0.7817   1.7226  15.5766 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  plot   (Intercept) 0.1318   0.3631  
##  year   (Intercept) 1.0295   1.0147  
## Number of obs: 1000, groups:  plot, 50; year, 20
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)    -0.36922    0.27858  -1.325    0.185    
## altitude       12.79345    0.73672  17.365   <2e-16 ***
## I(altitude^2) -13.01184    0.71319 -18.245   <2e-16 ***
## moisture       -0.15123    0.01615  -9.366   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) altitd I(l^2)
## altitude    -0.502              
## I(altitd^2)  0.431 -0.967       
## moisture    -0.002  0.013 -0.012

We see that we have a problem when we plot residuals against deadwood

plotResiduals(data$deadwood, simulationOutput$scaledResiduals)

so let’s add this term as well

mod <- glmer(beetles ~ altitude + I(altitude^2) + moisture + deadwood + (1|plot) + (1|year) , data = data, family=poisson, control = glmerControl(optCtrl = list(maxfun = 10000)))
simulationOutput <- simulateResiduals(fittedModel = mod)
plotSimulatedResiduals(simulationOutput = simulationOutput)

summary(mod)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: beetles ~ altitude + I(altitude^2) + moisture + deadwood + (1 |  
##     plot) + (1 | year)
##    Data: data
## Control: glmerControl(optCtrl = list(maxfun = 10000))
## 
##      AIC      BIC   logLik deviance df.resid 
##  14092.9  14127.3  -7039.4  14078.9      993 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -10.9061  -1.6364  -0.2825   2.0134  10.1203 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  plot   (Intercept) 0.07114  0.2667  
##  year   (Intercept) 0.93253  0.9657  
## Number of obs: 1000, groups:  plot, 50; year, 20
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)    -0.51331    0.25028   -2.05   0.0403 *  
## altitude       12.80693    0.57119   22.42   <2e-16 ***
## I(altitude^2) -13.04320    0.55327  -23.57   <2e-16 ***
## moisture       -0.14977    0.01675   -8.94   <2e-16 ***
## deadwood        1.06191    0.01870   56.79   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) altitd I(l^2) moistr
## altitude    -0.441                     
## I(altitd^2)  0.380 -0.967              
## moisture    -0.002  0.014 -0.012       
## deadwood    -0.024  0.004 -0.005 -0.013

still, there is obviously overdispersion in the data, so we’ll add an individual level random effect to account for overdispersion

mod <- glmer(beetles ~ altitude + I(altitude^2) + moisture + deadwood + (1|plot) + (1|year) + (1|dataID) , data = data, family=poisson, control = glmerControl(optCtrl = list(maxfun = 10000)))
simulationOutput <- simulateResiduals(fittedModel = mod)
plotSimulatedResiduals(simulationOutput = simulationOutput)

The data still looks overdispersed. The reason is that there is in fact no standard overdispersion, but zero-inflation in the data. We can look at the excess zeros via

testZeroInflation(simulationOutput)

## 
##  Zero-inflation test via comparison to expected zeros with
##  simulation under H0
## 
## data:  simulationOutput
## ratioObsExp = 1.2184, p-value < 2.2e-16
## alternative hypothesis: more

which shows that we have too many zeros. We need a GLMM with zero-inflation. The easiest option is to do this in a Bayesian framework, e.g. in JAGS as in this example

To be honest however, if I wouldn’t it would be hard to tell that zero-inflation is the problem here. The reason is that, if you have zero-inflation, the model will be drawn towards the zeros, which also creates an excess of residuals that are too large. Hence, it is hard to distinguish zero-inflation from the case where we have a case where the individual-level random effect is not successful at removing the overdispersion, i.e. because the functional form of the added noise is incorrect. The best way to test this is probably to run model selections (e.g. simulated LRTs) between a number of alternative models, e.g. a zero-inflated GLMM vs. a number of different overdispersed GLMMs.