How does covr work anyway?

Jim Hester

2016-12-05

Introduction

This vignette walks you through the design of covr and breaks down the process of tracking code execution piece by piece.

Other coverage tools

Prior to writing covr, there were a handful of coverage tools for R code. R-coverage by Karl Forner and testCoverage by Tom Taverner, Chris Campbell, Suchen Jin were the two I was most aware of.

R-coverage

R-coverage provided a very robust solution by modifying the R source code to instrument the code for each call. Unfortunately this requires you to patch the R source and getting the changes upstreamed into the base R distribution would likely be challenging.

Test Coverage

testCoverage uses an alternate parser of R-3.0 to instrument R code and record whether the code is run by tests. The package replaces symbols in the code to be tested with a unique identifier. This is then injected into a tracing function that will report each time the symbol is called. The first symbol at each level of the expression tree is traced, allowing the coverage of code branches to be checked. This is a complicated implementation I do not fully understand, which is one of the reasons I decided to write covr.

Covr

Covr takes an approach in-between the two previous tools, modifying the function definitions by parsing the abstract syntax tree and inserting trace statements. These modified definitions are then transparently replaced in-place using C. This allows us to correctly instrument every call and function in a package without having to resort to alternate parsing or changes to the R source.

Modifying the call tree

The core function in covr is trace_calls(), located in R/trace_calls.R. This function was adapted from ideas in Advanced R - Walking the AST with recursive functions. This recursive function modifies each of the leaves (atomic or name objects) of an R expression by applying a given function to them. For non-leaves we simply call modify_lang recursively in various ways. The logic behind modify_lang and similar functions to parse and modify R’s AST is explained in more detail at Walking the AST with recursive functions.

modify_lang <- function(x, f, ...) {
 recurse <- function(y) {
  # if (!is.null(names(y))) names(y) <- f2(names(y))
  lapply(y, modify_lang, f = f, ...)
 }

 if (is.atomic(x) || is.name(x)) {
  # Leaf
  f(x, ...)
 } else if (is.call(x)) {
  as.call(recurse(x))
 } else if (is.function(x)) {
  formals(x) <- modify_lang(formals(x), f, ...)
  body(x) <- modify_lang(body(x), f, ...)
  x
 } else if (is.pairlist(x)) {
  # Formal argument lists (when creating functions)
  as.pairlist(recurse(x))
 } else if (is.expression(x)) {
  # shouldn't occur inside tree, but might be useful top-level
  as.expression(recurse(x))
 } else if (is.list(x)) {
  # shouldn't occur inside tree, but might be useful top-level
  recurse(x)
 } else {
  stop("Unknown language class: ", paste(class(x), collapse = "/"),
   call. = FALSE)
 }
}

We can use this same framework to instead insert a trace statement before each call by replacing each call with a call to a counting function followed by the previous call. Braces ({) in R may seem like language syntax, but they are actually a Primitive function and you can call them like any other function.

identical({ 1 + 2; 3 + 4 }, `{`(1 + 2, 3 + 4))
## [1] TRUE

Remembering that braces always return the value of the last evaluated expression, we can call a counting function followed by the previous function substituting as.call(recurse(x)) in our function above with.

`{`(count(), as.call(recurse(x)))

Source References

Now that we have a way to add a counting function to any call in the AST without changing the output we need a way to determine where in the code source that function came from. Luckily for us R has a built-in method to provide this information in the form of source references. When option(keep.source = TRUE) (the default for interactive sessions), a reference to the source code for functions is stored along with the function definition. This reference is used to provide the original formatting and comments for the given function source. In particular each call in a function contains a srcref attribute, which can then be used as a key to count just that call.

The actual source for trace_calls is slightly more complicated because we want to initialize the counter for each call while we are walking the AST and there are a few non-calls we also want to count.

Refining Source References

Each statement comes with a source reference. Unfortunately, the following is counted as one statement:

if (x)
 y()

To work around this, detailed parse data (obtained from a refined version of getParseData()) is analyzed to impute source references at sub-statement level for if, for and while constructs.

Replacing

After we have our modified function definition, how do we re-define the function to use the updated definition, and ensure that all other functions which call the old function also use the new definition? You might try redefining the function directly.

f1 <- function() 1

f1 <- function() 2
f1() == 2
## [1] TRUE

While this does work for the simple case of calling the new function in the same environment, it fails if another function calls a function in a different environment.

env <- new.env()
f1 <- function() 1
env$f2 <- function() f1() + 1

env$f1 <- function() 2

env$f2() == 3
## [1] FALSE

As modifying external environments and correctly restoring them can be tricky to get correct, we use the C function reassign_function, which is used in testthat::with_mock(). This function takes a function name, environment, old definition, new definition and copies the formals, body, attributes and environment from the old function to the new function. This allows you to do an in-place replacement of a given function with a new function and ensure that all references to the old function will use the new definition.

S4 classes

R’s S3 object oriented classes simply define functions directly in the packages namespace, so they can be treated the same as any other function. S4 methods have a more complicated implementation where the function definitions are placed in an enclosing environment based on the generic method they implement. This makes getting the function definition more complicated.

replacements_S4 <- function(env) {
 generics <- getGenerics(env)

 unlist(recursive = FALSE,
  Map(generics@.Data, generics@package, USE.NAMES = FALSE,
   f = function(name, package) {
   what <- methodsPackageMetaName("T", paste(name, package, sep = ":"))

   table <- get(what, envir = env)

   lapply(ls(table, all.names = TRUE), replacement, env = table)
  })
 )
}

replacements_S4 first gets all the generic functions for the package environment. Then for each generic function if finds the mangled meta package name and gets the corresponding environment from the base environment. All of the functions within this environment are then traced.

RC classes

Similarly to S4 classes reference class (RC) classes define their methods in a special environment. A similar method is used to add the tracing calls to the class definition. These calls are then copied to the object methods when the generator function is run.

Compiled code

Gcov

Test coverage of compiled code uses a completely different mechanism than that of R code. Fortunately we can take advantage of Gcov, the built-in coverage tool for gcc and compatible reports from clang versions 3.5 and greater.

Both of these compilers track execution coverage when given the --coverage flag. In addition it is necessary to turn off compiler optimization -O0, otherwise the coverage output is difficult or impossible to interpret as multiple lines can be optimized into one, functions can be inlined, etc.

Makevars

R passes flags defined in PKG_CFLAGS to the compiler, however it also has default flags including -02 (defined in $R_HOME/etc/Makeconf), which need to be overridden. Unfortunately it is not possible to override the default flags with environment variables (as the new flags are added to the left of the defaults rather than the right). However if Make variables are defined in ~/.R/Makevars they are used in place of the defaults.

Therefore, we need to temporarily add -O0 --coverage to the Makevars file, then restore the previous state after the coverage is run. The implementation of this is in R/makevars.R.

Subprocess

The last hurdle to getting compiled code coverage working properly is that the coverage output is only produced when the running process ends. Therefore you cannot run the tests and get the results in the same R process. tests::testPackages() runs a separate R process when running tests, however we need to modify the package code first before running the tests.

The current behavior in covr is to install the package to be tested in a temporary directory, then add calls to the lazy loading code to install a user hook which modifies the code when it is loaded. We also register a finalizer which prints the coverage couts when the namespace is unloaded or the R process exists. These output files are then aggregated together to determine the coverage.

This procedure works regardless of the number of child R processes used, so therefore also works with parallel code.

Output Formats

The output format returned by covr is a very simple one. It consists of a named character vector, where the names are colon-delimited information from the source references. Namely, the file, line and columns the traced call is from. The value is simply the number of times that given call was called and the source ref of the original call. There is also an as.data.frame method to make subsetting by various features easy to do

While covr tracks coverage by expression, typically users expect coverage to be reported by line, so there are functions to convert to line oriented coverage.

Codecov.io and Coveralls.io

Codecov and Coveralls are a web services to help you track your code coverage over time, and ensure that all your new code is fully covered.

They both have simple JSON based APIs to submit and report on coverage.

Conclusion

Covr aims to be a simple and easy to understand implementation, hopefully this vignette helps to explain the rationale behind the package and explain how and why it works.