Overview of trialr

Kristian Brock

2019-04-21

trialr is a collection of Bayesian clinical trial designs implemented in Stan and R.

Many Bayesian experimental designs for clinical trials have been published. However, one of the factors that has constrained their adoption is availability of software. We present here some of the most notable, implemented and demonstrated in a consistent style, leveraging the powerful Stan environment.

Often authors of trial designs make available code with their publication. There are also some fantastic packages that aid the use of certain designs. However, challenges to use still persist. The disparate methods are naturally presented in a style that appeals to the particular author. Features implemented in one package for one design may be missing in another. Sometimes the technology chosen may only be available on one particular operating system, or the chosen technology may have fallen into disuse.

trialr seeks to address these problems. Models are specified in Stan, a state-of-the-art environment for Bayesian analysis. It uses Hamiltonian Monte Carlo to take samples from the posterior distribution. This method is more efficient than Gibbs sampling, for instance, and reliable inference can generally be performed on a few thousand posterior samples. R, Stan and trialr are each available on Mac, Linux, and Windows, so all of the examples presented here should work on each operating system. Furthermore, Stan offers a very simple method to split the sampling across n cores, taking advantage of modern multicore processors.

The designs implemented in trialr are introduced briefly below, and developed more fully in vignettes. We focus on practical usage, including:

Examples

In all examples, we will need to load trialr

library(trialr)

CRM

The Continual Reassessment Method (CRM) was first published by O’Quigley et al. (1990). It assumes a smooth mathematical form for the dose-toxicity curve to conduct a dose-finding trial seeking a maximum tolerable dose. There are many variations to suit different clinical scenarios and the design has enjoyed relatively common use (although nowhere near as common as the ubiquitous and inferior 3+3 design).

We will demonstrate the method using a notional trial example. In a scenario of five potential doses, let us assume that we seek the dose with probability of toxicity closest to 25% where our prior guesses of the rates of toxicity can be represented:

Let us assume that we have already treated 2 patients each at doses 2, 3 and 4, having only seen toxicity at dose-level 4. What dose should we give to the next patient or cohort? We can fit the data to the popular empiric model:

The fitted model contains lots of useful of information:

We see that dose-level 2 is the dose that the model recommends to be given to the next cohort.

All manner of interested visualisation is facilitated by the MCMC samples. The following plot simply shows the posterior expected probability of toxicity at each dose.

Confirmation that dose-level 2 is predicted to be closest to the target toxicity rate, although dose 3 is also close.

Several variants of the CRM are implemented in ‘trialr’. Further visualisation techniques are demonstrated in the Visualisation in CRM vignette.

EffTox

EffTox by Thall & Cook (2004) is a dose-finding design that uses binary efficacy and toxicity outcomes to select a dose with a high utility score. We present it briefly here but there is a much more thorough examination in the EffTox vignette.

For demonstration, we fit the model parameterisation introduced by Thall et al. (2014) to the following notional outcomes:

Patient Dose-level Toxicity Efficacy
1 1 0 0
2 1 0 0
3 1 0 1
4 2 0 1
5 2 0 1
6 2 1 1

In this instance, after evaluation of our six patients, the dose advocated for the next group is dose-level 3. This is contained in the fitted object:

This is not surprising because dose 3 has the highest utility score:

Sometimes, doses other than the maximal-utility dose will be recommended because of the dose-admissibility rules. See the papers or the EffTox vignette for more details.

Functions are provided to create useful plots. For instance, it is illuminating to plot the posterior means of the probabilities of efficacy and toxicity at each of the doses on the trade-off contours. The five doses are shown in red. Doses closer to the lower-right corner have higher utility.

This example continues in the EffTox vignette.

There are many publications related to EffTox but the two most important are Thall and Cook (2004) and Thall et al. (2014).

BEBOP in PePS2

Thall, Nguyen, and Estey (2008) introduced an extension of EffTox that allows dose-finding by efficacy and toxicity outcomes and adjusts for covariate information. Brock, et al. simplified the method by removing the dose-finding components to leave a design that studies associated co-primary and toxicity outcomes in an arbitrary number of cohorts determined by the basline covariates. They originally refered to the simplifed design as BEBOP, for Bayesian Evaluation of Bivariate binary Outcomes with Predictive variables. This name was later changed to P2TNE for Phase 2 Thall, Nguyen & Estey to reflect its pedigree.

The design was implemented in a phase II trial of pembrolizumab in non-small-cell lung cancer called PePS2. A distinct feature of the trial is the availability of predictive baseline covariates, the most notwworthy of which is the PD-L1 tumour proportion score, shown by Garon et al. (2015) to be a predictive biomarker.

This example is demonstrated in the BEBOP vignette.

References

Garon, Edward B, Naiyer a Rizvi, Rina Hui, Natasha Leighl, Ani S Balmanoukian, Joseph Paul Eder, Amita Patnaik, et al. 2015. “Pembrolizumab for the treatment of non-small-cell lung cancer.” The New England Journal of Medicine 372 (21): 2018–28. https://doi.org/10.1056/NEJMoa1501824.

Thall, Peter F., Hoang Q. Nguyen, and Elihu H. Estey. 2008. “Patient-specific dose finding based on bivariate outcomes and covariates.” Biometrics 64 (4): 1126–36. https://doi.org/10.1111/j.1541-0420.2008.01009.x.

Thall, Peter F., J. Kyle Wathen, B. Nebiyou Bekele, Richard E. Champlin, Laurence H. Baker, and Robert S. Benjamin. 2003. “Hierarchical Bayesian approaches to phase II trials in diseases with multiple subtypes.” Statistics in Medicine 22 (5): 763–80. https://doi.org/10.1002/sim.1399.

Thall, PF, and JD Cook. 2004. “Dose-Finding Based on Efficacy-Toxicity Trade-Offs.” Biometrics 60 (3): 684–93.

Thall, PF, RC Herrick, HQ Nguyen, JJ Venier, and JC Norris. 2014. “Effective sample size for computing prior hyperparameters in Bayesian phase I-II dose-finding.” Clinical Trials 11 (6): 657–66. https://doi.org/10.1177/1740774514547397.