ctrdata
for clinical trial protocol-related informationctrdata
on a R systemThe R Project website (https://www.r-project.org/) provides installers for the R system.
Alternatively, the R system can be used from software products such as R Studio (https://www.rstudio.com/products/RStudio/), which includes an open source integrated development environment (IDE), or Microsoft R Open (https://mran.microsoft.com/open/).
General information on the ctrdata
package is available here: https://github.com/rfhb/ctrdata.
The above should install package ctrdata
into the user’s library. If this installation does not succeed, the following sections offer potential solutions.
A proxy server may be need to be specified, such as follows:
On MS Windows, it seems recommended to not use UNC notation (such as \\server\directory
) for specifying the user’s library location:
For using the development version of package ctrdata
, install from GitHub:
As noted in the README for package ctrdata
, on MS Windows the cygwin environment has to be installed, into the local directory c:\cygwin
. The applications php, bash, perl, cat and sed in the cygwin environment are required for function ctrLoadQueryIntoDb()
of package ctrdata
(other functions in the package do not have this requirement). The installation of a minimal cygwin environment on MS Windows can be effected from package ctrdata
as follows:
If need be, a proxy can be specified:
Users who want or need to install cygwin manually can download the setup executable from here. In MS Windows command window or Powershell window, use the following command line. The parameters are explained here.
ctrdata
These functions open the browser, where the user can start searching for trials of interest.
Refine the search until the trials of interest are listed in the browser. Currently, the total number of trials that can be retrieved with package ctrdata
is intentionally set to 5000 (CTGOV).
Using operating system functions.
The next steps are executed in the R environment:
# Use search q that was defined in previous step:
ctrLoadQueryIntoDb(queryterm = q)
# Alternatively, use the following to retrieve a couple of trial records:
ctrLoadQueryIntoDb(queryterm = "2010-024264-18",
register = "EUCTR")
# If no parameters are given for a database connection: uses mongodb
# on localhost, port 27017, database "users", collection "ctrdata"
# Show which queries have been downloaded into the database so far
dbQueryHistory()
# query-timestamp query-register query-records query-term
# 1 2016-01-13-10-51-56 CTGOV 5233 type=Intr&cond=cancer&age=0
# 2 2016-01-13-10-40-16 EUCTR 910 cancer&age=under-18
# find names of fields of interest in database:
dbFindFields(namepart = "status",
allmatches = TRUE)
# [1] "overall_status" "b1_sponsor.b31_and_b32_status_of_the_sponsor"
# [3] "p_end_of_trial_status" "location.status"
# Get all records that have values in all specified fields.
# Note that b31_... is a field within the array b1_...
result <- dbGetFieldsIntoDf(fields = c("b1_sponsor.b31_and_b32_status_of_the_sponsor",
"p_end_of_trial_status"))
# Tabulate the status of the clinical trial on the date of information retrieval
with (result,
table("Status" = p_end_of_trial_status,
"Sponsor type" = b1_sponsor.b31_and_b32_status_of_the_sponsor))
# Sponsor type
# Status Commercial Non-Commercial
# Completed 138 30
# Not Authorised 3 0
# Ongoing 339 290
# Prematurely Ended 35 4
# Restarted 8 0
# Temporarily Halted 14 4