--- title: "An introduction to biodbChebi" author: "Pierrick Roger" date: "`r BiocStyle::doc_date()`" package: "`r BiocStyle::pkg_ver('biodbChebi')`" abstract: | How to use the biodbChebi connector and its methods. vignette: | %\VignetteIndexEntry{Introduction to the biodbChebi package.} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} output: BiocStyle::html_document: toc: yes toc_depth: 4 toc_float: collapsed: false BiocStyle::pdf_document: default bibliography: references.bib --- # Introduction biodbChebi is a *biodb* extension package that implements a connector to the [ChEBI](https://www.ebi.ac.uk/chebi/) database [@degtyarenko2007_ChEBI; @hastings2012_chebi]. # Installation Install using Bioconductor: ```{r, eval=FALSE} if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install('biodbChebi') ``` # Initialization The first step in using *biodbChebi*, is to create an instance of the biodb class `BiodbMain` from the main *biodb* package. This is done by calling the constructor of the class: ```{r, results='hide'} mybiodb <- biodb::newInst() ``` During this step the configuration is set up, the cache system is initialized and extension packages are loaded. We will see at the end of this vignette that the *biodb* instance needs to be terminated with a call to the `terminate()` method. # Creating a connector to ChEBI database In *biodb* the connection to a database is handled by a connector instance that you can get from the factory. biodbChebi implements a connector to a local database, thus when creating an instance you must provide a URL that points to your database: ```{r} chebi <- mybiodb$getFactory()$createConn('chebi') ``` # Requesting entries Using accession numbers, we request entries from ChEBI: ```{r} entries <- chebi$getEntry(c('2528', '17053', '15440')) ``` # Get entry fields Getting the values of entry fields are done using `getFieldValue()` method. Here we retrieve the SMILE field of the first entry from the ChEBI entries obtained previously: ```{r} entries[[1]]$getFieldValue('smiles') ``` # Get data frame We can convert an entry into a data frame: ```{r} entries[[1]]$getFieldsAsDataframe() ``` Building a data frame for a list of entries is also possible: ```{r} mybiodb$entriesToDataframe(entries, fields=c('accession', 'formula', 'molecular.mass', 'inchikey', 'kegg.compound.id')) ``` # Search by name and mass Searching by name and/or mass is done through the `searchCompound()`. Searching by name: ```{r} chebi$searchCompound(name='aspartic', max.results=3) ``` Searching by mass: ```{r} chebi$searchCompound(mass=133, mass.field='molecular.mass', mass.tol=0.3, max.results=3) ``` Searching by name and mass: ```{r} ids <- chebi$searchCompound(name='aspartic', mass=133, mass.field='molecular.mass', mass.tol=0.3, max.results=3) ``` Display results in data frame: ```{r} mybiodb$entriesToDataframe(chebi$getEntry(ids), fields=c('accession', 'molecular.mass', 'name')) ``` # Convert CAS IDs Converting CAS IDs to ChEBI IDs: ```{r} chebi$convCasToChebi(c('87605-72-9', '51-41-2')) ``` If more than one ChEBI ID is found for a CAS ID, then a list of character vectors is returned: ```{r} chebi$convCasToChebi('14215-68-0') ``` This behaviour can be made the default one by setting `simplify` parameter to `FALSE`. # Convert InChI The method is similar to `convCasToChebi()`. Converting InChI to ChEBI IDs: ```{r} chebi$convInchiToChebi('InChI=1S/C8H11NO3/c9-4-8(12)5-1-2-6(10)7(11)3-5/h1-3,8,10-12H,4,9H2/t8-/m0/s1') ``` You can also use an InChI key: ```{r} chebi$convInchiToChebi('MBDOYVRWFFCFHM-SNAWJCMRSA-N') ``` # Closing biodb instance Do not forget to terminate your biodb instance once you are done with it: ```{r} mybiodb$terminate() ``` # Session information ```{r} sessionInfo() ``` # References