useR! 2014
Author: Martin Morgan (mtmorgan@fhcrc.org), Sonali Arora
Date: 30 June, 2014
Language and environment for statistical computing and graphics
factor()
, NA
Vector, class, object
logical
,
integer
, numeric
, complex
, character
, byte
matrix
– atomic vector with 'dim' attributedata.frame
– list of equal length atomic vectorslm()
, belowFunction, generic, method
rnorm(1000)
print()
. print.factor
; methods are invoked indirectly, via the generic.Introspection
class()
, str()
dim()
Help
?print
: help on the generic print ?print.data.frame
: help on print method for objects of class
data.frame.Example
x <- rnorm(1000) # atomic vectors
y <- x + rnorm(1000, sd=.5)
df <- data.frame(x=x, y=y) # object of class 'data.frame'
plot(y ~ x, df) # generic plot, method plot.formula
fit <- lm(y ~x, df) # object of class 'lm'
methods(class=class(fit)) # introspection
## [1] add1.lm* alias.lm* anova.lm*
## [4] case.names.lm* confint.lm cooks.distance.lm*
## [7] deviance.lm* dfbeta.lm* dfbetas.lm*
## [10] drop1.lm* dummy.coef.lm effects.lm*
## [13] extractAIC.lm* family.lm* formula.lm*
## [16] hatvalues.lm* influence.lm* kappa.lm
## [19] labels.lm* logLik.lm* model.frame.lm*
## [22] model.matrix.lm nobs.lm* plot.lm*
## [25] predict.lm print.lm* proj.lm*
## [28] qr.lm* residuals.lm rstandard.lm*
## [31] rstudent.lm* simulate.lm* summary.lm
## [34] variable.names.lm* vcov.lm*
##
## Non-visible functions are asterisked
Analysis and comprehension of high-throughput genomic data
Packages, vignettes, work flows
Objects
getClass()
, showMethods(..., where=search())
,
selectMethod()
method?"substr,<tab>"
to select help on methods, class?D<tab>
for help on classesExample
require(Biostrings) # Biological sequences
data(phiX174Phage) # sample data, see ?phiX174Phage
phiX174Phage
## A DNAStringSet instance of length 6
## width seq names
## [1] 5386 GAGTTTTATCGCTTCCATGAC...ATTGGCGTATCCAACCTGCA Genbank
## [2] 5386 GAGTTTTATCGCTTCCATGAC...ATTGGCGTATCCAACCTGCA RF70s
## [3] 5386 GAGTTTTATCGCTTCCATGAC...ATTGGCGTATCCAACCTGCA SS78
## [4] 5386 GAGTTTTATCGCTTCCATGAC...ATTGGCGTATCCAACCTGCA Bull
## [5] 5386 GAGTTTTATCGCTTCCATGAC...ATTGGCGTATCCAACCTGCA G97
## [6] 5386 GAGTTTTATCGCTTCCATGAC...ATTGGCGTATCCAACCTGCA NEB03
m <- consensusMatrix(phiX174Phage)[1:4,] # nucl. x position counts
polymorphic <- which(colSums(m != 0) > 1)
m[, polymorphic]
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## A 4 5 4 3 0 0 5 2 0
## C 0 0 0 0 5 1 0 0 5
## G 2 1 2 3 0 0 1 4 0
## T 0 0 0 0 1 5 0 0 1
showMethods(class=class(phiX174Phage), where=search())
Exercise
vignette(package="Biostrings")
. Add another argument to the
vignette
function to view the 'BiostringsQuickOverview' vignette.The following code loads some sample data, 6 versions of the phiX174Phage genome as a DNAStringSet object.
library(Biostrings)
data(phiX174Phage)
Explain what the following code does, and how it works
m <- consensusMatrix(phiX174Phage)[1:4,]
polymorphic <- which(colSums(m != 0) > 1)
mapply(substr, polymorphic, polymorphic, MoreArgs=list(x=phiX174Phage))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## Genbank "G" "G" "A" "A" "C" "C" "A" "G" "C"
## RF70s "A" "A" "A" "G" "C" "T" "A" "G" "C"
## SS78 "A" "A" "A" "G" "C" "T" "A" "G" "C"
## Bull "G" "A" "G" "A" "C" "T" "A" "A" "T"
## G97 "A" "A" "G" "A" "C" "T" "G" "A" "C"
## NEB03 "A" "A" "A" "G" "T" "T" "A" "G" "C"
Bioconductor is a large collection of R packages for the analysis and comprehension of high-throughput genomic data. Bioconductor relies on formal classes to represent genomic data, so it is important to develop a rudimentary comfort with classes, including seeking help for classes and methods. Bioconductor uses vignettes to augment traditional help pages; these can be very valuable in illustrating overall package use.