Martin Morgan (mtmorgan@fhcrc.org), Fred Hutchinson Cancer Research, Center, Seattle, WA, USA.
24 August 2014
Part I
Part II
Part III
1                # vector of length 1
[1] 1
c(1, 1, 2, 3, 5) # vector of length 5
[1] 1 1 2 3 5
c(TRUE, FALSE), integer, numeric, complex, character 
c("A", "beta")list(c(TRUE, FALSE), c("A", "beta"))factor, NAAssignment and names
x <- c(1, 1, 2, 3, 5)
y = c(5, 5, 3, 2, 1)
z <- c(Female=12, Male=3)
= and <- are the sameOperations
x + y        # vectorized
[1] 6 6 5 5 6
x / 5        # ...recylcing
[1] 0.2 0.2 0.4 0.6 1.0
x[c(3, 1)]   # subset
[1] 2 1
Examples: c(), concatenate values; rnorm(), generate random normal deviates; plot()
x <- rnorm(1000)    # 1000 normal deviates
y <- x + rnorm(1000, sd = 0.5)
args(rnorm)
function (n, mean = 0, sd = 1) 
NULL
plot(x, y)
 
formula: another way plot(y ~ x)Within R
?rnorm
Rstudio
Main sections
Motivation: manipulate complicated data
x and y from previous example are related to one another –
same length, element i of y is a transformation of element i of xSolution: a “data frame” to coordinate access
df <- data.frame(X=x, Y=y)
head(df, 3)
        X       Y
1 -1.3692 -1.2625
2  1.9072  2.6103
3 -0.5395 -0.5987
class(df) # plain function
[1] "data.frame"
dim(df)   # generic & method for data.frame
[1] 1000    2
head(df$X, 4)  # column access
[1] -1.3692  1.9072 -0.5395 -1.3264
## create or update 'Z'
df$Z <- sqrt(abs(df$Y))
## subset rows and / or columns
head(df[df$X > 0, c("X", "Z")])
         X      Z
2  1.90720 1.6156
5  0.02705 0.5804
6  0.18376 0.5624
8  0.04149 0.2101
9  0.96177 0.3850
14 0.48720 1.0353
plot(Y ~ X, df) # Y ~ X, values from 'df'
## lm(): linear model, returns class 'lm'
fit <- lm(Y ~ X, df)
abline(fit)  # plot regression line
 
anova(fit)  
Analysis of Variance Table
Response: Y
           Df Sum Sq Mean Sq F value Pr(>F)
X           1   1042    1042    4417 <2e-16
Residuals 998    235       0               
X         ***
Residuals    
---
Signif. codes:  
  0 '***' 0.001 '**' 0.01 '*' 0.05 '.'
  0.1 ' ' 1
fit: object of class lmanova(): generic, with method for for class fitmethods(anova)
[1] anova.glm*     anova.glmlist*
[3] anova.lm*      anova.lmlist* 
[5] anova.loess*   anova.mlm*    
[7] anova.nls*    
   Non-visible functions are asterisked
## class of object
class(fit)
## method discovery
methods(class=class(fit))
methods(anova)
## help on generic, and specific method
?anova
?anova.lm
Installed
length(rownames(installed.packages()))
[1] 227
Available
'Attached' (installed and available for use):
search()            # attached packages
ls("package:stats") # functions in 'stats'
Attaching (make installed package available for use)
library(ggplot2)
Installing CRAN or Bioconductor packages
source("http://bioconductor.org/biocLite.R")
biocLite("GenomicRanges")
Packages
Best bet
R
[R];
R-help mailing listBioconductor
Funding
People