Motivation

Genomic ranges describe…

Packages

library(GenomicRanges)
library(GenomicAlignments)
sessionInfo()
## R version 3.2.0 alpha (2015-03-25 r68090)
## Platform: x86_64-unknown-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.2 LTS
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    parallel  stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.1.2
##  [2] GenomicFeatures_1.19.36                
##  [3] AnnotationDbi_1.29.21                  
##  [4] Biobase_2.27.3                         
##  [5] BSgenome.Hsapiens.UCSC.hg19_1.4.0      
##  [6] BSgenome_1.35.20                       
##  [7] rtracklayer_1.27.11                    
##  [8] GenomicAlignments_1.3.33               
##  [9] Rsamtools_1.19.49                      
## [10] Biostrings_2.35.12                     
## [11] XVector_0.7.4                          
## [12] GenomicRanges_1.19.52                  
## [13] GenomeInfoDb_1.3.16                    
## [14] IRanges_2.1.43                         
## [15] S4Vectors_0.5.22                       
## [16] BiocGenerics_0.13.11                   
## [17] BiocStyle_1.5.3                        
## 
## loaded via a namespace (and not attached):
##  [1] knitr_1.9            zlibbioc_1.13.3      BiocParallel_1.1.21 
##  [4] stringr_0.6.2        tools_3.2.0          DBI_0.3.1           
##  [7] lambda.r_1.1.7       futile.logger_1.4    htmltools_0.2.6     
## [10] yaml_2.1.13          digest_0.6.8         formatR_1.1         
## [13] futile.options_1.0.0 bitops_1.0-6         biomaRt_2.23.5      
## [16] RCurl_1.95-4.5       RSQLite_1.0.0        evaluate_0.5.5      
## [19] rmarkdown_0.5.1      XML_3.98-1.1

Use cases

GRanges: simple genomic ranges

GRangesList: nested genomic ranges

Range-based operations

Intra-range operations

Inter-range operations

Between-object

PLoS Comput Biol 9(8): e1003118

Working with Bioconductor classes and methods

What can I do with my GRanges instance?

methods(class="GRanges")
##   [1] !=                  $                   $<-                
##   [4] %in%                <                   <=                 
##   [7] ==                  >                   >=                 
##  [10] BamViews            NROW                Ops                
##  [13] ROWNAMES            ScanBamParam        ScanBcfParam       
##  [16] [                   [<-                 aggregate          
##  [19] anyNA               append              as.character       
##  [22] as.complex          as.data.frame       as.env             
##  [25] as.integer          as.list             as.logical         
##  [28] as.numeric          as.raw              bamWhich<-         
##  [31] blocks              browseGenome        c                  
##  [34] chrom               chrom<-             coerce             
##  [37] coerce<-            compare             countOverlaps      
##  [40] coverage            disjoin             disjointBins       
##  [43] distance            distanceToNearest   duplicated         
##  [46] elementMetadata     elementMetadata<-   end                
##  [49] end<-               eval                export             
##  [52] extractROWS         extractUpstreamSeqs findOverlaps       
##  [55] fixedColumnNames    flank               follow             
##  [58] gaps                getPromoterSeq      granges            
##  [61] head                high2low            intersect          
##  [64] isDisjoint          length              liftOver           
##  [67] mapCoords           mapFromAlignments   mapFromTranscripts 
##  [70] mapToAlignments     mapToTranscripts    match              
##  [73] mcols               mcols<-             metadata           
##  [76] metadata<-          mstack              names              
##  [79] names<-             narrow              nearest            
##  [82] order               overlapsAny         parallelSlotNames  
##  [85] pgap                pintersect          pmapCoords         
##  [88] pmapFromAlignments  pmapFromTranscripts pmapToAlignments   
##  [91] pmapToTranscripts   precede             promoters          
##  [94] psetdiff            punion              range              
##  [97] ranges              ranges<-            rank               
## [100] reduce              relist              relistToClass      
## [103] rename              rep                 rep.int            
## [106] replaceROWS         resize              restrict           
## [109] rev                 rowRanges<-         scanFa             
## [112] scanTabix           score               score<-            
## [115] seqinfo             seqinfo<-           seqlevelsInUse     
## [118] seqnames            seqnames<-          setdiff            
## [121] shift               shiftApply          show               
## [124] showAsCell          sort                split              
## [127] split<-             start               start<-            
## [130] strand              strand<-            subset             
## [133] subsetByOverlaps    summarizeOverlaps   table              
## [136] tail                tapply              tile               
## [139] trim                union               unique             
## [142] update              updateObject        values             
## [145] values<-            width               width<-            
## [148] window              window<-            with               
## [151] xtfrm              
## see '?methods' for accessing help and source code

What type of object(s) can I use findOverlaps() on (what methods exist for the findOverlaps() generic)?

methods(findOverlaps)
##  [1] findOverlaps,GAlignmentPairs,GAlignmentPairs-method          
##  [2] findOverlaps,GAlignmentPairs,Vector-method                   
##  [3] findOverlaps,GAlignments,GAlignments-method                  
##  [4] findOverlaps,GAlignments,Vector-method                       
##  [5] findOverlaps,GAlignmentsList,GAlignmentsList-method          
##  [6] findOverlaps,GAlignmentsList,Vector-method                   
##  [7] findOverlaps,GNCList,GenomicRanges-method                    
##  [8] findOverlaps,GRangesList,GRangesList-method                  
##  [9] findOverlaps,GRangesList,GenomicRanges-method                
## [10] findOverlaps,GRangesList,RangedData-method                   
## [11] findOverlaps,GRangesList,RangesList-method                   
## [12] findOverlaps,GenomicRanges,GIntervalTree-method              
## [13] findOverlaps,GenomicRanges,GNCList-method                    
## [14] findOverlaps,GenomicRanges,GRangesList-method                
## [15] findOverlaps,GenomicRanges,GenomicRanges-method              
## [16] findOverlaps,GenomicRanges,RangedData-method                 
## [17] findOverlaps,GenomicRanges,RangesList-method                 
## [18] findOverlaps,NCList,Ranges-method                            
## [19] findOverlaps,RangedData,GRangesList-method                   
## [20] findOverlaps,RangedData,GenomicRanges-method                 
## [21] findOverlaps,RangedData,RangedData-method                    
## [22] findOverlaps,RangedData,RangesList-method                    
## [23] findOverlaps,Ranges,IntervalTree-method                      
## [24] findOverlaps,Ranges,NCList-method                            
## [25] findOverlaps,Ranges,Ranges-method                            
## [26] findOverlaps,RangesList,GRangesList-method                   
## [27] findOverlaps,RangesList,GenomicRanges-method                 
## [28] findOverlaps,RangesList,IntervalForest-method                
## [29] findOverlaps,RangesList,RangedData-method                    
## [30] findOverlaps,RangesList,RangesList-method                    
## [31] findOverlaps,SummarizedExperiment,SummarizedExperiment-method
## [32] findOverlaps,SummarizedExperiment,Vector-method              
## [33] findOverlaps,Vector,GAlignmentPairs-method                   
## [34] findOverlaps,Vector,GAlignments-method                       
## [35] findOverlaps,Vector,GAlignmentsList-method                   
## [36] findOverlaps,Vector,SummarizedExperiment-method              
## [37] findOverlaps,Vector,Views-method                             
## [38] findOverlaps,Vector,ViewsList-method                         
## [39] findOverlaps,Vector,missing-method                           
## [40] findOverlaps,Views,Vector-method                             
## [41] findOverlaps,Views,Views-method                              
## [42] findOverlaps,ViewsList,Vector-method                         
## [43] findOverlaps,ViewsList,ViewsList-method                      
## [44] findOverlaps,integer,Ranges-method                           
## see '?methods' for accessing help and source code

How can I get help on functions, generics, and methods?

?"findOverlaps"          ## generic
?"findOverlaps,<tab>"    ## specific method

Other help?

Important parts of the sequence class menagerie

GAlignments and friends (GenomicAlignments)

DNAString and DNAStringSet (Biostrings)

SummarizedExperiment (GenomicRanges)

TxDb (AnnotationDb)

VCF (VariantAnnotation)

Lower-level classes

Deeper understanding

Classes and class hierarchies

R works efficiently on vectors

getClass("GRanges")
## Class "GRanges" [package "GenomicRanges"]
## 
## Slots:
##                                                                       
## Name:         seqnames          ranges          strand elementMetadata
## Class:             Rle         IRanges             Rle       DataFrame
##                                       
## Name:          seqinfo        metadata
## Class:         Seqinfo            list
## 
## Extends: 
## Class "GenomicRanges", directly
## Class "Vector", by class "GenomicRanges", distance 2
## Class "GenomicRangesORmissing", by class "GenomicRanges", distance 2
## Class "GenomicRangesORGRangesList", by class "GenomicRanges", distance 2
## Class "GenomicRangesORGenomicRangesList", by class "GenomicRanges", distance 2
## Class "RangedDataORGenomicRanges", by class "GenomicRanges", distance 2
## Class "Annotated", by class "GenomicRanges", distance 3

Vector and Annotated

List-like

Implementation: Vector plus partitioning

Practical

1. Exon and transcript characterization

Ingredients

Goals

2. GC content

Ingredients - BSgenome.Hsapiens.UCSC.hg19 BSGenome package - TxDb.Hsapiens.UCSC.hg19 TxDb package - ?"getSeq,BSgenome-method", letterFrequency()

Goapls

3. CpG islands

Ingredients

Goal

4. Aligned reads

Ingredients

Goals