\documentclass{article}

%\VignetteIndexEntry{Intrudoction_to_caOmicsV}
%\VignetteDepends{caOmicsV}
%\VignetteKeyword{bioinformatics}
%\VignetteKeyword{genomics}
%\VignetteKeyword{caOmicsV}
%\VignettePackage{caOmicsV}

\usepackage{graphicx}
\setkeys{Gin}{width=0.9\textwidth}
\usepackage{hyperref}

\begin{document}
\SweaveOpts{concordance=TRUE}

\title{Intrudoction to caOmicsV}
\author{Hongen Zhang, Ph.D.\\
Genetics Branch, Center for Cancer Research,\\
National Cancer Institute, NIH}
\date{March 10, 2015}
\maketitle

\tableofcontents

\section{Introduction}
Translational genomics research in cancer, e.g., International Cancer Genome 
Consortium (ICGC) and The Cancer Genome Atlas (TCGA), has generated large 
multidimensional datasets from high-throughput technologies. Multidimensional 
data offers great promise to improve clinical applications of genomic
information in diagnosis, prognosis and therapeutics of cancers. Tools to 
effectively visualize integrated multidimensional data are important for 
understanding and describing the relationship between genomic variations and
cancers. The caOmicsV package provides methods to visualize multidimensional 
cancer genomic data in two layouts: a heatmap-like matrix layout, bioMatrix, 
and circular layouts superimposed on a biological network or graph, 
bioNetCircos. \\

The data that could be plotted with each layout is listed below: 

\begin{itemize}
\item
Clinical (phenotypes) data such as gender, tissue type, and diagnosis, plotted 
as colored rectangles
\item
Expression data such as RNASeq and miRNASeq, plotted as heatmap 
\item
Category data such as DNA methylation status, plotted as colored box outlines 
on bioMatrix layout and bars on bioNetCircos layout
\item
Binary data such as mutation status and DNA copy number variations, plotted as 
colored points
\item
Text labelling, for gene names, sample names, summary in text format
\end{itemize}

In addition, link lines can also be plotted on bioNetCircos layout to show the 
relationship between two samples. \\

For bioNetCircos layout, igraph and bc3net packages must be installed first.\\

\section{An Quick Demo}

Following code will generate a bioMatrix layout image with the build-in demo 
data.

<<caOmicsVBioMatrixLayoutDemo, fig=TRUE>>=

library(caOmicsV)
data(biomatrixPlotDemoData)

plotBioMatrix(biomatrixPlotDemoData, summaryType="text")
bioMatrixLegend(heatmapNames=c("RNASeq", "miRNASeq"), 
    categoryNames=c("Methyl H", "Methyl L"), 
    binaryNames=c("CN LOSS", "CN Gain"),   
    heatmapMin=-3, heatmapMax=3, colorType="BlueWhiteRed")
@

Figure 1. caOmicsV bioMatrix layout plot\\

Run the code below will get a bioNetCircos layout image with the build-in 
demo data.

<<caOmicsVbionetCircosLayoutDemo, fig=TRUE>>=
library(caOmicsV)
data(bionetPlotDemoData)

plotBioNetCircos(bionetPlotDemoData)
dataNames <- c("Tissue Type", "RNASeq", "miRNASeq", "Methylation", "CNV")
bioNetLegend(dataNames, heatmapMin=-3, heatmapMax=3)
@

Figure 2. caOmicsV bioNetCircos layout plot\\

\section{Making Plot Data Set}

To use the default plot function shown as above, the first step is making an 
plot data set to hold all datasets in a list object. Two demo data sets are 
included in the package installation and could be explored with:

<<caOmicsVDemoData>>=
library(caOmicsV)
data(biomatrixPlotDemoData)
names(biomatrixPlotDemoData)

data(bionetPlotDemoData)
names(bionetPlotDemoData)
@ 

caOmicsV package has a function getPlotDataSet() to make the plot data set 
as above. The input data to pass to the function are as below:

\begin{itemize}
\item
sampleNames: required, character vector with names of samples to plot
\item
geneNames: required, character vector with names of genes to plot
\item
sampleData: required, data frame with rows for samples and columns for 
features. The first column must be sample names same as the sampleNames above 
in same order
\item
heatmapData: list of data frames (maximum 2), continue numeric data such 
as gene expression data. 
\item
categoryData: list of data frames(maximum 2), categorical data such as 
methylation High, Low, NO. 
\item
binaryData: list of data frames(maximum 3), binary data such as 0/1. 
\item
summaryData: list of data frames (maximum 2), summarization for genes or for 
samples. 
\item
secondGeneNames: names of second set of genes, e.g. miRNA names,  to label 
genes on right side of matrix layout.
\end{itemize}

All genomic data must be held with data frame in the format of rows for genes 
and columns for samples. The first column of each data frame must be gene 
names same as the geneNames as above in same order. The column names of each 
data frame must be sample names same as sampleNames as above in same order. 
caOmicsV package contains functions to sort data frame for given order and 
several supplement functions are also included in the package to help extract 
required data set from big datasets by supplying required gene names and 
sample names in a given order.\\

To make caOmicsV plot, the plot data set must contains at least one genomic 
data (one of heatmap data, category data, or binary data)

\section{Plot bioMatrix Layout Manually}

The default plot method, plotBiomatrix(), is a convenient and efficient way to 
make bioMatirx plot. In case of necessary, users can follow procedures below 
to generate bioMatrix layout plot manually.\\

1.  Demo data

<<bioMatrixDemoData>>=
library(caOmicsV)
data(biomatrixPlotDemoData)
dataSet <- biomatrixPlotDemoData
names(dataSet)
@

2.  Initialize bioMatrix Layout

<<InitializeBioMatrix>>=
numOfGenes <- length(dataSet$geneNames);
numOfSamples <- length(dataSet$sampleNames); 
numOfPhenotypes <- nrow(dataSet$sampleInfo)-1;

numOfHeatmap <- length(dataSet$heatmapData);
numOfSummary <- length(dataSet$summaryData);
phenotypes   <- rownames(dataSet$sampleInfo)[-1];

sampleHeight <- 0.4;
sampleWidth <- 0.1; 
samplePadding <- 0.025;
geneNameWidth <- 1;
sampleNameHeight <- 2.5;
remarkWidth <- 2; 
summaryWidth <- 1;
rowPadding <- 0.1;

initializeBioMatrixPlot(numOfGenes, numOfSamples, numOfPhenotypes, 
    sampleHeight, sampleWidth, samplePadding, rowPadding, 
    geneNameWidth, remarkWidth, summaryWidth, sampleNameHeight)
caOmicsVColors <- getCaOmicsVColors()

png("caOmicsVbioMatrixLayoutDemo.png", height=8, width=12, 
        unit="in", res=300, type="cairo")
par(cex=0.75)
showBioMatrixPlotLayout(dataSet$geneNames,dataSet$sampleNames, phenotypes)
@

3.  Plot tissue types on phenotype area

<<PlotBioMatrixPhenotype>>=
head(dataSet$sampleInfo)[,1:3]
rowIndex <- 2;

sampleGroup <- as.character(dataSet$sampleInfo[rowIndex,])
sampleTypes <- unique(sampleGroup)
sampleColors <- rep("blue", length(sampleGroup));
sampleColors[grep("Tumor", sampleGroup)] <- "red"

rowNumber <- 1
areaName <- "phenotype"
plotBioMatrixSampleData(rowNumber, areaName, sampleColors);

geneLabelX <- getBioMatrixGeneLabelWidth()
maxAreaX <- getBioMatrixDataAreaWidth()
legendH <- getBioMatrixLegendHeight()
plotAreaH <- getBioMatrixPlotAreaHeigth()
sampleH<- getBioMatrixSampleHeight()

sampleLegendX <- geneLabelX + maxAreaX
sampleLegendY <- plotAreaH + legendH - length(sampleTypes)*sampleH
colors <- c("blue", "red")
legend(sampleLegendX, sampleLegendY, legend=sampleTypes, 
    fill=colors,  bty="n", xjust=0)
@

4.  Heatmap plot 

<<PlotBioMatrixHeatmap>>=
heatmapData <- as.matrix(dataSet$heatmapData[[1]][,]);
plotBioMatrixHeatmap(heatmapData, maxValue=3, minValue=-3)

heatmapData <- as.matrix(dataSet$heatmapData[[2]][,])
plotBioMatrixHeatmap(heatmapData, topAdjust=sampleH/2,  
    maxValue=3, minValue=-3);

secondNames <- as.character(dataSet$secondGeneNames)
textColors <- rep(caOmicsVColors[3], length(secondNames));
plotBioMatrixRowNames(secondNames, "omicsData", textColors, 
    side="right", skipPlotColumns=0);
@

5.  Draw outline for each samples to show methylation status.

<<PlotBioMatrixCategoryData>>=
categoryData <- dataSet$categoryData[[1]]
totalCategory <- length(unique(as.numeric(dataSet$categoryData[[1]])))

plotColors <- rev(getCaOmicsVColors())
plotBioMatrixCategoryData(categoryData, areaName="omicsData", 
    sampleColors=plotColors[1:totalCategory])
@

6.  Binary data plot 

<<PlotBioMatrixBinaryData>>=
binaryData <- dataSet$binaryData[[1]];
plotBioMatrixBinaryData(binaryData, sampleColor=caOmicsVColors[4]);

binaryData <- dataSet$binaryData[[2]];
plotBioMatrixBinaryData(binaryData, sampleColor=caOmicsVColors[3])
@

7.  Plot summary data on right side of plot area

<<PlotBioMatrixSummaryData>>=
summaryData  <- dataSet$summaryInfo[[1]][, 2];
summaryTitle <- colnames(dataSet$summaryInfo[[1]])[2];

remarkWidth <- getBioMatrixRemarkWidth();
sampleWidth <- getBioMatrixSampleWidth();
col2skip <- remarkWidth/2/sampleWidth + 2;

plotBioMatrixRowNames(summaryTitle, areaName="phenotype", 
    colors="black", side="right", skipPlotColumns=col2skip);

plotBioMatrixRowNames(summaryData, "omicsData", 
    colors=caOmicsVColors[3], side="right", 
    skipPlotColumns=col2skip)
@

8.  Add legend

<<AddBioMatrixLegend>>=
bioMatrixLegend(heatmapNames=c("RNASeq", "miRNASeq"), 
    categoryNames=c("Methyl H", "Methyl L"), 
    binaryNames=c("CN LOSS", "CN Gain"),   
    heatmapMin=-3, heatmapMax=3, 
    colorType="BlueWhiteRed")
dev.off()
@

Run code above should generate an image same as Figure 1. 


\section{Plot bioNetCircos Layout Manually}

With default bioNetCircos layout plot method, the node layout and labelling 
rely on the igraph package and sometimes the node layout and labelling may not 
be in desired location. In that case, it is recommended to manually check out 
the layout first then plot each item. \\

Following are basic procedures to make a bioNetCircos plot:\\

1.  Demo data

<<BioNetCircosDemoData>>=
library(caOmicsV)

data(bionetPlotDemoData)
dataSet <- bionetPlotDemoData

sampleNames  <- dataSet$sampleNames
geneNames    <- dataSet$geneNames
numOfSamples <- length(sampleNames)

numOfSampleInfo <- nrow(dataSet$sampleInfo) - 1
numOfSummary <- ifelse(dataSet$summaryByRow, 0, col(dataSet$summaryInfo)-1)
numOfHeatmap <- length(dataSet$heatmapData)
numOfCategory <- length(dataSet$categoryData)
numOfBinary <- length(dataSet$binaryData)

expr <- dataSet$heatmapData[[1]]
bioNet <- bc3net(expr) 
@

2.  Initialize bioNetCircos layout

<<InitializeBioNetCircos>>=
widthOfSample    <- 100
widthBetweenNode <- 3
lengthOfRadius   <- 10

dataNum <- sum(numOfSampleInfo, numOfSummary, numOfHeatmap, 
    numOfCategory, numOfBinary)
trackheight <- 1.5
widthOfPlotArea  <- dataNum*2*trackheight

initializeBioNetCircos(bioNet, numOfSamples, widthOfSample, 
    lengthOfRadius, widthBetweenNode, widthOfPlotArea)
caOmicsVColors <- getCaOmicsVColors()
supportedType <- getCaOmicsVPlotTypes()

par(cex=0.75)
showBioNetNodesLayout()
@

3.  Manually label each node \\

At this point, each node has its index labelled. Manually check out the 
desired location for node name (gene) labelling then label each node (the 
node index here may be different from your graph).\\

<<BioNetCircosNodeLayout>>=
par(cex=0.6)
onTop <- c(14, 15, 16, 9, 7, 20, 8, 24, 10, 25)
labelBioNetNodeNames(nodeList=onTop,labelColor="blue", 
    labelLocation="top", labelOffset = 0.7)

onBottom <- c(26, 22, 23, 18, 19, 3, 5)
labelBioNetNodeNames(nodeList=onBottom,labelColor="black", 
    labelLocation="bottom", labelOffset = 0.7)

onLeft <- c(2, 11, 21, 17)
labelBioNetNodeNames(nodeList=onLeft,labelColor="red", 
    labelLocation="left", labelOffset = 0.7)

onRight <- c(13, 12, 4, 1, 6)
labelBioNetNodeNames(nodeList=onRight,labelColor="brown", 
    labelLocation="right", labelOffset = 0.7)
@

Once all node names are labelled correctly, plot area of each node could be 
erased for plotting.

<<EraseBioNetCircosNodes>>=
eraseBioNetNode()
@

4.  Plot each data set

<<BioNetCircosBoundary>>=
inner <- lengthOfRadius/2
outer <- inner +  trackheight
@

Plot tissue type for each node. Repeat this step if there are more than one 
clinical features.

<<PlotBioNetCircosPhenotype>>=
groupInfo <- as.character(dataSet$sampleInfo[2, ])
sampleColors <- rep("blue", numOfSamples);
sampleColors[grep("Tumor", groupInfo)] <- "red"

plotType=supportedType[1]
groupInfo <- matrix(groupInfo, nrow=1)
bioNetCircosPlot(dataValues=groupInfo, 
    plotType, outer, inner, sampleColors)

inner <- outer + 0.5
outer <- inner +  trackheight  
@

Heatmap plot for each node. Repeat this step if there are more heatmap data. 

<<PlotBioNetCircosHeatmap>>=
exprData <- dataSet$heatmapData[[1]]
plotType <- supportedType[4] 
bioNetCircosPlot(exprData, plotType, outer, inner, 
    plotColors="BlueWhiteRed", maxValue=3, minValue=-3)

inner <- outer + 0.5 
outer <- inner +  trackheight
@

Category data plot for each node. Repeat this step if there are more category 
datasets.

<<PlotBioNetCircosCategoryData>>=
categoryData <- dataSet$categoryData[[1]]
plotType <- supportedType[2];
bioNetCircosPlot(categoryData, plotType, 
    outer, inner, plotColors="red")
inner <- outer + 0.5
outer <- inner +  trackheight
@ 

Binary data plot for each node. Repeat this step if there are more binary 
datasets

<<PlotBioNetCircosBinaryData>>=
binaryData <- dataSet$binaryData[[1]]
plotType <- supportedType[3]
plotColors <- rep(caOmicsVColors[1], ncol(binaryData))
bioNetCircosPlot(binaryData, plotType, 
    outer, inner, plotColors)

inner <- outer + 0.5
outer <- inner +  trackheight 
@

Link samples on a node. Repeat this step for each node when needed.

<<LinkBioNetCircosSamples>>=
outer <- 2.5
bioNetGraph <- getBioNetGraph()
nodeIndex <- which(V(bioNetGraph)$name=="PLVAP")

fromSample <- 10 
toSample <- 50 
plotColors <- "red"
linkBioNetSamples(nodeIndex, fromSample, 
    toSample, outer, plotColors)

fromSample <- 40 
toSample <- 20 
plotColors <- "blue"
linkBioNetSamples(nodeIndex, fromSample, 
    toSample, outer, plotColors)
@

5.  Add legend

<<AddBioNetCircosLegend>>=
dataNames <- c("Tissue Type", "RNASeq", "Methylation", "CNV")
bioNetLegend(dataNames, heatmapMin=-3, heatmapMax=3)
@

The output from the code above should be same as Figure 2.\\


\section{sessionInfo}
<<sessionInfo>>=
sessionInfo()
@ 
\end{document}