%
% NOTE -- ONLY EDIT THE .Rnw FILE!!! The .tex file is
% likely to be overwritten.
%
% \VignetteIndexEntry{goTools overview}
% \VignetteDepends{tools, GO.db}
% \VignetteSuggest{hgu133a.db}
% \VignetteKeywords{Gene Ontology}
% \VignettePackage{goTools}
\documentclass[11pt]{article}
\usepackage{amsmath,epsfig,fullpage}
%\usepackage[authoryear,round]{natbib}
\usepackage{hyperref}
\newcommand{\Robject}[1]{{\texttt{#1}}}
\newcommand{\Rfunction}[1]{{\texttt{#1}}}
\newcommand{\Rpackage}[1]{{\textit{#1}}}
\parindent 0in
%\bibliographystyle{abbrvnat}
\begin{document}
\title{\bf Getting started with \Rpackage{goTools} package}
\author{Agnes Paquet$^1$ and (Jean) Yee Hwa Yang$^2$}
\maketitle
\begin{center}
1. {\tt paquetagnes@yahoo.com}\\
2. Department of Medicine, University of California, San Francisco, \url{http://www.biostat.ucsf.edu/jean}
\end{center}
% library(tools)
% setwd("c:/MyDoc/Projects/Jean/CVS/goTools/inst/doc")
% setwd("c:/MyDoc/Projects/madman/Rpacks-devel/goTools/inst/doc")
% Rnwfile<- file.path("c:/MyDoc/Projects/madman/Rpacks-devel/goTools/inst/doc","goTools.Rnw")
% Sweave(Rnwfile,pdf=TRUE,eps=TRUE,stylepath=TRUE,driver=RweaveLatex())
\tableofcontents
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Getting started}
This document provides a tutorial for the \Rpackage{goTools} package, which
allows graphical comparisons of functional groups between two sets of
genes.
\paragraph{Installing the package:}
To install the \Rpackage{goTools} package, go
to the Bioconductor installation web site
\url{http://www.bioconductor.org/help/faq} for more detailed instructions.
\paragraph{Help files:}
As with any R package, detailed information on
functions, classes and methods can be obtained in the help files. For
instance, to view the help file for the function {\tt ontoCompare} in a
browser, use {\tt help.start()} followed by {\tt ?ontoCompare}. \\
We demonstrate the functionality with a randomly
selected set of probe IDs from both Affymetrix hgu133a chip
(\Robject{affylist}). To load the \Robject{probeID}
dataset, use {\tt data(probeID)}, and to view a
description of the experiments and data, type {\tt ?probeID}.
\paragraph{Sweave:}
This document was generated using the \Rfunction{Sweave}
function from the R \Rpackage{tools} package. The source file is in the
{\tt /inst.doc} directory of the package \Rpackage{goTools}.
\paragraph{}
To begin, let's load the package and the probeID datasets into your R
session.
<<eval=TRUE,echo=TRUE>>=
library("goTools", verbose=FALSE)
data(probeID)
@
As shown below, \Robject{affylist} is a vector of lists containing 3
list of vectors of
probe ids from Affymetrix hgu133a chip.
<<eval=TRUE, echo=TRUE>>=
class(affylist)
length(affylist)
affylist[[1]][1:5]
@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Graphical comparisons of two sets of genes}
Gene Ontology is a Direct Acyclic Graph (DAG) that provides three structured
networks of defined terms to describe gene product attributes:
Molecular Function (MF), Biological
Process (BP) and Cellular Component (CC). A gene product has one or more
molecular functions and is used in one or more biological processes; it
might be associated with one or more cellular components. To learn more
about GO and DAG, please refer to Gene Ontology web site
\url{http://www.geneontology.org/}.\\
We have created a set of R functions that use GO structure to describe and
compare the composition of sets of genes (or probes). We use the
following algorithm:\\
\begin{enumerate}
\item{Read in a {\bf list} of sets of probe id you want to compare.}
\item{Map each probe id to corresponding ontologies in the GO tree, if any.}
\item{Create the set of GO ids of interest used to compare your
datasets (\Robject{endnode}). The function \Rfunction{EndNodeList()}
will create a set of nodes of the DAG located one level under MF, BP or
CC, but you can use any sets of GO ids.}
\item{For each GO id, go up the GO tree until
reaching the nodes in \Robject{endnode}. Search may be limited to MF, BP
or CC if specified in \Robject{goType}.}
\item{Compute the percentage of direct children found under each node in
\Robject{endnode}.}
\item{Return the results. Plot them if \Robject{plot=TRUE}.}
\end{enumerate}
\subsection{How to use goTools}
The main function that we provide is \Rfunction{ontoCompare}. It takes
as argument {\bf a list} of probe ids. Their type must be specified in
the argument \Robject{probeType}. For more details about it,you can
refer to the corresponding help file by typing:
\Rfunction{?ontoCompare}.
%\subsubsection{Examples}
%The following examples demonstrate how to use \Rfunction{ontoCompare} on the
%\Robject{probeID} dataset.
<<eval=FALSE,echo=TRUE>>=
library(GO.db)
subset=c(L1=list(affylist[[1]][1:5]),L2=list(affylist[[2]][1:5]))
res <-ontoCompare(subset, probeType="hgu133a")
@
\subsubsection{Methods for computing percentages}
\Rfunction{ontoCompare} allows you to choose from 3 different methods to
estimate the percentage of probes under each element of
\Robject{endnode}. The default method is \Robject{TGenes}.
\begin{enumerate}
\item{\Robject{TGenes}: for each end node, return the number of direct
children found / total number of probe ids.\\
This includes oligos which do not have GO annotations.}
\item{\Robject{TIDS}: for each end node, return the number of direct
children found / total number of GO ids describing the list.}
\item{\Robject{none}: for each end node, return the number of direct
children found.}
\end{enumerate}
\subsection{Plotting the results}
The plots are produced using the function \Rfunction{ontoPlot}. It is called by
\Rfunction{ontoCompare} when you set \Robject{plot=TRUE}. You can also call it
directly, passing as argument \Rfunction{ontoCompare}
results. If only one set of genes is passed to
\Rfunction{ontoCompare}, \Rfunction{ontoPlot} will return a pie
chart. In other cases, it will return a bargraph. You can modify
\Rfunction{ontoPlot} layout parameters using usual R graphics layout
parameters. For more details, type \Rfunction{?par}.
<<ontoPlotEg1,fig=TRUE,prefix=FALSE,echo=TRUE,include=FALSE, eval=FALSE>>=
library(GO.db)
subset=c(L1=list(affylist[[1]][1:5]),L2=list(affylist[[2]][1:5]))
res <- ontoCompare(subset, probeType="hgu133a", plot=TRUE)
@
%See Figure \ref{fig:ontoPlotEg}
\section{How to set up the "end nodes"}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Default list}
The default end nodes list is defined by a call to the
function \Rfunction{EndNodeList}. It contains all children of
MF(GO:0003674), BP(GO:0008150) and CC (GO:0005575).
<<eval=TRUE,echo=TRUE>>=
EndNodeList()
@
\subsection{Customized end node list}
If you want to use more ontologies to describe your set of genes, you
can use the function\\
\mbox{\Rfunction{CustomEndNodeList(id,rank)}} to create a bigger
set of end nodes. It returns all GO ids children of \Robject{id} up to
\Robject{rank} levels below \Robject{id}.
<<eval=FALSE,echo=TRUE>>=
MFendnode <- CustomEndNodeList("GO:0003674", rank=2)
@
Finally, the code below shows you how to use a custom end node list, and also
how to modify the \Robject{goType} argument to select only Molecular
Function (MF) ontologies.
<<eval=FALSE,echo=TRUE>>=
res <- ontoCompare(subset, probeType="hgu133a", endnode=MFendnode, goType="MF")
@
You can also create a
list of GO ids of nodes of interest and pass it directly to the
\Robject{endnode} argument in \Rfunction{ontoCompare}. GO ids must be in
the following format:
\mbox{"GO:\it{XXXXXXX}.}"\\
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\begin{figure} %[htpb]
%\begin{center}
%\begin{tabular}{cc}
%\includegraphics[width=3in,height=3in,angle=0]{ontoPlotEg1} %&
%\includegraphics[width=3in,height=3in,angle=0]{ontoPlotEg2} \\
%(a) & (b)\\
%\end{tabular}
%\end{center}
%\caption{Plots obtained using \Rfunction{ontoCompare} on \Robject{probeID} dataset}
%\protect\label{fig:ontoPlotEg}
%\end{figure}
%\bibliography{marrayPacks}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\end{document}