%\VignetteIndexEntry{EnrichmentBrowser Manual}
%\VignetteDepends{EnrichmentBrowser}
%\VignettePackage{EnrichmentBrowser}
%\VignetteEngine{utils::Sweave}

\documentclass[a4paper]{article}

<<style, eval=TRUE, echo=FALSE, results=tex>>=
BiocStyle::latex()
@

\usepackage{graphicx, subfig}
\usepackage[utf8]{inputenc} 

\bioctitle[EnrichmentBrowser]{EnrichmentBrowser: seamless navigation through combined \\
        results of set-based and network-based enrichment analysis}
\author{Ludwig Geistlinger} 

\begin{document}

\maketitle
\tableofcontents

\section{Introduction}
The \Biocpkg{EnrichmentBrowser} package implements essential functionality for
the enrichment analysis of gene expression data.
The analysis combines the advantages of set-based and network-based enrichment 
analysis to derive high-confidence gene sets and biological pathways 
that are differentially regulated in the expression data under investigation.
Besides, the package facilitates the visualization and exploration of such sets 
and pathways.\\
The following instructions will guide you through an end-to-end expression data
analysis workflow including:

\begin{enumerate}
\item Preparing the data
\item Preprocessing of the data
\item Differential expression (DE) analysis
\item Defining gene sets of interest
\item Executing individual enrichment methods
\item Combining the results of different methods
\item Visualize and explore the results
\end{enumerate}

All of these steps are modular, i.e.~each step can be executed individually and 
fine-tuned with several parameters. In case you are interested in a 
particular step, you can directly move on to the respective section.
For example, if you have differential expression already calculated for each gene, 
and your are now interested whether certain gene functions are enriched for differential expression, 
section \emph{Set-based enrichment analysis} would be the one you should go for. 
The last section \emph{Putting it all together} also demonstrates how to wrap 
the whole workflow into a single function, making use of suitably chosen 
defaults.

\section{Reading expression data from file}
Typically, the expression data is not already available in \R{} but 
rather has to be read in from file. 
This can be done using the function \Rfunction{read.eset}, which reads 
the expression data (\Rfunction{exprs}) along with the phenotype data 
(\Rfunction{pData}) and feature data (\Rfunction{fData}) into an 
\Rclass{ExpressionSet}. 
<<read-eset>>=
library(EnrichmentBrowser)
data.dir <- system.file("extdata", package="EnrichmentBrowser")
exprs.file <- file.path(data.dir, "exprs.tab")
pdat.file <- file.path(data.dir, "pData.tab")
fdat.file <- file.path(data.dir, "fData.tab")
eset <- read.eset(exprs.file, pdat.file, fdat.file)
@

The man pages provide details on file format and the \Rclass{ExpressionSet} 
data structure.

<<help, eval=FALSE>>=
?read.eset
?ExpressionSet
@

\newpage

\section{Types of expression data}

The two major data types processed by the \Biocpkg{EnrichmentBrowser} are 
microarray (intensity measurements) and RNA-seq (read counts) data.

\subsection{Microarray data}

To demonstrate the functionality of the package for microarray data, we 
consider expression measurements of patients suffering from acute lymphoblastic
 leukemia \cite{Chiaretti04}. A frequent chromosomal defect found among these 
patients is a translocation, in which parts of chromosome 9 and 22 swap places. 
This results in the oncogenic fusion gene BCR/ABL created by positioning the 
ABL1 gene on chromosome 9 to a part of the BCR gene on chromosome 22.\\
We load the \Biocpkg{ALL} dataset
<<load-ALL>>=
library(ALL)
data(ALL)
@
and select B-cell ALL patients with and without the BCR/ABL fusion as it has 
been described previously \cite{Scholtens05}.
<<subset-ALL>>=
ind.bs <- grep("^B", ALL$BT)
ind.mut <- which(ALL$mol.biol %in% c("BCR/ABL", "NEG"))
sset <- intersect(ind.bs, ind.mut)
all.eset <- ALL[, sset]
@
We can now access the expression values, which are intensity measurements 
on a log-scale for 12,625 probes (rows) across 79 patients (columns).

<<show-ALL>>=
dim(all.eset)
exprs(all.eset)[1:4,1:4]
@

As we often have more than one probe per gene, we compute gene expression values
as the average of the corresponding probe values.
<<probe2gene>>=
all.eset <- probe.2.gene.eset(all.eset)
head(featureNames(all.eset))
@
Note, that the mapping from probe to gene is done automatically as long as 
as you have the corresponding annotation package, here the 
\Biocpkg{hgu95av2.db} package, installed. Otherwise, the mapping can be 
defined in the \Rfunction{fData} slot.
<<show-probe2gene>>=
head(fData(eset))
@

\subsection{RNA-seq data}
To demonstrate the functionality of the package for RNA-seq data, we 
consider transcriptome profiles of four primary human airway smooth muscle cell 
lines in two conditions: control and treatment with dexamethasone \cite{Himes14}.

We load the \Biocpkg{airway} dataset
<<load-airway>>=
library(airway)
data(airway)
@

and create the  \Rclass{ExpressionSet}. 
For further analysis, we remove genes with very low read counts and 
measurements that are not mapped to an ENSEMBL gene ID.

<<preproc-airway>>=
expr <- assays(airway)[[1]]
expr <- expr[grep("^ENSG", rownames(expr)),]
expr <- expr[rowMeans(expr) > 10,]
air.eset <- new("ExpressionSet", exprs=expr, annotation="hsa")
dim(air.eset)
exprs(air.eset)[1:4,1:4]
@

\newpage

\section{Normalization}

Normalization of high-throughput expression data is essential to make results 
within and between experiments comparable.
Microarray (intensity measurements) and RNA-seq (read counts) data
exhibit typically distinct features that need to be normalized for.
The function \Rfunction{normalize} wraps commonly used functionality from 
\Biocpkg{limma} for microarray normalization and from \Biocpkg{EDASeq} for 
RNA-seq normalization.
For specific needs that deviate from these standard normalizations, the user
should always refer to more specific functions/packages.\\
Microarray data is expected to be single-channel. For two-color arrays, it is 
expected that normalization within arrays has been already carried out, 
e.g.~using \Rfunction{normalizeWithinArrays} from \Biocpkg{limma}.

A default quantile normalization based on
\Rfunction{normalizeBetweenArrays} from \Biocpkg{limma} can be carried out via

<<norm-ma>>=
before.norm <- exprs(all.eset)
all.eset <- normalize(all.eset, norm.method="quantile")
after.norm <- exprs(all.eset)
@

\begin{center}
<<plot-norm, fig=TRUE, width=12, height=6>>=
par(mfrow=c(1,2))
boxplot(before.norm)
boxplot(after.norm)
@
\end{center}

Note that this is only done for demonstration, as the ALL data has 
been already RMA-normalized by the authors of the ALL dataset.\\  
RNA-seq data is expected to be raw read counts. 
Note that normalization for downstream DE analysis, e.g.~with 
\Biocpkg{edgeR} and \Biocpkg{DESeq2}, is not ultimately necessary 
(and in some cases even discouraged) as many of these tools implement specific 
normalization approaches themselves.
See the vignette of \Biocpkg{EDASeq}, \Biocpkg{edgeR}, and \Biocpkg{DESeq2} for
details.
In case normalization is desired, between-lane normalization to adjust for 
sequencing depth can be carried out as demonstrated for microarray data.

<<norm-rseq>>=
norm.air <- normalize(air.eset, norm.method="quantile")
@

Within-lane normalization to adjust for gene-specific effects such as gene 
length and GC-content requires to retrieve this information first, 
e.g.~from BioMart or specific \Bioconductor annotation packages.
Both modes are implemented in the \Biocpkg{EDASeq} function \Rfunction{getGeneLengthAndGCContent}.

<<lgc, eval=FALSE>>=
ids <- head(featureNames(air.eset))
lgc <- EDASeq::getGeneLengthAndGCContent(ids, org="hsa", mode="biomart") 
@

Using precomputed information, normalization within and between lanes can be carried out via

<<norm2-rseq>>=
lgc.file <- file.path(data.dir, "air_lgc.tab")
fData(air.eset) <- read.delim(lgc.file)
norm.air <- normalize(air.eset, within=TRUE)
@

\newpage

\section{Differential expression}

Differential expression analysis between sample groups can be performed using 
the function \Rfunction{de.ana}.
As a prerequisite, the phenotype data should contain a binary group assignment 
for each patient.
For the ALL dataset, this indicates whether the BCR-ABL gene fusion is 
present (1) or not (0). 

<<sample-groups-ALL>>=
pData(all.eset)$GROUP <- ifelse(all.eset$mol.biol == "BCR/ABL", 1, 0)
table(pData(all.eset)$GROUP)
@

For the airway dataset, this indicates whether the cell lines have been treated 
with dexamethasone (1) or not (0).

<<sample-groups-airway>>=
pData(air.eset)$GROUP <- ifelse(colData(airway)$dex == "trt", 1, 0)
table(pData(air.eset)$GROUP)
@

Paired samples, or in general sample batches/blocks, can be defined via a 
\Robject{BLOCK} column in the \Rfunction{pData} slot.
For the airway dataset, the sample blocks correspond to the four different cell 
lines.

<<sample-blocks>>=
pData(air.eset)$BLOCK <- colData(airway)$cell
table(pData(air.eset)$BLOCK)
@

For microarray expression data, the \Rfunction{de.ana} function carries 
out a differential expression analysis between the two groups based on 
functionality from the \Biocpkg{limma} package.
Resulting fold changes and $t$-test derived $p$-values for each gene are 
appended to the \Robject{fData} slot.

<<DE-ana-ALL>>=
all.eset <- de.ana(all.eset)
head(fData(all.eset), n=4)
@
 
Raw $p$-values are already corrected for multiple testing 
(\Robject{ADJ.PVAL}) using the method from Benjamini and Hochberg implemented 
in the function \Rfunction{p.adjust} from the \CRANpkg{stats} package.\\
To get a first overview, we inspect the $p$-value distribution and the volcano 
plot (fold change against $p$-value).
 
\begin{center}
<<plot-DE, fig=TRUE, width=12, height=6>>=
par(mfrow=c(1,2))
pdistr(fData(all.eset)$ADJ.PVAL)
volcano(fData(all.eset)$FC, fData(all.eset)$ADJ.PVAL)
@
\end{center}

The expression change of highest statistical significance is observed for the 
ENTREZ gene \verb=7525=. 
<<DE-exmpl>>=
fData(all.eset)[ which.min(fData(all.eset)$ADJ.PVAL), ]
@
This turns out to be the YES proto-oncogene 1 
(\href{http://www.genome.jp/dbget-bin/www_bget?hsa:7525}{hsa:7525@KEGG}).
  
For RNA-seq data, the \Rfunction{de.ana} function carries out a differential 
expression analysis between the two groups either based on functionality from
\Biocpkg{limma} (that includes the \Rfunction{voom} transformation), or 
alternatively, the popular \Biocpkg{edgeR} or \Biocpkg{DESeq2} package. 
Here, we use the analysis based on \Biocpkg{edgeR} for demonstration.

<<DE-ana-airway>>=
air.eset <- de.ana(air.eset, de.method="edgeR")
head(fData(air.eset), n=4)
@

\newpage

\section{ID mapping}

Using genomic information from different resources often requires mapping 
between different types of gene identifiers.
Although primary analysis steps such as normalization and differential expression 
analysis can be carried out independent of the gene ID type, downstream exploration 
functionality of the \Biocpkg{EnrichmentBrowser} is consistently based on NCBI Entrez Gene IDs.
It is thus, in this regard, beneficial to initially map gene IDs of a different type 
to NCBI Entrez IDs.\\
The function \Rfunction{id.types} lists the available ID types for the mapping 
depending on the organism under investigation.

<<idmap-idtypes>>=
id.types("hsa")
@

ID mapping for the airway dataset (from ENSEMBL to ENTREZ gene ids) can then be
carried out using the function \Rfunction{map.ids}.

<<idmap-airway>>=
head(featureNames(air.eset))
air.eset <- map.ids(air.eset, from="ENSEMBL", to="ENTREZID")
head(featureNames(air.eset))
@

Now, we subject the ALL and the airway gene expression data to the enrichment 
analysis. 

\newpage

\section{Set-based enrichment analysis}
In the following, we introduce how the \Biocpkg{EnrichmentBrowser} package 
can be used to perform state-of-the-art enrichment analysis of gene sets. 
We consider the ALL and the airway gene expression data as processed in the 
previous sections. We are now interested whether there are not only single genes
 that are differentially expressed, but also sets of genes known to work together, 
e.g.~as defined in the Gene Ontology or the KEGG pathway annotation.\\
The function \Rfunction{get.kegg.genesets}, which is based on 
functionality from the \Biocpkg{KEGGREST} package, downloads all KEGG 
pathways for a chosen organism as gene sets.

<<get-kegg-gs, eval=FALSE>>=
kegg.gs <- get.kegg.genesets("hsa")
@

Analogously, the function \Rfunction{get.go.genesets} defines GO 
terms of a selected ontology as gene sets.

<<get-go-gs, eval=FALSE>>=
go.gs <- get.go.genesets(org="hsa", onto="BP", mode="GO.db")
@

User-defined gene sets can be parsed from the GMT file format 
<<parseGMT>>=
gmt.file <- file.path(data.dir, "hsa_kegg_gs.gmt")
hsa.gs <- parse.genesets.from.GMT(gmt.file)
length(hsa.gs)
hsa.gs[1:2]
@

Currently, the following set-based enrichment analysis methods are supported
<<sbea-methods>>=
sbea.methods()
@

\begin{itemize}
\item{ORA: Overrepresentation Analysis (simple and frequently used test based 
        on the hypergeometric distribution, see \cite{Goeman07} for a critical 
        review)}
\item{SAFE: Significance Analysis of Function and Expression 
        (resampling version of ORA, implements additional test statistics, e.g.~Wilcoxon's
        rank sum, and allows to estimate the significance of gene sets by 
        sample permutation; implemented in the \Biocpkg{safe} package)} 
\item{GSEA: Gene Set Enrichment Analysis (frequently used and widely accepted, 
        uses a Kolmogorovâ€“Smirnov statistic to test whether the ranks of the 
        $p$-values of genes in a gene set resemble a uniform distribution 
        \cite{Subramanian05})}
\item{PADOG: Pathway Analysis with Down-weighting of Overlapping Genes 
	(incorporates gene weights to favor genes appearing in few pathways versus genes 
	that appear in many pathways; implemented in the \Biocpkg{PADOG} package)}
\item{ROAST: ROtAtion gene Set Test 
	(uses rotation instead of permutation for assessment of gene set significance; 
	implemented in the \Biocpkg{limma} and \Biocpkg{edgeR} packages for microarray 
	and RNA-seq data, respectively)}
\item{CAMERA: Correlation Adjusted MEan RAnk gene set test
	(accounts for inter-gene correlations as implemented in the \Biocpkg{limma} 
	and \Biocpkg{edgeR} packages for microarray and RNA-seq data, respectively)}
\item{GSA: Gene Set Analysis
	(differs from GSEA by using the maxmean statistic, i.e.~the mean of the positive or 
	negative part of gene scores in the gene set; implemented in the \CRANpkg{GSA} package)}
\item{GSVA: Gene Set Variation Analysis
	(transforms the data from a gene by sample matrix to a gene set by sample matrix, 
	thereby allowing the evaluation of gene set enrichment for each sample;
	implemented in the \Biocpkg{GSVA} package)}
\item{GLOBALTEST: Global testing of groups of genes
	(general test of groups of genes for association with a response variable;
	implemented in the \Biocpkg{globaltest} package)}
\item{SAMGS: Significance Analysis of Microarrays on Gene Sets 
        (extending the SAM method for single genes to gene set analysis 
        \cite{Dinu07})}
\item{EBM: Empirical Brown's Method (combines $p$-values of genes in a 
gene set using Brown's method to combine $p$-values from dependent tests;
implemented in \Biocpkg{EmpiricalBrownsMethod})}
\item{MGSA: Model-based Gene Set Analysis
	(Bayesian modeling approach taking set overlap into account by working on all sets 
	simultaneously, thereby reducing the number of redundant sets; implemented in \Biocpkg{mgsa})}
\end{itemize}
For demonstration, we perform a basic ORA choosing a significance level 
$\alpha$ of 0.05.
<<sbea>>=
sbea.res <- sbea(method="ora", eset=all.eset, gs=hsa.gs, perm=0, alpha=0.05)
gs.ranking(sbea.res)
@

The result of every enrichment analysis is a ranking of gene sets by the 
corresponding $p$-value. The \Rfunction{gs.ranking} function displays only those
gene sets satisfying the chosen significance level $\alpha$.\\
While such a ranked list is the standard output of existing enrichment tools, 
the \Biocpkg{EnrichmentBrowser} package provides visualization 
and interactive exploration of resulting gene sets far beyond that point.
Using the \Rfunction{ea.browse} function creates a HTML summary from which each
gene set can be inspected in detail (this builds on functionality from the
\Biocpkg{ReportingTools} package). 
The various options are described in Figure \ref{fig:oraRes}. 

<<ea-browse, eval=FALSE>>=
ea.browse(sbea.res)
@

\begin{figure}
\centering
\includegraphics[width=14cm, height=8.75cm]{ora_ebrowse.png}
\caption{ORA result view. For each significant gene set in the ranking, the user
    can select to view (1) a gene report, that lists all genes of a set along
    with fold change and DE $p$-value, (2) interactive overview plots
    such as heatmap, $p$-value distribution, and volcano plot, (3) the pathway 
    in KEGG with differentially expressed genes highlighted in red.}\label{fig:oraRes}
\end{figure}

The goal of the \Biocpkg{EnrichmentBrowser} package is to provide
frequently used enrichment methods. However, it is also possible to exploit
its visualization capabilities with user-defined set-based enrichment methods.
This requires to implement a function that takes the characteristic arguments
\Robject{eset} (expression data), \Robject{gs} (gene sets), \Robject{alpha}
(significance level), and \Robject{perm} (number of permutations). 
In addition, it must return a numeric vector \Robject{ps} storing the resulting
$p$-value for each gene set in \Robject{gs}. The $p$-value vector must also be
named accordingly (i.e.~\Rcode{names(ps) == names(gs)}).\\
Let us consider the following dummy enrichment method, which randomly renders 
five gene sets significant and the remaining insignificant.

<<dummy-sbea>>=
dummy.sbea <- function(eset, gs, alpha, perm)
{
        sig.ps <- sample(seq(0,0.05, length=1000),5)
        insig.ps <- sample(seq(0.1,1, length=1000), length(gs)-5)
        ps <- sample(c(sig.ps, insig.ps), length(gs))
        names(ps) <- names(gs)
        return(ps)
}
@

\newpage

We can plug this method into \Rfunction{sbea} as before.

<<sbea2>>=
sbea.res2 <- sbea(method=dummy.sbea, eset=all.eset, gs=hsa.gs)
gs.ranking(sbea.res2)
@ 

\newpage
\section{Network-based enrichment analysis}
Having found sets of genes that are differentially regulated in the ALL data, 
we are now interested whether these findings can be supported by known 
regulatory interactions. 
For example, we want to know whether transcription factors and their target 
genes are expressed in accordance to the connecting regulations. 
Such information is usually given in a gene regulatory network derived from 
specific experiments, e.g.~using the \Biocpkg{GeneNetworkBuilder}, or compiled 
from the literature (\cite{Geistlinger13} for an example). 
There are well-studied processes and organisms for which comprehensive and 
well-annotated regulatory networks are available, e.g.~the \verb=RegulonDB= for
\emph{E.~coli} and \verb=Yeastract= for \emph{S.~cerevisiae}. 
However, there are also cases where such a network is missing. 
A first simple workaround is to compile a network from regulations in the KEGG
database.\\
We can download all KEGG pathways of a specified organism via the 
\Rfunction{download.kegg.pathways} function that exploits functionality
from the \Biocpkg{KEGGREST} package.

<<dwnld-pwys, eval=FALSE>>=
pwys <- download.kegg.pathways("hsa")
@

In this case, we have already downloaded a selection of human KEGG pathways.
We parse them making use of the \Biocpkg{KEGGgraph} package and compile the 
resulting gene regulatory network.   
<<compile-grn>>=
pwys <- file.path(data.dir, "hsa_kegg_pwys.zip")
hsa.grn <- compile.grn.from.kegg(pwys)
head(hsa.grn)
@

Now, we are able to perform enrichment analysis using the compiled network.
Currently, the following network-based enrichment analysis methods are supported
<<nbea-methods>>=
nbea.methods()
@
 
\begin{itemize}
\item{GGEA: Gene Graph Enrichment Analysis 
    (evaluates consistency of known regulatory interactions with the observed
    expression data \cite{Geistlinger11})}
\item{SPIA: Signaling Pathway Impact Analysis 
    (combines ORA with the probability that expression changes
	are propagated across the pathway topology;
	implemented in the \Biocpkg{SPIA} package)}
\item{PathNet: Pathway Analysis using Network Information
    (applies ORA on combined evidence for the observed signal for gene nodes and the signal 
	implied by connected neighbors in the network; 
	implemented in the \Biocpkg{PathNet} package)}
\item{DEGraph: Differential expression testing for gene graphs
	(multivariate testing of differences in mean incorporating underlying 
	graph structure; implemented in the \Biocpkg{DEGraph} package)}
\item{TopologyGSA: Topology-based Gene Set Analysis
	 (uses Gaussian graphical models to incorporate the dependence structure 
	among genes as implied by pathway topology;
	implemented in the \CRANpkg{topologyGSA} package)}
\item{GANPA: Gene Association Network-based Pathway Analysis
	(incorporates network-derived gene weights in the enrichment analysis;
	implemented in the \CRANpkg{GANPA} package)}
\item{CePa: Centrality-based Pathway enrichment
	(incorporates network centralities as node weights mapped from 
	differentially expressed genes in pathways;
		implemented in the \CRANpkg{CePa} package)}
\item{NetGSA: Network-based Gene Set Analysis
	(incorporates external information about interactions among genes
	as well as novel interactions learned from data;
	implemented in the \CRANpkg{NetGSA} package)}
\item{NEA: Network Enrichment Analysis 
    (applies ORA on interactions instead of genes; 
	implemented in the \Biocpkg{neaGUI} package)}
\end{itemize}

\newpage
For demonstration, we perform GGEA using the compiled KEGG regulatory network.

<<nbea>>=
nbea.res <- nbea(method="ggea", eset=all.eset, gs=hsa.gs, grn=hsa.grn)
gs.ranking(nbea.res)
@

The resulting ranking lists, for each statistically significant gene set, the 
number of relations of the network involving a member of the gene set under study 
(\texttt{NR.RELS}), the sum of consistencies over the relations of the set (\texttt{RAW.SCORE}), 
the score normalized by induced network size 
(\texttt{NORM.SCORE} = \texttt{RAW.SCORE} / \texttt{NR.RELS}), 
and the statistical significance of each gene set based on a permutation 
approach.\\
A GGEA graph for a gene set depicts the consistency of each interaction in the set. 
Nodes (genes) are colored according to expression (up-/down-regulated) and 
edges (interactions) are colored according to consistency, i.e.~how well the 
interaction type (activation/inhibition) is reflected in the correlation of the 
observed expression of both interaction partners.

\begin{center}
<<ggea-graph, fig=TRUE, width=12, height=6>>=
par(mfrow=c(1,2))
ggea.graph(
    gs=hsa.gs[["hsa05217_Basal_cell_carcinoma"]], 
    grn=hsa.grn, eset=all.eset)
ggea.graph.legend()
@
\end{center}

As described in the previous section, it is also possible to plug in user-defined
network-based enrichment methods.

\newpage
\section{Combining results}
Different enrichment analysis methods usually result in different gene set 
rankings for the same dataset.
To compare results and detect gene sets that are supported by different methods,
the \Biocpkg{EnrichmentBrowser} package allows to combine results from the 
different set-based and network-based enrichment analysis methods.
The combination of results yields a new ranking of the gene sets under 
investigation by specified ranking criteria, e.g.~the average rank across methods.
% or a combined $p$-value using Fisher's method or Stouffer's method \cite{Kim13}.\\
We consider the ORA result and the GGEA result from the previous sections and 
use the function \Rfunction{comb.ea.results}.

<<combine>>=
res.list <- list(sbea.res, nbea.res)
comb.res <- comb.ea.results(res.list)
@
 
The combined result can be detailedly inspected as before and interactively 
ranked as depicted in Figure \ref{fig:combRes}. 

<<browse-comb, eval=FALSE>>=
ea.browse(comb.res, graph.view=hsa.grn, nr.show=5)
@

\begin{figure}[!h]
\centering
\includegraphics[width=17cm, height=6.8cm]{comb_ebrowse.png}
\caption{Combined result view. By clicking on one of the columns 
    (ORA.RANK, ..., GGEA.PVAL) the result can be interactively ranked according
    to the selected criterion.}\label{fig:combRes}
\end{figure}

\newpage
\section{Putting it all together}
There are cases where it is necessary to perform certain steps of the demonstrated
enrichment analysis pipeline individually.
However, it is often more convenient to run the complete standardized pipeline.
This can be done using the all-in-one wrapper function \Rfunction{ebrowser}.
For example, the result page displayed in Figure \ref{fig:combRes} can also be 
produced from scratch via 

<<all-in-one, eval=FALSE>>=
ebrowser(   meth=c("ora", "ggea"), 
        exprs=exprs.file, pdat=pdat.file, fdat=fdat.file, 
        org="hsa", gs=hsa.gs, grn=hsa.grn, comb=TRUE, nr.show=5)
@

%\newpage
%\section{Advanced: Configuring the EnrichmentBrowser}
%TODO

\newpage
\section{Frequently asked questions}
\begin{enumerate}
\item \textbf{How to cite the \Biocpkg{EnrichmentBrowser}?}\\[0.25cm]
Geistlinger L, Csaba G and Zimmer R. Bioconductor's EnrichmentBrowser: 
seamless navigation through combined results of set- \& network-based enrichment
analysis. \textit{BMC Bioinformatics}, \textbf{17}:45, 2016.\\[0.25cm]
\item \textbf{Is it possible to apply the \Biocpkg{EnrichmentBrowser} to simple gene lists?}\\[0.25cm]
Enrichment methods implemented in the \Biocpkg{EnrichmentBrowser} are, except for ORA, 
expression-based (and also draw their strength from that).
The set-based methods GSEA, SAFE, and SAMGS use sample permutation,
involving recomputation of differential expression, for gene set
significance estimation, i.e.~they require the complete expression
matrix.
The network-based methods require measures of differential expression 
such as fold change and $p$-value to score interactions of the network.
In addition, visualization of enriched gene sets is explicitly designed for 
expression data.
Thus, for simple gene list enrichment, tools like \software{DAVID}
(\href{https://david.ncifcrf.gov}{https://david.ncifcrf.gov}) and
\software{GeneAnalytics} (\href{http://geneanalytics.genecards.org}{http://geneanalytics.genecards.org})
are more suitable, and it is recommended to use them for this purpose.
\end{enumerate}
%\bibliographystyle{unsrturl}
\bibliography{refs}
\end{document}