%\VignetteIndexEntry{Using weaver to process Sweave documents}
%\VignetteDepends{weaver}
%\VignetteKeywords{latex,sweave,cache}
%\VignettePackage{weaver}
\documentclass{article}
\usepackage{hyperref}
\textwidth=6.2in
\textheight=8.5in
\oddsidemargin=.1in
\evensidemargin=.1in
\headheight=-.3in
\newcommand{\Rfunction}[1]{{\texttt{#1}}}
\newcommand{\Rmethod}[1]{{\texttt{#1}}}
\newcommand{\Rcode}[1]{{\texttt{#1}}}
\newcommand{\Robject}[1]{{\texttt{#1}}}
\newcommand{\Rpackage}[1]{{\textit{#1}}}
\newcommand{\Rclass}[1]{{\textit{#1}}}
\newcommand{\classdef}[1]{%
{\em #1}
}
\begin{document}
\title{How to use weaver for Sweave document processing}
\author{Seth Falcon}
\date{8 June, 2006}
\maketitle
\section{Introduction}
The \Rpackage{weaver} package provides extensions to the Sweave
utilities included in R's \Rpackage{utils} package. The focus of the
extensions is on caching computationally expensive (time consuming)
code chunks in Sweave documents.
\textit{Why would I want to cache code chunks?} If your Sweave
document includes one or more code chunks that take a long time to
compute, you may find it frustrating to make small changes to the
document. Each run requires recomputing the ``expensive'' code
chunks. If these chunks aren't changing, you can benefit from the
caching provided by \Rpackage{weaver}.
\textit{How does it work?} The details are in the code, of course,
but in a few words... You tell \Rpackage{weaver} which code chunks you
want cached by setting a chunk option (\Rcode{cache=TRUE}). A digest
(md5 sum) of the text representation of \textit{each} expression in
the code chunk is computed and the result of the expression is stored
in a file named by the expression's digest. Dependencies on
previously cached expressions are determined using functions from the
\Rpackage{codetools} package. After the cache files have been
created, subsequent runs load the cache instead of evaluating the
expression (this means side-effects are completely lost!). When
changes in the dependencies of an expression are detected (or when the
expression itself has changed), it is recomputed and the cache file is
updated.
\section{Using the expression caching feature}
If you add the chunk option \Rcode{cache=TRUE}, then caching will be
turned on for all expressions in the chunk. Here's an example:
\input{chunk-example1}
Side-effects, such as printing, plotting, definging S4 classes or
methods, or setting global options are not captured by the caching
mechanism. Avoid doing such things in a code chunk that has
\Rcode{cache=TRUE}. Treat cached code chunks as if you had set the
option \Rcode{results=hide}.
\subsection{Warnings about using the caching feature}
Do not stare directly at the cache! May cause blindness, headache,
shortness of breath, and dizzyness.
\begin{itemize}
\item Printing doesn't work in cached chunks since it is a side
effect.
\item The dependency detection is imperfect and will fail you. When
you've made important changes, you should remove all cache files
and rebuild the document. By default, the cache database is
stored in a directory named \verb+r_env_cache+ in the current
working directory. Removing this directory is the best way to
be certain that the following run will not use any cached data.
A log file is produced in the current working directory named
\verb+weaver_debug_log.txt+. Reviewing it can be useful in
determining what the \Rpackage{weaver} system thinks the
dependencies of a given expression are.
\item Caching is performed separately on each expression in a chunk
which has the option \Rcode{cache=TRUE} set. Be especially
careful with repeated calls to random number based functions
like \Rfunction{rnorm}. Repeated calls within cached chunks
will pull from the cache rather than computing a new stream of
random numbers.
\item The cache is not document specific. If you have two documents
in the same working directory that contain equivalent
expressions within a chunk that has caching turned on, you will
get the cached value. I think this is a feature and will be
useful for testing purposes, but could be surprising.
\end{itemize}
\section{Processing a document from inside R}
To process a document using \Rpackage{weaver}, load the
\Rpackage{weaver} package and then use \Rfunction{weaver()} as the
\Rcode{driver} argument to \Rfunction{Sweave}. Here is an example:
<<basicProc>>=
library("weaver")
testDocPath <- system.file("extdata/doc1.Rnw", package="weaver")
curDir <- getwd()
setwd(tempdir())
z <- capture.output(Sweave(testDocPath, driver=weaver()),
file=tempfile())
setwd(curDir)
@
Note that the calls to \Rfunction{setwd} are only needed here because
we are processing an Sweave document inside an Sweave document. Also,
\Rfunction{capture.output} was used to keep this document short and to
encourage you to run the examples yourself\footnote{In addition, some
of the output is sent to stderr and this is not captured when
running Sweave inside Sweave.}
Now we run another sample document.
<<another>>=
testDocPath <- system.file("extdata/doc2.Rnw", package="weaver")
curDir <- getwd()
setwd(tempdir())
z <- capture.output(Sweave(testDocPath, driver=weaver()),
file=tempfile())
setwd(curDir)
@
Finally, we run our first example document again. This time, you can
see that data from the cache is being used.
<<again>>=
testDocPath <- system.file("extdata/doc1.Rnw", package="weaver")
curDir <- getwd()
setwd(tempdir())
z <- capture.output(Sweave(testDocPath, driver=weaver()),
file=tempfile())
setwd(curDir)
@
\section{Sample convenience shell script}
You can use this shell script to make processing \texttt{Rnw} files
with \Rpackage{weaver} easier.
\begin{verbatim}
#!/bin/bash
echo "library(weaver); Sweave(\"$1\", driver=weaver())" \
| R --no-save --no-restore
\end{verbatim}
If you put that into a file \texttt{weaver.sh}, then you can do:
\begin{verbatim}
weaver.sh somefile.Rnw
\end{verbatim}
to process \texttt{somefile.Rnw} with \Rpackage{weaver}. Another
useful script is one that does the processing without using any cached
data. This is useful, for example, when you are ready to produce a
final draft of your document.
\begin{verbatim}
#!/bin/bash
echo "library(weaver); Sweave(\"$1\", driver=weaver(), use.cache=FALSE)" \
| R --no-save --no-restore
\end{verbatim}
\section{Session Info}
<<sessionInfo, results=tex>>=
toLatex(sessionInfo())
@
\end{document}