%\VignetteEngine{knitr}
%\VignetteIndexEntry{in-silico cleavage of polypeptides}
%\VignetteKeyword{polypeptide, cleavage}
%\VignettePackage{cleaver}
\documentclass[12pt,a4paper,english]{scrartcl}
\usepackage[super]{natbib}

<<style, eval=TRUE, echo=FALSE, results="asis">>=
BiocStyle::latex(use.unsrturl=FALSE)
@

\newcommand{\cleaver}{\Biocpkg{cleaver}}

\author{Sebastian Gibb%
  \thanks{\email{mail@sebastiangibb.de}}
}
\date{\today}

<<setup, include=FALSE, cache=FALSE>>=
library("knitr")
opts_chunk$set(tidy.opts=list(width.cutoff=45, tidy=FALSE), fig.align='center', comment=NA, prompt=TRUE)
@

<<libraries, echo=FALSE>>=
suppressPackageStartupMessages(library("UniProt.ws"))
suppressPackageStartupMessages(library("BRAIN"))
@

\begin{document}

\title{In-silico cleavage of polypeptides using the \cleaver{} package}

\maketitle

\tableofcontents

\section{Introduction}

Most proteomics experiments need protein (peptide) separation and cleavage
procedures before these molecules could be analyzed or identified
by mass spectrometry or other analytical tools. \\
\cleaver{} allows in-silico cleavage of polypeptide sequences to e.g. create
theoretical mass spectrometry data. \\
The cleavage rules are taken from the
\href{http://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html}{ExPASy
PeptideCutter tool}\citep{peptidecutter}.

\section{Simple Usage}

Loading the \cleaver{} package:
<<>>=
library("cleaver")
@

Getting help and list all available cleavage rules:
<<eval=FALSE>>=
help("cleave")
@

Cleaving of \emph{Gastric juice peptide 1 (P01358)} using \emph{Trypsin}:
<<>>=
## cleave it
cleave("LAAGKVEDSD", enzym="trypsin")
## get the cleavage ranges
cleavageRanges("LAAGKVEDSD", enzym="trypsin")
## get only cleavage sites
cleavageSites("LAAGKVEDSD", enzym="trypsin")
@

Sometimes cleavage is not perfect and the enzym miss some cleavage positions:
<<>>=
## miss one cleavage position
cleave("LAAGKVEDSD", enzym="trypsin", missedCleavages=1)
cleavageRanges("LAAGKVEDSD", enzym="trypsin", missedCleavages=1)
## miss zero or one cleavage positions
cleave("LAAGKVEDSD", enzym="trypsin", missedCleavages=0:1)
cleavageRanges("LAAGKVEDSD", enzym="trypsin", missedCleavages=0:1)
@

Combine \cleaver{} and the \Biocpkg{Biostrings} R package\citep{Biostrings}:
<<>>=
## create AAStringSet object
p <- AAStringSet(c(gaju="LAAGKVEDSD", pnm="AGEPKLDAGV"))

## cleave it
cleave(p, enzym="trypsin")
cleavageRanges(p, enzym="trypsin")
cleavageSites(p, enzym="trypsin")
@


\section{Insulin \& Somatostatin Example}

Downloading \emph{Insulin (P01308)} and \emph{Somatostatin (P61278)} sequences
from the \href{http://www.uniprot.org}{UniProt}\citep{uniprot} database using
the \Biocpkg{UniProt.ws} R package\citep{UniProt.ws}.
<<>>=
## load UniProt.ws library
library("UniProt.ws")

## select species Homo sapiens
UniProt.ws <- UniProt.ws(taxId=9606)

## download sequences of Insulin/Somatostatin
s <- select(UniProt.ws, keys=c("P01308", "P61278"), columns=c("SEQUENCE"))

## fetch only sequences
sequences <- setNames(s$SEQUENCE, s$UNIPROTKB)

## remove whitespaces
sequences <- gsub(pattern="[[:space:]]", replacement="", x=sequences)
@

Cleaving using \emph{Pepsin}:
<<>>=
cleave(sequences, enzym="pepsin")
@

\section{Isotopic Distribution Of Tryptic Digested Insulin}

A common use case of in-silico cleavage is the calculation of the
isotopic distribution of peptides (which were enzymatic digested in the
in-vitro experimental workflow). Here the
\Biocpkg{BRAIN} R package\citep{BRAIN, BRAIN2} is used to calculate the isotopic
distribution of \cleaver{}'s output.
(please note: it is only a toy example, e.g. the relation of intensity values
between peptides isn't correct).
<<>>=
## load BRAIN library
library("BRAIN")

## cleave insulin
cleavedInsulin <- cleave(sequences[1], enzym="trypsin")[[1]]

## create empty plot area
plot(NA, xlim=c(150, 4300), ylim=c(0, 1),
     xlab="mass", ylab="relative intensity",
     main="tryptic digested insulin - isotopic distribution")

## loop through peptides
for (i in seq(along=cleavedInsulin)) {
  ## count C, H, N, O, S atoms in current peptide
  atoms <- BRAIN::getAtomsFromSeq(cleavedInsulin[[i]])
  ## calculate isotopic distribution
  d <- useBRAIN(atoms)
  ## draw peaks
  lines(d$masses, d$isoDistr, type="h", col=2)
}
@

\section{Session Information}
<<sessioninfo, echo=FALSE, results="asis">>=
toLatex(sessionInfo())
@

\bibliographystyle{plainnat}
\bibliography{cleaver}

\end{document}

\end{document}