---
title: "GRNdata"
author: "Pau Bellot, Catharina Olsen, Patrick Meyer"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
bibliography: bibliography.bib
vignette: >
  %\VignetteIndexEntry{GRNdata}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---

This package contains a large set of gene expressions generated by various 
simulators collected in what we cal ``Datasource". 

The data generated by the simulators is free of noise. The noise could 
be added later so that it is possible to control its properties independently 
of the simulators and also to provide fully reproducible tests. This study 
involves data generated by three different GRN simulators:

###  GNW
The GNW simulator [@schaffter2011genenetweaver] generates network structures 
by extracting parts of known real GRN structures capturing several of their 
important structural properties. To produce gene expression data, the 
simulator relies on a system of non-linear ordinary 
differential equations (ODE).

###  SynTReN
The SynTReN simulator [@van2006syntren] generates the underlying networks by 
selecting sub-networks from \emph{E.coli} and \emph{Yeast} organisms. Then 
the experiments are obtained by simulating equations based on Michaelis-Menten 
and Hill kinetics under different conditions.

###  Rogers
The data generator described in [@Rogers15072005] that will be referred 
as \emph{Rogers} relies on a power-law distribution on the number of 
connections of the genes to generate the underling network. 
The steady state of the system is obtained by integrating a system of 
differential equations simulating only knockout data.

### Datasources
Using these simulators, five large datasources involving many noise-free 
experiments have been generated. The characteristics of these datasources 
are detailed in the following Table:

|    Datasource    |         Topology        | Experiments | Genes | Edges |
|:----------------:|:-----------------------:|:-----------:|:-----:|:-----:|
|  $Rogers_{1000}$ | Power-law tail topology |     1000    |  1000 |  1350 |
|  $SynTReN_{300}$ |         E. coli         |     800     |  300  |  468  |
| $SynTReN_{1000}$ |         E. coli         |     1000    |  1000 |  4695 |
|   $GNW_{1565}$   |         E. coli         |     1565    |  1565 |  7264 |
|   $GNW_{2000}$   |          Yeast          |     2000    |  2000 | 10392 |

In order to generate these datasources we have simulated multifactorial data 
with SynTReN and GNW, which is a less 
informative data [@marbach2010revealing].

---
#### References: