This page was generated on 2021-05-06 12:34:51 -0400 (Thu, 06 May 2021).
R version 4.0.5 (2021-03-31) -- "Shake and Throw"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin17.0 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(EBarrays)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, sd, var, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, basename, cbind, colnames, dirname, do.call,
duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
tapply, union, unique, unsplit, which.max, which.min
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: lattice
> demo(ebarrays)
demo(ebarrays)
---- ~~~~~~~~
> library(EBarrays)
> ## EM algorithm
> ## Lognormal-Normal Demo
>
> ## mu10,sigma2,tau are parameters in the LNNB model; pde is the
> ## proportion of differentially expressed genes; n is the
> ## total number of genes; nr1 and nr2 are the number of replicate
> ## arrays in each group.
>
> lnnb.sim <- function(mu10, sigmasq, tausq, pde, n, nr1, nr2)
+ {
+ de <- sample(c(TRUE, FALSE), size = n, replace = TRUE, prob = c(pde, 1 - pde))
+ x <- matrix(NA, n, nr1)
+ y <- matrix(NA, n, nr2)
+ mu1 <- rnorm(n, mu10, sqrt(tausq))
+ mu2.de <- rnorm(n, mu10, sqrt(tausq))
+ mu2 <- mu1
+ mu2[de] <- mu2.de[de]
+ for(j in 1:nr1) {
+ x[, j] <- rnorm(n, mu1, sqrt(sigmasq))
+ }
+ for(j in 1:nr2) {
+ y[, j] <- rnorm(n, mu2, sqrt(sigmasq))
+ }
+ outmat <- exp(cbind(x, y))
+ list(mu1 = mu1, mu2 = mu2, outmat = outmat, de = de)
+ }
> ## simulating data with
> ## mu_0 = 2.33, sigma^2 = 0.1, tau^2 = 2
> ## P(DE) = 0.2
>
> sim.data1 <- lnnb.sim(2.33, 0.1, 2, 0.2, 2000, nr1 = 3, nr2 = 3)
> de.true1 <- sim.data1$de ## true indicators of differential expression
> sim.data2 <- lnnb.sim(1.33, 0.01, 2, 0.2, 2000, nr1 = 3, nr2 = 3)
> de.true2 <- sim.data2$de ## true indicators of differential expression
> testdata <- rbind(sim.data1$outmat,sim.data2$outmat)
> hypotheses <- ebPatterns(c("1 1 1 1 1 1", "1 1 1 2 2 2"))
> em.out <- emfit(testdata, family = "LNN", hypotheses,
+ cluster = 1:5,
+ type = 2,
+ verbose = TRUE,
+ num.iter = 10)
Checking for negative entries...
Checking for negative entries...
Generating summary statistics for patterns.
This may take a few seconds...
Starting EM iterations (total 10 ).
This may take a while
Starting iteration 1 ...
Starting iteration 2 ...
Starting iteration 3 ...
Starting iteration 4 ...
Starting iteration 5 ...
Starting iteration 6 ...
Starting iteration 7 ...
Starting iteration 8 ...
Starting iteration 9 ...
Starting iteration 10 ...
Fit used 0.48 seconds user time
Checking for negative entries...
Generating summary statistics for patterns.
This may take a few seconds...
Starting EM iterations (total 10 ).
This may take a while
Starting iteration 1 ...
Starting iteration 2 ...
Starting iteration 3 ...
Starting iteration 4 ...
Starting iteration 5 ...
Starting iteration 6 ...
Starting iteration 7 ...
Starting iteration 8 ...
Starting iteration 9 ...
Starting iteration 10 ...
Fit used 1.42 seconds user time
Checking for negative entries...
Generating summary statistics for patterns.
This may take a few seconds...
Starting EM iterations (total 10 ).
This may take a while
Starting iteration 1 ...
Starting iteration 2 ...
Starting iteration 3 ...
Starting iteration 4 ...
Starting iteration 5 ...
Starting iteration 6 ...
Starting iteration 7 ...
Starting iteration 8 ...
Starting iteration 9 ...
Starting iteration 10 ...
Fit used 1.75 seconds user time
Checking for negative entries...
Generating summary statistics for patterns.
This may take a few seconds...
Starting EM iterations (total 10 ).
This may take a while
Starting iteration 1 ...
Starting iteration 2 ...
Starting iteration 3 ...
Starting iteration 4 ...
Starting iteration 5 ...
Starting iteration 6 ...
Starting iteration 7 ...
Starting iteration 8 ...
Starting iteration 9 ...
Starting iteration 10 ...
Fit used 2.29 seconds user time
Checking for negative entries...
Generating summary statistics for patterns.
This may take a few seconds...
Starting EM iterations (total 10 ).
This may take a while
Starting iteration 1 ...
Starting iteration 2 ...
Starting iteration 3 ...
Starting iteration 4 ...
Starting iteration 5 ...
Starting iteration 6 ...
Starting iteration 7 ...
Starting iteration 8 ...
Starting iteration 9 ...
Starting iteration 10 ...
Fit used 3.03 seconds user time
> em.out
EB model fit
Family: LNN ( Lognormal-Normal )
Model parameter estimates:
mu_0 sigma.2 tao_0.2
Cluster 1 1.335812 0.00989453 2.063506
Cluster 2 2.268558 0.09983727 1.993214
Estimated mixing proportions:
Pattern.1 Pattern.2
Cluster 1 0.4035865 0.09765197
Cluster 2 0.4026661 0.09609544
> post.out <- postprob(em.out, testdata)
> table(post.out$pattern[, 2] > .5, c(de.true1,de.true2))
FALSE TRUE
FALSE 3189 158
TRUE 34 619
> table((post.out$cluster[, 2] > .5)+1, c(rep("Cluster 1",2000),rep("Cluster 2",2000)))
Cluster 1 Cluster 2
1 139 1944
2 1861 56
> plotMarginal(em.out,testdata)
> par(ask=TRUE)
> plotCluster(em.out,testdata)
> par(ask=FALSE)
> lnnmv.em.out <- emfit(testdata, family = "LNNMV", hypotheses, groupid=c(1,1,1,2,2,2),
+ verbose = TRUE,
+ num.iter = 10,
+ p.init = c(0.95, 0.05))
Checking for negative entries...
Generating summary statistics for patterns.
This may take a few seconds...
Starting EM iterations (total 10 ).
This may take a while
Starting iteration 1 ...
Starting iteration 2 ...
Starting iteration 3 ...
Starting iteration 4 ...
Starting iteration 5 ...
Starting iteration 6 ...
Starting iteration 7 ...
Starting iteration 8 ...
Starting iteration 9 ...
Starting iteration 10 ...
Fit used 0.52 seconds user time
> lnnmv.em.out
EB model fit
Family: LNNMV ( Lognormal-Normal with modified variances )
Model parameter estimates:
mu_0 tao_0.2
1 1.806385 2.241027
Estimated mixing proportions:
Pattern.1 Pattern.2
p.temp 0.7892532 0.2107468
> post.out <- postprob(lnnmv.em.out, testdata, groupid=c(1,1,1,2,2,2))
> table(post.out$pattern[, 2] > .5, c(de.true1,de.true2))
FALSE TRUE
FALSE 3122 150
TRUE 101 627
There were 50 or more warnings (use warnings() to see the first 50)
>
>
>
> proc.time()
user system elapsed
11.617 0.700 12.315