Last updated: 2018-05-16
Loading required package: ashr
In the simplest case, there is no true deviation exists. \[c_{j} = \mu_{j} 1\] \[\hat{c}_{j} \sim N_{8}(c_{j}, \frac{1}{2}I)\] Let L be the contrast matrix. Therefore, \[\hat{\delta}_{j} = L\hat{c}_{j} \sim N_{7}(0, \frac{1}{2}LL')\]
We first generate the data:
set.seed(1)
data = sim_contrast1(nsamp = 10000, ncond = 8)
colnames(data$C) = colnames(data$Chat) = colnames(data$Shat) = c('CTL', paste0('T', 1:7))
Set up the contrast matrix and the mash contrast data object
L = rbind(c(-1,1,0,0,0,0,0,0),
c(-1,0,1,0,0,0,0,0),
c(-1,0,0,1,0,0,0,0),
c(-1,0,0,0,1,0,0,0),
c(-1,0,0,0,0,1,0,0),
c(-1,0,0,0,0,0,1,0),
c(-1,0,0,0,0,0,0,1))
row.names(L) = c('T1-CTL','T2-CTL','T3-CTL','T4-CTL','T5-CTL','T6-CTL','T7-CTL')
mash_data = mash_set_data(Bhat=data$Chat, Shat=data$Shat)
mash_data_L = mash_set_data_contrast(mash_data, L)
Set up the covariance matrices:
There are two types of covariance matrix you can use in mash: “canonical” and “data-driven”.
# canonical
U.c = cov_canonical(mash_data_L)
# data driven
mash_data_L.z = mash_data_L
mash_data_L.z$Bhat = mash_data_L$Bhat/mash_data_L$Shat
m.1by1 = mash_1by1(mash_data_L.z)
strong = get_significant_results(m.1by1,0.05)
# no strong
# no data driven
Fit mashcontrast
model
mashcontrast.model = mash(mash_data_L, U.c, algorithm.version = 'R')
- Computing 10000 x 169 likelihood matrix.
- Likelihood calculations took 1.04 seconds.
- Fitting model with 169 mixture components.
- Model fitting took 1.88 seconds.
- Computing posterior matrices.
- Computation allocated took 0.49 seconds.
length(get_significant_results(mashcontrast.model))
[1] 0
There is no discovery, which is as we expected. The true deviations are zero for all samples.
barplot(get_estimated_pi(mashcontrast.model),las = 2, cex.names = 0.7)
Indep.data = mash_set_data(Bhat = data$Chat%*%t(L))
U.c = cov_canonical(Indep.data)
Indep.model = mash(Indep.data, U.c, algorithm.version = 'R')
- Computing 10000 x 169 likelihood matrix.
- Likelihood calculations took 0.70 seconds.
- Fitting model with 169 mixture components.
- Model fitting took 1.64 seconds.
- Computing posterior matrices.
- Computation allocated took 0.08 seconds.
length(get_significant_results(Indep.model))
[1] 2818
There are more false positives.
barplot(get_estimated_pi(Indep.model),las = 2,cex.names = 0.7)
In this case, the observed deviations are truly independent.
# simulate true independent delta
D = matrix(0, nrow=10000, ncol = 7)
Shat = matrix(1, nrow=10000, ncol = 7)
set.seed(1)
E = matrix(rnorm(length(Shat), mean=0, sd=Shat), nrow=10000, ncol = 7)
Dhat = D + E
row_ids = paste0("sample_", 1:nrow(D))
col_ids = paste0("condition_", 1:ncol(D))
rownames(D) = row_ids
colnames(D) = col_ids
rownames(Dhat) = row_ids
colnames(Dhat) = col_ids
rownames(Shat) = row_ids
colnames(Shat) = col_ids
mash_delta = mash_set_data(Bhat = Dhat, Shat = Shat)
U.c = cov_canonical(mash_delta)
mash_delta_model = mash(mash_delta, U.c, algorithm.version = 'R')
- Computing 10000 x 169 likelihood matrix.
- Likelihood calculations took 0.83 seconds.
- Fitting model with 169 mixture components.
- Model fitting took 1.70 seconds.
- Computing posterior matrices.
- Computation allocated took 0.04 seconds.
length(get_significant_results(mash_delta_model))
[1] 0
barplot(get_estimated_pi(mash_delta_model),las = 2,cex.names = 0.7)
From the above comparisons, we show that the wrong assumption \(\hat{\delta}\sim N_{7}(0,I)\) produces many more false positives.
sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.4
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mashr_0.2-8 ashr_2.2-7
loaded via a namespace (and not attached):
[1] Rcpp_0.12.16 knitr_1.20 magrittr_1.5
[4] REBayes_1.3 MASS_7.3-50 doParallel_1.0.11
[7] pscl_1.5.2 SQUAREM_2017.10-1 lattice_0.20-35
[10] foreach_1.4.4 plyr_1.8.4 stringr_1.3.0
[13] tools_3.4.4 parallel_3.4.4 grid_3.4.4
[16] rmeta_3.0 htmltools_0.3.6 iterators_1.0.9
[19] assertthat_0.2.0 yaml_2.1.19 rprojroot_1.3-2
[22] digest_0.6.15 Matrix_1.2-14 codetools_0.2-15
[25] evaluate_0.10.1 rmarkdown_1.9 stringi_1.2.2
[28] compiler_3.4.4 Rmosek_8.0.69 backports_1.1.2
[31] mvtnorm_1.0-7 truncnorm_1.0-8