Last updated: 2018-10-30
workflowr checks: (Click a bullet for more information) ✔ R Markdown file: up-to-date
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
✔ Environment: empty
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
✔ Seed:
set.seed(20180529)
The command set.seed(20180529)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
✔ Session information: recorded
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
✔ Repository version: adb05db
wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: analysis/.Rhistory
Ignored: docs/.DS_Store
Ignored: docs/figure/Test.Rmd/
Untracked files:
Untracked: analysis/mashMean.Rmd
Unstaged changes:
Modified: analysis/mashrEMlk.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | adb05db | zouyuxin | 2018-10-30 | wflow_publish(“analysis/susie_summarySTAT.Rmd”) |
html | 62d6309 | zouyuxin | 2018-10-09 | Build site. |
Rmd | e6b0f9d | zouyuxin | 2018-10-09 | wflow_publish(“analysis/susie_summarySTAT.Rmd”) |
html | 59d6d36 | zouyuxin | 2018-09-05 | Build site. |
Rmd | 046104b | zouyuxin | 2018-09-05 | wflow_publish(c(“analysis/susie_summarySTAT.Rmd”)) |
html | f4ebd91 | zouyuxin | 2018-09-05 | Build site. |
Rmd | ecdacb8 | zouyuxin | 2018-09-05 | wflow_publish(c(“analysis/VB_theory.Rmd”, “analysis/susie_summarySTAT.Rmd”, “analysis/SUmofSIngleEffect.Rmd”)) |
html | 89eafd2 | zouyuxin | 2018-08-20 | Build site. |
Rmd | ff75070 | zouyuxin | 2018-08-20 | wflow_publish(“analysis/susie_summarySTAT.Rmd”) |
html | 7b44c37 | zouyuxin | 2018-08-15 | Build site. |
Rmd | 85c8b0f | zouyuxin | 2018-08-15 | wflow_publish(“analysis/susie_summarySTAT.Rmd”) |
html | d48e72d | zouyuxin | 2018-07-17 | Build site. |
Rmd | 7fd7b1f | zouyuxin | 2018-07-17 | wflow_publish(“analysis/susie_summarySTAT.Rmd”) |
html | b2d26ee | zouyuxin | 2018-07-17 | Build site. |
Rmd | ed14aba | zouyuxin | 2018-07-17 | wflow_publish(c(“analysis/index.Rmd”, “analysis/susie.Rmd”, |
In SUSIE
, we need individual level data. However, individual level data is hard to obtain in GWAS. To implement SUSIE
using summary statistics, we only need a simple trick.
\[ \beta_{lj}|\gamma_{lj}=1 \sim_{q_{l}} N(\mu_{lj}, s_{lj}^{2}) \quad q(\gamma_{l}) = \alpha_{lj} \]
\[ s_{lj}^{2} = \left(\tau_{l} + \tau (X^{T}X)_{jj}\right)^{-1} \]
\[ \mu_{lj} = \frac{\tau(X^{T}\mathbf{y}_{res,-l})_{j}}{\tau_{l} + \tau (X^{T}X)_{jj}} = s_{lj}^{2}\tau (X^{T}\mathbf{y}_{res,-l})_{j} = s_{lj}^{2}\tau [(X^{T}\mathbf{y})_{j} - \sum_{k\neq j}(X^{T}X)_{kj}\alpha_{lk}\mu_{lk}] \]
\[ \alpha_{lj} \propto \pi_{j} N(\frac{(X^{T}\mathbf{y}_{res,-l})_{j}}{(X^{T}X)_{jj}}; 0, \frac{1}{\tau(X^{T}X)_{jj}} + \frac{1}{\tau_{l}} ) \]
\[ \frac{\alpha_{lj}}{1-\alpha_{lj}} = \frac{\pi_{j}}{1-\pi_{j}} \sqrt{\tau_{l}s_{lj}^2} \exp(\frac{\mu_{lj}^{2}}{2s_{lj}^{2}}) = \frac{\pi_{j}}{1-\pi_{j}} BF \]
We have \(X^{T}X\), so the only problematic part is \(X^{T}\mathbf{y}_{res,-j}\).
In susie
code, \(\mathbf{y}_{res,-l}\) is the R
(update_each_effect).
\[ \mathbf{y}_{res,-l} = R = Y - X\mathbf{r}_{l} \] where \(\mathbf{r}_{l} = \alpha_{l\cdot}\mu_{l\cdot}\).
We don’t have X
and Y
here. But we actually need \(X^{T}\mathbf{y}_{res,-l}\). \[
X^{T}\mathbf{y}_{res,-l} = X^{T}R = X^{T}Y - X^{T}X\mathbf{r}_{l}
\] which can be easily computed using \(X^{T}Y\) and \(X^{T}X\).
\(\mathbf{r} = \sum_{l=1}^{L} \mathbf{r}_{l}\)
\[ \begin{align*} L(q(\boldsymbol{\beta})) &= -\frac{n}{2}\log(2\pi) + \frac{n}{2}\log \tau - \frac{\tau}{2}\left[\|\mathbf{y} - X\mathbf{r}\|^{2} - \sum_{l=1}^{L}\|X\mathbf{r}_{l}\|^{2} + \sum_{l=1}^{L}\sum_{j=1}^{p}(X^{T}X)_{jj}\mathbb{E}(\beta_{lj}^2)\right] + \sum_{l=1}^{L} \left[ \log p(\mathbf{y}_{resid,-l}) + \frac{n}{2}\log(2\pi) - \frac{n}{2}\log \tau + \frac{\tau}{2}\left(\mathbf{y}_{resid,-l}^{T}\mathbf{y}_{resid,-l} - 2\mathbf{y}_{resid,-l}^{T}X\mathbb{E}_{q}\left[\boldsymbol{\beta}_{l}\right] + (X^{2})^{T}\mathbb{E}_{q}\left[\boldsymbol{\beta}_{l}\boldsymbol{\beta}_{l}^{T}\right] \right) \right] \\ &= -\frac{n}{2}\log(2\pi) + \frac{n}{2}\log \tau - \frac{\tau}{2}\left[\|\mathbf{y} - X\mathbf{r}\|^{2} - \sum_{l=1}^{L}\|X\mathbf{r}_{l}\|^{2} + \sum_{l=1}^{L}\sum_{j=1}^{p}(X^{T}X)_{jj}\mathbb{E}(\beta_{lj}^2)\right] + \sum_{l=1}^{L} \left[ \log \left[\sum_{j=1}^{p} \pi_{j} BF(\mathbf{y}_{resid,-l}, \mathbf{x}_{j})\right] + \log N(\mathbf{y}_{resid,-l}; 0, \frac{1}{\tau}I) + \frac{n}{2}\log(2\pi) - \frac{n}{2}\log \tau + \frac{\tau}{2}\left(\mathbf{y}_{resid,-l}^{T}\mathbf{y}_{resid,-l} - 2\mathbf{y}_{resid,-l}^{T}X\mathbb{E}_{q}\left[\boldsymbol{\beta}_{l}\right] + (X^{2})^{T}\mathbb{E}_{q}\left[\boldsymbol{\beta}_{l}\boldsymbol{\beta}_{l}^{T}\right] \right) \right] \\ &= -\frac{n}{2}\log(2\pi) + \frac{n}{2}\log \tau - \frac{\tau}{2}\left[\|\mathbf{y} - X\mathbf{r}\|^{2} - \sum_{l=1}^{L}\|X\mathbf{r}_{l}\|^{2} + \sum_{l=1}^{L}\sum_{j=1}^{p}(X^{T}X)_{jj}\mathbb{E}(\beta_{lj}^2)\right] + \sum_{l=1}^{L} \left[ \log \left[\sum_{j=1}^{p} \pi_{j} BF(\mathbf{y}_{resid,-l}, \mathbf{x}_{j})\right] - \tau \mathbf{y}_{resid,-l}^{T}X\mathbb{E}_{q}\left[\boldsymbol{\beta}_{l}\right] + \frac{\tau}{2}(X^{2})^{T}\mathbb{E}_{q}\left[\boldsymbol{\beta}_{l}\boldsymbol{\beta}_{l}^{T}\right] \right] \\ \end{align*} \]
\[ \|\mathbf{y} - X\mathbf{r}\|^2 = \mathbf{y}^{T}\mathbf{y} + \mathbf{r}^{T}X^{T}X\mathbf{r} - 2\mathbf{y}^{T}X\mathbf{r} = (n-1)Var(y) + \mathbf{r}^{T}X^{T}X\mathbf{r} - 2\mathbf{y}^{T}X\mathbf{r} \]
\[ \|X\mathbf{r}_{l}\|^{2} = \mathbf{r}_{l}^{T}X^{T}X\mathbf{r}_{l} \]
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] workflowr_1.1.1 Rcpp_0.12.19 digest_0.6.18
[4] rprojroot_1.3-2 R.methodsS3_1.7.1 backports_1.1.2
[7] magrittr_1.5 git2r_0.23.0 evaluate_0.12
[10] stringi_1.2.4 whisker_0.3-2 R.oo_1.22.0
[13] R.utils_2.7.0 rmarkdown_1.10 tools_3.5.1
[16] stringr_1.3.1 yaml_2.2.0 compiler_3.5.1
[19] htmltools_0.3.6 knitr_1.20
This reproducible R Markdown analysis was created with workflowr 1.1.1