Last updated: 2018-06-27

Code version: f6754bb

The summary statistics are from Mixed model association for biobank-scale data sets.

The \(\beta\) are coefficients from mix effects model. Since fitting the mix model with a large sample size is intractable, the authors estimate them using \(\chi^2_{BOLT\_LMM\_inf}\) statistics. There is no information about how to compute se. The p value is based on \(\chi^2_{BOLT\_LMM}\) statistics. Detail about the BOLT LMM is in BOLT_LMM

There are 23 phenotypes. The tri-allelic SNPs are excluded from the data. We took the union of the SNPs from different phenotype, so there are missing values in \(\hat{B}\) matrix. The total data set contains 11988455 SNPs.

The phenotypes are

Abbreviation Phenotype
Eosinophil_Count Eosinophil count
Height Height
BMI BMI
WHR Waist hip ratio
BMD Bone mineral density
FVC Forced vital capacity
FEV1FVC FEV1 FVC ratio
Red_Count Red blood cell count
RBC_Dist_Width RBC distribution width
White_Count White blood cell count
Platelet_Count Platelet count
BP Blood pressure (systolic)
Cardiovascular Cardiovascular disease
T2D Type 2 diabetes
Respiratory Respiratory disease
Allergy_Eczema Allergy or eczema
Hypothyroidism Hypothyroidism
Neuroticism Neuroticism
MorningPerson Chronotype (morning person)
Hair Hair color
Tanning Tanning ability
Edu_Years Years of education
Smoking Smoking status

The procedure to select strong SNPs subset:

  1. Select SNPs with p value less than \(5 \times 10^{-8}\).
  2. For each chromosome, pick SNPs that at least 15000 bp away.

The procedure to select random SNPs subset:

  1. Random select 800000 SNPs
  2. For each chromosome, pick SNPs that at least 15000 bp away.

The strong subset contains 60070 SNPs. The random subset contains 142075 SNPs.

Session information

sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.5

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] compiler_3.4.4  backports_1.1.2 magrittr_1.5    rprojroot_1.3-2
 [5] tools_3.4.4     htmltools_0.3.6 yaml_2.1.19     Rcpp_0.12.17   
 [9] stringi_1.2.2   rmarkdown_1.9   knitr_1.20      git2r_0.21.0   
[13] stringr_1.3.0   digest_0.6.15   evaluate_0.10.1

This R Markdown site was created with workflowr