Full results of Zhu and Stephens (2018)

Last updated: 2024-09-16

Checks: 2 0

Knit directory: rss-gsea/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Repository version: 0d0131b

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 0d0131b. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/index.Rmd) and HTML (docs/index.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
html	dedfe81	Xiang Zhu	2024-09-16	Build site.
html	7223781	Xiang Zhu	2019-02-18	Build site.
html	ba77842	Xiang Zhu	2018-12-26	Build site.
Rmd	7baf5f9	Xiang Zhu	2018-12-26	wflow_publish("analysis/index.Rmd")
html	d526c1f	Xiang Zhu	2018-10-29	Build site.
Rmd	f1570df	Xiang Zhu	2018-10-29	wflow_publish("analysis/index.Rmd")
html	622097d	Xiang Zhu	2018-10-29	Build site.
Rmd	d5550b5	Xiang Zhu	2018-10-29	wflow_publish("analysis/index.Rmd")
html	fa4d958	Xiang Zhu	2018-10-19	Build site.
Rmd	bde6899	Xiang Zhu	2018-10-19	wflow_publish("analysis/index.Rmd")
html	4789ea4	Xiang Zhu	2018-10-06	Build site.
Rmd	956af9a	Xiang Zhu	2018-10-06	wflow_publish("analysis/index.Rmd")
html	da04234	Xiang Zhu	2018-10-06	Build site.
Rmd	6b08034	Xiang Zhu	2018-10-06	wflow_publish("analysis/index.Rmd")
html	d1b708d	Xiang Zhu	2018-10-06	Build site.
Rmd	d63db43	Xiang Zhu	2018-10-06	wflow_publish("analysis/index.Rmd")
html	32eb1dd	Xiang Zhu	2018-10-05	Build site.
Rmd	5c7ee1e	Xiang Zhu	2018-10-05	wflow_publish("analysis/index.Rmd")
html	1c85967	Xiang Zhu	2018-10-05	Build site.
html	810e15a	Xiang Zhu	2018-09-16	Build site.
Rmd	ebd4220	Xiang Zhu	2018-09-16	wflow_publish("analysis/index.Rmd")
html	ecf07e1	Xiang Zhu	2018-09-16	Build site.
Rmd	ec44ee0	Xiang Zhu	2018-09-16	wflow_publish("analysis/index.Rmd")
html	8f27e6d	Xiang Zhu	2018-07-02	Build site.
Rmd	c07c276	Xiang Zhu	2018-07-02	wflow_publish("index.Rmd")
html	c77f1e1	Xiang Zhu	2018-07-02	Build site.
Rmd	acbdfcb	Xiang Zhu	2018-07-02	wflow_publish("index.Rmd")
html	c1e0aac	Xiang Zhu	2018-07-02	Build site.
Rmd	16a82f8	Xiang Zhu	2018-07-02	wflow_publish("index.Rmd")
html	835a6be	Xiang Zhu	2018-07-02	Build site.
Rmd	447917e	Xiang Zhu	2018-07-02	wflow_publish("index.Rmd")
html	55087ae	Xiang Zhu	2018-07-01	Build site.
Rmd	07473ff	Xiang Zhu	2018-07-01	wflow_publish("index.Rmd")
html	2b35739	Xiang Zhu	2018-06-30	Build site.
Rmd	e364292	Xiang Zhu	2018-06-30	wflow_publish("index.Rmd")
html	de49400	Xiang Zhu	2018-06-29	Build site.
Rmd	f80d46e	Xiang Zhu	2018-06-29	wflow_publish("index.Rmd")
html	b30daba	Xiang Zhu	2018-06-27	Build site.
Rmd	bc9746a	Xiang Zhu	2018-06-27	wflow_publish("index.Rmd")
html	c749ac7	Xiang Zhu	2018-06-27	Build site.
Rmd	6006659	Xiang Zhu	2018-06-27	wflow_publish("index.Rmd")
html	0852676	Xiang Zhu	2018-06-27	Build site.
Rmd	46603f3	Xiang Zhu	2018-06-27	wflow_publish("index.Rmd")
html	ff6afd0	Xiang Zhu	2018-06-26	Build site.
Rmd	bf31edf	Xiang Zhu	2018-06-26	wflow_publish(files = c("analysis/index.Rmd", "analysis/license.Rmd"))
html	33aa5b3	Xiang Zhu	2018-06-26	Build site.
Rmd	02b392f	Xiang Zhu	2018-06-26	wflow_publish("analysis/index.Rmd")
Rmd	cee1a92	Xiang Zhu	2018-06-26	Start workflowr project.

Overview

This is my online notebook to document and share the full results of genome-wide enrichment and prioritization analyses described in the article:

Xiang Zhu and Matthew Stephens (2018). Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes. Nature Communications 9, 4361. https://doi.org/10.1038/s41467-018-06805-x.

We developed a new statistical method, RSS-E, to generate the results for this study. The software that implements RSS-E is freely available at stephenslab/rss. We also provide an end-to-end example illustrating how to use RSS-E to perform the reported genome-wide enrichment and prioritization analyses of GWAS summary statistics. This software can be referenced in a journal’s “Code availability” section as .

In addition, all 4,026 pre-processed gene sets used in this study (including 3,913 biological pathways and 113 tissue-based gene sets) are freely available at xiangzhu/rss-gsea. These gene sets can be referenced in a journal’s “Data availability” section as .

If you find the analysis results, the pre-processed gene sets, the statistical methods, and/or the open-source software useful for your work, please kindly cite our article listed above, Zhu and Stephens (2018).

If you have any question about the notebook and/or the article, please feel free to contact me: Xiang Zhu, xiangzhu[at]uchicago[and/or]stanford.edu.

Main results

Anthropometric phenotypes

Hematopoietic phenotypes

Metabolic phenotypes

Neurological phenotypes

Gene prioritization

Additional resources

How can I perform similar analyses on a new GWAS summary dataset using RSS-E?

The software that generated results of this study is freely available at stephenslab/rss. I also write a step-by-step RSS-E tutorial that illustrates how to use this software to perform genome-wide enrichment and prioritization analyses on GWAS summary statistics.

Compared with most existing enrichment methods, the most appealing feature of RSS-E is the automatic gene prioritization in light of inferred enrichments. Is this gene prioritization feature available in your software?

Yes. This feature is implemented as function compute_pip.m in RSS-E. The step-by-step RSS-E tutorial illustrates how to use this function.

There are two sanity checks for the more sophisticated RSS-E analysis in Zhu and Stephens (2018): an eyeball test and a likelihood ratio calculation. Do you have software for these sanity checks?

Yes. The eyeball test is simply plotting marginal distribution of GWAS z-scores, stratified by SNP-level annotations based on a given gene set. Here we used ggplot2::geom_density (default setting). Regarding the likelihood ratio check, I write a stand-alone script ash_lrt_31traits.R. Please carefully read the instruction in this script. For more details of these two sanity checks, please see the caption of Supplementary Figure 17 in Zhu and Stephens (2018).

Where can I download all 4,026 pre-processed gene sets used in this work?

All 4,026 gene sets used in this study are freely available at xiangzhu/rss-gsea, where the folder biological_pathway contains 3,913 biological pathways, and the folder tissue_set contains 113 GTEx tissue-based gene sets. More details about these gene sets can be found here.

Where can I find RSS-E “baseline” model fitting results of all 31 traits?

You can find summary results of “baseline” model fitting at xiangzhu/rss-gsea-baseline. For me, the baseline model fitting results are merely inferential “bases” for the enrichment model fitting results shown in the “Main results” section above. However, when I was presenting the enrichment results during my Ph.D. thesis defense, Prof. John Novembre and Prof. Xin He both pointed out these baseline results might be useful for other on-going research projects on the “fourth floor” (i.e. the fantastic computational space shared with the labs of Matthew Stephens, John Novembre and Xin He). Their comments motivated me to create a separate online notebook xiangzhu/rss-gsea-baseline to share the baseline summary results.

Where can I find “Round 1” RSS-E results of all 3,913 biological pathways?

Currently you need to contact me directly to view our “Round 1” results of all 3913 pathways. When this work was under review, one referee pointed out that our previous online results, especially our “Round 1” analysis results, were “needlessly complicated” and did not have “any obvious benefit”. Hence, I removed the “Round 1” analysis results from this notebook to simplify the presentation. I sincerely hope that this change can address this referee’s comment.