GWAS SNPs Annotation, Prioritization and Interpretation associated with Asthma

Overview

Genome-wide association studies (GWAS) have identified numerous single nucleotide polymorphisms (SNPs) associated with complex diseases such as asthma. However, not all SNPs are functionally relevant. This project implements a reproducible pipeline to filter, annotate, and prioritize asthma-associated SNPs. Asthma is a genetically and biologically complex disease. Identifying functional variants and druggable gene targets is key for translational impact. This pipeline implements a multi-omics, systems biology approach to prioritize GWAS signals using:

Multi-mapping strategies (positional, eQTL, chromatin)
Pathway and tissue-enrichment analysis
Integration with Open Target and Pharos database for clinical & drug candidacy

The workflow integrates Python (Jupyter Notebooks), R (biomaRt and visualization), and online bioinformatics resources (NHGRI GWAS Catalog, FUMA and Pharos) to identify biologically meaningful variants. The ultimate goal is to highlight candidate SNPs and genes with potential roles in asthma pathogenesis, providing a foundation for downstream functional studies and personalized medicine approaches.

Tools & Technologies

Languages: Python (Jupyter Notebook) and R
Databases & Resources: NHGRI GWAS Catalog, FUMA GWAS, Open Target and Pharos
Libraries: Python: (Pandas, os and csv) R: (biomaRt, ggplot2, ggthemes readxl, tidyr, data.table, dplyr, stringr and igraph)
Reproducibility: RMarkdown for reporting and GitHub for version control

Data Source

NHGRI GWAS Catalog: GCST010042 (Han Y. et al.), containing asthma-associated SNPs and their metadata.
Additional data integration from:
- FUMA GWAS for regulatory annotation and deleteriousness prediction
- Pharos for scoring and prioritising eQTL genes

How to Run

Clone this repository: git clone <https://github.com/Naila-Srivastava/GWAS-SNPs-Annotation-Prioritization> cd GWAS-SNPs-Annotation-Prioritization
Install dependencies: pip install -r requirements.txt
Run analysis
Generate final report

Methodology

Features

Automated SNP filtering by p-value and trait relevance.
Functional annotation using Ensembl BioMart.
Prediction of regulatory effects using FUMA.
eQTL mapping for gene expression association.
Scoring & prioritization of SNPs integrating multiple evidence sources.
Multi-level visualization: Manhattan plots, graphs and networks.

Visualizations

Manhattan plot for genome-wide SNP significance
Network graph showing top genes relationships
Minor Allele Frequency Distribution graph
SNPs Consequences barplot
Top genes associated with asthma
Genes associated with the most frequent KEGG Pathways

Results

IL13, IL4, IL4R, IL2RA, ORMDL3, GSDMB, ZPBP2, IKZF3, KIF3A, SMAD3, TLR1, RORA, RUNX3, LRRC32, C11orf30/EMSY, RAD50, TNFSF4 etc. have been found to have strong associations with Asthma.
Majority of the genes are enriched in pathways, like- Systemic Lupus Erythematosus, JAK-STAT Signalling Pathway and Hematopoietic cell lineage.
The selected SNPs are 'intronic' and 'intergenic' in nature.

Key Takeaways

Successfully implemented a bioinformatics pipeline to prioritize asthma SNPs.
Identified candidate SNPs with high regulatory potential and disease association.
Integrated multiple datasets for a systems-level perspective of asthma genetics.
Established a reproducible framework for GWAS SNP annotation and visualization.

What’s Next?

Extend prioritization to other immune-related traits (e.g., atopy, allergic rhinitis).
Incorporate machine learning models for SNP classification.
Expand to multi-omics integration (epigenetics, transcriptomics).
Publish the pipeline as a ready-to-use workflow (Nextflow/Snakemake).

References

NHGRI GWAS Catalog: https://www.ebi.ac.uk/gwas/ [Han Y. et al. (GCST010042)]
FUMA GWAS: https://fuma.ctglab.nl/
Open Target: https://www.opentargets.org/
Pharos: https://pharos.nih.gov/

Project Structure

Asthma pGWAS SNP Prioritization and Interpretation/  
│
├── README.md                                                  # You're reading this now   
├── requirements.txt                                           # Python dependencies   
│
├── GWAS_data_cleaning_&_preprocessing.ipynb                   # Jupyter notebook (Data cleaning & preprocessing)  
├── R/                                                         # R scripts     
├── Results/                                                   # Processed files and tables
├── Visuals/                                                   # Processed figures and plots  
└── data/                                                      # Input datasets

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GWAS SNPs Annotation, Prioritization and Interpretation associated with Asthma

Overview

Tools & Technologies

Data Source

How to Run

Methodology

Features

Visualizations

Results

Key Takeaways

What’s Next?

References

Project Structure

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
R		R
Results		Results
Visuals		Visuals
data		data
.gitattributes		.gitattributes
GWAS_data_cleaning_&_preprocessing.ipynb		GWAS_data_cleaning_&_preprocessing.ipynb
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

Naila-Srivastava/GWAS-SNPs-Annotation-Prioritization

Folders and files

Latest commit

History

Repository files navigation

GWAS SNPs Annotation, Prioritization and Interpretation associated with Asthma

Overview

Tools & Technologies

Data Source

How to Run

Methodology

Features

Visualizations

Results

Key Takeaways

What’s Next?

References

Project Structure

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages