26th Summer Institute in Statistical Genetics (SISG)


This module is currently full. Registrations are closed at this time.

Module 12: Pathway & Network Analysis for Omics Data

Mon, July 19 to Wed, July 21
Registration for this module closes July 12. 

 

Live session timeframe (exact schedule with live sessions will be posted by module instructors prior to the start of the module): Monday: 8 a.m. – 2:30 p.m. Pacific (11 a.m. – 5:30 p.m. Eastern); Tuesday: 8 a.m. – 2:30 p.m. Pacific (11 a.m. – 5:30 p.m. Eastern); Wednesday: 8 a.m. – 11 a.m. Pacific (11 a.m. – 2 p.m. Eastern).

Networks represent the interactions among components of biological systems. In the context of high dimensional omics data, relevant networks include gene regulatory networks, protein-protein interaction networks, and metabolic networks. These networks provide a window into biological systems as well as complex diseases, and can be used to understand how biological functions are implemented and how homeostasis is maintained. On the other hand, pathway-based analyses can be used to leverage biological knowledge available from literature, gene ontologies or previous experiments in order to identify the pathways associated with disease or an outcome of interest.

In this module, various statistical learning methods for reconstruction and analysis of networks from omics data are discussed, as well as methods of pathway enrichment analysis. Particular attention is paid to omics datasets with a large number of variables, e.g. genes, and a small number of samples, e.g. patients. The techniques discussed will be demonstrated in R. This course assumes familiarity with R or other command-line programming languages. Suggested pairing: Modules 5, 10, 17.

Learning Objectives: After attending this module, participants will be able to:

  1. Evaluate the relative strengths and weaknesses of publicly available knowledge bases for gene set analysis.
  2. Choose an appropriate null hypotheses in gene-set analysis methods for specific biological questions.
  3. Using publicly available tools, test for over representation of gene-sets/pathways from individual gene association results.
  4. Estimate (partially) directed and undirected networks from high-dimensional omics data, using publicly available software appropriate for the data at hand.
  5. Perform network-based pathway enrichment analysis using publicly available software tools.
  6. Perform version control for meta-data (e.g. pathway and network data) and analysis (codes, hyper-parameters) to ensure reproducibility of results.