25th Summer Institute in Statistical Genetics (SISG)


Module 10: Pathway & Network Analysis for Omics Data

Mon, July 20 to Wed, July 22

Module dates/times: Monday, July 20; Tuesday, July 21, and Wednesday, July 22. Live sessions will start no earlier than 8 a.m. Pacific and end no later than 2:30 p.m. Pacific, except for Wednesdays. For modules that end on Wednesday, live sessions will end by 11 a.m. Pacific. For modules that start on Wednesday, live sessions will begin no earlier than 11:30 a.m.

Networks represent the interactions among components of biological systems. In the context of high dimensional omics data, relevant networks include gene regulatory networks, protein-protein interaction networks, and metabolic networks. These networks provide a window into biological systems as well as complex diseases, and can be used to understand how biological functions are implemented and how homeostasis is maintained. On the other hand, pathway-based analyses can be used to leverage biological knowledge available from literature, gene ontologies or previous experiments in order to identify the pathways associated with disease or an outcome of interest.

In this module, various statistical learning methods for reconstruction and analysis of networks from omics data are discussed, as well as methods of pathway enrichment analysis. Particular attention is paid to omics datasets with a large number of variables, e.g. genes, and a small number of samples, e.g. patients. The techniques discussed will be demonstrated in R. This course assumes familiarity with R or other command-line programming languages. Suggested pairing: Modules 5, 10, 15.

Access 2019 course materials.

Learning Objectives: After attending this module, participants will be able to:

  1. Evaluate the relative strengths and weaknesses of publicly available knowledge bases for gene set analysis.
  2. Choose an appropriate null hypotheses in gene-set analysis methods for specific biological questions.
  3. Using publicly available tools, test for over representation of gene-sets/pathways from individual gene association results.
  4. Estimate (partially) directed and undirected networks from high-dimensional omics data, using publicly available software appropriate for the data at hand.
  5. Perform network-based pathway enrichment analysis using publicly available software tools.
  6. Perform version control for meta-data (e.g. pathway and network data) and analysis (codes, hyper-parameters) to ensure reproducibility of results.