24th Summer Institute in Statistical Genetics (SISG)

Module 17: Computational Pipeline for WGS Data

Wed, July 24 to Fri, July 26

Module dates/times: Wednesday, July 24, 1:30-5 p.m.; Thursday, July 25, 8:30 a.m.-5 p.m., and Friday, July 26, 8:30 a.m.-5 p.m.

This module is designed to follow on from Module 14. It will be a hands-on introduction to whole genome sequence analysis pipelines, informed by the instructors' experience with the TOPMed project (www.nhlbiwgs.org), and in particular its focus on pooled-data analysis used to study the role of rare variants on disease outcomes.

It will begin with an overview of data formats (BAM, VCF, GDS), and then cover population structure and relatedness effects on association mapping, phenotype harmonization, association testing (single-variant, burden and SKAT), variant annotation, WGS variant analysis pipelines focusing on tools used in the TOPMed Analysis pipeline and the role of cloud computing.

Access 2018 course materials through the Summer Institutes archives.

Suggested pairing: Modules 13 and 14.

Stephanie Gogarten is a Research Scientist in the Genetics Analysis Center at the University of Washington. She develops computational pipelines for GWAS and WGS data. She was lead author on “GWASTools: an R/Bioconductor package for quality control and analysis of Genome-Wide Association Studies. Bioinfor- matics 28:3329-3331, 2012. She recently published “Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology.” nature Genetics 49:1560-1563, 2017.

Ken Rice is Professor of Biostatistics at the University of Washington. His research focuses primarily on developing and applying statistical methods for complex disease epidemiology, notably cardiovascular disease. He leads the Analysis Committee for the CHARGE consortium, a large group of investigators studying genetic determinants of heart and aging outcomes. He recently published “Large-scale genome-wide analysis identifies genetic variants associated with cardiac structure and function.” J. Clinical Investigation 127:1798-1812.

Tim Thornton is Associate Professor of Biostatistics at the University of Washington. His research interest is in the area of statistical genetics, with an emphasis on statistical methodology for genetic association studies of complex traits in samples with relatedness, ancestry admixture, and/or population structure. He recently published “Admixture mapping in the Hispanic Community Health Study/Study of Latinos reveals regions of genetic associations with blood pressure traits.” PLoS One 12:e0188400, 2017.