26th Summer Institute in Statistical Genetics (SISG)


This module is currently full. Registrations are closed at this time.

Module 18: Multivariate Analysis for Genetic Data

Wed, July 21 to Fri, July 23
Instructor(s):
Registration for this module closes July 14. 

 

Live session timeframe (exact schedule with live sessions will be posted by module instructors prior to the start of the module): Wednesday: 11:30 a.m.-2:30 p.m. Pacific (2:30-5:30 p.m. Eastern); Thursday, 8 a.m. – 2:30 p.m. Pacific  (11 a.m. – 5:30 p.m. Eastern); Friday, 8 a.m. – 2:30 p.m. Pacific (11 a.m. – 5:30 p.m. Eastern).

This module provides an introduction to multivariate analysis, with a strong emphasis on data visualization by means of multivariate graphics known as biplots. The course covers principal component analysis (PCA), multidimensional scaling (MDS), correspondence analysis (CA), canonical analysis, cluster analysis, discriminant analysis (DA) and some multivariate inference, illustrating these methods with genetic data. Some genetic datasets have a compositional nature, and basic principles of compositional data analysis like log-ratio transformations are considered. The use of multivariate methods for uncovering population substructure and cryptic relatedness is addressed. Suggested pairing: Modules 7 and 9.

Course materials can be accessed through the Summer Institutes archives.

Learning Objectives: After attending this module, participants will be able to:

  1. Describe the purpose of basic multivariate statistical methods.
  2. Select an appropriate multivariate method for a given data set.
  3. Apply adequate transformations for a given data set.
  4. Perform multivariate statistical analysis on a computer in the R environment.
  5. Visualize multivariate data by means of biplot construction.
  6. Interpret biplots correctly and assess goodness-of-fit.
  7. Carry out basic multivariate hypothesis tests.
  8. State the peculiar nature of compositional data, and account it for in the analysis.