The 27th Summer Institute in Statistical Genetics

Module 18: Multivariate Analysis for Genetics Data

Wed, July 27 to Fri, July 29
Instructor(s):

This module provides an introduction to multivariate analysis, with a strong emphasis on data visualization by means of multivariate graphics known as biplots. The course covers principal component analysis (PCA), multidimensional scaling (MDS), correspondence analysis (CA), canonical analysis, cluster analysis, discriminant analysis (DA) and some multivariate inference, illustrating these methods with genetic data. Some genetic datasets have a compositional nature, and basic principles of compositional data analysis like log-ratio transformations are considered. The use of multivariate methods for uncovering population substructure and cryptic relatedness is addressed. 

Suggested pairing: Module 7: Applications of Population Genetics and Module 8: Statistical Genetics

Course materials can be accessed through the Summer Institutes archives.

Learning Objectives: After attending this module, participants will be able to:

  1. Describe the purpose of basic multivariate statistical methods.
  2. Select an appropriate multivariate method for a given data set.
  3. Apply adequate transformations for a given data set.
  4. Perform multivariate statistical analysis on a computer in the R environment.
  5. Visualize multivariate data by means of biplot construction.
  6. Interpret biplots correctly and assess goodness-of-fit.
  7. Carry out basic multivariate hypothesis tests.
  8. State the peculiar nature of compositional data, and account it for in the analysis.