## 25th Summer Institute in Statistical Genetics (SISG)

#### Module 1: Probability and Statistical Inference

Mon, July 13 to Wed, July 15
Instructor(s):

Module dates: Monday, July 13, 8 a.m. - 1 p.m. Pacific; Tuesday, July 14, 8 a.m. - 1 p.m. Pacific, and Wednesday, July 15, 8 a.m. - 11 a.m. Pacific.

This module serves as an introduction to statistical inference using tools from mathematical statistics and probability. It introduces core elements of statistical modeling, beginning with a review of basic probability and some common distributions (such as the binomial, multinomial, and normal distributions). Maximum likelihood estimation is motivated and described. The central limit theorem and frequentist confidence intervals are introduced, along with simple Bayes methods.

We then cover classical hypothesis testing scenarios such as one-sample tests, two-sample tests, chi-square tests for categorical data analysis, and permutation tests. The course concludes with an overview of resampling methods, such as the bootstrap and jackknife, and a discussion of multiple testing corrections such as false discovery rate control. This module serves as a foundation for almost all of the later modules.

Training in calculus is not a prerequisite for this module, but a willingness to attempt math problems and some comfort with basic algebra will be necessary. Suggested pairing: Modules 4 and 7.

Access 2019 course materials.

Learning Objectives: After attending this module, participants will be able to:

1. Describe the assumptions underlying the Binomial, Multinomial, and Normal probability models.
2. Define sensitivity, specificity and predictive values in the context of a binary screening test for a disease.
3. Explain how the likelihood function can be used for estimation and model selection.
4. Translate scientific questions into appropriate null and alternative hypotheses.
5. Describe the assumptions underlying z-tests, t-tests and chi-square tests and use these tests to statistically compare two samples.
6. Explain and interpret p-values and confidence intervals.
7. Recognize and explain the concepts of confounding and effect modification.
8. Explain the role of computer intensive methods (bootstrap, jackknife, permutation tests) in hypothesis testing and confidence intervals.
9. Explain the false discovery approach to addressing the issue of multiple comparisons in hypothesis testing.