25th Summer Institute in Statistical Genetics (SISG)


Module 3: Introduction to R

Mon, July 13 to Wed, July 15
Instructor(s):

Module dates/times: Live sessions will start no earlier than 8 a.m. Pacific and end no later than 2:30 p.m. Pacific, except for Wednesdays. For modules that end on Wednesday, live sessions will end by 11 a.m. Pacific. For modules that start on Wednesday, live sessions will begin no earlier than 11:30 a.m.

This module introduces the R statistical environment, assuming no prior knowledge. It provides a foundation for the use of R for computation in later modules.

In addition to discussing basic data management tasks in R, such as reading in data and producing summaries through R scripts, we will also introduce R’s graphics functions, its powerful package system, and simple methods of looping.

Hands-on use of R is a major component of this module; users require a laptop and will use it in all sessions. Examples and exercises will use data drawn from biological and medical applications, including infectious diseases and genetics. Participants require a laptop and will use it in all sessions. Suggested pairing: All later modules.

Learning Objectives: After attending this module, participants will be able to:

  1. Use R to perform descriptive statistics including graphics.
  2. Read and write data files.
  3. Perform basic data manipulations (e.g. creating new variables, merging data sets).
  4. Write and use R script files.
  5. Install and load R packages, and be able to access the help system and other resources to facilitate their use.
  6. Perform basic inferential statistical analyses including regression analysis.
  7. Write and use R functions, and perform basic programming in R including loops.