26th Summer Institute in Statistical Genetics (SISG)


This module is currently full. Registrations are closed at this time.

Module 3: Introduction to R

Wed, July 7 to Fri, July 9
Instructor(s):
Registration for this module closes June 30. 

 

Live session timeframe (exact schedule with live sessions will be posted by module instructors prior to the start of the module): Wednesday: 11:30 a.m.-2:30 p.m. Pacific (2:30-5:30 p.m. Eastern); Thursday, 8 a.m. – 2:30 p.m. Pacific  (11 a.m. – 5:30 p.m. Eastern); Friday, 8 a.m. – 2:30 p.m. Pacific (11 a.m. – 5:30 p.m. Eastern).

This module introduces the R statistical environment, assuming no prior knowledge. It provides a foundation for the use of R for computation in later modules.

In addition to discussing basic data management tasks in R, such as reading in data and producing summaries through R scripts, we will also introduce R’s graphics functions, its powerful package system, and simple methods of looping.

Hands-on use of R is a major component of this module; users require a laptop and will use it in all sessions. Examples and exercises will use data drawn from biological and medical applications, including infectious diseases and genetics. Participants require a laptop and will use it in all sessions. Suggested pairing: All later modules.

Learning Objectives: After attending this module, participants will be able to:

  1. Use R to perform descriptive statistics including graphics.
  2. Read and write data files.
  3. Perform basic data manipulations (e.g. creating new variables, merging data sets).
  4. Write and use R script files.
  5. Install and load R packages, and be able to access the help system and other resources to facilitate their use.
  6. Perform basic inferential statistical analyses including regression analysis.
  7. Write and use R functions, and perform basic programming in R including loops.