SISCER 2019 Module 19 Analyzing Data from Complex Surveys

Module info

  • Location In Person
  • Room SCC 308
  • Meeting Times Fri, Jul 26, 8:30am-5pm PST
  • Instructors Thomas Lumley Thomas Lumley

Data from multistage surveys such as NHANES and BRFSS has been important in health research for many years. More recently, two-phase sampling has allowed more efficient subsampling from existing cohorts and databases. This module will provide an overview of data analysis for complex surveys.

We will introduce the basic concepts that distinguish complex samples from more familiar data: clusters, strata, and weights. We will then cover basic summary statistics, exploratory analysis, and regression modelling for multistage surveys. Finally, we will briefly discuss two-phase sampling and the use of raking to bring in information from the whole cohort when analyzing a subsample

Mathematical details will be kept to a minimum, and there will be data examples with code in R for all topics and in Stata for most. Familiarity with linear and logistic regression will be assumed.