Summer Institutes

The 9th Summer Institute in Statistics for Clinical and Epidemiological Research

Module 2: Small Area Estimation

Mon, July 11 to Wed, August 3

Instructor(s):

Schedule notice: This module consists of two sessions on July 11: 8:30 a.m. to noon and 1-4:30 p.m. Pacific (11:30 a.m. to 3 p.m. and 4 p.m. to 7:30 p.m. Eastern). The registration rates for this course are the same as for a two-day module, which also consists of two sessions.

Small area estimation (SAE) is an important endeavor in global health, epidemiology, economics, agriculture and demography. SAE is often based on data obtained from complex surveys, and one must acknowledge the survey design when statistical analysis is performed so that bias is avoided and measures of uncertainty incorporate sampling variability. Often data in particular areas are sparse (perhaps non-existent) and so smoothing using covariates (auxiliary information) and spatial proximity is advantageous to ‘borrow strength’ and use the totality of data in a profitable manner. We will distinguish between area-level and unit-level approaches. Area-level approaches include direct (weighted) estimates and Fay-Herriot models that incorporate covariates and area-level random effects. In these models, the complex survey is directly addressed through the use of weighted estimates. Unit-level approaches are potentially more powerful (since they allow more detailed auxiliary information to be incorporated), but acknowledging the design is more fraught with errors, and aggregation to the level of the area may not be straightforward. Model fitting and assessment for both classes of models will be discussed, and illustrated with examples.

We will begin with introductions to complex survey data, SAE modeling, and Bayesian statistics. Example analyses will be presented, using the new SAE module within the R survey package. This new module builds on the SUMMER R package which the instructor developed with his SAE collaborators.