In this book, we demonstrate how to estimate poverty and inequality measures in a population using microdata collected from a complex survey sample. Most surveys administered by government agencies or larger research organizations utilize a sampling design that violates the assumption of simple random sampling (SRS), including:
Therefore, basic unweighted R commands such as mean()
or glm()
will not properly account for the weighting nor the measures of uncertainty (such as sampling variance estimates and confidence intervals) present in the dataset. For some examples of publicly-available complex survey data sets, see http://asdfree.com.
Unlike other software, the R convey
package does not require that the user specify these parameters throughout the analysis. So long as the svydesign object or svrepdesign object has been constructed properly at the outset of the analysis, the convey
package will incorporate the survey design automatically and produce statistics and variances that take the complex sample into account.
Survey analysts familiar with the R dplyr
syntax implemented by the survey
library’s wrapper srvyr
package might be interested in implementing specific convey
functions by following the svygini()
example published by srvyr
author Greg Freedman Ellis. Note that the full design stored by convey_prep()
may in some cases complicate this extension.