Defense Date


Document Type


Degree Name

Master of Science



First Advisor

Christine Schubert


The purpose of this research is to examine the bootstrap and jackknife as methods for estimating the variance of the AUC from a study using a complex sampling design and to determine which characteristics of the sampling design effects this estimation. Data from a one-stage cluster sampling design of 10 clusters was examined. Factors included three true AUCs (.60, .75, and .90), three prevalence levels (50/50, 70/30, 90/10) (non-disease/disease), and finally three number of clusters sampled (2, 5, or 7). A simulated sample was constructed for each of the 27 combinations of AUC, prevalence and number of clusters. Estimates of the AUC obtained from both the bootstrap and jackknife methods provide unbiased estimates for the AUC. In general it was found that bootstrap variance estimation methods provided smaller variance estimates. For both the bootstrap and jackknife variance estimates, the rarer the disease in the population the higher the variance estimate. As the true area increased the variance estimate decreased for both the bootstrap and jackknife methods. For both the bootstrap and jackknife variance estimates, as number of clusters sampled increased the variance decreased, however the trend for the jackknife may be effected by outliers. The National Health and Nutrition Examination Survey (NHANES) conducted by the CDC is a complex survey which implements the use of the one-stage cluster sampling design. A subset of the 2001-2002 NHANES data was created looking only at adult women. A separate logistic regression analysis was conducted to determine if exposure to certain furans in the environment have an effect on abnormal levels of four hormones (FSH, LH, TSH, and T4) in women. Bootstrap and jackknife variance estimation techniques were applied to estimate the AUC and variances for the four logistic regressions. The AUC estimates provided by both the bootstrap and jackknife methods were similar, with the exception of LH. Unlike in the simulated study, the jackknife variance estimation method provided consistently smaller variance estimates than bootstrap. AUC estimates for all four hormones suggested that exposure to furans effects abnormal levels of hormones more than expected by chance. The bootstrap variance estimation technique provided better variance estimates for AUC when sampling many clusters. When only sampling a few clusters or as in the NHANES study where the entire population was treated as a single cluster, the jackknife variance estimation method provides smaller variance estimates for the AUC.


© The Author

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission

June 2009

Included in

Biostatistics Commons