DOI
https://doi.org/10.25772/2FB3-AF10
Defense Date
2016
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Biostatistics
First Advisor
Roy T. Sabo
Second Advisor
Le Kang
Third Advisor
Adam Sima
Fourth Advisor
Qiqi Lu
Fifth Advisor
Edward L. Boone
Abstract
Clustered data often feature nested structures and repeated measures. If coupled with binary outcomes and large samples (>10,000), this complexity can lead to non-convergence problems for the desired model especially if random effects are used to account for the clustering. One way to bypass the convergence problem is to split the dataset into small enough sub-samples for which the desired model convergences, and then recombine results from those sub-samples through meta-analysis. We consider two ways to generate sub-samples: the K independent samples approach where the data are split into k mutually-exclusive sub-samples, and the cluster-based approach where naturally existing clusters serve as sub-samples. Estimates or test statistics from either of these sub-sampling approaches can then be recombined using a univariate or multivariate meta-analytic approach. We also provide an innovative approach for simulating clustered and dependent binary data by simulating parameter templates that yield the desired cluster behavior. This approach is used to conduct simulation studies comparing the performance of the K independent samples and cluster-based approaches to generating sub-samples, the results from which are combined either with univariate and multivariate meta-analytic techniques. These studies show that using natural clusters leaded to lower biased test statistics when the number of clusters and treatment effect were large, as compared to the K independent samples approach for both the univariate and multivariate meta-analytic approaches. And the independent samples approach was preferred when the number of clusters and treatment effect were small. We also apply these methods to data on cancer screening behaviors obtained from electronic health records of n=15,652 individuals and showed that these estimated results support the conclusions from the simulation studies.
Rights
© The Author
Is Part Of
VCU University Archives
Is Part Of
VCU Theses and Dissertations
Date of Submission
5-11-2016