DOI
https://doi.org/10.25772/NC2H-DC55
Defense Date
2016
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Biostatistics
First Advisor
Dr. Kellie J. Archer
Abstract
Multigenic diagnostic and prognostic tools can be derived for ordinal clinical outcomes using data from high-throughput genomic experiments. A challenge in this setting is that the number of predictors is much greater than the sample size, so traditional ordinal response modeling techniques must be exchanged for more specialized approaches. Existing methods perform well on some datasets, but there is room for improvement in terms of variable selection and predictive accuracy. Therefore, we extended an impressive binary response modeling technique, Feature Augmentation via Nonparametrics and Selection, to the ordinal response setting. Through simulation studies and analyses of high-throughput genomic datasets, we showed that our Ordinal FANS method is sensitive and specific when discriminating between important and unimportant features from the high-dimensional feature space and is highly competitive in terms of predictive accuracy.
Discrete survival time is another example of an ordinal response. For many illnesses and chronic conditions, it is impossible to record the precise date and time of disease onset or relapse. Further, the HIPPA Privacy Rule prevents recording of protected health information which includes all elements of dates (except year), so in the absence of a “limited dataset,” date of diagnosis or date of death are not available for calculating overall survival. Thus, we developed a method that is suitable for modeling high-dimensional discrete survival time data and assessed its performance by conducting a simulation study and by predicting the discrete survival times of acute myeloid leukemia patients using a high-dimensional dataset.
Rights
© Kyle L. Ferber
Is Part Of
VCU University Archives
Is Part Of
VCU Theses and Dissertations
Date of Submission
11-17-2016