DOI
https://doi.org/10.25772/QPZS-YJ59
Author ORCID Identifier
https://orcid.org/0009-0006-8667-9609
Defense Date
2024
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Biostatistics
First Advisor
Dipankar Bandyopadhyay, Ph.D.
Second Advisor
Chenlu Ke, Ph.D.
Abstract
Recent advances in sequencing technologies have enabled the collection of extensive genome wide data, significantly enhancing the diagnosis and prognosis of head and neck cancer. Identifying predictive markers for survival time is crucial for developing prognostic systems and understanding the molecular drivers of cancer progression. Moreover, given the increasing recognition of HPV-associated head and neck (H&N) cancers, understanding its role is vital for prevention and guiding treatment decisions. In an attempt to address these, this dissertation develops model-free feature screening procedures for ultra-high dimensional right censored data, capable of capturing important features, both (a) marginally and (b) conditionally (say, on HPV status), while preserving robustness against unknown censoring mechanisms. Specifically, the proposed two-stage approach initially selects significant features using nonparametric reproducing-kernel-based ANOVA statistics, and then refines them under directional false discovery rate (FDR) control through a unified knockoff procedure. Theoretical properties, including sure screening, and rank consistency were studied. The finite sample properties of the proposed method, and the novelty in light of existing alternatives were explored through simulation studies. The methodology was illustrated via application to the right-censored H&N cancer survival data derived from The Cancer Genome Atlas, and validated on a similar dataset from the Gene Expression Omnibus database. The R package DSFDRC available in GitHub implements the proposed methodology.
Rights
© The Author
Is Part Of
VCU University Archives
Is Part Of
VCU Theses and Dissertations
Date of Submission
5-28-2024
Included in
Biostatistics Commons, Data Science Commons, Genetics Commons, Microarrays Commons, Survival Analysis Commons