DOI

https://doi.org/10.25772/0T0F-7C81

Defense Date

2024

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Nanoscience and Nanotechnology

First Advisor

Richard Inho Joh

Abstract

Gene expression provides insight into the functional variations on the cellular level that shape biological phenomena. Several recent sequencing technologies have produced an abundance of expression profiles spurring entirely new disciplines of biological study. With this myriad of data, the new task is deciding how to assess and extract meaningful insights. Identifying genes with expression changes in disease conditions is often the first step in finding potential biomarkers for diagnosis, and targets for pharmaceutical treatments. Parametric statistical tests at the individual gene level have been the conventional approach for finding differentially expressed genes. These tests exhibit high statistical power but rely on distributional assumptions that are difficult to validate. Which has led to a vast number of selected genes, with very few being effective in clinical applications. Alternatively, machine learning algorithms have been developed to identify patterns in high-dimensional data that can be easily applied to gene expression analysis. Here we present a novel algorithm for identifying aberrantly expressed genes in cancer. By comparing the expression pattern of individual genes to the cumulative pattern of the whole profile, we have developed a robust classification tool. We provide evidence that aberrant expression is effective in reporting biologically relevant gene signatures that may be overlooked by traditional methods. Due to the general assumptions used in our approach, we demonstrate its ability to assess gene expression from multiple technologies (microarray, RNA-Seq, scRNA-Seq) and for multiple insights (disease associations, treatment associations, cell/tissue variability). Lastly, we apply our method to single-cell RNA profiles, where robust identification of AEGs is possible with fewer samples than the conventional approaches. We hope these results inspire further research into developing a generalized framework for assessing gene expression patterns that can lead to the improvement of clinical outcomes and the development of personalized medicine.

Rights

© The Author

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission

5-8-2024

Share

COinS