DOI
https://doi.org/10.25772/X140-N976
Author ORCID Identifier
https://orcid.org/0000-0003-1245-6411
Defense Date
2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Computer Science
First Advisor
Bridget T. McInnes
Abstract
Literature-based discovery (LBD) is a scientific process that introduces methods to automatically identify novel insights between non-interacting sets of literature. To date, numerous statistical and machine learning-based methods have been applied in the biomedical domain to find treatments for diseases such as Raynaud's disease, Parkinson's disease, and Multiple Sclerosis. However, the lack of standardized practices and creation of bespoke methodologies produces a scenario where the adoption of LBD remains challenging in real-world systems. Our work addresses these concerns through the improvement of five critical areas: 1) error propagation within LBD's a priori dependent tasks, 2) exploring the integration of modern DL methods for LBD, 3) reducing the high barrier to entry and promoting open-ended research, 4) performing comparisons with related work, and 5) addressing the lack of confidence in LBD systems. We propose several methods that address these issues by: 1) improving performance in LBD's a priori tasks, 2) integrating various graph and transformer-based deep learning architectures for LBD, 3) evaluating our system against sets of random relationships to determine model efficacy, and 4) introducing the development of methods that reduce the reliance on subject matter experts and facilitate open-ended exploration for novel knowledge generation. Our work provides insights into several methodologies and future directions that can be utilized to build confidence in LBD systems and assess their capacity to generate novel knowledge.
Rights
© Clint Cuffy, July 2025
Is Part Of
VCU University Archives
Is Part Of
VCU Theses and Dissertations
Date of Submission
7-16-2025