DOI

https://doi.org/10.25772/X140-N976

Author ORCID Identifier

https://orcid.org/0000-0003-1245-6411

Defense Date

2025

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Computer Science

First Advisor

Bridget T. McInnes

Abstract

Literature-based discovery (LBD) is a scientific process that introduces methods to automatically identify novel insights between non-interacting sets of literature. To date, numerous statistical and machine learning-based methods have been applied in the biomedical domain to find treatments for diseases such as Raynaud's disease, Parkinson's disease, and Multiple Sclerosis. However, the lack of standardized practices and creation of bespoke methodologies produces a scenario where the adoption of LBD remains challenging in real-world systems. Our work addresses these concerns through the improvement of five critical areas: 1) error propagation within LBD's a priori dependent tasks, 2) exploring the integration of modern DL methods for LBD, 3) reducing the high barrier to entry and promoting open-ended research, 4) performing comparisons with related work, and 5) addressing the lack of confidence in LBD systems. We propose several methods that address these issues by: 1) improving performance in LBD's a priori tasks, 2) integrating various graph and transformer-based deep learning architectures for LBD, 3) evaluating our system against sets of random relationships to determine model efficacy, and 4) introducing the development of methods that reduce the reliance on subject matter experts and facilitate open-ended exploration for novel knowledge generation. Our work provides insights into several methodologies and future directions that can be utilized to build confidence in LBD systems and assess their capacity to generate novel knowledge.

Rights

© Clint Cuffy, July 2025

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission

7-16-2025

Share

COinS