DOI

https://doi.org/10.25772/C1P9-WG56

Defense Date

2019

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Computer Science

First Advisor

Dr. Bridget McInnes

Second Advisor

Dr. Alberto Cano

Third Advisor

Dr. Dayanjan S. Wijesinghe

Fourth Advisor

Dr. Halil Kilicoglu

Fifth Advisor

Dr. Than Dinh

Abstract

The exponential growth of scientific literature is creating an increased need for systems to process and assimilate knowledge contained within text. Literature Based Discovery (LBD) is a well established field that seeks to synthesize new knowledge from existing literature, but it has remained primarily in the theoretical realm rather than in real-world application. This lack of real-world adoption is due in part to the difficulty of LBD, but also due to several solvable problems present in LBD today. Of these problems, the ones in most critical need of improvement are: (1) the over-generation of knowledge by LBD systems, (2) a lack of meaningful evaluation standards, and (3) the difficulty interpreting LBD output. We address each of these problems by: (1) developing indirect relatedness measures for ranking and filtering LBD hypotheses; (2) developing a representative evaluation dataset and applying meaningful evaluation methods to individual components of LBD; (3) developing an interactive visualization system that allows a user to explore LBD output in its entirety. In addressing these problems, we make several contributions, most importantly: (1) state of the art results for estimating direct semantic relatedness, (2) development of set association measures, (3) development of indirect association measures, (4) development of a standard LBD evaluation dataset, (5) division of LBD into discrete components with well defined evaluation methods, (6) development of automatic functional group discovery, and (7) integration of indirect relatedness measures and automatic functional group discovery into a comprehensive LBD visualization system. Our results inform future development of LBD systems, and contribute to creating more effective LBD systems.

Rights

© Sam Henry

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission

5-9-2019

Share

COinS