DOI

https://doi.org/10.25772/CG3N-FZ56

Defense Date

2021

Document Type

Thesis

Degree Name

Master of Science

Department

Chemical and Life Science Engineering

First Advisor

Dr. James Ferri

Second Advisor

Dr. David Tyler McQuade

Third Advisor

Mr. William Glandorf

Abstract

Machine learning models for chemical property predictions are high dimension design challenges spanning multiple disciplines. Free and open-source software libraries have streamlined the model implementation process, but the design complexity remains. In order better navigate and understand the machine learning design space, model information needs to be organized and contextualized. In this work, instances of chemical property models and their associated parameters were stored in a Neo4j property graph database. Machine learning model instances were created with permutations of dataset, learning algorithm, molecular featurization, data scaling, data splitting, hyperparameters, and hyperparameter optimization techniques. The resulting graph contains over 83,000 nodes and 4 million edges and can be explored with interactive visualization software. The structure of the property graph is centered around models and molecules which enables efficient and intuitive inter- and intra-model evaluation. We use a curated lipophilicity dataset to demonstrate graph use cases. Difficult to predict molecules were identified across multiple models simultaneously. Powerful and expressive graph queries were implemented to identify molecular fragments that were both prevalent and associated with high lipophilicity prediction error.

Rights

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission

5-6-2021

Download

Included in

Databases and Information Systems Commons, Data Science Commons, Other Chemical Engineering Commons, Other Chemistry Commons

COinS

Theses and Dissertations

Information Architecture for a Chemical Modeling Knowledge Graph

DOI

Defense Date

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Rights

Is Part Of

Is Part Of

Date of Submission

Included in

Browse

Search

Author Corner

Links

Theses and Dissertations

Information Architecture for a Chemical Modeling Knowledge Graph

Author

DOI

Defense Date

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Rights

Is Part Of

Is Part Of

Date of Submission

Included in

Share

Browse

Search

Author Corner

Links