DOI
https://doi.org/10.25772/AZZZ-EN83
Defense Date
2012
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Engineering
First Advisor
Preetam Ghosh
Abstract
In spite of many efforts in the past, inference or reverse engineering of regulatory networks from microarray data remains an unsolved problem in the area of systems biology. Such regulatory networks play a critical role in cellular function and organization and are of interest in the study of a variety of disease areas and ecotoxicology to name a few. This dissertation proposes information theoretic methods/algorithms for inferring regulatory networks from microarray data. Most of the algorithms proposed in this dissertation can be implemented both on time series and multifactorial microarray data sets. The work proposed here infers regulatory networks considering the following six factors: (i) computational efficiency to infer genome-scale networks, (ii) incorporation of prior biological knowledge, (iii) choosing the optimal network that minimizes the joint network entropy, (iv) impact of higher order structures (specifically 3-node structures) on network inference (v) effects of the time sensitivity of regulatory interactions and (vi) exploiting the benefits of existing/proposed metrics and algorithms for reverse engineering using the concept of consensus of consensus networks. Specifically, this dissertation presents an approach towards incorporating knock-out data sets. The proposed method for incorporating knock-out data sets is flexible so that it can be easily adapted in existing/new approaches. While most of the information theoretic approaches infer networks based on pair-wise interactions this dissertation discusses inference methods that consider scoring edges from complex structures. A new inference method for building consensus networks based on networks inferred by multiple popular information theoretic approaches is also proposed here. For time-series datasets, new information theoretic metrics were proposed considering the time-lags of regulatory interactions estimated from microarray datasets. Finally, based on the scores predicted for each possible edge in the network, a probabilistic minimum description length based approach was proposed to identify the optimal network (minimizing the joint network entropy). Comparison analysis on in-silico and/or real time data sets have shown that the proposed algorithms achieve better inference accuracy and/or higher computational efficiency as compared with other state-of-the-art schemes such as ARACNE, CLR and Relevance Networks. Most of the methods proposed in this dissertation are generalized and can be easily incorporated into new methods/algorithms for network inference.
Rights
© The Author
Is Part Of
VCU University Archives
Is Part Of
VCU Theses and Dissertations
Date of Submission
December 2012