DOI
https://doi.org/10.25772/W0JE-H248
Author ORCID Identifier
0000-0001-8162-5069
Defense Date
2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Computer Science
First Advisor
Alberto Cano
Abstract
The rapid growth of data from sources such as mobile applications, sensors, and network monitoring has increased the need for machine learning algorithms capable of handling non-stationary data streams. However, learning from such streams presents significant challenges due to their evolving nature and the presence of concept drift. One of the most complex issues is learning from imbalanced data streams, where shifting data distributions, combined with feature space drifts, complicate continuous adaptation. These challenges become even more pronounced in multi-class scenarios, which are common in real-world applications. Detecting concept drift in such contexts is particularly demanding, as it requires tracking changes across multiple classes. Moreover, obtaining labels for all instances in practical settings is often infeasible, making it crucial to determine which instances to label and when to do so to optimize performance while minimizing costs. This dissertation explores various aspects of data stream learning, analyzing its key challenges and proposing solutions to enhance the field. The primary objective is to advance data stream research by highlighting its complexity and diversity. Through comprehensive experiments and benchmarks, the proposed contributions demonstrate their effectiveness in addressing non-stationary, multi-class, imbalanced, and partially labeled data streams. Overall, the developed methods and strategies provide valuable insights into data stream learning and contribute to the creation of more accurate, adaptive, and efficient machine learning algorithms for real-world applications.
Rights
© Gabriel Jonas Aguiar
Is Part Of
VCU University Archives
Is Part Of
VCU Theses and Dissertations
Date of Submission
4-15-2025