Author ORCID Identifier
https://orcid.org/0000-0002-0848-9790
Defense Date
2024
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Computer Science
First Advisor
Alberto Cano
Abstract
Drifting data streams and multi-label data are both challenging problems. When multi-label data arrives as a stream, the challenges of both problems must be addressed along with additional challenges unique to the combined problem. Algorithms must be fast and flexible, able to match both the speed and evolving nature of the stream. We propose four methods for learning from multi-label drifting data streams. First, a multi-label k Nearest Neighbors with Self Adjusting Memory (ML-SAM-kNN) exploits short- and long-term memories to predict the current and evolving states of the data stream. Second, a punitive k nearest neighbors algorithm with a self-adjusting memory (MLSAMPkNN) for multi-label, drifting data streams uses a novel punitive system that identifies and penalizes errant data examples early, removing them from the window. Third, a self-adapting algorithm for continual learning on multi-label data streams (MLSAkNN) uses a series of techniques to adapt to concepts drifts and add robustness to data-level difficulties. Lastly, we compare and combine simple aging and rejuvenation strategies into a method (ARkNN) that updates and adapts a window of instances in a competitive method that is highly resource efficient.Thorough experimental studies compare the proposed methods to state-of-the-art classifiers for streaming multi-label data on real world and artificial data streams using multi-label performance metrics, evaluation time, and memory consumption.
Rights
© The Author
Is Part Of
VCU University Archives
Is Part Of
VCU Theses and Dissertations
Date of Submission
3-6-2024