"Adaptive Multi-label Classification on Drifting Data Streams" by Martha Roseberry

DOI

https://doi.org/10.25772/GTF3-KX79

Author ORCID Identifier

https://orcid.org/0000-0002-0848-9790

Defense Date

2024

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Computer Science

First Advisor

Alberto Cano

Abstract

Drifting data streams and multi-label data are both challenging problems. When multi-label data arrives as a stream, the challenges of both problems must be addressed along with additional challenges unique to the combined problem. Algorithms must be fast and flexible, able to match both the speed and evolving nature of the stream. We propose four methods for learning from multi-label drifting data streams. First, a multi-label k Nearest Neighbors with Self Adjusting Memory (ML-SAM-kNN) exploits short- and long-term memories to predict the current and evolving states of the data stream. Second, a punitive k nearest neighbors algorithm with a self-adjusting memory (MLSAMPkNN) for multi-label, drifting data streams uses a novel punitive system that identifies and penalizes errant data examples early, removing them from the window. Third, a self-adapting algorithm for continual learning on multi-label data streams (MLSAkNN) uses a series of techniques to adapt to concepts drifts and add robustness to data-level difficulties. Lastly, we compare and combine simple aging and rejuvenation strategies into a method (ARkNN) that updates and adapts a window of instances in a competitive method that is highly resource efficient.Thorough experimental studies compare the proposed methods to state-of-the-art classifiers for streaming multi-label data on real world and artificial data streams using multi-label performance metrics, evaluation time, and memory consumption.

Rights

© The Author

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission

3-6-2024

Share

COinS