DOI
https://doi.org/10.25772/8F2J-1G28
Defense Date
2022
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Biomedical Engineering
First Advisor
Dean Krusienski
Abstract
In the last two decades, there have been many breakthrough advancements in non-invasive and invasive brain-computer interface (BCI) systems. However, the majority of BCI model designs still follow a paradigm whereby neural signals are preprocessed and task-related features extracted using static, and generally customized, data-independent designs. Such BCI designs commonly optimize narrow task performance over generalizability, adaptability, and robustness, which is not well suited to meeting individual user needs. If one day BCIs are to be capable of decoding our higher-order cognitive commands and conceptual maps, their designs will need to be adaptive architectures that will evolve and grow in concert with users, as well as the ever-progressing landscape of technological innovation. Speech is a complex neural process, involving planning, motor execution, auditory self-perception, and semantic encoding. This makes speech an attractive target for the development of adaptive BCI. Non-invasive BCIs, such as those utilizing scalp EEG, lack the spatial resolution and spectral bandwidth required for decoding of complex dynamics of speech processes. The present work uses intracranial signals, from stereotactic EEG and electrocorticography, which possess signal characteristics better suited for the development of practical speech BCIs. Deep learning is a machine learning approach in which features and the classifier can be jointly learned directly from the data. Such approaches have been demonstrated to be uniquely able to model applications involving high-dimensional, unstructured data, such as computer vision and natural language processing. This work argues for universal design principles and deep learning architectures as the foundations for the development of robust, user-centered BCI. First, it is shown that combining traditional feature extraction techniques with deep learning models does not confer performance benefits, and comes at the cost of computational inefficiency and increased barriers to reproducibility. Then, a novel model is presented, SincIEEG, for speech activity detection from intracranial neural signals. Initial model layers learn data-driven features corresponding to frequency bands. The interpretable features are used to show that models derive person-specific features. Additionally, results confirm that conventional feature extraction methods are excluding frequency bands useful for detecting speech. Furthering analysis of SincIEEG, the transfer learning potential of the model is systematically quantified, and hyperparameters that have the greatest impact are summarized. Finally, the power of deep learning and data-driven modeling is showcased. We present a first-of-its-kind modeling framework in HUBRIS; a self-supervised, transformer-based, transfer learning approach, capable of training from unlabeled data pooled from multiple participants. This is enabled partly by a novel embedding of the neuroanatomical electrode locations in the model. Models learn self-derived pseudo-lexical speech representations and are evaluated using three disparate downstream speech classification tasks to highlight the generalizability of this design.
Rights
© The Author
Is Part Of
VCU University Archives
Is Part Of
VCU Theses and Dissertations
Date of Submission
12-12-2022
Included in
Artificial Intelligence and Robotics Commons, Bioelectrical and Neuroengineering Commons, Data Science Commons