Author ORCID Identifier

0000-0002-1818-0359

Defense Date

2026

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Healthcare Policy & Research

First Advisor

Andrew Barnes

Second Advisor

Katherine Tossas

Third Advisor

Bassam Dahman

Fourth Advisor

Maria Thomson

Fifth Advisor

Krzysztof Cios

Abstract

ABSTRACT Cancer surveillance systems rely on standardized racial and ethnic classifications to monitor disparities in stage at diagnosis and survival. However, current federal standards classify South West Asian and North African (SWANA) individuals as White, rendering this population statistically invisible within registry-based data. As a result, little is known about cancer burden, stage at diagnosis, and survival outcomes among SWANA populations in the United States. This dissertation is organized into three papers that collectively address this gap by developing and applying a novel method for identifying SWANA individuals within cancer registry data and examining how classification shapes the measurement and interpretation of cancer disparities. Paper 1 develops and validates a SWANA Surname Algorithm (SSA) to identify SWANA individuals in the absence of self-reported ethnicity within administrative data systems. Using a large dataset of surnames linked to country of birth and ethnicity indicators, the algorithm leverages linguistic features and probabilistic classification methods to assign SWANA identity. The SSA demonstrates strong discriminatory performance and provides a scalable approach for identifying SWANA individuals in population-based datasets where ethnicity is not directly captured. Paper 2 examines how disaggregating SWANA individuals from the White category alters estimates of late-stage cancer diagnosis and disease-free survival using Virginia Cancer Registry data. Under standard Office of Management and Budget (OMB) racial classifications, SWANA individuals are embedded within the White reference group. Reclassification using the SSA reveals modestly higher odds of late-stage diagnosis in unadjusted models; however, these differences attenuate and are not statistically significant after demographic and clinical adjustment. In contrast, SWANA individuals exhibit a lower hazard of death in fully adjusted disease-free survival models, a pattern that is not observable under OMB classification. Sensitivity analyses demonstrate that these findings are robust to alternative approaches for handling missing data. Together, these results illustrate how racial misclassification can obscure meaningful patterns in cancer surveillance data. Paper 3 uses in-depth qualitative interviews to examine how SWANA individuals experience cancer diagnosis, healthcare access, and racial classification within clinical and social contexts. Findings highlight themes of invisibility, negotiation of identity, and challenges in navigating healthcare systems that do not recognize SWANA identity. Participants described barriers to being heard, delays in diagnosis, and uncertainty in clinical interactions, underscoring how structural invisibility within data systems reflects broader experiences of marginalization. Taken together, this dissertation demonstrates that the absence of a SWANA identifier within cancer surveillance systems limits the ability to accurately measure and interpret cancer disparities. By introducing a novel identification approach and integrating quantitative and qualitative evidence, this work highlights the importance of improving racial and ethnic classification in population health data. These findings have implications for federal race and ethnicity standards, cancer registry practices, and efforts to advance equity in cancer prevention and control.

Rights

© The Author

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission

5-11-2026

Available for download on Saturday, May 10, 2031

Share

COinS