Author ORCID Identifier
0000-0002-1818-0359
Defense Date
2026
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Healthcare Policy & Research
First Advisor
Andrew Barnes
Second Advisor
Katherine Tossas
Third Advisor
Bassam Dahman
Fourth Advisor
Maria Thomson
Fifth Advisor
Krzysztof Cios
Abstract
ABSTRACT Cancer surveillance systems rely on standardized racial and ethnic classifications to monitor disparities in stage at diagnosis and survival. However, current federal standards classify South West Asian and North African (SWANA) individuals as White, rendering this population statistically invisible within registry-based data. As a result, little is known about cancer burden, stage at diagnosis, and survival outcomes among SWANA populations in the United States. This dissertation is organized into three papers that collectively address this gap by developing and applying a novel method for identifying SWANA individuals within cancer registry data and examining how classification shapes the measurement and interpretation of cancer disparities. Paper 1 develops and validates a SWANA Surname Algorithm (SSA) to identify SWANA individuals in the absence of self-reported ethnicity within administrative data systems. Using a large dataset of surnames linked to country of birth and ethnicity indicators, the algorithm leverages linguistic features and probabilistic classification methods to assign SWANA identity. The SSA demonstrates strong discriminatory performance and provides a scalable approach for identifying SWANA individuals in population-based datasets where ethnicity is not directly captured. Paper 2 examines how disaggregating SWANA individuals from the White category alters estimates of late-stage cancer diagnosis and disease-free survival using Virginia Cancer Registry data. Under standard Office of Management and Budget (OMB) racial classifications, SWANA individuals are embedded within the White reference group. Reclassification using the SSA reveals modestly higher odds of late-stage diagnosis in unadjusted models; however, these differences attenuate and are not statistically significant after demographic and clinical adjustment. In contrast, SWANA individuals exhibit a lower hazard of death in fully adjusted disease-free survival models, a pattern that is not observable under OMB classification. Sensitivity analyses demonstrate that these findings are robust to alternative approaches for handling missing data. Together, these results illustrate how racial misclassification can obscure meaningful patterns in cancer surveillance data. Paper 3 uses in-depth qualitative interviews to examine how SWANA individuals experience cancer diagnosis, healthcare access, and racial classification within clinical and social contexts. Findings highlight themes of invisibility, negotiation of identity, and challenges in navigating healthcare systems that do not recognize SWANA identity. Participants described barriers to being heard, delays in diagnosis, and uncertainty in clinical interactions, underscoring how structural invisibility within data systems reflects broader experiences of marginalization. Taken together, this dissertation demonstrates that the absence of a SWANA identifier within cancer surveillance systems limits the ability to accurately measure and interpret cancer disparities. By introducing a novel identification approach and integrating quantitative and qualitative evidence, this work highlights the importance of improving racial and ethnic classification in population health data. These findings have implications for federal race and ethnicity standards, cancer registry practices, and efforts to advance equity in cancer prevention and control.
Rights
© The Author
Is Part Of
VCU University Archives
Is Part Of
VCU Theses and Dissertations
Date of Submission
5-11-2026
Included in
Community Health and Preventive Medicine Commons, Epidemiology Commons, Other Public Health Commons