Author ORCID Identifier


Defense Date


Document Type


Degree Name

Doctor of Philosophy



First Advisor

Robert A. Perera

Second Advisor

Chris Gennings



Weighted Quantile Sum Regression for Analyzing Correlated Predictors Acting Through a Mediation Pathway on a Biological Outcome


Bhanu M. Evani, Ph.D.

A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at Virginia Commonwealth University.

Virginia Commonwealth University, 2017.

Major Director: Robert A. Perera, Asst. Professor, Department of Biostatistics

This work examines mediated effects of a set of correlated predictors using the recently developed Weighted Quantile Sum (WQS) regression method. Traditionally, mediation analysis has been conducted using the multiple regression method, first proposed by Baron and Kenny (1986), which has since been advanced by several authors like MacKinnon (2008).

Mediation analysis of a highly correlated predictor set is challenging due to the condition of multicollinearity. Weighted Quantile Sum (WQS) regression can be used as an alternative method to analyze the mediated effects, when predictor correlations are high. As part of the WQS method, a weighted quartile sum index (WQSindex) is computed to represent the predictor set as an entity. The predictor variables in classic mediation are then replaced with the WQSindex, allowing for the estimation of the total indirect effect between all the predictors and the outcome. Predictors having a high relative importance in their association with the outcome can be identified by examining the empirical weights for the individual predictors estimated by the WQS regression method. Other constrained optimization methods (e.g. LASSO) focus on reducing dimensionality of the correlated predictors to reduce multicollinearity.

WQS regression in the context of mediation is studied using Monte Carlo simulation for mediation models with two and three correlated predictors. WQS regression’s performance is compared to the classic OLS multiple regression and the regularized LASSO regression methods. An application of these three methods to the National Health and Nutrition Examination Survey (NHANES) dataset examines the effect of serum concentrations of Polychlorinated Biphenyls (independent variables) on the liver enzyme, alanine aminotransferase ALT (outcome), with chromosomal telomere length as a potential mediator.

Keywords: Multicollinearity, Weighted Quantile Sum Regression, Mediation Analysis


© Bhanu M. Evani

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission