DOI

https://doi.org/10.25772/3CD0-CH78

Defense Date

2013

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Biostatistics

First Advisor

Donna McClish

Abstract

Many continuous medical tests often rely on a threshold for diagnosis. There are two sequential testing strategies of interest: Believe the Positive (BP) and Believe the Negative (BN). BP classifies a patient positive if either the first test is greater than a threshold θ1 or negative on the first test and greater than θ2 on the second test. BN classifies a patient positive if the first test is greater than a threshold θ3 and greater than θ4 on the second test. Threshold pairs θ = (θ1, θ2) or (θ3, θ4), depending on strategy, are defined as optimal if they maximized GYI = Se + r(Sp – 1). Of interest is to determine if these optimal threshold, or optimal operating point (OOP), estimates are “good” when calculated from a sample. The methods proposed in this dissertation derive formulae to estimate θ assuming tests follow a binormal distribution, using the Newton-Raphson algorithm with ridging. A simulation study is performed assessing bias, root mean square error, percentage of over estimation of Se/Sp, and coverage of simultaneous confidence intervals and confidence regions for sets of population parameters and sample sizes. Additionally, OOPs are compared to the traditional empirical approach estimates. Bootstrapping is used to estimate the variance of each optimal threshold pair estimate. The study shows that parameters such as the area under the curve, ratio of standard deviations of disease classification groups within tests, correlation between tests within a disease classification, total sample size, and allocation of sample size to each disease classification group were all influential on OOP estimation. Additionally, the study shows that this method is an improvement over the empirical estimate. Equations for researchers to use in estimating total sample size and SCI width are also developed. Although the models did not produce high coefficients of determination, they are a good starting point for researchers when designing a study. A pancreatic cancer dataset is used to illustrate the OOP estimation methodology for sequential tests.

Rights

© The Author

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission

August 2013

Included in

Biostatistics Commons

Share

COinS