Interest often focuses on estimating sensitivity and specificity of a
group of raters or a set of new diagnostic tests in situations where
gold standard evaluation is invasive or expensive. In a typical
situation a group of raters or a series of diagnostic tests assess
disease status on a group of individuals. For situations in which no
gold standard evaluation is available, various authors have proposed
latent class modeling approaches for estimating diagnostic error and
prevalence. We discuss a potential problem with these approaches.
Namely, we show that when the conditional dependence between tests is
misspecified, estimates of sensitivity, specificity, and prevalence can
be severely biased. Importantly, we demonstrate that with a small
numbers of tests, likelihood comparisons and other model diagnostics
may not be able to distinguish between models with different dependence
structures. While these results caution against using these latent
class models, the difficulties of obtaining gold standard verification
remain a practical reality. We provide a compromise in which gold
standard information is collected on a subset of subjects. We propose
both semi-latent class models and a multiple imputation approach for
estimating diagnostic error and prevalence with partial gold standard
evaluation. Through analytic work and simulations, we show that even
with a small percentage of verified individuals, in most cases, these
approaches are substantially more robust than latent class models
without any gold standard information. We illustrate our methodological
work with data analyses from two medical examples. This work is joint
with Lori Dodd at NCI.
Meet the speaker in Room 212 Cockins Hall at 4:30
p.m. Refreshments will be served.