Publication

Validation Data-Based Adjustments for Outcome Misclassification in Logistic Regression:An Illustration

Downloadable Content

Persistent URL
Last modified
  • 02/20/2025
Type of Material
Authors
    Robert Lyles, Emory UniversityLi Tang, Emory UniversityHillary M. Superak, Emory UniversityCaroline C. King, Centers for Disease Control and PreventionDavid D. Celentano, Johns HopkinsYungtai Lo, Albert Einstein CollegeJack D. Sobel, Wayne State University
Language
  • English
Date
  • 2011-07
Publisher
  • Lippincott, Williams & Wilkins
Publication Version
Copyright Statement
  • © 2011 Lippincott Williams & Wilkins, Inc.
Final Published Version (URL)
Title of Journal or Parent Work
ISSN
  • 1044-3983
Volume
  • 22
Issue
  • 4
Start Page
  • 589
End Page
  • 597
Grant/Funding Information
  • This work was supported by National Institute of Nursing Research Grant 1RC4NR012527-01, by National Institute of Environmental Health Sciences Grant 2R01-ES012458-5, and by PHS Grant UL1 RR025008 from the Clinical and Translational Science Award Program, National Institutes of Health, National Center for Research Resources. The HER Study was supported by the Centers for Disease Control and Prevention: U64/CCU106795, U64/CCU206798, U64/CCU306802, and U64/CCU506831.
Abstract
  • Misclassification of binary outcome variables is a known source of potentially serious bias when estimating adjusted odds ratios. Although researchers have described frequentist and Bayesian methods for dealing with the problem, these methods have seldom fully bridged the gap between statistical research and epidemiologic practice. In particular, there have been few real-world applications of readily grasped and computationally accessible methods that make direct use of internal validation data to adjust for differential outcome misclassification in logistic regression. In this paper, we illustrate likelihood-based methods for this purpose that can be implemented using standard statistical software. Using main study and internal validation data from the HIV Epidemiology Research Study, we demonstrate how misclassification rates can depend on the values of subject-specific covariates, and illustrate the importance of accounting for this dependence. Simulation studies confirm the effectiveness of the maximum likelihood approach. We emphasize clear exposition of the likelihood function itself, to permit the reader to easily assimilate appended computer code that facilitates sensitivity analyses as well as the efficient handling of main/external and main/internal validation-study data. These methods are readily applicable under random cross-sectional sampling, and we discuss the extent to which the main/internal analysis remains appropriate under outcome-dependent (case-control) sampling.
Author Notes
  • Address for correspondence: Robert H. Lyles; Department of Biostatistics and Bioinformatics, The Rollins School of Public Health of Emory University, 1518 Clifton Rd. N.E., Atlanta, GA 30322 (phone: 404-727-1310; fax: 404-727-1370; rlyles@sph.emory.edu)
Research Categories
  • Biology, Biostatistics
  • Health Sciences, Epidemiology

Tools

Relations

In Collection:

Items