Publication
Overall indices for assessing agreement among multiple raters
Downloadable Content
- Persistent URL
- Last modified
- 05/15/2025
- Type of Material
- Authors
-
-
Jeong Hoon Jang, Emory UniversityAmita Manatunga, Emory UniversityAndrew Taylor Jr., Emory UniversityQi Long, Emory University
- Language
- English
- Date
- 2018-12-10
- Publisher
- Wiley: 12 months
- Publication Version
- Copyright Statement
- © 2018 John Wiley & Sons, Ltd.
- Final Published Version (URL)
- Title of Journal or Parent Work
- ISSN
- 0277-6715
- Volume
- 37
- Issue
- 28
- Start Page
- 4200
- End Page
- 4215
- Grant/Funding Information
- This research was supported by the National Institute of Diabetes and Digestive and Kidney Diseases under Grant 1R01DK108070-01A1.
- Abstract
- The need to assess agreement exists in various clinical studies where quantifying inter-rater reliability is of great importance. Use of unscaled agreement indices, such as total deviation index and coverage probability (CP), is recommended for two main reasons: (i) they are intuitive in a sense that interpretations are tied to the original measurement unit; (ii) practitioners can readily determine whether the agreement is satisfactory by directly comparing the value of the index to a prespecified tolerable CP or absolute difference. However, the unscaled indices were only defined in the context of comparing two raters or multiple raters that assume homogeneity of variances across raters. In this paper, we introduce a set of overall indices based on the root mean square of pairwise differences that are unscaled and can be used to evaluate agreement among multiple raters that often exhibit heterogeneous measurement processes in practice. Furthermore, we propose another overall agreement index based on the root mean square of pairwise differences that is scaled and extends the concept of the recently proposed relative area under CP curve in the presence of multiple raters. We present the definitions of overall indices and propose inference procedures in which bootstrap methods are used for the estimation of standard errors. We assess the performance of the proposed approach and demonstrate its superiority over the existing methods when raters exhibit heterogeneous measurement processes using simulation studies. Finally, we demonstrate the application of our methods using a renal study.
- Author Notes
- Keywords
- coverage probability
- Public, Environmental & Occupational Health
- NORMAL VARIABLES
- Physical Sciences
- CATEGORICAL-DATA
- agreement
- STATISTICAL-METHODS
- CONCORDANCE CORRELATION-COEFFICIENT
- QUADRATIC-FORMS
- Statistics & Probability
- Medicine, Research & Experimental
- unscaled index
- Medical Informatics
- RENOGRAPHY
- multiple raters
- Mathematical & Computational Biology
- TOTAL DEVIATION INDEX
- Life Sciences & Biomedicine
- Science & Technology
- Mathematics
- Research & Experimental Medicine
- root-mean-square difference
- Research Categories
- Health Sciences, Radiology
- Biology, Biostatistics
Tools
- Download Item
- Contact Us
-
Citation Management Tools
Relations
- In Collection:
Items
| Thumbnail | Title | File Description | Date Uploaded | Visibility | Actions |
|---|---|---|---|---|---|
|
|
Publication File - ts0q9.pdf | Primary Content | 2025-03-27 | Public | Download |