Interrater Reliability in Systematic Review Methodology: Exploring Variation in Coder Decision-Making

Resource type
Journal Article
Authors/contributors
Title
Interrater Reliability in Systematic Review Methodology: Exploring Variation in Coder Decision-Making
Abstract
A methodologically sound systematic review is characterized by transparency, replicability, and a clear inclusion criterion. However, little attention has been paid to reporting the details of interrater reliability (IRR) when multiple coders are used to make decisions at various points in the screening and data extraction stages of a study. Prior research has mentioned the paucity of information on IRR including number of coders involved, at what stages and how IRR tests were conducted, and how disagreements were resolved. This article examines and reflects on the human factors that affect decision-making in systematic reviews via reporting on three IRR tests, conducted at three different points in the screening process, for two distinct reviews. Results of the two studies are discussed in the context of IRR and intrarater reliability in terms of the accuracy, precision, and reliability of coding behavior of multiple coders. Findings indicated that coding behavior changes both between and within individuals over time, emphasizing the importance of conducting regular and systematic IRR and intrarater reliability tests, especially when multiple coders are involved, to ensure consistency and clarity at the screening and coding stages. Implications for good practice while screening/coding for systematic reviews are discussed.
Publication
Sociological Methods and Research
Volume
50
Issue
2
Pages
837-865
Date
2021
Citation
Belur, J., Tompson, L., Thornton, A., & Simon, M. (2021). Interrater Reliability in Systematic Review Methodology: Exploring Variation in Coder Decision-Making. Sociological Methods and Research, 50(2), 837–865. https://doi.org/10.1177/0049124118799372