Large-Scale Simulation Study of Active Learning models for Systematic Reviews

Resource type
Journal Article
Authors/contributors
Title
Large-Scale Simulation Study of Active Learning models for Systematic Reviews
Abstract
The active learning methods for prioritising systematic reviews have undergone significant progress and innovation in recent years. This rapid development, however, has inadvertently highlighted the disparity between the rapid development of these methodologies and their rigorous evaluation, stemming from constraints in simulation size, lack of infrastructure, and the use of few datasets. We embark on a large-scale simulation study involving over 27 thousand simulations and over 156 million datapoints, designed to provide robust empirical evidence of active learning methodologies performance. We evaluate 13 combinations of different classification models and feature extraction techniques across high-quality datasets sourced from the SYNERGY dataset. We run a single simulation for each possible combination of selected classification model, feature extraction technique, dataset, and relevant document. The spectrum of performance varies considerably, from marginally better than random reading to near flawless results. Still, every single model-feature extraction combination outperforms random screening. Results are publicly available for analysis and replication. This study advocates for large-scale simulations as the gold standard for assessing active learning methods; it underscores the importance of comprehensive testing to reduce reporting bias and enhance result reliability. It also highlights the need for curating diverse datasets for systematic review literature.
Date
2023
Accessed
28/11/2023, 12:39
Library Catalogue
Google Scholar
Extra
Publisher: PsyArXiv
Citation
Teijema, J. J., de Bruin, J., Bagheri, A., & van de Schoot, R. (2023). Large-Scale Simulation Study of Active Learning models for Systematic Reviews. https://doi.org/10.31234/osf.io/2w3rm