Can a Small Sample Size Capture a Big Truth? A Hypothetical Simulation Study on Diagnostic Accuracy of AI vs Radiologist in Tuberculosis Detection through X-rays

Authors

DOI:

https://doi.org/10.56450/JEFI.2026.v4i01.010

Keywords:

Artificial Intelligence, Radiologist, Tuberculosis, Monte Carlo Method, Sample Size, Reproducibility of Results

Abstract

Background: Artificial intelligence (AI) tools are increasingly being adopted in diagnostic radiology. However, early-phase evaluations often rely on small sample sizes due to logistical constraints. It remains unclear whether such small studies can reliably detect real differences in diagnostic accuracy between AI models and human experts. Objectives: To assess whether a small sample size (n = 30) can reliably detect statistically significant differences in diagnostic accuracy between an AI model and a radiologist for TB detection from chest X-rays under varying performance scenarios. Methods: A Monte Carlo simulation was conducted in R (v4.5.0) using 1000 iterations per scenario. Three diagnostic accuracy scenarios tested were: scenario 1. AI (98%) vs Radiologist (70%), scenario 2. AI (90%) vs Radiologist (75%) and scenario 3. AI (92%) vs Radiologist (80%). Predictions were simulated using defined sensitivity and specificity values, and accuracy was computed. A one-sided paired t-test assessed whether AI outperformed the radiologist. Empirical power was calculated as the proportion of simulations yielding p-values < 0.05. Results: Scenario 1 with a large accuracy gap (28.2%), achieved 96% empirical power. Scenarios 2 and 3, with moderate and small differences (~14%), achieved only 46.8% and 37.5% power, respectively. Conclusion: Small samples can detect large diagnostic performance differences but are unreliable for moderate or small differences.

Downloads

Download data is not yet available.

References

1. Derricho E. Can a Small Sample Reflect a Whole Population? A Simulation Study on Treatment Efficacy [Internet]. Medium. 2025 [cited 2025 May 10]. Available from: https://medium.com/@ederricho1/can-a-small-sample-reflect-a-whole-population-a-simulation-study-on-treatment-efficacy-50f0b2a24258

2. R Core Team (2025). R: A Language and Environment for Statistical Computing_. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.

3. Šimundić AM. Measures of Diagnostic Accuracy: Basic Definitions. EJIFCC. 2009 Jan 20;19(4):203-11.

4. Altman DG, Bland JM. Statistics notes: Diagnostic tests 1: Sensitivity and specificity. BMJ. 1994;308(6943):1552. doi: 10.1136/bmj.308.6943.1552.

Downloads

Published

2026-03-31


How to Cite

1.
Vaz FS. Can a Small Sample Size Capture a Big Truth? A Hypothetical Simulation Study on Diagnostic Accuracy of AI vs Radiologist in Tuberculosis Detection through X-rays. JEFI [Internet]. 2026 Mar. 31 [cited 2026 May 18];4(1):84-90. Available from: https://efi.org.in/journal/index.php/JEFI/article/view/274

Similar Articles

1-10 of 165

You may also start an advanced similarity search for this article.