Comparative accuracy of chatbots for queries related to methods of tobacco cessation: A systematic review

Authors

  • Sonal Bhatia All India Institute of Medical Sciences image/svg+xml Author
  • Harsh Priya All India Institute of Medical Sciences image/svg+xml Author
  • Nitesh Tiwari All India Institute of Medical Sciences image/svg+xml Author
  • Bharathi M. Purohit All India Institute of Medical Sciences image/svg+xml Author
  • Partha Haldar All India Institute of Medical Sciences image/svg+xml Author

DOI:

https://doi.org/10.56450/

Abstract

Background: Artificial intelligence (AI)-based conversational agents, including chatbots and machine learning (ML) platforms, are increasingly being used to disseminate health information and assist in tobacco cessation counselling. However, their comparative accuracy in providing evidence-based tobacco cessation advice remains unclear.

Objective: This systematic review aimed to evaluate and compare the accuracy of responses provided by chatbots and ML platforms to queries related to methods of tobacco cessation.

Methods: Following PRISMA guidelines, a comprehensive electronic search was performed in October 2025 across PubMed, EMBASE, Web of Science, and Scopus. The search strategy was structured using the PICO framework. Grey literature was identified through Google Scholar and OpenGrey, and reference lists of included studies were hand-searched. Eligible studies included descriptive, observational, cohort, cross-sectional, and case-control designs that compared the accuracy of chatbots or ML-based systems for tobacco cessation queries. Data were extracted and synthesized narratively, with quantitative measures of accuracy expressed as frequencies, proportions, or mean differences.

Results: A total of 2,516 records were identified (PubMed=1,587; EMBASE=371; Web of Science=558). After removing duplicates and screening, preliminary inclusion suggests that most studies evaluated chatbots such as ChatGPT, Gemini, and Perplexity for smoking cessation counselling and pharmacotherapy recommendations. Early evidence indicates substantial variability in accuracy across platforms, with large language models (LLMs) giving similar accuracies compared to earlier or rule-based systems. Factors such as prompt phrasing, question domain, and language significantly influenced accuracy levels.

Discussion: The findings highlight an evolving but uneven performance landscape among AI conversational agents for tobacco cessation. While newer generative AI models show improved reliability and contextual understanding, inconsistencies persist, particularly in region-specific or non-English responses. Further standardization of benchmarking methods and continuous model training on verified cessation guidelines are recommended to enhance clinical applicability and public health integration.

Downloads

Download data is not yet available.

References

Published

2026-04-02


Issue

Section

EFICON 2025 Abstracts

How to Cite

1.
Bhatia S, Priya H, Tiwari N, Purohit BM, Haldar P. Comparative accuracy of chatbots for queries related to methods of tobacco cessation: A systematic review. JEFI [Internet]. 2026 Apr. 2 [cited 2026 Apr. 3];3((2Supp). Available from: https://efi.org.in/journal/index.php/JEFI/article/view/475