Babylon Health Sympton Checker diagnostic chatbot
Babylon Health's Diagnostic and Triage System (aka 'Sympton Checker') is a chatbot that uses artificial intelligence to ask users questions about their condition in order to identify what is wrong with them and prioritise treatment.
The system has drawn criticism for its perceived poor quality and safety, including alleged errors in diagnosing conditions such as heart attacks, for its opaque governance and misleading marketing claims, and for the manner in which the company responded to the feedback angled at it.
Quality, safety
First launched in 2016, Babylon's Sympton Checker quickly attracted concerns. A 2017 WIRED assessment of three diagnostic tools rated Babylon's as the least accurate when it came to identifying common illnesses such as asthma and shingles.
In June 2018, the company said its chatbot could diagnose common diseases as well as human physicians - a claim that was openly questioned by the Royal College of General Practitioners, the British Medical Association, and the Royal College of Physicians, amongst others.
A 2018 paper published inThe Lancet concluded that Babylon’s study failed to offer convincing evidence that its symptom checker performs 'better than doctors in any realistic situation, and there is a possibility that it might perform significantly worse.'
In February 2020, a BBC Newsnight investigation raised concerns about the bot's safety. And the UK's medical device regulatory body, MHRA, said in a December 2020 letter to NHS consultant Dr David Watkins, a persistent (pseudonymous) critic of the company's governance and products, that his concerns were 'valid' and 'ones that we share'.
Bias, discrimination
In September 2019, Dr Watkins accused Babylon's chatbot of possible gender bias in Babylon’s symptom checker - a finding dismissed by Babylon as relevant to the health industry as a whole, but not evident in its diagnostic tool.
Governance, transparency, accountability
Babylon Health's corporate and product governance have been singled out for being inadequate and opaque, and its marketing as misleading.
It has been accused by NHS consultants of rushing to market without adequate proof that its Sympton Checker worked, notably by testing it on real patients, and having failed to conduct any peer-reviewed, randomised control studies before launch.
Forbes reported in 2018 that 'Interviews with current and former Babylon staff and outside doctors reveal broad concerns that the company has rushed to deploy software that has not been carefully vetted, then exaggerated its effectiveness.'
In February 2020, the company dismissed (pdf) Dr Watkins as a 'troll' and claimed he had 'targeted members of our staff, partners, clients, regulators and journalists and tweeted defamatory content about us.'
Operator: Babylon Health
Developer: Babylon Health
Country: UK; USA; Rwanda
Sector: Health
Purpose: Provide health information
Technology: Chatbot; NLP/text analysis; Deep learning; Neural network; Machine learning
Issue: Accuracy/reliability; Bias/discrimination - gender; Safety
Transparency: Governance; Complaints/appeals; Legal; Marketing
System
Babylon Health. Sympton Checker
Babylon Health (2022). Babylon Triage Product Information
Babylon Health (2020). Babylon results published after 2400 Twitter troll tests (pdf)
Babylon Health researchers (2018). A comparative study of artificial intelligence and human doctors for the purpose of triage and diagnosis (pdf)
Legal, regulatory
UK Advertising Standards Authority (2018). ASA Ruling on Babylon Healthcare Services Ltd t/a GP at hand
Research, advocacy
Gilbert S., et al (2020). How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs
Fraser H., Coeira E., Wong D. (2018). Safety of patient-facing digital symptom checkers
Razzaki S., et al (2018). A comparative study of artificial intelligence and human doctors for the purpose of triage and diagnosis
McCartney M. (2017). Innovation without sufficient evidence is a disservice to all
Investigations, assessments, audits
BBC Newsnight (2020). Digital Healthcare: Is it clinically effective?
BBC Horizon (2018). Diagnosis on Demand? The Computer Will See You Now
BBC Radio 4 Inside Health (2017). Robo-docs, using AI to diagnose; Pancreatic cancer; Statins and muscle aches
News, commentary, analysis
https://www.ft.com/content/19dc6b7e-8529-11e8-96dd-fa565ec55929
https://www.thetimes.co.uk/article/its-hysteria-not-a-heart-attack-gp-app-tells-women-gm2vxbrqk
https://inews.co.uk/news/babylon-health-doctor-app-nhs-does-it-work-266864
https://www.cnbc.com/2018/06/28/babylon-claims-its-ai-can-diagnose-patients-better-than-doctors.html
https://www.thetimes.co.uk/article/its-hysteria-not-a-heart-attack-gp-app-tells-women-gm2vxbrqk
Page info
Type: System
Published: July 2023