An unprecedented study recently revealed that artificial intelligence (AI) may enhance decision-making in primary healthcare, leveraging ChatGPT versions 3.5 and 4.
The findings, which were published in the British Medical Journal’s open-access journal Family Medicine and Community Health, compared how ChatGPT evaluated cases of mild and severe depression using hypothetical case studies of patients with symptoms of sadness, sleep problems, and loss of appetite.
The goal of the study was to determine whether an AI tool like ChatGPT could generate established treatment standards for depression just as well as human doctors – without gender bias or social status discrimination.
Globally, approximately 5 percent of adults suffer from depression, according to the World Health Organization (WHO), with general practitioners serving as the first avenue for medical assistance. For treatment, any diagnosis should comply with evidence-based clinical guidelines in alignment with the severity of the depression.
The study, which comprised researchers from the UK and Israel, juxtaposed ChatGPT’s evaluations of hypothetical depression scenarios – mild and severe – against assessments from 1,249 French primary care physicians, with 73% of them women.
According to the study, vignettes were input to the ChatGPT interface, focusing primarily on hypothetical patients with symptoms of depression during initial consultations. Eight distinct versions of vignettes were meticulously designed, varying patient attributes, including, but not limited to, sex, socioeconomic status (blue-collar worker or white-collar worker), and depression severity (mild or severe). Each scenario was presented ten times to ChatGPT versions 3.5 and 4 to ensure consistency and reliability, beginning with the following question:
“What do you think a primary care physician should suggest in this situation?”
The possible responses included:
- Watchful waiting;
- Referral for psychotherapy;
- Prescribed drugs (for depression/anxiety/sleep problems);
- Referral for psychotherapy plus prescribed drugs; or
- None of these
When queried about appropriate treatments for the hypothetical patients, ChatGPT exhibited a significant alignment with clinical guidelines. For instance, while a mere 4 percent of family doctors advocated exclusively for psychotherapy referrals in mild cases, ChatGPT versions 3.5 and 4 favored this approach in 95% and 97.5% of instances, respectively.
For severe depression cases, ChatGPT-4 demonstrated a heightened acumen in matching treatments with clinical advisories. Most of the doctors recommended psychotherapy plus prescribed drugs (44.5%) in comparison with ChatGPT v3.5, suggesting the same option 72% of the time, and ChatGPT v4, suggesting the same option 100% of the time.
According to the researchers, the AI seemed to display a surprising absence of bias, irrespective of patient gender or socioeconomic background, in compliance with clinical guidelines.
AI Should Only Ever Be a ‘Complement’
Yet, the researchers caution against unbridled optimism. They underscored the paramountcy of human judgment in clinical settings, positing that AI tools should complement, not replace, human expertise. They also candidly addressed potential limitations within their study, in addition to ethical and privacy considerations.
However, their overarching message remained positive. The research underscores ChatGPT’s potential to act as a valuable asset, potentially refining decision-making processes in primary healthcare landscapes.
At the end of the day, to say that ChatGPT or any AI tool “may be better than a doctor” at recognizing medical conditions and suggesting proper medical treatments is nothing short of dangerous.
Editor’s note: This article was written by an nft now staff member in collaboration with OpenAI’s GPT-4.