Check out what happened in the world of science and international politics in Q1 2024!
Knowledge article main photo
Artificial intelligence and mental health. Why won’t a chatbot replace a psychotherapist ?

ChatGPT, M.D. How AI has entered the world of medicine?

Artificial intelligence had taken hold in medicine long before the release of ChatGPT, although the topic remained on the fringes of collective consciousness for many. The beginning of such use of machine learning was in the early 1970s, when the first simple models based on rule sets were developed. An example is the MYCIN model which, based on 600 rules, was able to provide a list of potential bacterial pathogens and suggest appropriate antibiotic treatment tailored to the patient’s weight. [1]

Among the numerous applications developed over the ensuing decades are models that assess cardiovascular risk, and algorithms that continuously monitor sugar levels to optimize insulin doses. It has also been used to predict a decline in glomerular filtration rate in patients with polycystic kidney disease. [2]

It is impossible to ignore the achievements of algorithms in analyzing images obtained by various imaging methods – from MRI or CT scans to time-lapse recording from an endoscopic camera.

The branch of machine learning, which includes GPT models and more broadly large language models (LLMs), has found widespread use in diagnostics. Sufficiently extensive databases of reliable data make it possible to design models that can detect hard-to-grasp relationships based on symptoms or laboratory results. From there, it’s a straight road to an intelligent physician’s assistant. In a matter of seconds, it will analyze all available data, confront it with the implemented knowledge and generate a list of possible diagnoses with a proposal of further treatment for each of them.

An example from Poland’s backyard is ParrotAI, a tool being developed by specialists from Altar and scientists from Kielce University of Technology, which is designed to make doctors’ work easier by reducing bureaucratic duties and aiding the diagnosis process. Earlier this year, the developers presented the initial, promising results of their work. ParrotAI is expected to reach the first hospitals and clinics in early 2026. [3]

Beyond the functionality of AI-based systems, the key and truly revolutionary issue is cost. The system described above, which demonstrates the potential to solve one of the biggest ills of the Polish health service — queues and “paperwork” that can take up most of the time in theory provided for a patient — will, according to the assumptions, consume about PLN 11 million. Even if it was ultimately to be 10 times more, compared to the enormity of the benefits and the funds allocated in the Polish budget for health care (PLN 190 billion ) these are small, almost negligible amounts. [4]

As it can be noted, the applications of AI in medicine touch on all sorts of areas, and the few mentioned above are just the tip of the iceberg. Not surprisingly, among representatives of numerous specialties, machine learning evokes almost exclusively enthusiasm.

In the flood of excitement, however, a rather clear line is drawn – the examples described above are from purely “bodily” branches of medicine. Much more controversy arises where therapy touches the human mind – in psychiatry or applied psychology.

Artificial intelligence in the role of a therapist. A dangerous fantasy or the foreseeable future?

Usually, when two patients present the same symptoms, the diagnosis provided by any family doctor with specialized  medical knowledge will be identical. The same will be true of the treatment process. In psychotherapy, the situation is quite different. Symptoms are only part of a larger puzzle inextricably linked to the patient’s biography, past and present environment, as well as his self-perception.

Concerns about the use of AI in psychological therapy do not need to be explained in depth. Intuitively, we understand the human fear of letting technologically created algorithms into the process of altering our characters, colloquially speaking, “into our heads”. The second, somewhat less visible aspect is the extraordinary sensitivity of the data, which must be provided with maximum security before one can even think about the systemic use of chatbots and other AI-based tools for such purposes.

It might seem that the above concerns would be similar, if not even stronger, for the scientific community. However, after reviewing the still severely limited literature, the hopes and potential benefits come to the fore.

Before we go any further, it is necessary to separate the content of the following paragraphs from the hundreds of diverse applications – trainers, personal assistants, virtual friends, etc. The vast majority are applications whose main purpose is to make money for the creators. There is a reason why words like therapy or e-psychologist rarely appear in the names of such programs – this increases liability, and the risk of lawsuits. Usually you have to pay for the few, more professional apps, and certainly none, even the best of them (at this point), can replace a therapist, which is simply another human being.

As reported by Hannah Zeavin, a humanities’ history researcher at Indiana University in Bloomington, quoted in an article that appeared in Nature:

Many apps are quick to co-opt the generally proven efficacy of CBT, stating that their methods are ‘evidence-based’. Yet, one review of 117 apps marketed to people with depression found that of the dozen that implement CBT principles, only 15% did so consistently. The low adherence could be explained by apps incorporating principles and exercises from multiple models of therapy. But another analysis of the claims made by mental-health apps found that among the platforms that cite specific scientific methods, one-third endorsed an unvalidated technique.

However, the imperfection of available solutions should not automatically disqualify potential applications. Paolo Raile of the Sigmund Freud Private University in Vienna writes about the hopes of using chatbots (in this case, the free version of ChatGPT-3.5). [6]:

In summary, ChatGPT is considered a potentially useful tool in healthcare. Concerns remain about accuracy, reliability, and jurisdictional impact.

Analyzing the content of other studies and testimonials from users who sought help on their own through chatbot conversations, the author concluded that ChatGPT could be a valuable addition to the therapeutic process. It is also the most advanced, fairly reliable, easily accessible and free tool where thousands of people are already seeking mental health guidance. Those who benefit the most are those who have not yet had the opportunity to receive professional help. 

According to Raile, ChatGPT is an easily accessible, though imperfect tool for psychotherapists themselves as well. Based on his conversations and testimonials from professional colleagues, the author suggests the chatbot can be used, among other things, to get a second opinion on a diagnosis. It can also be used to get clues on the course of therapy and appropriate methods, or to re-analyze materials gathered during a session. 

The final decision grounded on his knowledge must be made by the therapist himself – the chatbot has at most an auxiliary role here, and in no case can be based solely on its answers.

ChatGPT and the diagnosis of depression. Family physicians lag far behind

More concrete data can be found in an article that appeared in a journal under the aegis of the British Medical Journal. In the analysis, the researchers developed imaginary profiles of people suffering from depression and looked at what recommendations ChatGPT-3.5, the paid version of ChatGPT-4 and real GPs would make for them.

In each analysis, both physicians and the respective versions of the GPT model were given a choice of five options: observation, referral to psychotherapy, drug therapy, psychotherapy combined with pharmacological intervention and none of the above.

For symptoms indicating mild to moderate depression, only 4 percent of doctors recommended psychotherapy. Meanwhile, for ChatGPT-3.5 and ChatGPT-4, it was 95 percent and 97.5 percent, respectively. This is a shocking result, especially since the “rationale” lies in the algorithms. According to current guidelines, patients with symptoms attributed to the study should be placed under the care of a psychotherapist.

In more severe cases, as many as four in 10 doctors prescribed only medication, which (separate from psychotherapy) none of the ChatGPT versions recommended.

It turns out that, at least in the case of depression, publicly available versions of ChatGPT remain in line with the current state of medical knowledge and offer better solutions far more often than GPs do. However, the methodology should not be overlooked – the prompt (the input text entered into the chat text box) is of paramount importance to the response obtained. In a changed format (e.g., a casual conversation or a stream of words from a person in a fragile mental state), the chatbot could arrive at completely different results.

ChatGPT is unbiased but fixated on certain therapies

According to Raile, people who have already become aware of their mood disorder, but have not up to now decided to consult a specialist, can benefit most from ChatGPT conversation.

People with mental disorders who seek this type of help in a conversation with ChatGPT receive empathetic and devoid answers free of the influence of stereotypes, as well as a private and safe environment in which they can open up. The author of this article expresses some reservations about this thesis – the data collected in the course of a conversation with the OpenAI program is well protected, but the creators themselves warn not to introduce sensitive data into conversations.

The content of the chatbot’s statements has its strengths – they regularly state that ChatGPT is not a psychotherapist and that it is best to seek help from one. However, such suggestions do not always fit the situation, for example when the chatbot fails to catch irony.

The key problem is the model’s fixation on two methods of referring to mental disorders – cognitive-behavioral therapy and psychodynamic therapy. To get information about other methods, you have to ask about them explicitly, and even then information is quite incomplete and general. 

In many cases another approach could work much better, but the chat room, while explaining the issue and suggesting simple steps to improve the mental state, focuses primarily on the two methods mentioned. This is where another concern arises – the chatbot is unable to provide any more personalized guidance, because during the conversation it does not ask for details about the user’s biography, symptoms, or possible suicidal thoughts. However, given the sensitivity of the data and the incomplete knowledge of the chatbot, which was not created for therapeutic purposes, seems acutely appropriate.

Chatbot as the support line for people in mental health crisis. Why is this idea fatal?

The fact that the chatbot does not attempt to replace a specialist at all costs and urges people to visit a qualified therapist if they have problems, seems even more favorable when confronted with the results of a study. The study examined how ChatGPT performs in assessing the risk of a suicide attempt. [8]

The authors created brief descriptions (vignettes) of people who, for a variety of reasons, perceived their lives as burdensome and/or experienced thwarted belongingness. Some characterizations were more, others less troubling.

After the chatters were introduced to a given vignette, the researchers asked them six questions about (1) their rating of psychological pain, (2) their level of stress/anxiety, (3) their risk of suicidal thoughts, (4) their risk of a suicide attempt, (5) their risk of a “major suicide attempt” (literal translation – a distinction between a sham attempt aimed at getting attention and one aimed directly at taking one’s own life) and (6) their level of mental toughness. The chatters answered each question using a 7-point scale, which was then used as a score for statistical comparisons with a control group consisting of mental health professionals (MHPs).

In all possible scenarios, ChatGPT rated the risk of a suicide attempt lower than the specialists. Simultaneously, the model also rated psychological resilience lower in the hypothetical patients described in the vignettes. Whether the answer is simply inconsistent or the algorithm underestimates the impact of mental toughness on the risk of a suicide attempt, using it for such purposes would have dire consequences.

In almost every article cited here, two pieces of information appeared at the end. 

First, large language models demonstrate the potential to successfully serve people suffering from mental disorders. For this to happen, however, it is necessary to develop a more sophisticated model, “tailored” for such use, trained on a massive number of very high-quality sources. 

The second point is a phrase that can not be missing from most published articles. More research is required – on what we already have and on what is yet to come in the hands of doctors, therapists, helpline workers, and, above all, patients.

ChatGPT may be smart, but it is by no means a substitute for a human being with developed emotional intelligence.  Nevertheless, the emergence of a more clever and specialized sibling remains only a matter of time.


[1] Vaul K. et al., History of artificial intelligence in medicine,

[2] Briganti G., et al., Artificial intelligence in medicine: Today and Tomorrow,

[3] CC News, Altar and Kielce University of Technology unveiled ParrotAI,, accessed on: 7.02.2024

[4] Service of the Republic of Poland, Record funds for health care – we will allocate PLN 190 billion in 2024,–w-2024,  accessed on: 7.02.2024

[5] Nature, Graber-Stiehl I., Is the world ready for ChatGPT therapists?,

[6] Nature, Raile P., The usefulness of ChatGPT for psychotherapists and patients,

[7] Nature, Sharma A., et al., Human–AI collaboration enables more empathic conversations in text-based peer-to-peer mental health support,  

[8] Elyoseph Z., et al., Beyond human expertise: the promise and limitations of ChatGPT in suicide risk assessment,

Cover photography: Pixabay

Marcin Szałaj
Absolwent kognitywistyki na Uniwersytecie Marii Curie-Skłodowskiej w Lublinie. Dziennikarz i copywriter, który od lat na bieżąco śledzi wszystkie doniesienia ze świata nauki i działa na rzecz jej popularyzacji.
Written by:

Marcin Szałaj

Leave a comment