Exploring AI challenges: common limitations of large language models such as those used by ChatGPT

05 October 2023

In recent years, significant advancements have been made in the development of Large Language Models (LLMs) within the field of Artificial Intelligence (AI). Models like GPT-3 (used in ChatGPT) and GPT-4, created by OpenAI, have demonstrated unprecedented capabilities, challenging conventional understandings of learning and cognition. These LLMs showcase remarkable proficiency in solving complex tasks across diverse domains such as mathematics, coding, medicine, law, and psychology. In the context of software engineering, LLMs have emerged as highly reliable tools, capable of tackling even the most intricate tasks when it comes to assisting developers in their work. This level of performance would have been unimaginable just a few years ago.

While acknowledging the undeniable power and utility of LLMs, this article explores the technological and societal limitations and challenges posed by GPT-4 and other LLMs, with a focus on their implications in the field of software engineering.

The Problem

The development of evermore powerful LLMs, such as GPT-4, is not going to stop anytime soon, and they will be relied upon more and more, in different forms (chat assistants, human-machine interfaces for various databases, plugins for Integrated Development Environments (IDEs)...) and in all areas of life.

This raises significant questions and concerns regarding the capabilities, limitations, and future implications of these models. Understanding the limitations and challenges associated with LLMs is crucial to build trust and mitigate the potential negative impacts of their use, as well as to identify areas for improvement in AI technology.

Ethical and societal questions

LLMs can inadvertently perpetuate biases present in the training data, leading to biased, misinformed or discriminatory outputs. Addressing concerns related to transparency, accountability, and privacy is vital for responsible deployment and to ensure the technology benefits society as a whole.

The deployment of LLMs may have a significant impact on the workforce, potentially leading to job displacement and economic inequality. Understanding these implications is crucial for preparing the workforce and creating strategies to mitigate potential negative consequences.

Technological issues

LLMs have a propensity to generate errors without warning when faced with specific problems. Examples of such problems include mathematical (even basic additions/multiplications), programming, logical problems, and higher-level concepts (such as jokes). Often referred to as hallucinations, these errors appear plausible and can be intertwined with correct statements, presented in a persuasive and confident manner. As a result, identifying these errors requires careful inspection and diligent fact-checking.

The cause of hallucinations has been linked to the very architecture of LLMs: LLMs generate tokens sequentially according to learned probabilities. They have a limited capability for anticipation or reasoning, as well as limited memory of what they wrote in the past and no ability to backtrack. This forward-only architecture also makes LLMs less capable of correcting their mistakes by themselves even when they are pointed out.

How can we solve these problems?

The development and refinement of large language models, specifically the notable improvement in critical thinking and logical reasoning capabilities seen from ChatGPT with GPT-4, are crucial factors in effectively addressing the challenges that have been discussed. Advancements in training methodologies and model architectures will contribute to the development of LLMs, making them less prone to biases and errors, as well as more interpretable.

Explainable AI (XAI) can be a valuable tool in addressing the problem of errors and hallucinations generated by large language models (LLMs). By providing transparency and interpretability in AI systems, explainable AI techniques can help the user identify errors and uncover the underlying reasons behind these errors, therefore enhancing the trustworthiness and reliability of LLM outputs.

Training people to use ML applications, and ensuring they have a basic understanding of the underlying concepts at work, is essential to make sure they realise the power, and most importantly the limitations of ML. Using and trusting ML blindly without comprehending its inner workings can lead to potential risks.

Overall, technology can provide the means to tackle some of the limitations and challenges discussed here, while also paving the way for future advancements in artificial intelligence.

Need help with Artificial Intelligence in your next project?

Artificial Intelligence, and Machine Learning in particular, has proven useful time after time in solving problems for which historical data is available.

Many of our customers (e.g, European Space Agency, Eumetsat, European Central Bank, Wella, etc.) are already benefiting from the AI + Software solutions Solenix has developed for them. For instance:

Solenix has supported the European Central Bank in improving plausibility checks with Machine Learning and Explainable AI (XAI).
Solenix has provided Machine Learning Training at Eumetsat’s HQ applied to Earth Observation use cases.
In the area of hair coloring Solenix has helped Wella in shortening the time needed to create new recipes and play what-if analysis on new ingredients combinations, leading to increased efficiency.

If you want to learn more about this topic, we found this article very insightful.