Select your language

Errors in AI Model Reasoning

Errores en razonamiento de modelos de IA

As AI models move from simple tools to agents, the "how" becomes increasingly important.

By: Ph.D, IEEE Carlos Pantsios Markhauser*

As today's AI models increase in complexity and are integrated into more critical applications, it becomes more difficult to understand their ways of reasoning, in principle, because their architectures are so sophisticated. This is especially true of deep learning and LLM (Long Language Model) models.

The accuracy of the responses of recent AI models has improved significantly, driving interest in the technological potential to help sectors such as medical diagnostics, definition of therapies, or acting as virtual tutors.

- Publicidad -

As many AI models are increasingly used as assistants, rather than tools, several recent studies suggest that the way these models reason could have serious implications in critical areas such as healthcare, legal, education, security, and consumer electronics.

AI reasoning models mainly have three limitations: 1) lack of true understanding, 2) dependence on data quality for training, and 3) challenges in handling context and ambiguity.

First, AI models lack a thorough understanding of the concepts. They are good at detecting and processing patterns in data, but they don't understand context or meaning. They misinterpret sarcasm, for example, despite having been trained with sentences with similar structures.

Second, reasoning in AI depends fundamentally on the quality of the data and its scope. Polarizations, voids, or noise in training data directly impact your reasoning. For example, models trained with outdated information cannot reason about recent events, as is the case with a chatbot, ignorant of political changes after 2021. Similarly, driverless vehicle systems trained for sunny climates may have problems on snowy or icy roads.

Third, AI models present problems with ambiguities or dynamic contexts. Human reasoning adapts to evolutionary situations by incorporating real-time feedback and external knowledge. But many AI models operate with fixed parameters. For example, the chatbot may fail when there is an abrupt change of direction in the conversation initiated.

These limitations significantly affect the ability of AI models to replicate human reasoning and limit their practical application in complicated scenarios.

New research on reasoning in AI models suggests that the problem they face is that they reason fundamentally differently from humans, which makes them less suited to solving subtle problems. A recent research paper published in Nature Machine Intelligence highlighted that models have difficulty distinguishing between beliefs and factual facts, and that multi-agent systems, designed to provide medical advice, are subject to errors in reasoning, which can lead to inappropriate diagnoses.

- Publicidad -

As AI models move from simple tools to agents, the "how" becomes increasingly important, says James Zou, an associate professor of medical data science at Stanford School of Medicine and author of the Nature Machine Intelligence paper.

The distinction between factual facts and beliefs is a particularly important skill in areas such as legal, therapy, and education, Zou says.

Experiments conducted on new reasoning models, such as OpenAI's01 or DeepSeek'sR1, showed good results in factual verifications, consistently achieving correct results above 90%. In contrast, the new models presented problems in cases where false beliefs reported in the first person were processed (i.e., "I believe that ... x", when x is incorrect), showed mismatches in 52% to 62% of cases.

Flaws in the way AI models reach decisions could be particularly problematic in medical group discussions. Here, AI-based multi-agent systems work collaboratively discussing problems, hoping to replace the pool of doctors diagnosing complicated medical conditions, says Lequan Ty, an assistant professor of medical AI at the University of Hong Kong. The best multi-agent systems solved simple problems correctly, reaching 90% accuracy.

However, with complex problems requiring specialists, the systems collapsed reaching only 27% accuracy. Part of the problem was because many of these multi-agent systems were based on the same LLM, for all agents involved in the discussion, says Yinghao Zhu, one of the Ph.D. students who co-authored the research paper. Here, the failure to know the model leads to all agents agreeing to give the same wrong answer.

In conclusion, the challenge of robust and genuine reasoning in AI, particularly in LLM models, is an unsolved challenge. While LLMs excel at generating fluent human-type text based on statistical patterns, they often have problems with genuine, systematic, and multilevel logic deductions, a weakness frequently exposed by the issue of hallucinations. LLM models are weak when facing problems of systematic logic and verification.

- Publicidad -

AI models that reason are limited by their ability to understand concepts, their high dependence on imperfect training data, and their difficulties in handling ambiguous contexts and rapid changes over time. To do this, it is still necessary to incorporate human experts for the supervision of the processes and the rigorous validation of the data that is continuously fed to the models, and hybrid-symbolic AI approaches.

*Text written by Carlos Pantsios Markhauser, PhD, IEEE. he is a Telecommunications Engineer, PhD in telecommunications electronics, Master in Communications from the Simón Bolívar University, with a Specialization in Satellite Telecommunications and Networks from The George Washington University - School of Engineering & Applied Science, Specialization in Digital Telecommunications from the University of Colorado Boulder. He works as a postgraduate professor in the telecommunications schools at the Simón Bolívar University and Andrés Bello Catholic University. In addition to being a professional consultant in TV projects based in Argentina.


No comments

• If you're already registered, please log in first. Your email will not be published.

Leave your comment

In reply to Some User
A new Alegria Party at NAB Show with a full house

A new Alegria Party at NAB Show with a full house

NAB. A new NAB Show, a new Fiesta Alegría. It is always a pleasure to meet colleagues and friends from the Latin American broadcast industry who attend the invitation of TVyVideo+Radio every year.

Sony recognized its Latam strategic partners at NAB 2026

Sony recognized its Latam strategic partners at NAB 2026

NAB. As part of its participation in NAB Show 2026, Sony Professional Solutions Latin America (PSLA) held its long-awaited Broadcast Reseller Meeting, a key space to strengthen the relationship with...

305 Broadcast and SCMS seek to strengthen their presence in Latam

305 Broadcast and SCMS seek to strengthen their presence in Latam

Latin America. 305 Broadcast, founded by Alfonso Lopez and recognized for more than 18 years of service to the broadcast industry, announced a strategic alliance with SCMS, a major U.S.-based...

Netflix presented creative training initiatives at the FICCI

Netflix presented creative training initiatives at the FICCI

Colombia. As part of the "Industry Night" of the Cartagena de Indias International Film Festival (FICCI), Netflix reaffirmed its commitment to the Colombian creative ecosystem, announcing four new...

Nacho Carballo, new Global Managing Director of EFD Studios

Nacho Carballo, new Global Managing Director of EFD Studios

Latin America. EFD Studios announced the appointment of Nacho Carballo as the new Global Managing Director, in a decisive commitment to transatlantic collaboration and operational integration.

Campaign launched against piracy of audiovisual content

Campaign launched against piracy of audiovisual content

Argentina. ATVC and CAPPSA presented an awareness and prevention campaign aimed at making visible the direct impact of the consumption of pirated content on users, with special emphasis on the risks...

Music business for productions is transformed

Music business for productions is transformed

Slipstream's catalog exceeds one million tracks, in addition to more than 300,000 sound effects. Richard Santa

Lawo introduced converged video and audio stagebox

Lawo introduced converged video and audio stagebox

Latin America. With Edge One, Lawo opens a new chapter in audio and video connectivity for broadcast and professional audio/video workflows. Edge One offers great flexibility on the I/O side,...

Blackmagic Announces Davinci 21 and More News for NAB

Blackmagic Announces Davinci 21 and More News for NAB

NAB. Blackmagic Design made several announcements ahead of NAB Show 2026. Among them is DaVinci Resolve 21, of which its public beta version is now available for download.

Atomos acquires Flanders Scientific

Atomos acquires Flanders Scientific

Latin America. Atomos announced the acquisition of Flanders Scientific, one of the most prestigious brands in professional benchmark monitoring. This strategy reinforces Atomos' long-term commitment...

Suscribase Gratis
Remember Me
SUBSCRIBE TO OUR ENGLISH NEWSLETTER
DO YOU NEED A PRODUCT OR SERVICES QUOTE?
LATEST INTERVIEWS
SITE SPONSORS










LATEST NEWSLETTER
Ultimo Info-Boletin