Science

Researchers Uncover Major Vulnerabilities in Large Language Models

Published

2 months ago

27 August, 2025

Recent investigations by multiple research teams have exposed significant vulnerabilities in large language models (LLMs), revealing that these AI systems can be easily manipulated to disclose sensitive information. Despite advances in artificial intelligence and high benchmark performance, these findings indicate that security measures for LLMs remain inadequate.

The research highlights a troubling trend: LLMs can be tricked into revealing confidential data through the use of run-on sentences and poorly constructed prompts. For instance, the absence of punctuation in a prompt can confuse the model, leading it to bypass safety protocols. David Shipley, a security expert at Beauceron Security, emphasized this issue, stating, “The truth about many of the largest language models out there is that prompt security is a poorly designed fence with so many holes to patch that it’s a never-ending game of whack-a-mole.”

In addition to problems with text prompts, researchers at Palo Alto Networks’ Unit 42 have identified a “refusal-affirmation logit gap.” This gap indicates that while LLMs are trained to refuse harmful queries, they still possess the potential to generate dangerous outputs. The research shows that attackers can exploit this gap by crafting prompts that do not allow the model to reassert safety measures. The success rates for such tactics varied, with researchers reporting up to an 80% to 100% success rate against several mainstream models, including Google’s Gemini and OpenAI’s recent model, gpt-oss-20b.

The implications of these vulnerabilities are significant, particularly in professional settings where employees upload images to LLMs. Researchers from Trail of Bits demonstrated that images containing hidden messages could be used to extract data from systems such as the Google Gemini command-line interface (CLI). When resized, certain areas of these images changed color, revealing commands that the model executed without proper validation. This method poses a substantial risk, as it could potentially expose sensitive information inadvertently.

Security assessments by other firms, like Tracebit, have also raised alarms. Their findings suggest that a combination of prompt injection, inadequate validation, and poor user experience design can lead to significant security breaches. As outlined by researchers, these issues create a cascade of vulnerabilities that are often undetectable.

The fundamental problem lies in a lack of understanding regarding how LLMs operate. Valence Howden, an advisory fellow at Info-Tech Research Group, noted that applying effective security controls is challenging due to the complexity and dynamic nature of AI models. He pointed out that current security measures are not equipped to handle the nuances of natural language as a potential threat vector.

Shipley further highlighted that many AI systems have been constructed with security as an afterthought. This oversight has resulted in models that are “insecure by design,” compromising user safety. He likened the situation to a “big urban garbage mountain” covered with a thin layer of snow, suggesting that while it may appear functional, the underlying issues remain unresolved.

With the rapid evolution of AI technology, the need for robust security measures is paramount. As researchers continue to uncover these vulnerabilities, the call for a comprehensive approach to AI security becomes increasingly urgent. The landscape of artificial intelligence must evolve to address these critical gaps before they lead to real harm in everyday applications.

Related Topics:

Up Next

Researchers Expose AI Vulnerabilities in Language Models, Prompt Security Failures

Don't Miss

Researchers Expose Security Flaws in Large Language Models

Editorial

Our Editorial team doesn’t just report the news—we live it. Backed by years of frontline experience, we hunt down the facts, verify them to the letter, and deliver the stories that shape our world. Fueled by integrity and a keen eye for nuance, we tackle politics, culture, and technology with incisive analysis. When the headlines change by the minute, you can count on us to cut through the noise and serve you clarity on a silver platter.