Science
Researchers Uncover Major Vulnerabilities in Large Language Models

Recent investigations by multiple research teams have exposed significant vulnerabilities in large language models (LLMs), revealing that these AI systems can be easily manipulated to disclose sensitive information. Despite advances in artificial intelligence and high benchmark performance, these findings indicate that security measures for LLMs remain inadequate.
The research highlights a troubling trend: LLMs can be tricked into revealing confidential data through the use of run-on sentences and poorly constructed prompts. For instance, the absence of punctuation in a prompt can confuse the model, leading it to bypass safety protocols. David Shipley, a security expert at Beauceron Security, emphasized this issue, stating, “The truth about many of the largest language models out there is that prompt security is a poorly designed fence with so many holes to patch that it’s a never-ending game of whack-a-mole.”
In addition to problems with text prompts, researchers at Palo Alto Networks’ Unit 42 have identified a “refusal-affirmation logit gap.” This gap indicates that while LLMs are trained to refuse harmful queries, they still possess the potential to generate dangerous outputs. The research shows that attackers can exploit this gap by crafting prompts that do not allow the model to reassert safety measures. The success rates for such tactics varied, with researchers reporting up to an 80% to 100% success rate against several mainstream models, including Google’s Gemini and OpenAI’s recent model, gpt-oss-20b.
The implications of these vulnerabilities are significant, particularly in professional settings where employees upload images to LLMs. Researchers from Trail of Bits demonstrated that images containing hidden messages could be used to extract data from systems such as the Google Gemini command-line interface (CLI). When resized, certain areas of these images changed color, revealing commands that the model executed without proper validation. This method poses a substantial risk, as it could potentially expose sensitive information inadvertently.
Security assessments by other firms, like Tracebit, have also raised alarms. Their findings suggest that a combination of prompt injection, inadequate validation, and poor user experience design can lead to significant security breaches. As outlined by researchers, these issues create a cascade of vulnerabilities that are often undetectable.
The fundamental problem lies in a lack of understanding regarding how LLMs operate. Valence Howden, an advisory fellow at Info-Tech Research Group, noted that applying effective security controls is challenging due to the complexity and dynamic nature of AI models. He pointed out that current security measures are not equipped to handle the nuances of natural language as a potential threat vector.
Shipley further highlighted that many AI systems have been constructed with security as an afterthought. This oversight has resulted in models that are “insecure by design,” compromising user safety. He likened the situation to a “big urban garbage mountain” covered with a thin layer of snow, suggesting that while it may appear functional, the underlying issues remain unresolved.
With the rapid evolution of AI technology, the need for robust security measures is paramount. As researchers continue to uncover these vulnerabilities, the call for a comprehensive approach to AI security becomes increasingly urgent. The landscape of artificial intelligence must evolve to address these critical gaps before they lead to real harm in everyday applications.
-
World1 month ago
Test Your Knowledge: Take the Herald’s Afternoon Quiz Today
-
Sports1 month ago
PM Faces Backlash from Fans During Netball Trophy Ceremony
-
Lifestyle1 month ago
Dunedin Designers Win Top Award at Hokonui Fashion Event
-
Sports1 month ago
Liam Lawson Launches New Era for Racing Bulls with Strong Start
-
Lifestyle1 month ago
Disney Fan Reveals Dress Code Tips for Park Visitors
-
Health1 month ago
Walking Faster Offers Major Health Benefits for Older Adults
-
World2 months ago
Coalition Forms to Preserve Māori Wards in Hawke’s Bay
-
Politics1 month ago
Scots Rally with Humor and Music to Protest Trump’s Visit
-
Top Stories2 months ago
UK and India Finalize Trade Deal to Boost Economic Ties
-
World2 months ago
Huntly Begins Water Pipe Flushing to Resolve Brown Water Issue
-
Science1 month ago
New Interactive Map Reveals Wairarapa Valley’s Geological Secrets
-
World2 months ago
Fonterra’s Miles Hurrell Discusses Butter Prices with Minister Willis