Science
Researchers Expose Vulnerabilities in Large Language Models

Researchers have uncovered significant vulnerabilities in large language models (LLMs), demonstrating that these systems can be easily manipulated to disclose sensitive information. Despite claims of advanced training and the promise of artificial general intelligence (AGI), recent findings reveal that LLMs remain susceptible to exploitation through simple tactics such as run-on sentences and poor grammar.
A study conducted by various research labs indicates that LLMs struggle in scenarios where human intuition would typically prevent error. For instance, researchers found that by using lengthy prompts without punctuation, they could trick models into revealing confidential details. David Shipley, a representative of Beauceron Security, remarked, “The truth about many of the largest language models out there is that prompt security is a poorly designed fence with so many holes to patch that it’s a never-ending game of whack-a-mole.”
The vulnerabilities stem from what researchers at Palo Alto Networks’ Unit 42 describe as a “refusal-affirmation logit gap.” In essence, while LLMs are programmed to reject harmful queries, there remains a gap that attackers can exploit to produce dangerous outputs. The Unit 42 team noted that using bad grammar and run-on sentences can facilitate these exploits, achieving success rates of between 80% to 100% across various mainstream models, including Google’s Gemini and OpenAI’s gpt-oss-20b.
Image Exploits and Data Breaches
The vulnerabilities extend beyond text prompts. Researchers from Trail of Bits demonstrated that LLMs could be manipulated through images containing harmful instructions. These instructions only became discernible when the images were scaled down, allowing for the extraction of sensitive data. In one notable experiment, researchers managed to command Google’s Gemini command-line interface to access calendar events, despite the instructions being hidden within an image.
The method exploits how LLMs process images at different resolutions. It highlights a significant risk for enterprise users who unknowingly upload images potentially containing sensitive information. Shipley emphasized that such security oversights illustrate how AI systems often treat security as an afterthought. “What this exploit shows is that security for many AI systems remains a bolt-on afterthought,” he stated.
Another analysis by Tracebit added to these findings, revealing further vulnerabilities in Google’s command-line interface. Researchers pointed out that a combination of prompt injection and inadequate validation could allow malicious actors to access data without detection.
Reevaluating AI Security Measures
The persistent security issues can be traced back to fundamental misunderstandings about how AI systems operate. Valence Howden, an advisory fellow at Info-Tech Research Group, stated that effective security controls cannot be established without a thorough understanding of model behavior. Many AI systems are primarily trained in English, which can lead to lost contextual cues when different languages are involved.
Shipley noted that many AI models are built with inadequate security measures, leading to an environment where vulnerabilities can be easily exploited. He likened LLMs to “a big urban garbage mountain that gets turned into a ski hill,” suggesting that while they may appear polished on the surface, underlying issues remain hidden.
As these vulnerabilities come to light, it becomes increasingly clear that relying solely on internal alignment mechanisms to prevent harmful content is insufficient. The combination of easily exploitable prompts and oversight in security design underscores the need for more robust protective measures in the rapidly evolving AI landscape.
The implications are significant, as researchers warn that the current state of security in many AI systems could lead to real-world harm if not addressed promptly. The focus on enhancing AI security must shift from being an afterthought to a priority, ensuring that the technology remains safe for users across various applications.
-
World1 month ago
Test Your Knowledge: Take the Herald’s Afternoon Quiz Today
-
Sports1 month ago
PM Faces Backlash from Fans During Netball Trophy Ceremony
-
Lifestyle1 month ago
Dunedin Designers Win Top Award at Hokonui Fashion Event
-
Sports1 month ago
Liam Lawson Launches New Era for Racing Bulls with Strong Start
-
Lifestyle1 month ago
Disney Fan Reveals Dress Code Tips for Park Visitors
-
Health1 month ago
Walking Faster Offers Major Health Benefits for Older Adults
-
World2 months ago
Coalition Forms to Preserve Māori Wards in Hawke’s Bay
-
Politics1 month ago
Scots Rally with Humor and Music to Protest Trump’s Visit
-
Top Stories2 months ago
UK and India Finalize Trade Deal to Boost Economic Ties
-
World2 months ago
Huntly Begins Water Pipe Flushing to Resolve Brown Water Issue
-
Science1 month ago
New Interactive Map Reveals Wairarapa Valley’s Geological Secrets
-
World2 months ago
Fonterra’s Miles Hurrell Discusses Butter Prices with Minister Willis