Science

Researchers Expose Security Flaws in Large Language Models

Published

2 weeks ago

27 August, 2025

Recent research has uncovered significant vulnerabilities in large language models (LLMs), demonstrating that these systems can be easily manipulated to disclose sensitive information. Despite advancements in artificial intelligence, including high benchmark scores and claims about nearing artificial general intelligence (AGI), these models still struggle with basic human-like reasoning.

A series of studies conducted by various research labs have highlighted how LLMs can be misled through simple techniques, such as using run-on sentences or poorly structured prompts. These tactics take advantage of gaps in the models’ training, which typically aim to refuse harmful requests. For instance, one effective approach involves crafting lengthy instructions that lack punctuation, which can confuse the model’s safety mechanisms. According to David Shipley, a representative from Beauceron Security, “the truth about many of the largest language models out there is that prompt security is a poorly designed fence with so many holes to patch that it’s a never-ending game of whack-a-mole.”

Manipulating Language Models

The concept of a “refusal-affirmation logit gap” has been identified by researchers at Palo Alto Networks’ Unit 42. This gap suggests that while LLMs are trained to reject harmful queries, the potential for dangerous outputs still exists. The researchers have discovered that attackers can exploit this gap by using specific grammatical structures, achieving success rates of 80% to 100% in their experiments. This was particularly evident when manipulating models such as Google’s Gemini, Meta’s Llama, and OpenAI’s gpt-oss-20b.

One significant finding illustrated how a lack of punctuation can allow attackers to bypass safety features. The researchers advised, “never let the sentence end — finish the jailbreak before a full stop and the safety model has far less opportunity to re-assert itself.” This raises concerns about the inherent weaknesses in LLMs, prompting calls for more robust security measures.

Image Exploitation and Broader Implications

Additionally, vulnerabilities extend beyond textual manipulation. Research from Trail of Bits revealed that images uploaded to LLMs could covertly transmit sensitive information. In their experiments, images containing harmful instructions were undetectable at full resolution but became visible when scaled down. For example, a command to check a calendar and send event details was executed by the model without recognition of its harmful nature.

The implications of these findings are significant. The vulnerability was shown to affect various systems, including Google Gemini’s command-line interface (CLI) and other interfaces like Vertex AI Studio and Google Assistant. Shipley expressed concern, stating that the security of many AI systems appears to be an afterthought, highlighting a long-standing issue where security measures are implemented reactively rather than proactively.

Moreover, a study by security firm Tracebit indicated that additional vulnerabilities could allow malicious actors to access sensitive data through a combination of prompt injection and poor user experience design. The researchers noted that these factors collectively create significant, undetectable risks.

A fundamental misunderstanding of AI’s operational mechanics contributes to these vulnerabilities. According to Valence Howden, an advisory fellow at Info-Tech Research Group, effective security controls cannot be established without a clear understanding of how models operate and respond to prompts. He emphasized that the dynamic and complex nature of AI makes static security measures less effective.

With around 90% of models trained primarily in English, the introduction of other languages further complicates security efforts, as contextual cues can be lost. Shipley remarked that the current state of AI security resembles a poorly managed landscape, stating, “there’s so much bad stuffed into these models…the only sane thing, cleaning up the dataset, is also the most impossible.”

As researchers continue to expose these vulnerabilities, it becomes increasingly clear that the security of LLMs needs substantial improvement. The industry must shift its approach to prioritize security from the ground up, rather than as an afterthought. The risks presented by these weaknesses highlight the urgent need for better safeguards in the evolving landscape of artificial intelligence.

TDGNEWS

Researchers Expose Security Flaws in Large Language Models

Science

Researchers Expose Security Flaws in Large Language Models

Manipulating Language Models

Image Exploitation and Broader Implications

Ex-Sharesies Managers Secure $4.6 Million for AI Assistant Marloo

Prince William Reveals Prince George’s Mischievous Side in Private

Prince Harry Visits Queen Elizabeth’s Grave on Death Anniversary

Local Hero Vani Kapoor Unites Community Through Support Services

Epstein Estate Submits Files to Congress, Including Controversial Letter

Taranaki Hosts Inaugural Science Symposium to Boost Innovation

Test Your Skills: Join the Herald’s Sports Quiz Challenge

Hawke’s Bay Leader Seeks River Dredging for Final Waka Voyage

Prince Harry Visits Queen’s Grave Amid Royal Family Speculation

Test Your Knowledge: Take the Herald’s Afternoon Quiz Today

PM Faces Backlash from Fans During Netball Trophy Ceremony

Dunedin Designers Win Top Award at Hokonui Fashion Event

Liam Lawson Launches New Era for Racing Bulls with Strong Start

Disney Fan Reveals Dress Code Tips for Park Visitors

Walking Faster Offers Major Health Benefits for Older Adults

Coalition Forms to Preserve Māori Wards in Hawke’s Bay

Scots Rally with Humor and Music to Protest Trump’s Visit

UK and India Finalize Trade Deal to Boost Economic Ties

Trending

TDGNEWS

Researchers Expose Security Flaws in Large Language Models

Manipulating Language Models

Image Exploitation and Broader Implications

You may like

Ex-Sharesies Managers Secure $4.6 Million for AI Assistant Marloo

Prince William Reveals Prince George’s Mischievous Side in Private

Prince Harry Visits Queen Elizabeth’s Grave on Death Anniversary

Local Hero Vani Kapoor Unites Community Through Support Services

Epstein Estate Submits Files to Congress, Including Controversial Letter

Taranaki Hosts Inaugural Science Symposium to Boost Innovation

Test Your Skills: Join the Herald’s Sports Quiz Challenge

Hawke’s Bay Leader Seeks River Dredging for Final Waka Voyage

Prince Harry Visits Queen’s Grave Amid Royal Family Speculation

Test Your Knowledge: Take the Herald’s Afternoon Quiz Today

PM Faces Backlash from Fans During Netball Trophy Ceremony

Dunedin Designers Win Top Award at Hokonui Fashion Event

Liam Lawson Launches New Era for Racing Bulls with Strong Start

Disney Fan Reveals Dress Code Tips for Park Visitors

Walking Faster Offers Major Health Benefits for Older Adults

Coalition Forms to Preserve Māori Wards in Hawke’s Bay

Scots Rally with Humor and Music to Protest Trump’s Visit

UK and India Finalize Trade Deal to Boost Economic Ties

Trending