Indirect Prompt Injection

13/05/2026

Building LLM-backed workflows has introduced a new threat vector that transcends traditional software vulnerabilities: the evolution of prompt injection from a simple chatbot bypass into a systemic risk. While initial security measures focused on direct injection, where a user provides malicious input to override model instructions, the rise of agentic workflows has shifted the focus toward indirect prompt injection. In this scenario, the model consumes data from external untrusted sources such as third-party plugins, emails or document repositories which may contain hidden malicious instructions. This shift is a fundamental change in the security perimeter, as the threat no longer originates only from the end-user but can be embedded within the data streams that powers the model.

The technical seriousness of this type of vulnerability lies in the unclear boundary between instructions and data. Unlike traditional computing systems that keep a strict separation between executable code and data, LLMs treat all inputs as part of a continuous context window. When a model uses a plugin to summarize an external webpage or process an incoming file, any malicious instruction hidden in that content can hijack the model's logic. This can lead to unauthorized data exfiltration or the silent manipulation of processes. This vulnerability is particularly critical in "Disposable Software" environments, where tools are quickly generated to solve specific problems bypassing the security reviews applied to enterprise-wide platforms.

Managing this risk requires a departure from the "break-fix" mentality towards a philosophy of strategic restriction. The proliferation of single-use AI-generated utility tools increases the probability for these type of attacks. The implementation of a "zero-trust" approach to model inputs treating every piece of external data, regardless of its source, as potentially executable code, is an obvious first step mitigating these threat vector.

A model that consumes unverified data creates a state of dangerous confidence among engineering teams. Technical leaders must prioritize the auditing of these models against actual production behavior, ensuring that decisions remain robust and informed by verified and validated data sources.

Tech Leaderism

Indirect Prompt Injection

More Posts