Prompt Injection

An attack technique where malicious instructions are inserted into content processed by an LLM, exploiting the model's inability to distinguish trusted commands from untrusted data.

Prompt injection exploits a fundamental architectural property of large language models: they process instructions and data in a shared context window without a formal grammar to separate them. Unlike SQL injection, which was solved through parameterized queries, prompt injection has no equivalent fix because natural language cannot be deterministically sanitized. Attacks range from direct (user-typed malicious prompts) to indirect (embedded in documents, emails, or tool outputs the model processes).

Also known as

injection attack, prompt attack, indirect prompt injection, direct prompt injection