
The world of Large Language Models (LLMs) has opened up for everyone. With access to powerful models and a range of tools from GitHub and code-focused utilities to sophisticated platforms like Vanna for SQL database interaction and Prompt Perfect for optimizing your instructions, it seems like anyone can build the next big thing in tech.[1] All you need is to know a language and have an internet connection to interact with an LLM.
So, what gives you an advantage in this crowded market? What innovation can you bring to a new product? The one element you can control is your data, or what we call in the LLM world, your “context.” Instead of immediately jumping to complex methods like Retrieval-Augmented Generation (RAG) or graph-based RAG, LoRa fine-tuning, the real key is to spend most of your time understanding your data and how to represent it effectively. Someone rightly called this “context engineering,” because how well you can shape your context can be your biggest advantage.
But before we explore how to do that, let’s first understand the common ways your context can fail.
1. Context Poisoning
Context poisoning happens when false, irrelevant, or flawed data gets into your context. This misinformation can make your LLM behave in inconsistent and unpredictable ways.[2]
The term was notably used by the DeepMind team when they found that a Gemini agent playing a game would sometimes “hallucinate” or make up information. This happened because incorrect information about the agent’s goal crept into the conversation history and was repeatedly referenced, causing the agent to focus on a non-existent objective.
2. Context Distraction
Context distraction occurs when the context becomes so long that the model starts paying too much attention to the context itself, ignoring the valuable knowledge it learned during its initial training. As you build up a conversation history in your application, this accumulated context can become more of a distraction than a help.[5]
A study by Databricks found that the performance of Meta’s Llama 3.1 405B model started to decline once the context size went beyond 32,000 tokens, and this happened even earlier for smaller models.If models start to perform poorly long before their maximum context window is filled, it raises the question of how useful these massive context windows really are. This phenomenon, sometimes called “context rot,” shows that simply adding more information doesn’t always lead to better results.[5]
3. Context Confusion
Context confusion happens when the model uses less important information from the context to generate a low-quality response. If you put something in the context, the model is designed to pay attention to it. This might be irrelevant details or unnecessary tool definitions, but the model will still consider them.
This is a significant issue when building AI agents that use multiple tools. Studies have shown that models tend to perform worse when they are given more than one tool to work with.
4. Context Clash
Context clash occurs when information in the context conflicts with other information also present in the context. This is a more severe problem than context confusion because the conflicting information isn’t just irrelevant; it directly contradicts other data in the prompt.[9][10]
Researchers tested this by taking prompts from various benchmarks and splitting the necessary information across multiple, separate prompts. This “sharded” information led to a dramatic drop in performance, with an average decrease of 39%. This shows that when the model has to piece together conflicting or fragmented information, its ability to reason and provide accurate answers is severely hampered.
When building agents that gather data from multiple sources, you need to be particularly careful. There’s a high chance that the information from these different sources will conflict. Furthermore, if you connect your agent to tools that you didn’t create, their descriptions and instructions might clash with the rest of your prompt, leading to unreliable behavior.[9] Studies have shown that when there is a conflict, LLMs often tend to favor their internal, pre-trained knowledge over the new information provided in the prompt.
Sources help




