Published: Jan 29, 2026|21 min read21 minutes read
Agents are LLMs that move in the partially observed environment, interact with it, reason, plan, act, and adapt to the changing environment. They need to use tools to gather information in the iterative cycles. The information required to complete the task is hidden and requires online querying or tool use to obtain it. If your environment is fully observed and determined, tasks are repetitive, and you don’t need to adapt to unplanned changes, a simple LLM pipeline is just enough.
It is worth mentioning the difference between an agentic system and an agentic task. An agentic system is an AI system capable of autonomously completing user-defined tasks, which takes multiple steps to reason and interact with an environment to complete it. An agentic task, on the other hand, has no clearly-defined success path, a changing environment, and requires completing the task under a partially observed environment. If we want, we can use an agentic system for non-agentic tasks. However, agentic tasks require agentic systems; they cannot be completed with one-shot reasoning or a user-defined pipeline. The growing popularity and spectacular performance of AI Agents make it appealing to apply them in every automation. However, an AI Agent is not a silver bullet. Sometimes it does not bring any clear benefits or can even degrade the system performance.
Task No 1: In a context, you are given a document containing a list of documents provided by the customer. Please check in our database whether all documents were correctly posted in our system. For that purpose, use the sql_tool. Return the response in the following structure:
{
“All documents present”: <yes/no>,
“List of missing documents”: [],
}
Task No 2: You are a helpful customer service assistant. Please analyse the problem the customer is facing and propose a solution. You can use the following tools for troubleshooting:
customer_history: stores the customer-related logs
customer_profile: stores the customer's purchase history, scope of service, and warranty times
manual: manuals for our products.
ask_customer: query for information by asking the customer
knowledge_base: search our internal knowledge base for similar use cases or documentation.
In case of troubles or if you are unable to help the customer, you can pass the call to our specialised human assistance teams by invoking a human tool.
Both tasks can be agentic tasks. However, Task 1. Can be easily implemented with a workflow. Therefore, it is debatable whether an agent implementation is desired here. The workflow implementation has a few advantages: it is deterministic, explainable, can be optimized, and there is no freedom for an agent to make mistakes. Approaching Task No. 1 with a workflow can be done in 3 steps:
Based on the given document, return a list of missing customer documents in the following format: {“missing_docs”: [<doc1>, <doc2>, <doc3>, …]}
Call the sql_toolprogrammatically.
Parse the SQL response to the requested JSON output.
If the input document format is standardised, it can also be implemented fully programmatically, without LLM calls at all.
Task No. 2, on the other hand, does not contain a clear solution path. We don’t know the problem beforehand. The agent might ask the user to take certain actions to troubleshoot the problem, which changes the environment in which the agent exists. The agent needs to iteratively query users and tools to analyse, plan actions, and find the solution. The environment is not fully observed from the beginning. Those are agent actions that reveal problem depths, debug, and allow to formulate the solution.
It’s worth mentioning that even though certain tasks are agentic today, they could be solved by a single LLM in the future. It is already the case for the GSM8K Math questions dataset. One year ago, LLMs needed a calculator tool to return the correct answer. Today, many LLMs can solve the benchmark problems without a calculator. The adjective “agentic” is relative to the current LLM capabilities.
Direct LLM prompting tasks reveal LLMs' linguistic knowledge, whereas agents' intelligence is measured based on exploration, adaptation, and coordination.
sustained multi-step interactions with an external environment
iterative information gathering under partial observability
adaptive strategy refinement based on environmental feedback.
Web browsing, financial trading, software engineering, and interactive planning are usually tasks requiring agents, assuming certain system complexity. For example, generating a code in isolation can be completed in a single-shot LLM call and cannot be considered agentic. It is the coordination of targeting task goals, navigating through repositories, testing, and refinement of the code that makes the task agentic.
Given that we identified the agentic task, the next step is to define agentic system design. We have two main choices: Single Agent System (SAS) and Multi Agent System (MAS).
Single-agent system
It is a single LLM-agent that has a unified memory stream, where it stores all reasoning and planning steps, and tool usage results. We can distinguish four main design patterns:
ReAct: reason and act pattern. The agent reasons, uses tools, and reasons again in a loop, until it is happy with the results.
Self-reflection. It is an extension of the ReAct pattern, where we add additional steps or self-assessment and critique of the reasoning step. We ask the agent to re-formulate the reasoning step based on self-reflection feedback.
Planning. It is a modification of the ReAct pattern, where we ask the agent to split the task into smaller subtasks and perform them, usually sequentially. We can also implement the planning pattern on top of the self-reflection pattern. In that case, we self-reflect on the plan and reformulate it if necessary.
Decompose the task into smaller tasks, creating the execution plan
Execute task by task sequentially, including tool calls
Formulate the answer or re-plan if necessary
LATS: the tree search agent. Agents investigate multiple solutions by creating a solution tree. It creates multiple solution trajectories (tree branches). It iteratively progresses with each branch. It reflects on results from each branch and decides which ways to investigate further and which to cut off. If the trajectory fails, a reflection is generated and used as additional context for future trials. The agent needs to balance exploration with exploitation. The branches themselves can split at any depth level, creating a multi-solution space. LATS is the most resource-consuming solution, as it requires investigating multiple approaches to solve the problem.
In contrast, multi-agent systems fragment the context each agent poses and add the communication and coordination overhead between agents. From the benefits site, we can compare a multi-agent system to a team, where each member has unique skills and contributes to the task to achieve the final goal. Knowledge and context switching capabilities of a single person wouldn’t be enough to solve the complex task. Adding on top of this agent's parallelisation, we can get faster delivery of complex tasks.
But is the unique value proposition of multi-agent collaboration really worth it? LLMs gain extended context windows, sophisticated tool use, and improved self-reflection, which might be enough to solve complex tasks with a single agent and a long context window. The Towards a Science of Scaling Agent Systems study provides a really deep elaboration on that. Authors of the paper reveal that:
““... multi-agent performance is governed by quantifiable trade-offs: a tool-coordination trade-off where tool-heavy tasks suffer from coordination overhead, capability saturation where coordination yields diminishing returns beyond∼45% single-agent baselines, and architecture-dependent error amplification ranging from 4.4×(centralized) to 17.2×(independent). Performance gains vary dramatically by task structure, from +80.9% on Finance Agent to−70.0% on PlanCraft, demonstrating that coordination benefits depend on task decomposability rather than team size.””
Long story short: it depends on task characteristics and architectural choices. MAS often does not bring any benefit over SAS. In practice, MAS can yield worse results than SAS. It also happens that MAS brings a significant performance boost over SAS. It mainly depends on the task decomposability and coordination overhead that the system will need to handle. On-non agentic benchmarks, MAS systems can bring benefits through the ensemble effect, where every single agent's response votes for a final output.
In MAS, we can distinguish four main design patterns:
Independent. Multiple agents work on the same task independently. The final response is aggregated by majority voting.
Decentralised: Same as independent, but agents can exchange information between each other directly.
Centralised: there is a master agent that coordinates information exchange between agents and formulates results based on partial results provided by sub-agents. Agents do not communicate with each other directly.
Hybrid: compared to the centralized approach, it allows exchanging information between agents directly as well.
Agents interact with the environment to complete the task. The environment is not fully observed in the beginning and needs an active agent to query and a tool use to find the solution. The environment can change over time, and the agent needs to adapt to it. Task completion requires multiple steps and cannot be completed in a single LLM reasoning step.
Agents can be applied to non-agentic tasks; however, agentic tasks require an agentic system, which is subject to the current LLM capabilities. It may happen that tasks that require an agent today will be solvable with a single LLM call in the future.
Multi-agent systems are not always better than the single agent systems. The design choice should be made based on the task's decomposability and the predicted agents' communication overhead. Eventually, the decision should be made based on experimental results. MAS can bring benefits when applied to non-agentic tasks, through the ensemble effect. Aggregating multiple agents' responses by voting strategy corrects errors of a single LLM.
Do you wonder whether your automation is suitable for agents? Reach out to us, and we will discuss the task complexity, gather requirements, and propose a solution. If you like it, we can take a step further and implement it for you.