Introduction
A single LLM call is a feature. A system of cooperating agents — planners, retrievers, tool-callers, reviewers — is an architecture. The difference between an impressive demo and a dependable platform comes down to six unglamorous building blocks that have governed distributed systems for decades, now recast for agentic AI.
1. Routing
In a multi-agent system, routing decides which agent — and which model — handles each task. A planner may decompose a request into sub-tasks, but something must dispatch those sub-tasks intelligently: classification traffic to a small fast model, deep reasoning to a frontier model, domain questions to the agent holding the right tools and context.
Good routing is also an economic control: it caps latency and spend by ensuring expensive capacity is reserved for the work that genuinely needs it, with fallbacks when a model or agent is degraded.
2. Security
Agents act. They call APIs, write to databases, and send messages — which makes an agent platform an attack surface, not just an inference endpoint. Prompt injection, tool-output poisoning, and data exfiltration through model responses are the new OWASP list.
The discipline that anchors all of it is the Principle of Least Privilege: every agent gets the minimum access necessary for its assigned task and nothing more. A retrieval agent gets read-only access to one index; a booking agent can create reservations but never refund them. Blast radius is designed, not discovered.
3. Authentication
Every actor in the system — human, agent, or service — must prove who it is. Multi-agent platforms extend workforce identity standards (OpenID Connect, OAuth2, SAML) with workload identity: short-lived, automatically rotated credentials for each agent instance rather than shared, long-lived API keys.
Just as important is identity propagation: when an agent acts on behalf of a user, the user's identity travels with the request chain, so downstream systems can distinguish 'the agent' from 'the person the agent is serving.'
4. Authorization
Authentication says who you are; authorization says what you may do. For agents, entitlements must be evaluated per tool call, not per session: scoped tokens, policy engines, and human-in-the-loop approval gates for high-impact actions such as payments, deletions, or external communications.
Combined with least privilege, this turns a compromised or confused agent from a catastrophe into a contained incident — it simply lacks the permissions to do real damage.
5. Working Offline
Not every environment has a cloud connection — and not every dataset is allowed to reach one. Edge deployments, air-gapped facilities, and data-residency regimes all demand agents that keep working disconnected: local models served by runtimes like Ollama, on-device vector stores, and durable task queues that reconcile when connectivity returns.
Offline capability is also a resilience posture for connected systems: when an upstream model API degrades, a local fallback keeps critical workflows alive.
6. Service Orchestration
Finally, something has to conduct the orchestra. Orchestration coordinates multi-step, multi-agent workflows: sequencing sub-tasks, running independent branches in parallel, retrying transient failures, timing out stuck agents, and persisting state so a long-running job survives restarts.
Event-driven and saga patterns from microservices translate directly — with one addition: agent outputs are probabilistic, so orchestration must include evaluation gates and checkpoints where results are validated (by rules, by another agent, or by a human) before the workflow proceeds.
Putting It Together
These six blocks are not a menu — production systems need all of them. @RitS, we design agentic platforms the same way we design any distributed system: identity first, least privilege everywhere, explicit orchestration, and graceful degradation — so the intelligence layer can be ambitious precisely because the engineering layer is disciplined.

Want to explore what this could do for your business?
Talk to us