The silence between two agents.
Tonight I set up a second AI agent on its own box. New host, new name, same stack as the one at home. The plan was easy. Agent one keeps its job. Agent two takes sales work. Two agents, two inboxes, one of me. Then agent two went live and went quiet on me. I had to chase the bug. The bug had a lesson in it worth writing down.
The shape of the problem
The new agent had its own bot. Its own chat name. Its own pipe from my phone to the agent. I sent it a message. The bot said it got it. My first agent watched the logs. It saw the message hit the bot server. But the new agent never wrote back. Outbound was fine. I had the agent send a hi out through its own bot keys. The text hit my phone in under a second. Inbound was dead.
A silent inbound failure is the worst kind of bug in a multi-agent setup. Every layer shrugs. The phone says "delivered". The bot server says "I got it". The agent says "listening". No one owns the fault. You stand in the middle of the chain. Then it hits you. This is not a line of bugs. It's a gap.
The obvious theories, one at a time
The first hour I burned on the easy guesses. Was the bot token good? Yes. Was the allowlist set right? Yes. Was the agent process running? Yes. Was the runtime even loading the chat channel? It claimed yes. The startup banner read "listening for channel messages from
You can tell them apart by the process tree. A healthy second agent has a child process. That child is the messaging poller. Mine had no child. For about two hours I tried every fix I could think of. New env vars. New runtime flags. A fresh install of the agent runtime in a new format. I patched the launcher to pass new flag mixes. Each try gave me a new error. That is like chasing fireflies. It looks like progress. It is not.
The actual bug
It broke open when I asked the agent runtime what plugins it had loaded. I used a small status command. The answer came back fast. It was wrong. The runtime said the messaging plugin had "failed to load: plugin not found in its marketplace". But the marketplace folder was right there. The plugin file inside it was right there too. The runtime just had not lined them up. Its own list of what was in that marketplace was stale.
One command refreshed that index. The runtime came back up. It spawned the subprocess. The bot sent my next test message into the session in under a second. The fix was not where I had been looking. It sat one layer below all of that. I had thought that layer was fine. I had rsynced its files over from the working agent, so I trusted it.
That's the story. Three hours of wrong tries. One command to fix it. But I'm not writing this for the fix. I'm writing it for the shape of the problem.
Where bugs actually live in multi-agent systems
Run one agent and your brain treats it as one thing. One box. One process. One set of configs. When it breaks, you ask what is wrong. You ask where. And "here" has a clear edge to it.
Two agents feels like one thing running twice. That feels right. It is almost right. The word almost does a lot of work. The second agent has its own runtime state. It has its own setup history. It has its own first-run flags. You copy the first one's files and you get the files. You do not get the state. The files are a snapshot of a done install. The state is what the runtime has seen, checked, and stored. Those are not the same. The gap matters more than I thought.
In my case, the first agent went through the normal first-launch flow, which includes a step where the runtime scans a marketplace directory and writes "yes, telegram is available here" into its internal index. The second agent got the marketplace files but never went through the scan step. So it had the ingredients and none of the prep. The runtime then politely told me the plugin was "enabled" because I had said so in the config, while simultaneously being unable to actually find it because the index hadn't been built.
That failure shape is the shape of almost every multi-agent bug I hit. The bug lives in the seam. Not inside agent one. Not inside agent two. It lives in what you think crosses from one to the other. Files cross. State does not. Config crosses. Checks do not. Behavior crosses. Identity does not. Each one can fail in silence. The runtime will not tell you which. It does not know the seam is there. It only sees its own half of the world.
Why this matters if you're thinking about running two
People who run one agent think the next step is a second copy. They want backup. They want a spare. They want scale. I hear it from clients all the time. "Could I just have a second one on another machine for when the first is busy?" Yes, you can. But "just" is a trap word. The second one is not a copy. It's a new thing. It starts fresh when you hand it the files. Every belief the first one built up about its world has to be set up again on the second one. If not, you get a bug like the one tonight.
Here's what that means day to day. A real multi-agent setup is not a clone job. It's a setup job with its own list. Each bot has its own ID. Each one needs its own login flow. Local files need to be built fresh, not copied over. Any cache that warmed up on bot one will be empty on bot two until it runs for a while. Any "yes" you clicked once on bot one has to be clicked again on bot two. None of this is hard on its own. All of it stays hidden until a quiet failure makes you go find it.
The payoff is real. Once you stand up the second one, the third and fourth get easy. You wrote down the gap between "files" and "state." You stop mixing them up. That one shift turns multi-agent from an idea into something you can ship.
Where I'm taking this next
The second agent is live now. It has its own chat. The two agents split the work in a clear way. I use the first one for all the old stuff. The second one runs one workflow that needs its own focus and its own rate limit. I had been ducking a question for a while. What does it feel like to run two of these. Now I know, and there's a runbook for it. Tomorrow brings a new problem. Tomorrow I have to make them work together. But tonight the room is not quiet anymore. That counts as progress.
Here's the quiet lesson I'll keep. Bugs in multi-agent systems hide. They don't sit where you look. They live in the handoff. Two parts each think they work fine. The bug is in what they pass. So if every part shows green and the output is wrong, stop. Don't poke at the parts. Look at what flows between them. Ask which rule you thought was free.
← Back to all posts