Your AI agent had 94,000 conversations last month. Your team read 212. That gap isn't a reading problem — it's a category problem. None of the tools in your stack are built to read free text.
The conversations are logged. The meaning is not.
Every AI agent writes two kinds of output. The structured part — tool calls, retries, latency, status codes — lands cleanly in your observability stack. The unstructured part — the conversation itself, the thing the user actually typed — lands in a trace store and stops there. Your analytics warehouse never sees it.
This isn't a bug. Analytics tools are built to count events. A conversation isn't an event. It's a paragraph, with intent, hedging, and repair, and no database schema can describe it.
What each tool actually measures
- Analytics tools count clicks, sessions, conversions. An AI agent has none of these.
- Observability tools check whether the system stayed up and responded within the SLA. They can't read what it said.
- Evals score model output against a fixed test set. They don't know what your actual users are asking.
None of these tools is wrong. Each was built for a category of data that existed before AI agents. None of them was built for the thing AI agents produce most: plain-language conversations, at scale.
What's actually lost
A product team with a traditional analytics stack can tell you what users clicked. A product team with an AI agent often can't tell you what users asked for. The difference is everything — one is about the product's surface, the other is about what the product was used for.
“The output of an AI agent is a conversation. The output of our analytics stack is a number. There is no tool that turns one into the other.”
What to do about it
Locus reads every conversation in your trace store and turns them into a readable shape — behavioural groups, named use cases, and the handful of patterns that are actually changing week to week. Read a live sample or book a call and we'll build a memo from your own traces.