Unlike ordinary software bugs, AI hallucinations recur on every inference – making real-time detection a permanent infrastructure layer, not a stopgap.
ENTRY ANGLES
Specialized evaluation and control platforms for specific industries · Vertical-specific AI error detection and hallucination catching · Domain-focused AI governance solutions beyond generic monitoring
VERTICALS
CAPABILITIES
AI evaluation and error detection technology, Domain-specific context understanding for target vertical, AI governance and control platform development
More companies every month are deploying AI agents to handle autonomous tasks – processing e-commerce returns, initiating refunds, sending outreach emails, and the rest of it.
The problem is that AI hallucinates. And unlike a conventional software bug that you can fix once and forget, hallucinations in AI agents can't be tested away. An AI agent doesn't follow a static script – it reasons fresh each time it acts, even when the prompt stays the same.
Anyone who has asked ChatGPT the same question twice has seen this: the answer varies each time. Ask again a month later and it may have shifted even more, as the model has absorbed new information or simply "forgotten" old context.
For business processes, a 3–10% hallucination rate – which is typical even for current frontier models, depending on topic complexity – translates directly into: refunds issued for items never returned, emails sent to high-value prospects describing features that don't exist, and other errors that cost money and relationships. And crucially, you can never fully eliminate this.
Salus built a platform to catch those errors in real time – blocking bad AI outputs before they execute as real-world actions.
Integrating Salus requires a few lines of code added to your agent stack. From there, the platform intercepts agent outputs and decides in real time whether to allow or block each action.
Evaluation happens in two ways:
- Rule-based constraints defined by the platform administrator – expressed in config files or plain language.
- Historical deviation checks: Salus logs every agent action over time and flags outputs that diverge significantly from established patterns. A sudden outlier might signal an error.
When an error is caught, Salus can instruct the agent to retry – passing back an augmented prompt that explains why the previous output was rejected.
In 58% of retry cases, that correction is enough: the agent comes back with a valid response and the original task completes successfully.
Administrators get a full error dashboard, where they can tighten or loosen rules, identify recurring issues, and queue improvement tasks. Newly updated agents can be run against the full historical error log to verify they perform better before going back into production.
Salus entered Y Combinator in January and published its launch announcement on the YC site a few days ago.
Hallucination is not a bug in the traditional sense – it's a structural property of how large language models work. Some experts argue that *all* AI output is a form of hallucination; the question is just how close to truth it lands.
Even after careful testing and refinement, deployed agents can fail unpredictably and on entirely familiar inputs. There is no patch that removes this.
What that means in practice: 3–10% of all autonomous AI actions in a live system will eventually be wrong. Apply that rate to hundreds or thousands of daily agent actions, and the business exposure adds up quickly.
The only viable response is real-time oversight.
In high-stakes, lower-volume workflows, humans can play that role. Crosby ([related review](/review/bolee-prostaja-model-dlja-sozdanija-perspektivnogo-ii-produkta)) raised $25.8M for an AI legal firm where AI does the heavy analytical lifting – but qualified attorneys review every output before it goes out the door.
For higher-volume, lower-stakes automation, human review doesn't scale. That's where AI watchdog systems come in: autonomous monitors that catch and block questionable actions before they execute, or route them for correction. That's exactly what Salus is building.
Salus operates in the AI evaluation and governance market – a space that hit $1.4B in 2024 and could reach $7.9B or more by 2033.
This market spans many categories beyond hallucination detection: regulatory compliance, data privacy enforcement, output consistency, and more. Unconstrained AI can cause damage across all of them.
The direction worth pursuing is building specialized evaluation and control platforms for a specific industry or use case. Deep vertical focus will always outperform generic monitoring – because the right error detection logic depends heavily on the context of what the AI is actually trying to do.
All that remains is picking the domain where you want to build it.