← back to problems
problemin-progress

Catching the 0.1% AI error before it compounds

The hard-to-detect drifts — directionally correct but wrong answer. What does the eval, observability, and human-review system look like for workflows where mistakes are hard to observe but costly?

The hard-to-detect drifts, directionally correct but wrong answer. What does the eval, observability, and human-review system look like for workflows where mistakes are hard to observe but costly?.

For example, I provide a website copy for this project, AI agents decided to follow the website copy but change the word slightly to fit the design pattern. Some are the right edits, but a few places it made up things I didn't include in the original copy or altered the tone slightly even after I gave specific instructions to use the website copy. Let's call it 2% drift - small but meanfuly enough to change the authenticity of the site.