The most useful thing I have learned about applied AI is also the least convenient: its abilities are jagged. It can do some tasks that look hard to us with ease, and stumble on others that look just as hard — or easier. Researchers studying knowledge work have called this the "jagged technological frontier," and it is the single idea I wish more business owners carried into their first AI project.
The evidence is striking in both directions. In a widely cited study of consultants run with Boston Consulting Group, professionals using a leading AI model on tasks suited to it completed meaningfully more of the work, finished it faster, and produced higher-quality results than peers without the tool. On a task deliberately chosen to sit just outside the model's strengths, the picture flipped: AI users were more likely to land on the wrong answer, because the tool was confidently helpful in the wrong direction.
The output looks just as polished whether the model was right or wrong. That is the trap.
That last point is the trap. A human who is unsure usually signals it — they hedge, they ask, they leave the draft rough. AI does not. The output looks just as clean whether the model was operating well inside its competence or well outside it. Fluency is not accuracy, and AI is extraordinarily fluent.
The hidden cost: rework
This is where a real, unbudgeted expense appears. Harvard Business Review writers have given it a memorable name — "workslop" — for AI-generated work that looks polished but lacks the substance to actually move a task forward. In their survey, a large share of professionals reported receiving it, and each instance cost meaningful time to untangle: roughly two hours, on average, to figure out what was wrong and redo it.
Think about what that does to a naive automation. A workflow that "drafts" something in seconds feels like a win. But if a colleague then spends two hours verifying and repairing it, you have not saved time — you have moved the cost downstream and made it harder to see. The demo looked great. The quarter did not improve.
Why review belongs inside the workflow
The instinct is to fix this with a rule: "always have a human check the AI." In practice, a check that lives outside the system is the first thing to get skipped under deadline pressure. The more durable answer is to build the review step into the workflow itself, so it is structural rather than optional.
Concretely, that means designing automation that knows the difference between what it can ship and what a person needs to see. A few patterns we rely on:
- Scope to the frontier. Automate the tasks AI does dependably; keep judgment-heavy tasks human, with AI as support rather than author.
- Route by confidence. When the system is unsure, it should hand off to a person instead of producing a confident guess.
- Make a review queue the default. High-stakes output — anything client-facing or financial — passes through a person before it goes out, not after a complaint.
- Log what happened. Keep a record of what the workflow did and on what basis, so quality can be audited rather than assumed.
This is a feature, not a tax
It is tempting to see review as friction that slows down the magic. I would frame it the other way. Oversight is what lets a business actually trust an automated workflow with real work. Without it, you are forced to keep everything low-stakes, which caps the value AI can deliver. With it, you can let automation handle more, because you have a dependable way to catch the cases where it gets things wrong.
The companies that get the most from AI will not be the ones that automate the most aggressively. They will be the ones that automate with their eyes open — clear about where the frontier is, and willing to put a person exactly where the machine is weak. That is not a brake on progress. It is the thing that makes the progress safe to keep.