In traditional software, a button either submits the form or it doesn't. The behaviour is specified, implemented and verified. Design can run ahead on static screens because the underlying logic is predictable, you are mostly deciding what things look like and how they are arranged.
AI features break that assumption. The output is generated, not specified. The same input can produce different results. The system can be confidently wrong. It can be slow. It can refuse. And the user is now a participant who has to judge, correct and sometimes overrule what the product produced.
None of that is visible in a static mockup. You cannot annotate your way to confidence about an experience whose core behaviour is uncertain. You have to feel it.
The uncertainty lives in places mockups can't show
When you design an AI product, the hardest questions are rarely about layout. They are about what happens in the messy middle:
- Probabilistic output. The result varies run to run. Which means the interface has to hold a range of outputs gracefully, not one perfect example.
- Edge cases and failure. Empty results, partial results, low-confidence results, refusals, timeouts. These are not rare, for AI they are routine, and they need real design.
- Trust and correction. Users need to understand where output came from, how sure the system is, and how to fix it when it's wrong.
- Latency. A two-second wait changes the entire feel of an interaction. You only learn that by experiencing it.
- Human review. Most useful AI products keep a human in the loop. Designing that hand-off, approve, edit, reject, escalate, is the actual product.
You cannot review your way to a good AI experience. You have to put a behaving thing in front of a person and watch what breaks.
Prototyping moves uncertainty forward, where it's cheap
Every product has a fixed amount of uncertainty to burn down before it ships. The only question is when you pay for it. Discover a broken trust model in a prototype and it costs a week. Discover it after engineering has built the pipeline, and it costs a quarter.
A realistic prototype, even a faked or scripted one, lets you answer the expensive questions early:
- Does the interaction model actually make the AI feel controllable?
- Do users trust the output enough to act on it, and distrust it enough to check?
- What does the experience feel like when the model is wrong, slow or empty?
- Is the human-review step a relief or a chore?
You do not need a working model to answer these. You need something that behaves convincingly enough to provoke a real reaction. That is what a prototype is for.
What this looks like in practice
For AI work I prototype the behaviour before the pixels: the states, the confidence signals, the correction paths, the failure modes. Often with canned or wizard-of-oz responses standing in for the real model. The goal isn't a demo that always works, it's a demo that reveals where the experience falls apart, while changing it is still cheap.
Traditional software lets you design forward and validate at the end. AI products invert that. The earlier you make the behaviour tangible, the less you pay for the uncertainty you were always going to have to face.