Rethinking Product Design in an Agentic World

Xiangyu Wang, Nitin Mangal, Jian Wang, Yuchen Wu

We recently tried to ship an AI-native product.

Like most teams, we started with a proof-of-concept demo, and it looked great — 95%+ correct, smooth interactions, and a strong sense of readiness.

However, when we attempted to put it into production, everything slowed down. Edge cases appeared everywhere. The system behaved inconsistently in ways that were hard to predict. And suddenly, that last 5% gap between “usually right” and “reliably correct” felt impossibly far away.

Initially, we assumed this was a model or system problem, and focused on refining the system designs.

We were wrong.

It turned out the core issue was not an engineering failure, but a fundamental product design challenge:

The instinct to reach 100%

If you’ve built SaaS products before, your instincts are well-trained: You define features and flows, and make sure every action produces the correct, predictable result. In this paradigm, when the system deviates, it is immediately treated as a bug. The core assumption is:

If it’s not always correct and predictable, it’s not ready.

That assumption starts to break in an agentic world.

The agentic systems don’t merely execute predefined logic; they interpret intent, generate structure, and operate in open-ended spaces. As they become more capable, they also become less predictable. This introduces a tension that doesn’t exist in traditional software:

The more capability you unlock, the harder it becomes to guarantee the outcomes.

A different way to see the problem

That tension forces a critical reframing. We stopped asking:

“How do we get from 95% to 100%?”

And started asking:

“What actually happens when we don’t?”

That shift led us to a different mental model.

Instead of thinking purely in terms of accuracy, it became more useful to think in terms of capability and reliability, and more importantly, the boundary between acceptable and unacceptable outcomes.

Capability-predictability landscape with a red-line boundary. — Capability–predictability landscape.

In the SaaS terminology, reliability meant 100% predictable behavior + high accuracy. This definition assumes deterministic systems, where correctness is tightly coupled with repeatability. Under this definition, SaaS systems would sit in a relatively stable region on the left of the capability–predictability landscape : a narrow range of tasks handled with very high reliability.

Agentic systems, conversely, expand that range significantly. They move to the right, covering a much broader set of use cases. But that expanded capability comes with variability – reliability is no longer uniform across the entire surface.

Under this lens, the capability–reliability curve feels inevitable: as systems become more capable (and more probabilistic), they appear less reliable.

However, that’s only true if we hold onto the traditional definition of reliability.

Redefining Reliability in Agentic Systems

A more practical way to think about reliability is through user’s perspective:

X% predictable system behavior + Y% UX that bridges the remaining gap = 100% user expectation satisfaction

The key idea is that reliability is no longer purely a property of the model — it is a property of the entire system, including UX.

Instead of forcing the model to be perfectly predictable, we design interaction patterns that set appropriate expectations, surface uncertainty clearly, enable easy correction and iteration, and provide guardrails and fallbacks.

Once you adopt this definition, the tradeoff is reversed.

The capability–reliability curve still exists — but its meaning changes. Increasing capability does not necessarily reduce perceived reliability. UX absorbs and manages uncertainty, allowing the system to still achieve 100% reliability from the user’s perspective.

We are not eliminating uncertainty—we are designing around it.

The new design principle: Designing for Uncertainty

If the system goal is no longer 100% predictability and correctness, the product design principle must evolve too. It’s tempting to frame this shift as “we must handle errors better”. That’s incomplete.

SaaS design is fundamentally about pushing more features into a region of near-zero bugs and fully predictable UX.

Agentic design is different. It has three simultaneous goals:

Capability extends as far right as possible
User experience bridges the gap between what the system can do and what users expect
Nothing falls below the red line

A new way to measure system capabilities

Adopting this new design principle changes how engineering teams think about and measure system capabilities.

In a traditional SaaS system, improvement is mostly linear: fewer bugs, higher correctness, and a steady & slow march toward new features with 100% predictability (reliability).

In agentic systems, that framing doesn’t quite hold. What matters instead is the frontier of the capability–predictability curve.

When the system improves — through better prompting, better tooling, or better models — it doesn’t simply become “more correct” in place. Instead, the curve itself shifts outward. Tasks that were previously unreliable become reliable enough. More of the capability surface moves above the red line.

Progress, in this sense, is defined by:

Expanding the AUC (Area Under the Curve) where the system is both useful and (sufficiently) trustworthy

UX fills the gap

This is where the gap between system capability and user expectation becomes concrete.

If you look at the capability–predictability curve, there is always a gap between what the system can reliably do and what users expect it to do. That gap doesn’t disappear just because the model improves. It has to be designed away.

This defines the role of UX in an agentic system: not just to make things look clean or intuitive, but to reshape interaction so that users can operate safely above the curve.

In practice, that means:

Structuring outputs so they are easy to review and verify
Exposing uncertainty where it matters
Guiding users toward better inputs
Introducing checkpoints only where necessary (not overwhelm users)

UX now becomes a critical component of the product design. It is no longer just polishing the system, it is completing the system.

Designing above the red line

Once you accept that predictability is uneven across capability, you will soon notice that somewhere beneath that capability–predictability curve, there is a boundary — the Red Line. Below it, outcomes are no longer just unpredictable; they are unacceptable: they break user trust, they require significant effort to fix, or they introduce real risk.

the design question then becomes:

How do we keep the experience above the red line across as much of that surface as possible?

Two fundamental levers exist for managing the boundary:

Two levers: constrain capability and introduce human review. — Two fundamental levers for staying above the red line.

The first is to constrain capability: narrowing the problem space with structure, templates, and guardrails to reduce ambiguity and increase reliability.

The second is to introduce human (user) review: preserving flexibility but adding checkpoints such as previews, edits, or confirmations before execution.

Most successful production systems utilize a combination of both, but neither works well in isolation: without UX, constraints feel limiting and review feels burdensome. With good UX, both become natural parts of the interaction.

From system design to product design

These design choices don’t just affect the interface, they actually reshape the entire product development process. This way of thinking leads to a consistent system pattern:

The agent explores possibilities
The user applies judgment
The system executes deterministically

But more importantly, it changes how products are built.

In the SaaS world, the process is largely linear: define behavior → design the interface → build the system.

In an agentic world, that sequence no longer holds.

You don’t fully know what the system can reliably do until you experiment with it. So instead of designing everything upfront, you start by probing the system, observing where it performs well, where it struggles, and where it crosses the red line. Design then becomes the act of shaping an experience around those realities.

This also changes how teams divide responsibility:

Product defines the red line and user expectations
Engineering shapes the capability–reliability curve
UX fills the gap above the curve to meet expectations

The product can only succeed when all three components are aligned.

A different kind of product thinking

Taken together, these shifts lead to a fundamentally different product mindset.

SaaS products are built around a clear promise:

“This will work.”

Agentic systems operate under a different one:

“This will remain safe and usable, even when there is uncertainty.”

This mindset isn’t about tolerating errors, but designing systems where uncertainty does not translate into failure. All of this ultimately leads to a different way of evaluating success.

We’re not replacing SaaS. We’re recomposing it:

Agents expand capability
Humans provide judgment
Deterministic systems guarantee execution
UX bridges the gap between them

And the central question shifts from

“Is this always correct and predictable?”

to:

Does the system stay above the red line—and how much of the capability surface can we make usable through design?

That’s the new frontier.