When AI Features Pass Every Test and Still Get Shut Down

Posted on

March 27, 2026

OpenAI shut down Sora this week. Not a quiet deprecation. A full shutdown, thirty days notice, a billion-dollar Disney deal unwound along with it (Disney, of all the partnerships to unwind). The reason they gave: the cost per generated video minute was "economically irreconcilable" with what people would actually pay.

What struck me was the why. Products get killed all the time. The reason this one matters is what it says about a problem most product teams building AI features right now are probably not solving for.

The Technology Worked

Sora wasn't a failed experiment. The videos were real. The users were real. OpenAI still had to kill it.

Sora's videos were impressive and people actually used it. OpenAI is one of the most well-resourced companies on the planet, with more AI infrastructure, more engineering talent, and more distribution than almost any product team will ever have access to. And they still couldn't make it work commercially.

The product wasn't flawed. It was the validation gap that killed it.

We've Been Asking The Wrong Question

Most product teams building AI features are spending almost all of their validation energy on one question: does it work?

Most teams run some version of the same checklist:

Does the AI do the task?
Is the output good enough?
Does it pass QA?
Does it hold up in edge cases?

Legitimate questions. Worth asking. Sora answered all of them correctly.

The question that didn't get answered loudly enough: will people pay for this, often enough, at a price that makes it viable to offer? That's a user behavior question. It requires a different kind of testing than most teams are running, and it has to be on the list. The specific challenges of testing AI features make this harder than it sounds.

This Is Happening Everywhere, Just Quietly

Sora made headlines because OpenAI makes headlines.

Most teams just quietly move on. AI features that shipped and then never got mentioned in another release note. Copilot integrations that users learn to ignore after the first week. "Smart" capabilities that got announced with a blog post and then slowly disappeared from the product without anyone saying anything.

Most teams don't put out a press release when they kill something. So you don't see it. But it's happening.

Gartner predicted that over 40% of agentic AI projects will be canceled by end of 2027. The reasons they cite: escalating costs and unclear business value. Neither is a technical failure.

That's the Sora story, smaller and quieter, playing out everywhere.

What Actually Needs To Be Tested

The questions that catch this kind of failure are harder to set up than a standard QA pass, but they're not complicated.

The questions that catch this kind of failure are about behavior, not capability. Does the feature earn a place in how users actually work, or does it just pass a demo? Do users come back to it after the novelty wears off, or do they quietly route around it? What happens to usage three weeks after launch, not just at launch?

Those answers don't come from a QA pass. They require longer observation windows and users who are doing their actual jobs, not running through a script. Most teams are skipping that part, and the metrics that actually matter for ship decisions reflect a very different picture than a QA scorecard.

The Question Isn't Going Away

The Sora story will stop being news in about a week. Something else will happen and we'll all move on.

But how do you validate an AI feature for real-world value, not just technical performance? A lot of product teams are wrestling with that right now without a clear answer.

It's one of the central topics at the Product AI Summit on April 16 in Irvine, where product leaders are gathering to talk honestly about how AI is changing the way products get built, tested, and shipped.

If that's a conversation you want to be in, there's still time to register. Click here to register.



Artificial Intelligence (AI)

Product Validation

Customer Validation

User Research

No items found.

When AI Features Pass Every Test and Still Get Shut Down

The Technology Worked

We've Been Asking The Wrong Question

This Is Happening Everywhere, Just Quietly

What Actually Needs To Be Tested

The Question Isn't Going Away

You Might Also Like

10 Reasons to Attend the Product AI Summit

3 Research Mistakes Growing Product Teams Keep Making (And How to Fix Them)

10 Ways Quality Teams Use AI to Enhance Testing