So, I’ve been thinking a lot about Sierra recently. If you don’t know, Sierra is this AI startup that’s got Bret Taylor at the helm. And I’ve got to say, I’m pretty bullish on what they’re doing around AI agents.

I also found myself nodding while reading their post on pricing the work of these agents.

How do we actually validate what these AI agents produce?

Take customer success. On the surface it looks straightforward - count the number of tickets closed. But it gets messy real quick. Let’s say you’re a customer success manager. You’re using an AI agent to help you close tickets. The agent closes 100 tickets in a day. Is that a good outcome? It depends.

What if the agent closed 100 tickets, but they were all low-value tickets? What if the agent closed 100 tickets, but the customer satisfaction score went down? What if the agent closed 100 tickets, but the customer churn rate went up?

And this is just customer success - arguably the easier case. When you move to development and design, it becomes even more complicated. How do you define a “good” outcome?

If Replit’s AI agent creates a UI that looks like it’s from the 90s in its first attempt, is that the outcome we measure? Or do we look at what finally goes to production after multiple iterations and feedback loops? What if the agent wrote 1,000 lines of code, but it was all buggy? What if the agent wrote 1,000 lines of code, but it was all unreadable? What if the agent wrote 1,000 lines of code, but it didn’t meet the requirements? These are all valid questions. And they’re all hard to answer.

Now let’s say you’re a designer. You’re using an AI agent to help you create designs. The agent creates 10 designs in a day. Is that a good outcome? You guessed it. It depends. What if the agent created 10 designs, but they were all ugly? What if the agent created 10 designs, but they were all unusable? What if the agent created 10 designs, but they didn’t meet the requirements?

How do you price something when you can’t even properly define what success looks like? It’s like trying to price a product when your north star metric keeps changing direction.

I don’t have the answers. But I’m excited to see how this space evolves. And I’m excited to see how PMs tackle this problem. It’s going to be interesting. And it’s going to be hard. But it’s also going to be important. Because AI agents are the future. And we need to figure out how to price them.