Impact Pricing Podcast

#709: How to Benchmark Your Pricing Like AI Models with Steven Forth

July 21, 2025

Steven Forth is the Co-founder and Chief Value Officer at Ibbaka, a leading value and pricing consulting firm. With deep expertise in AI applications for pricing and value modeling, Steven is at the forefront of developing intelligent agents that help businesses understand and communicate value more effectively. His work focuses on the intersection of artificial intelligence, pricing strategy, and value creation, making him a pioneer in applying AI to solve complex pricing challenges.

In this episode, Steven shares his insights on how benchmarking is revolutionizing both AI development and pricing strategy. Drawing parallels between how AI models are improved through benchmarking and how pricing models should be evaluated, he introduces a framework for measuring pricing effectiveness that could transform how we approach pricing decisions. Together with Mark, they explore the challenges of establishing “truth” in pricing, the role of synthetic data, and the future of AI-powered pricing tools.

Podcast: Play in new window | Download

Why you have to check out today’s podcast:

Discover how AI benchmarking principles can revolutionize pricing model evaluation.
Understand how to evaluate pricing models from both buyer and seller perspectives.
Explore the future of AI-powered pricing tools and what it means for pricing professionals.

“We don’t start with the truth. We have to work our way towards truth through multiple iterations and applications.”

– Steven Forth

Topics Covered:

02:15 – How Intercom’s FinAI agent uses daily benchmarking to improve ticket resolution performance

05:30 – Why AI’s success is built on benchmarking and how it emerged from the ImageNet competition

08:45 – The critical problem: pricing lacks standardized benchmarking like AI models have

11:20 – Michael Mansard’s 12-factor pricing model assessment and its potential as an industry standard

14:10 – Why pricing models must be evaluated from both buyer and seller perspectives

17:25 – How market segmentation and use cases complicate pricing model benchmarking

20:40 – The role of synthetic data in pricing research and model validation

24:15 – Why “vibe coding” could disrupt traditional pricing consulting within 3 years

27:30 – The search for truth in pricing: hedonic pricing models and market assumptions

31:45 – Introduction to ValueIQ: Ibbaka’s new AI agent for value-based selling

Key Takeaways:

“Anyone who says that they’re data centric or data driven is actually before that they have to be model driven because they’re using some form of model to organize the data.” – Steven Forth

“We should have done this 20 years ago. What were we thinking? Well, we weren’t thinking. And we didn’t have ways to do this for us anyway.” – Steven Forth (on developing pricing benchmarks)

“Benchmarking every day, I think, is going to be critical to the success of agents that do important business things.” – Steven Forth

“You can always improve your measurement, but at some point the return of improving the measurement is lower than the cost of increasing the validity of the measurement.” – Steven Forth

Resources and People Mentioned:

Douglas Hubbard’s How to Measure Anything (book): https://www.amazon.com/How-Measure-Anything-Intangibles-Business/dp/1118539273
ImageNet: https://image-net.org/
Michael Mansard’s 12-Factor Pricing Model: https://www.insead.edu/bio/michael-mansard-0
Intercom’s FinAI: https://www.intercom.com/help/en/articles/8205718-fin-ai-agent-resolutions
Lovable, Replit, Bolt: https://linkblink.medium.com/bolt-vs-cursor-vs-replit-vs-lovable-ai-coders-comparison-guide-3b9d41e75810
ValueIQ: https://www.ibbaka.com/ibbaka-market-blog/get-ready-for-valueiq-sign-up-now-for-beta-access

Connect with Steven Forth:

LinkedIn: https://www.linkedin.com/in/stevenforth/
Email: [email protected]

Connect with Mark Stiving:

LinkedIn: https://www.linkedin.com/in/stiving/
Email: [email protected]

Full Interview Transcript

(Note: This transcript was created with an AI transcription service. Please forgive any transcription or grammatical errors. We probably sounded better in real life.)

Steven Forth

Benchmarking every day, I think, is going to be critical to the success of agents that do important business things.

[Intro / Ad]

Mark Stiving

Welcome to Impact Pricing, the podcast where we discuss pricing, value, and the measured relationship between them. I’m Mark Stiving, and I run boot camps to help companies get paid more. Our guest today is the one and only Mr. Steven Forth. Welcome, Steven.

Steven Forth

Hello, Mark, it’s good to be back.

Mark Stiving

It is going to be fun. Well, for you, it’s going to be fun. I’m not sure about for me. So I’m about to go on vacation. When people hear this, I’d probably be back from vacation, but I needed to get a couple of these recorded. And so, I call my good friend, Steven. I say, Steven, I need you on the podcast. What do you want to talk about?

And he goes, benchmarking. Oh my gosh. So now I’ve been researching benchmarking for the last several hours, trying to figure out what the heck we’re going to talk about. So Steven, what the heck are we about to talk about?

Steven Forth

So one of the ways that, as you know, Mark, I’m obsessed with AI and pricing and building applications that will allow us to understand value better, to price better, to communicate better, to win more on deals and so on. So I spend a lot of my time in the AI world and the generative AI world. And it’s become clear to me that benchmarking is a core part of how these applications and models work.

But I’m going to start with something that will, I hope, resonate with the pricing community. So when we talk these days about outcome-based pricing, a lot of us talk about Intercom’s FinAI agent. And as people will remember, that’s the agent that Intercom brought out in order to resolve customer support tickets, that charges per ticket. So we’ve talked that one to death.

Mark Stiving

Charges per resolved ticket.

Steven Forth

Resolved ticket. Thank you. Yes, per resolved ticket. So we’ve talked a lot about that one. And it was one of the things that Kyle Poyar and I talked about yesterday in our webinar. But one of the things that Kyle mentioned when he had interviewed the people at Intercom who had developed the pricing for this and developed the FinAI agent is that every day they obsessed with their benchmarking.

And the way that they were benchmarking their agent is how many of the tickets does it successfully resolve? And they fought every day, they would measure that ticket resolution performance and claw their way up from, I think they started down in the low 20%, you know, which would not be a successful agent either for the buyer or for their pricing model.

But they fought their way up till they’re now in their high eighties, I think. And benchmarking was a core part of their development process. So that was part of the thing that got me focused on benchmarking because Ibbaka is developing its own set of agents right now. And I wanted to build benchmarking and measurement of the agent performance right into the agent itself. So we are developing a number of different ways to benchmark our agents.

So one is one of the things that our agents both do and rely on is value models. And then one of the questions becomes, how good is this value model? So we’ve developed a way of assessing value models to see how good they are, to make sure that we are generating good value models in different contexts. So that’s part of how we’re benchmarking our agent.

Mark Stiving

Okay. Before we move too far, I did some research on benchmarking to try to understand what the heck we’re talking about. And the first, of course, to do my research, I go to ChatGPT and have conversations. And so the first thing I want to know is, let’s step away from value models. And let’s just talk about pricing, right?

So you set a price for a product. I set a price for a product. How do we know which one set the price better, right? Especially because it’s two different products, two different markets, two different whatevers. How do we know that you did a better job than I did? Besides the fact that it’s you and me, right?

Steven Forth

Well, when we know you’re going to do better because you’ve got a PhD. So that’s actually a really good question, Mark, sure that, can we just pop up to a higher level and talk about how benchmarking works in AI more generally?

Mark Stiving

Okay, so if we do that, can you answer the following question as we do that? The thing I love about the FinAI example is I know if we resolved something or not. So there was a legitimate outcome, a measurable outcome. And in pricing or value models or many, many, many other benchmarks, I don’t see that measurable outcome. So take us up a step, a higher level and talk about measurable outcomes.

Steven Forth

Okay. There’s a wonderful book by the way, called How to Measure Anything. Do you have that book? It’s a fantastic book. And it walks you through the process of measuring any business outcome you’re interested in, but it also points out that there’s a cost to measurement. So you can always improve your measurement, but at some point the return of improving the measurement is lower than the cost of increasing the validity of the measurement.

So anyway, it’s a fantastic book. I’ll send you a link to it later. So in order to measure something, you have to have a model. You’re always measuring against a model. The model could be the metric system so that you’re measuring things in centimeters and meters and kilometers and so on. But there are many different models that you can measure against, but without a model, you can’t measure.

So anyone who says that they’re data centric or data driven is actually before that they have to be model driven because they’re using some form of model to organize the data. So I’m just going to take us way back here to the beginning of deep learning. So back in, I’m going to probably get my dates wrong here, but back around 2012, there was a collection put together called ImageNet and ImageNet had millions of images and I think 20,000 categories.

And the competition was to categorize images into the correct category. And they had had this competition for a couple of years when all of a sudden, you know, I think it was 2012. Sorry, I should have checked all these dates beforehand, but I believe it’s 2012. The founder of ImageNet, Fei-Fei, sorry, Fei-Fei somebody, a famous AI researcher at Stanford. She was pregnant, so she wasn’t actually taking part in the competition that year.

She got a call from her research assistant that said, something weird is going on here. We have a model now that is outperforming everyone else by a huge percentage point. Like normally the models are like a half a percent or 1% better. This one was way better. And he said, and the really weird thing is that it’s using that old fashioned neural network technology that nobody uses anymore because it doesn’t work.

And she said, okay, well, who submitted it? And she said, oh, is this guy Geoffrey Hinton from the University of Toronto? Do you know anything about him? And she goes, yeah, yeah, he’s the real deal. We should take this seriously. That was the competition and that benchmarking that actually led to the current deep learning revolution.

And virtually all of the AIs that you and I are using today are based on deep learning and they’re the result of that competition. So AI, the current deep learning based version of AI, emerged through benchmarking. And now whenever any one of the big foundation model companies releases a new model, whether it’s OpenAI or Anthropic or Cohere or DeepSeek and so on.

One of the things they do is they benchmark their model against a set of standard indices. You know, how well does this model compare to a human doing LSAT? Or how well does this model do compared to, and always the reference point seems to be humans, human on standard mathematical tests or reasoning tests or so on and so forth.

Mark Stiving

I just read one this morning about EQ and how it’s doing better in EQ than humans do.

Steven Forth

That’s a scary thought. Benchmarking is built into the whole generative AI process. And that’s how generative AIs are improved, by making sure that they’re performing better against benchmarks.

So with that as background, I want to come back to your question, how do we know if we are doing a good job on, let’s say, designing pricing models? That’s a great question because we don’t really have any good way, or good standardized way of assessing pricing models. And that is something that we can fix.

Mark Stiving

So I’m sorry, I just want to go back to something you said and reiterate it for importance because it’s an aha moment to me. When we think about benchmarking an AI on the LSAT, we actually know what the right or wrong answers are on the LSAT.

Or at least we could have someone grade it the way they would grade any other LSAT that came across. And so, the thing that we’re going to struggle with in this world of pricing or value models is we don’t have that benchmark yet. We don’t have that truth yet.

Steven Forth

Right. So that’s what we need to get to for both value models and for pricing models. And I think that, you know, people like you and I and the pricing industry as a whole have been negligent in not developing this. So we should have done this 20 years ago. You know, what were we thinking? Well, we weren’t thinking. And we didn’t have whys’ to do this for us anyway.

Mark Stiving

I’m sorry for interrupting again. Remember the webinar you did recently with Michael Mansard on the compass and how we’re going to evaluate pricing metrics. In all honesty, as I was watching that, I was like, well, there’s nothing different here. It’s just that we’ve never gone out to do this before. Right.

Steven Forth

Yeah. I mean, I like Michael’s 12 factor approach, but as you say, it’s not new and it’s not yet validated. But that sort of brings me to where we are today. So Ibbaka has decided to adopt Michael’s approach and to use those 12 factors.

We are running it against hundreds of different pricing pages right now, because the pricing pages are easily available and we’re trying to validate some things. So you have to calibrate a model, right? So right now we’re going through a process of calibrating the 12 factor pricing model assessment.

And we’ve elaborated on it quite a bit, but one of Michael’s claims is that each of those 12 factors is independent of the others. Maybe, probably yes, because, you know, Michael’s a smart guy and he’s thought hard about this and tested it, but that’s still an assumption. So let’s test that assumption. And then if they are orthogonal, independent of each other, great.

If they’re not orthogonal, why are they not orthogonal? So there’s a lot of work that needs to be done on, on the rubric or what we’re going to test pricing models against. The other thing that I think is really important is what does it mean to test a pricing model? Because a pricing model, and I think there’s two dimensions to this.

First, is the pricing model a good pricing model for the seller? That’s what we always assume, right? We’re building pricing models for the seller or the vendor, and that’s what we focus on. But the pricing model also has to work for the buyer. So one of the things that we did was we said, okay, let’s take this 12 factor model that Michael’s developed.

And we developed a bunch of prompts, so we’re doing this using AI, and let’s run it from both a buyer’s point of view and a seller’s point of view, and then compare the results. That’s proved to be incredibly interesting and generates really interesting conversations, because at least now we have some measurements that we can put up and say, how good is this pricing model for the buyer? How good is this pricing model for the seller?

Mark Stiving

Okay, before you go too far, I have way too many thoughts already. And so, two quick conversations. I’m going to go back to the LSAT conversation we just had. If I write a multiple-choice question and there is absolutely a right or wrong answer, we know what we’re doing, right? I mean, I guess we kind of do.

Maybe someone wrote the wrong question or wrote the wrong answers. If we have an essay as part of the LSAT and we have a professor read it and grade it and say, is this acceptable? We don’t know if that’s really right. We just know that someone did that. In other words, the truth depends on who made up the test.

Steven Forth

Yep.

Mark Stiving

Who’ve graded the test. And so, let’s go to Michael’s 12 points. We still don’t know, even if we use those 12 and we score it, we still don’t know that that was the right pricing metric. Nothing against Michael, you’re right, he’s brilliant, right? But we don’t know if that’s the right pricing metric.

Steven Forth

I agree.

Mark Stiving

We know those are 12 points that might be orthogonal.

Steven Forth

Yeah. And as you say, might be, that’s a hypothesis that needs to be tested. So, yeah. So, these pricing model benchmarks, the pricing model benchmarks, need an awful lot of work. They need an awful lot of validation and verification. And we are, I think as an industry, just at the beginning of this process.

Mark Stiving

Okay, so the other point I want to make is about the buyers, if I may, right? So when I think about the model for the seller, as a seller, we get to choose from an infinite array of different pricing models we might want to use. As a buyer, we get to choose from which sellers have offered which pricing models.

So we don’t get to choose, it isn’t like we get to design our optimal pricing metric. We get to choose that I like this one better than that one. And this company is selling that way and this company is selling this way. And so, I feel it’s different. And how do you think about that?

Steven Forth

So I agree. It is different because unless you’re a very powerful buyer, and you can get paid how you’re going to pay for those? You have to take the pricing model that the vendor provides. And the pricing model is just going to be, in most cases, a relatively small factor in your buying decision, but it still gives you insights.

Now, if the pricing model works very well. for the vendor on one factor and poorly for the buyer, you want to dig in and understand why is that the case? And is it an issue? So in some cases, the vendor may not care and say, well, screw the buyer.

They have to accept what we’re doing. But in other cases, they might say, okay, so we should expect the buyer to be resisting this part of our pricing model. At the very least, we have to be able to communicate around why we’ve priced it this way. And we may want to adjust our pricing model so that it doesn’t create that resistance for the buyer. Very fair point.

So it gives you insight that you otherwise would miss. And as you’re designing things, you have to design them for the users. And I think pricing models are designed things, but there’s two users in this case, right? The vendor uses the pricing model, but the buyer also uses the pricing model and you need to consider both perspectives.

Mark Stiving

Okay. Perfectly fair. Sorry for interrupting. Let’s move on.

Steven Forth

The other thing that we realized as we do it, as we were working on this, okay, is this a good pricing model, but for what, in what context? Most solutions are sold to different buyers that you can define through market segmentation, or there’s various different ways to define this, right? These days, people are using ideal customer profiles a lot.

You know, you and I have used market segmentations a lot. So one of the things you have to ask, you know, you can ask, is this a good pricing model in general? But you will often get much more insight if you say, is this a good pricing model for this segment? Or is this a good pricing model for this customer profile? So I did this for one company recently where I compared their pricing model for three different market segments.

So one market segment was small, scrappy, early stage companies. Another segment, and these are all segments that they address in their solution. Another segment was a global e-commerce company. And then the third segment was international banks and their pricing model worked pretty well, you know, for the small users. From their point of view, it also worked pretty well for the sort of mid-sized users.

And it did not work at all well for the very large users. From the buyer perspective, it also worked pretty well for the small users. It did not work well for the mid-sized users, and it was a disaster for the large users. And this is not surprising because this is a company that started with selling to smaller companies. It evolved its pricing model in that context.

Then it moved to good, better, best pricing and stuck on more tiers and made it more complicated. But they never really rethought their buying model for these different segments. So this, this process, you know, this benchmarking process gave insight as to where the pricing model was going to work and where it was not going to work and where one would need to focus on fixing it.

Mark Stiving

I think that’s really interesting. I often think about market segments and it forces me to think, am I thinking about market segments correctly? Cause I usually don’t consider company size or customer size as the segment. I usually think of it as the, what’s the problem we’re trying to solve for the customer.

Steven Forth

Yeah. And that’s a better way to do it. But in this case, you know, I went with their conventional segmentation. You can’t solve too many problems at once.

Mark Stiving

Well, and it may be that the enterprise customers are actually trying to solve a different problem than the small customers.

Steven Forth

Yeah, absolutely. So that brings up the next point is also when one is, I think, benchmarking pricing models, it’s not just about segments. It’s also about use cases. Because many of these solutions can be used for more than one use case. So what you’re really asking, is this a good pricing model for this segment, for this use case?

Now we’re getting stuff pretty complicated here because right now we have 12 factors from a buyer perspective and a seller perspective for each different segment, for each of the relevant use cases in that segment. So as a human, I would not want to have to do this, but it’s pretty easy to get an AI to do it for me.

So I can navigate through these different ways of benchmarking about the pricing model, and I can find out where I need to focus on improving the pricing model. The other really fun thing though, is going to be comparing different pricing models. You know, there are lots of different ways you can design a pricing model for any solution. Let’s generate 20 different pricing models, run them against the benchmark and see which ones work in which situations.

And then over time, because somebody is thinking right now, well, this is all just a hand-waving because it’s not being compared on actual data. But then you can also start comparing it against actual data to the extent that you’re able to collect that data. But, you know, we have more and more access to data every day, and we can also generate synthetic data that has pretty good verisimilitude quite easily.

Mark Stiving

So what data would we access it to? So I could imagine the way to test this is I have two different pricing models that are actually in the real world. One works better than the other. So I know that, hey, that one beat that one. I mean, it’s kind of like doing a logistic regression or something. But I need lots of examples of this.

Steven Forth

Yeah. So we’re not going to get there by not starting.

Mark Stiving

Steven, I’m old.

Steven Forth

Let me give you a different example of something that I’ve been working on over the last couple of days, which is just Valve. And it does, I think, relate back to benchmarking. So there’s this very big meme in the software world right now, vibe coding. And that’s the ability to generate code by using prompts and natural language.

And there’s all these different vibe coding applications out there. We use Lovable, there’s Replit, there’s Bolt. The big foundation model companies all have some version of this tucked away in there somewhere. And more and more applications and agents are actually being developed through vibe coding. So I said, that’s interesting.

So I started playing with Lovable and getting it to replicate common applications that I use all the time to see if it could do that, because that’s actually one of the reasons vibe coding took off, right, was Reid Hoffman. used one of these ones, I think it was Bolt to replicate LinkedIn. And he said, this vibe coding just generated a better version of LinkedIn than exists today in about an hour.

You know, this transforms how we think about innovation and software development. But one of the things I said, or I did was I said, okay, now how would you price this? So I used the vibe coding to actually develop the pricing. Now, the results were terrible. The vibe coding apps do not do a good job pricing today. That doesn’t mean they won’t do a good job pricing tomorrow.

So then I got worried about my business because I thought, Hmm, if the majority of applications are developed in vibe coding and the vibe coding applications can be used to generate packaging and pricing, then all of our business goes away. And I think this is going to happen, you know, whether it happens in six months, probably not 18 months, probably not three years, almost certainly.

So, so it’s, it’s a good thing you and I are not too young. Right. As you say, I should plan my retirement now, but then, so I put up a poll on LinkedIn as I like to do, and I ran it across a number of different groups and that was interesting. And I’ll share the results at some point next week. But then what I also did is I wrote a series of prompts to ask, how would, and I defined 25 different roles.

So how would people in each of these roles be likely to respond to the poll? And it gave me answers. And I thought, okay, that was moderately interesting. And I said, now simulate a Monte Carlo, you know, where you run lots of iterations. And so my AI was able to simulate a Monte Carlo and give me ranges and probabilities for the answers. And then I got it to do the same thing for, you know, each of the four possible answers.

It was absolutely fascinating. And now I’m cross correlating this with the, what’s the difference between correlating and cross correlating? I don’t think there is any. And now I’m correlating this with the real people. which is a bit of work, but you know, we’ve got a light week coming up because it’s Canada day on the 1st and you know, the American holiday on July 4th. Something. I think it has to do with the fireworks.

Mark Stiving

Yeah.

Steven Forth

But anyway, so, you know, I’ve got a little bit more time than usual next week. So I’m going to compare the synthetic data with the actual poll data and see how close they are. But this is the sort of thing that we can now do increasingly for all of our pricing research.

Don’t have the data? Okay. Generate a bunch of data, test it against what data you do have, use it to extrapolate and then use it to benchmark and investigate different alternatives. This is the world that we are now living in.

Mark Stiving

Okay, Steven, we’re back to a year or two years ago when you tried to tell me that I needed to know more about AI. And you convinced me, right? I mean, I’m deep into this stuff, but I’m not bought in yet. I’m not bought into synthetic data. I’m not bought into benchmarking yet. And it’s because I don’t see the truth. I mean, someplace there has to be truth.

Steven Forth

Yeah, so let’s divide those into two separate conversations. So benchmarking, the way that we’re going to get to truth is by being transparent, by testing, by having people discuss it, by finding different rubrics that we can use to test it and comparing the results from those different rubrics.

But we’re going to be able to do what would have taken 10 years, you know, three years ago, will now go very, very quickly once people adopt it. So I would think that within six to 12 months, we’ll have made huge advances on how we benchmark pricing models.

Mark Stiving

I think it’s going to take a whole bunch of these conversations to figure this out. I have to say, I’m sure you were part of them. I used to argue on LinkedIn conversations all the time. What’s the definition of value? And we don’t have an answer to that question, right? I mean, I have my own answer, but there is not a consensus in our industry on what that means. And so, how are we going to get to, you know, this is truth when it comes to benchmarking pricing models or value models or anything.

Steven Forth

Yeah. Let’s put value models aside because that’s a, and just focus on, on the pricing models. Okay. You know, so we’re going to propose using Michael Mansard’s 12 factor model. We are building a bunch of technology around that. We are going to test it and test it and test it, and then analyze the results of those tests.

And then see how for the pricing models we’ve designed, how it correlates with actual results and impact. And over time, there will be a shared understanding of what makes a good price model and a shared understanding of how you measure how good a price model is. It’s not going to happen overnight. As you said, we’re going to have to have lots of conversations.

We’re going to need to be transparent. We’re going to need to share data and results. Other people will have different perspectives. You know, think of how many different ways that foundation models are benchmarked today. There’s like 20 different things that they’re benchmarked against. I couldn’t name them all.

Mark Stiving

Yeah, but each one of those benchmarks, when we’re reading our AI reports and we’re reading these benchmarks, each one of these benchmarks was specifically designed to be a benchmark to say, hey, how does it perform in this specific situation? It doesn’t mean that it’s right.

It doesn’t mean it does a good job for how you and I want to use it. It just means, hey, this is what it does and you can win at this game. And so certainly we could do that. But that’s not what I’m after. I’m after truth, right?

Steven Forth

But truth will emerge from multiple iterations and applications. We don’t start with truth. We have to work our way towards truth.

Mark Stiving

Okay. When I was a doctoral student, one of the things that we would do is something called a hedonic pricing model. And hedonic pricing was pretty interesting in that we took the actual prices as the independent variable to figure out how much each attribute was actually worth. And the assumption was that the company could price correctly, right? That’s the underlying assumption.

So one could make the argument that if you went out and gathered data from a ton of different pricing pages and said, these are the different pricing models, and we make the assumption that they’ve chosen an optimal pricing model, maybe it’s not perfect, but it’s probably more optimal than 90% of the other options that were out there. And then you said, look, let’s assume that’s truth. I think we get pretty close to saying, hey, what is a good pricing model? Yeah.

Steven Forth

So, and that’s what we are doing. And as we wrap more and more AI and support around this, we’ll be able to do it faster and faster for more and more pricing models, and we’ll get a richer and richer dataset. But what we really then want to say, okay, what’s the purpose of the pricing model? Because different companies have different strategies. So then we have to look at how successful the pricing model is in supporting this specific strategy.

So there’s a huge amount of work to be done. And it would have been way too much work to do a year ago, but we now increasingly have the tools or can build the tools that will allow us to do this work and start getting closer and closer to, you know, a shared understanding and a shared way of measuring what a good pricing model is.

Mark Stiving

So Steve, I want to share with you how I do this stuff. And then I’d like to hear, is this what AI could do? So typically, I know what I know because I’ve crafted a series of frameworks. And I look for examples where it doesn’t work. And I figure out, ok, why doesn’t it work? It’s a different context. It’s a different situation. My framework was wrong.

And so, it’s time to tweak the framework. And that’s why I know what I know today. Now it feels like AI could do that, but again, it’s the, I got to search for examples that don’t fit.

Steven Forth

I think that’s an incredibly good insight. So just to bring it back to benchmarking pricing models, let’s think, okay, so this pricing model scores poorly on the 12 factor assessment. However, we know that the company that is using this pricing model has been doing incredibly well. So then we need to dig in and say, why has it been doing incredibly well?

It could be that it’s been doing incredibly well because it has so many other things that it’s doing right, that the fact that it has a crappy pricing model, you know, gets washed out because of all the other good stuff. Or it could be that it’s actually a really good pricing model and our measurement system is wrong. So we need to start investigating and asking ourselves those questions.

But like I said, you know, we got to get started and then all of this stuff is going to come out in the wash. It’s not going to be perfect out of the box and we’re going to have to iterate quickly and to get lots of people involved. so that we can get to some form of consensus and then test, evolve, find the exceptions, use the exceptions to drive improvements. It’s going to be a great voyage.

Mark Stiving

Steven, as always, we’re way over time. Before we wrap, instead of asking you what’s the final piece of advice, give us a quick advertisement for new value IQ.

Steven Forth

Thank you. So ValueIQ is an agent that Ibbaka is going to be bringing into beta in July. And what ValueIQ does is it makes it easy for a salesperson to sell on value. And it does that by generating a value model in the background. The salesperson never needs to look at that value model. by applying that value model to the specific deal involved.

And then by using that to create a value story. So it’s basically a, you know, supports a conversation that the salesperson has with the agent to understand the value in the context of this deal, and then generate the story that they need to talk about value in a sales interaction.

Mark Stiving

Awesome. So I signed up to be one of your beta testers, one of your early testers. Once I get access and play with it, let’s have another podcast. And I’m just going to ask you hard questions about ValueIQ. Now, you know, they may not be sales type questions. It may just be really challenging questions, but if anyone can handle them, you can.

Steven Forth

Well, I might bring Amar into that as well, because Amar Dhaliwal has led a lot of the work on the ValueIQ agent.

Mark Stiving

No worries, no worries. But I’m fascinated by it, right? I cannot wait to see it.

Steven Forth

We were benchmarking it today, you know, and that was the, you know, the essence of the story really is that when Intercom said that, you know, we fought every day to improve the performance of FinAI, I thought, man, that’s exactly what Ibica has to be doing with value IQ. And we’re also going to be coming out with a pricing IQ agent in the probably late this year, we just need to be benchmarking every day.

And today’s performance was disappointing. It was overestimating the value. And I think I know why. There were probably three reasons why it was doing that. So there’s just a lot of tuning that we need to be doing on it. But benchmarking every day, I think, is going to be critical to the success of agents that do important business things.

Mark Stiving

Okay. Steven, fascinating. Thank you very much for your time today. If anybody wants to contact you, how can they do that?

Steven Forth

Connect with me on LinkedIn, I’m easy to find. Steven with a V, fourth, F-O-R-T-H. And my email is Steven, S-T-E-V-E-N, at Ibbaka.com.

Mark Stiving

And Ibbaka is spelled with two B’s and one K?

Steven Forth

That’s right. I-B-B-A-K-A.

Mark Stiving

Man, I always get that. It’s either two B’s or two K’s and I always confuse it. So it’s two B’s and one K. Yeah.

Steven Forth

Well, three syllables. Ib-ba-ka.

Mark Stiving

Ah. Now it’s starting to make sense.

Steven Forth

You have to get Karen Chiang on and she will explain the reason for the three syllables.

Mark Stiving

Okay, perfect. To our listeners, thank you for your time. If you enjoyed this, would you please leave us a rating and a review? And if you have any questions or comments about the podcast, or if you want to get paid more for the value you deliver, email me, [email protected]. Now go make an impact.

[Ad / Outro]

Tags: Accelerate Your Subscription Business, ask a pricing expert, pricing metrics, pricing strategy

Mark Stiving

Mark is a pricing expert who helps companies understand value, how to create it, communicate it and capture it. He has a PhD from U.C. Berkeley and an MBA from Santa Clara University, plus 25+ years pricing experience. As an educator, speaker and coach, Mark applies innovative, value-based pricing strategies to guide growth and increase profits for large and small companies.

#705: Why Most Companies Are Pricing Wrong: The Hidden System Behind Google’s Million Daily Price Changes with Gary Bailey

Gary Bailey is the founder at The Talent Capitalist, where he teaches monetization as a

Impact Pricing Podcast

#709: How to Benchmark Your Pricing Like AI Models with Steven Forth

Mark Stiving

Related Podcasts

#709: How to Benchmark Your Pricing Like AI Models with Steven Forth

#707: How AI is Breaking Traditional Per-User Pricing Models with Alan Hollander

#706: Blogcast: Markets Don’t Segment By Industry; Buyers Segment By Problem

#705: Why Most Companies Are Pricing Wrong: The Hidden System Behind Google’s Million Daily Price Changes with Gary Bailey

Pricing Best Practices:
How Private Equity Can Drive Value Without Compromising Relationships

Our Speakers

Mark Stiving, Ph.D.

Alexis Underwood

Stephen Plume

Impact Pricing Podcast

#709: How to Benchmark Your Pricing Like AI Models with Steven Forth

Mark Stiving

Related Podcasts

#709: How to Benchmark Your Pricing Like AI Models with Steven Forth

#707: How AI is Breaking Traditional Per-User Pricing Models with Alan Hollander

#706: Blogcast: Markets Don’t Segment By Industry; Buyers Segment By Problem

#705: Why Most Companies Are Pricing Wrong: The Hidden System Behind Google’s Million Daily Price Changes with Gary Bailey

Pricing Best Practices: How Private Equity Can Drive Value Without Compromising Relationships

Our Speakers

Mark Stiving, Ph.D.

Alexis Underwood

Stephen Plume

Pricing Best Practices:
How Private Equity Can Drive Value Without Compromising Relationships