OpenAI buys Windsurf

People are discussing why OpenAI acquired Windsurf on x dot com. It’s clearly not for the revenue. A $100 million revenue stream won’t add much value to OpenAI, which is already valued at $300B. While the Windsurf team is certainly talented and acquiring them is beneficial, I believe the main reason is that while OpenAI has become the default application for general chat, they’re lagging behind Claude and the rapidly improving Gemini models when it comes to coding capabilities.

How does OpenAI catch up? When building better models, they typically conduct Reinforcement Learning from Human Feedback (RLHF) through partners like Mercor and ScaleAI who help with labeling tasks and providing feedback on outputs of these models. Imagine you are trying to generate iOS code, you would need a lot of iOS experts from Mercor, Scale AI, and Turing AI to give you feedback on the code generated. There is not enough iOS code out in the wild for foundational models to train on. This becomes expensive and consists of repeated transactions with the human labellers who are not even in a direct relationship with the foundation models and sourced from these third parties who source these experts. OpenAI of course don’t want these operational expenses (additional head counts of experts) on their balance sheets, nor do they want to continually go through third parties that will cut into their margins.

The best approach is to acquire software that enables an ongoing feedback loop. Currently, when people use ChatGPT to generate code, OpenAI receives no feedback about whether users applied the code, used it, what iterations they made, or if they even pushed the recommended code to GitHub. There’s no feedback loop - OpenAI provides output on ChatGPT client and there is no accept, reject button, no response by the user on whether it served the purpose. Users might not even say “thank you, this worked” because that costs additional tokens. And Sam Altman himself has been complaining about how much money OpenAI burns because of these “Thank you”s in general conversations.

With Windsurf, users interact with agents that rely on foundation models and providing continuous feedback. The system tracks user satisfaction with responses and subsequent actions. More importantly, since Windsurf is a wrapper around multiple models, by acquiring Windsurf, OpenAI gains feedback not just on their own models but also on competitor models. This creates a permanent data and feedback loop that will help build better coding agents.

Amjad from Replit has suggested that the key to achieving AGI is having the perfect coding agent, as it can code whatever you want and eventually self replicate. Don’t remember the source though (where he said it, apologies if I misremember). Foundation model companies know this. Claude now has Claude Code, and OpenAI has launched a standalone coding product recently that you can run from your terminal called OpenAI Codex CLI.

However, these products might not be sufficient because they’re terminal based and targeted mostly at professional developers. Yes, Pro devs use IDEs too, but my point is that currently what foundational models have shipped are terminal based coding agents. Windsurf and Cursor have access to much more data because their user-friendly interfaces attract not just professional developers but also novice coders, “vibecoders”— essentially the entire spectrum of developers. This results in significantly higher volumes of data and feedback on response quality. I believe an IDE with the largest base of coders will always outperform the current iterations of Claude Code and OpenAI Codex when it comes to data generation and feedback loops. After reviewing dozens of transcripts from companies like Mercor and ScaleAI on Tegus, I’ve understood the costs these foundation models and application layer products incur to ensure optimal performance in specialised tasks like coding through human-in-the-loop feedback.

Coding is a P0 use case for foundation model companies, and it seems inevitable that Anthropic will also try to acquire a coding IDE. Sundar Pichai following the Cursor account on x dot com has led to speculation that Google might eventually acquire Cursor. There will be significant competition for Cursor though, with their impressive $300 million in revenue, and recent 900 million raise, they likely won’t sell cheaply anytime soon.

Google has recently released their own coding IDE, but I doubt people will adopt it given Google’s history of shutting down products that don’t achieve their desired scale.