Big Data

OpenAI partners with Stack Overflow to make models better at coding


Discover how companies are responsibly integrating AI in production. This invite-only event in SF will explore the intersection of technology and business. Find out how you can attend here.


A week after partnering with the Financial Times, OpenAI has announced it is teaming up with Stack Overflow – the open platform to discuss and vote on common coding challenges – to improve its models’ ability to handle programming-related tasks.

Under the engagement, the terms of which remain undisclosed, OpenAI developers will have access to Stack Overflow’s vetted technical knowledge, code and community to improve the performance of their models. The partnership will also make ChatGPT better at programming, both companies confirmed in a statement.

Since ChatGPT has taken the world by storm, Stack Overflow traffic has declined as programmers are using ChatGPT to help them complete code and address questions and issues, reducing the need for Stack Overflow.

The development also comes more than a year after Stack Overflow banned ChatGPT-generated code and content on its platform over concerns of inaccurate information hitting the site. Now, it is moving to improve the same platform.

What to expect from OpenAI-Stack Overflow partnership?

Stack Overflow has built a subscription-based API service, dubbed OverflowAPI, that gives continuous access to its public datasets (accumulated over the last 15 years) to enterprises looking to train and fine-tune large language models. The company roped Google as its first customer with a deal to provide data for the Gemini family of models.

Now, with the latest partnership, the company is giving this API (and the technical content it provides access to) to developers leveraging OpenAI models, including the GPT-4 family. This will help them improve the models using vetted technical content and feedback from the Stack Overflow community.

As the API is used ChatGPT, which also runs on the GPT family of models, will begin to provide improved coding-related answers leveraging technical knowledge from Stack Overflow. According to the companies, this will give users of the chatbot access to trusted, attributed, accurate and highly technical knowledge and code backed by millions of developers.

However, given the credibility concerns many may still have, each ChatGPT response using Stack Overflow data will also link back to the highest relevance posts that influenced the summarized answer. This way, users could go to the platform for “deeper engagement.” 

But sharing data is just one part of the deal.

Stack Overflow also said that it will use OpenAI’s models to accelerate the development of OverflowAI, a series of generative AI capabilities on both the public Stack Overflow site as well as its enterprise offering Stack Overflow for Teams. The partnership with Google also helped the company build out gen AI features for its knowledge platform. The first set of integrations and capabilities developed with OpenAI will be available before mid-2024. 

“Stack Overflow is the world’s largest developer community, with more than 59 million questions and answers. Through this industry-leading partnership with OpenAI, we strive to redefine the developer experience, fostering efficiency and collaboration through the power of community, best-in-class data, and AI experiences,” Stack Overflow CEO Prashanth Chandrasekar said.

“Our goal with OverflowAPI, and our work to advance the era of socially responsible AI, is to set new standards with vetted, trusted, and accurate data that will be the foundation on which technology solutions are built and delivered to our user,” he added.

OpenAI is racing to gather data

The partnership comes as the latest effort from OpenAI to improve its models with data from a credible source. Just a week ago, the company partnered with FT to expand journalistic content available via ChatGPT.

Similar deals have also been signed with other sources, including Associated Press, German media company Axel Springer and the American Journalism Project. Back in January, The Information reported that OpenAI is paying between $1 million to $5 million to sign licensing deals with media firms.