Big Data

OctoAI’s OctoStack Helps Deploy Private AI Models in the Enterprise


Join us in Atlanta on April 10th and explore the landscape of security workforce. We will explore the vision, benefits, and use cases of AI for security teams. Request an invite here.


Seattle-based OctoAI has a new offering called OctoStack, designed to help those in the enterprise deploy private generative AI models. Companies can use this “turn-key production platform” in a virtual private cloud or on-premises and will have access to highly optimized inference, model customization and asset management. In doing so, OctoAI wants to give companies the freedom to build and run gen AI applications in the way they see fit.

“Enabling customers to build viable and future-proof Generative AI applications requires more than just affordable cloud inference,” Luis Ceze, OctoAI’s chief executive, said in a statement. “Hardware portability, mode onboarding, fine-tuning, optimization, load-balancing — these are full-stack problems that require full-stack solutions.”

OctoStack supports fine-tuning and deployment of a range of open source and commercial AI models, such as Meta’s Llama family, Mistral’s 8x8B and Stable Diffusion models. However, it doesn’t include Anthropic’s Claude, because the AI is only offered in the cloud via Anthropic. “But we offer a lot of these super capable open source models that you can fully control and customize for,” Ceze said.

A detailed diagram of how OctoAI's OctoStack platform works in the enterprise. Image credit: OctoAI
How the OctoStack platform works in the enterprise. Image credit: OctoAI

From Fully Managed to Do-It-Yourself

This isn’t the first attempt by the startup to provide companies with a packaged AI offering. Last year, OctoAI released its self-optimizing infrastructure service. As Ceze explains, the difference is that the feature introduced back then is now a fully managed solution. “That means that you call our APIs, offers highly efficient inference, and we have support for customizing the model,” he told VentureBeat. “We have support for building model cocktails and so on, all with the enterprise and production in mind.”

VB Event

The AI Impact Tour – Atlanta

Continuing our tour, we’re headed to Atlanta for the AI Impact Tour stop on April 10th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on how generative AI is transforming the security workforce. Space is limited, so request an invite today.


Request an invite

In contrast, OctoAI’s OctoStack is a self-managed offering. Ceze said that once the company’s customers started to “do many billions of tokens a day” and there were “millions of images” generated on the platform daily, it became clear there was a need “for more private deployments of our technology.” It’s comparable to having your blog hosted on WordPress.com versus being on your own private server — an analogy Ceze didn’t dispute.

“As enterprises start getting serious about deploying AI, they’re nervous about sending data over an API outside their control,” Ceze delineates. “What we do with OctoStack is they can choose their model, customize their models, and offer that as a totally private API. And we provide all of the infrastructure for it. That means we take care of how the model becomes reliable and efficient across their GPUs.”

Hundreds of customers currently use OctoAI’s fully managed solution, but Ceze declined to share how many have signed up for OctoStack. Instead, he referred me to those listed on the company’s press release — Apate.ai, Otherside AI, Lattitude Games and CapitalAI. However, I’m told the companies being targeted are those that are already experimenting with gen AI tools and are now looking to deploy these models into a production environment.

A Wide Open Market for Enterprise AI

There is a tremendous opportunity for generative AI adoption within the enterprise. A Menlo Ventures report highlighted that $400 billion was spent on cloud software in this space last year. Seventy billion of that investment went to AI (18 percent). Gen AI made up $2.5 billion, less than 1 percent.

Enterprise investment in generative AI is small compared to enterprise budgets for traditional AI and cloud software.
Enterprise investment in generative AI is small compared to enterprise budgets for traditional AI and cloud software. Image credit: Menlo Ventures

“The current usage and availability of Generative AI in the enterprise is technically high, with over half of CIOs having some plans to formally deploy Generative AI and the popularity of services and models such as Microsoft Copilot, ChatGPT, Midjourney and many others,” Amalgam Insights chief executive and analyst Hyoun Park shared with me. “But the capabilities associated with customization, fine-tuning and augmenting models is still low…”

Constellation Research founder and principal analyst Ray Wang shares that right now, “most organizations are trying to optimize for a multi-vendor world, hence there have not been any pure-gen AI stacks. Bringing your own app frameworks, models and data is the predominant approach.” He describes OctoStack as a good thing because “it’s easier to have the stack in one place.”

OctoAI may not have long to rest on its laurels. It faces stiff competition from not only its fellow startups, but also enterprise incumbents, including Nvidia, Databricks and Sambanova Systems. Ceze says he’s not worried: “I’m pretty sure this is a hot space and we have to expect that others who will have offerings that compete with this and the way we continue to differentiate is again using our unique expertise in doing cross-tech optimizations. That’s the DNA of our company. That’s how we started.”