"I personally think that if we do this right, we don't need ML Ops," says Luis Ceze, OctoML CEO, regarding the company's bid to make deployment of machine learning just another function of the DevOps software process.
The field of MLOps has arisen as a way to get ahold of the complexity of industrial uses of artificial intelligence.
That effort has so far failed, says Luis Ceze, who is co-founder and CEO of startup OctoML, which develops tools to automate machine learning.
"It's still pretty early to turn ML into a common practice," Ceze told ZDNet in an interview via Zoom.
"That's why I'm a critic of MLOps: we're giving a name for something that's not very well defined, and there's something that's very well defined, called DevOps, that's a very well defined process of taking software to production, and I think that we should be using that."
"I personally think that if we do this right, we don't need ML Ops," Ceze said.
"We can just use DevOps, but for that you need to be able to treat the machine learning model as if it was any other piece of software: it has to be portable, it has to be performant, and doing all of that is something that's very hard in machine learning because of the tight dependence between the model, and the hardware, and the framework, and the libraries."
Also:OctoML announces the latest release of its platform, exemplifies growth in MLOps
Ceze contends that what is needed is to solve dependencies that arise from the highly fractured nature of the machine learning stack.
OctoML is pushing the notion of "models-as-functions," referring to ML models. It claims the approach smooths cross-platform compatibility and synthesizes the otherwise disparate development efforts of machine learning model building and conventional software development.
OctoML began life offering a commercial service version of the open-source Apache TVM compiler, which Ceze and fellow co-founders invented.
On Wednesday, the company announced an expansion of its technology, including automation capabilities to resolve dependencies, among other things, and "Performance and compatibility insights from a comprehensive fleet of 80+ deployment targets" that include a myriad of public cloud instances from AWS, GCP, and Azure, and support for different versions of CPU - x86 and ARM - GPUs, and NPUs, from multiple vendors.
"We want to get a much broader set of software engineers to be able to deploy models on mainstream hardware without any specialized knowledge of machine learning systems," said Ceze.
The code is designed to address "a big challenge in the industry," said Ceze, namely, "the maturity of creating models has increased quite a bit, so, now, a lot of the pain is shifting Hey, I have a model, now what?"
The average time to go from a new machine learning model is twelve weeks, notes Ceze, and half of all models don't get deployed.
"We want to shorten that to hours," said Ceze.
If done right, said Ceze, the technology of should lead to a new class of programs called "Intelligent Applications," which OctoML defines as "apps that have an ML model integrated into their functionality."
OctoML's tools are meant to serve as a pipeline that abstracts the complexity of taking machine learning models and optimizing them for a given target hardware and software platform.
OctoMLThat new class of apps "is becoming most of the apps," said Ceze, citing examples of the Zoom app allowing for background effects, or a word processor doing "continuous NLP," or, natural language processing.
Also: AI design changes on the horizon from open-source Apache TVM and OctoML
"ML is going everywhere, it's becoming an integral part of what we use," observed Ceze, "it should be able to be integrated very easily - that's the problem we set out to solve."
The state of the art in MLOps, saidCeze, is "to make a human engineer understand the hardware platform to run on, pick the right libraries, work with the Nvidia library, say, the right Nvidia compiler primitives, and arrive at something they can run.
"We automate all of that," he said of the OctoML technology. "Get a model, turn it into a function, and call it," should be the new reality, he said. "You get a Hugging Face model, via a URL, and download that function."
The new version of the software makes a special effort to integrate with Nvidia's Triton inference server software.
Nvidia said in prepared remarks that Triton's "portability, versatility and flexibility make it an ideal companion for the OctoML platform."
Asked about the addressable market for OctoML as a business, Ceze pointed to "the intersection of DevOps and AI and ML infrastructure." DevOps is "just shy of a hundred billion dollars," and AI and ML infrastructure is multiple hundreds of billions of dollars in annual business.