Inscrivez-vous maintenant pour un meilleur devis personnalisé!

Domino Data Lab adds autoscaling to MLOps

09 févr. 2022 Hi-network.com
Shutterstock

As big on Data bro Andrew Brust reported last fall, Domino Data Lab has of late been taking a broader view of MLOps, from experiment management to continuous integration/continuous delivery of models, feature engineering, and lifecycle management. In the recently released 5.0 version, Domino focuses on obstacles that typically slow physical deployment.

Artificial Intelligence

  • 8 ways to reduce ChatGPT hallucinations
  • AI is transforming organizations everywhere. How these 6 companies are leading the way
  • 3 ways AI is revolutionizing how health organizations serve patients. Can LLMs like ChatGPT help?
  • If AI is the future of your business, should the CIO be the one in control?

Chief among the new capabilities is autoscaling. Before this, data scientists had to either play the role of cluster engineers or work with them to get models into production and manage compute. The new release allows this step to be automated, leveling the playing field with cloud services such as Amazon SageMaker and Google Vertex AI which already do, and Azure Machine Learning offers in preview. Further smoothing the way, it is certified to run on the Nvidia AI Enterprise platform (Nvidia is one of the investors in Domino).

The autoscaling features build on support for Ray and Dask (in addition to Spark) that was added in the previous 4.6 version, which provides APIs for building in distributed computing into the code.

Another new feature of 5.0 tackling the deployment is the addition of a new library of data connectors, so data scientists don't have to reinvent the wheel each time they try connecting to Snowflake, AWS Redshift, or AWS S3; other data sources will be added in the future.

Rounding out the 5.0 release is built-in monitoring. This actually integrated a previously standalone capability and had to be manually configured. With 5.0, Domino automatically sets up monitoring, capturing live prediction streams and running statistical checks of production vs. training data once a model is deployed. And for debugging, it captures snapshots of the model: the version of the code, data sets, and compute environment configurations. With a single click, data scientists spin up a development environment of the versioned model to do debugging. The system, however, does not at this point automate detection or make recommendations on where models need to be repaired.

The spark (no pun intended) for the 5.0 capabilities is tackling operational headaches that force data scientists to perform system or cluster engineering tasks or rely on admins to perform it for them.

But there is also the data engineering bottleneck, as we found from research we performed for Ovum (now Omdia) and Dataiku back in 2018. From in-depth discussions with over a dozen chief data officers, we found that data scientists typically spend over half the time with data engineering. The 5.0 release tackles one major hurdle in data engineering -- connecting to popular external data sources, but currently, Domino does not address the setting up of data pipelines or, more elementally, automating data prep tasks. Of course, the latter (integration of data prep) is what drove Data Robot's 2019 acquisition of Paxata.

The 5.0 features reflect how Domino Data Lab, and other ML lifecycle management tools, have had to broaden the focus from the model lifecycle to deployment. That, in turn, reflects the fact that, as enterprises get more experienced with ML, they are developing more models more frequently and need to industrialize what had originally been one-off processes. We wouldn't be surprised if Domino next pointed its focus at feature stores.

Big Data

How to find out if you are involved in a data breach (and what to do next)Fighting bias in AI starts with the dataFair forecast? How 180 meteorologists are delivering 'good enough' weather dataCancer therapies depend on dizzying amounts of data. Here's how it's sorted in the cloud
  • How to find out if you are involved in a data breach (and what to do next)
  • Fighting bias in AI starts with the data
  • Fair forecast? How 180 meteorologists are delivering 'good enough' weather data
  • Cancer therapies depend on dizzying amounts of data. Here's how it's sorted in the cloud

tag-icon Tags chauds: affaires Données massives

Copyright © 2014-2024 Hi-Network.com | HAILIAN TECHNOLOGY CO., LIMITED | All Rights Reserved.