Sergey Nivens - Fotolia
Integrated Deployment, a new addition to the KNIME analytics platform, aims to eliminate the gap between the development of data science models and their deployment.
KNIME, an open source analytics vendor founded in 2004 and based in Zurich, Switzerland, unveiled Integrated Deployment on Wednesday during KNIME Spring Summit 2020, the company's virtually held user conference.
One of the many problems that arise during the analytics process is how to take a machine learning model that's been prepared and developed by data scientists and deliver it to business users so they can use it in the decision-making process.
Normally, what in theory should be a simple transfer is actually far from simple. It requires that data be moved in exactly the right form for consumption, and that can mean that data scientists must manually replicate the model's settings as the model is transferred into the workflows of end users.
With Integrated Deployment, the KNIME analytics platform automates the labor involved in the transfer, eliminating the need to manually replicate data models repeatedly, as well as making the data instantly available for analysis.
"The idea is that the data scientists don't need to struggle with anybody to do the translation of whatever they are creating -- they can directly deploy that -- and the business users on the opposite end of the food chain benefit because they get to see the changes literally within a minute," said Michael Berthold, co-founder and CEO of KNIME. "They don't have to wait for anybody in the middle to make some translation or move the models around or do some adjustments."
Berthold compared model development by data scientists and their eventual deployment to end users in a form they can use to a chef creating recipes in a high-end restaurant kitchen and then having to rewrite those recipes so people can recreate the dishes in their own kitchens without the same tools as the restaurant chef.
"The data scientists are using wild Python libraries and whatever tool they can get their hands on, downloading them from academic sites," Berthold said. "They're trying out deep learning architectures, but in the end on the deployment side, in order to actually get that running you're very often limited to a different type of setup and there needs to be some sort of translation in the middle."
That translation in the middle that the KNIME analytics platform is addressing has long been needed, said Mike Leone, senior analyst at Enterprise Strategy Group.
Mike LeoneSenior analyst, Enterprise Strategy Group
"There are so many dependencies going from golden model to production that it's significantly delaying time-to-production and time-to-value," he said. "Additionally, over time it's not uncommon for data drift to force organizations to have to retrain models and re-deploy updated models. The gaps and roadblocks KNIME is addressing with Integrated Deployment will enable organizations to rapidly and effectively put models to work in production environments."
Similarly, Doug Henschen, an analyst at Constellation Research, said that the gap between model development and deployment can be a significant problem -- sometimes even debilitating -- but noted that KNIME is not the first vendor to address the problem.
"KNIME is not the only vendor to try to overcome this gap with what's billed as an end-to-end platform," he said. "The large public cloud vendors and multiple software vendors, both large and small, have also made efforts to close this gap."
Specifically, Henschen noted that AWS, Microsoft, Google,, SAS, IBM, Oracle, Tibco and Domino Data Lab all have tried to tackle the development-to-deployment process.
He also cautioned that, as with any potential implementation, enterprises need to compare tools before deciding on which vendor's product fits best within their organization.
"Would-be customers should organize a multi-disciplinary team to consider available options," Henschen said. "Will the end-to-end platform require the DevOps side of the house to license and learn new software, and to what degree will it complement current tools, processes and practices? Change management and software adoption can be significant obstacles to realizing promised benefits."
Beyond Integrated Deployment, Berthold noted that some of the significant future additions to the KNIME analytics platform -- the vendor releases updates twice a year -- will center around the KNIME Hub, the vendor's exchange platform for workflows that it first made available in March 2019.
"It is an open source project," said Berthold. "We have a lot of community contributions that are adding their new algorithms to the platform, and at the same time we're also providing wrappers around some of these open source developments."