A buyer's guide to selecting the right big data analytics software
A collection of articles that takes you from defining technology needs to purchasing options
KNIME is an open source data analytics, reporting and integration platform developed and supported by KNIME.com AG. Through the use of a graphical interface, KNIME enables users to create data flows, execute selected analysis steps and review the results, models and interactive views.
Written in Java and built on Eclipse, KNIME leverages Eclipse's module extension capability through the use of plug-ins. Available plug-ins support the integration, with methods for text mining, image mining and time series analysis. KNIME also integrates various other open source projects, including machine learning algorithms from Weka, R and JFreeChart. It supports wrappers to call other code and provides nodes, so users can run Java, Python, Perl and other code fragments. KNIME leverages the Eclipse plug-in capability -- as a result, connector extender nodes for a wide range of systems and platforms continue to be added.
In addition to the KNIME open source data analytics platform, KNIME offers the following commercial products:
- KNIME Personal Productivity, which provides a way to efficiently build and maintain KNIME workflows. Code snippets and meta nodes in a workflow can more easily be managed, reused and shared.
- KNIME Partner Productivity, which provides consulting organizations with the ability to encrypt and lock encapsulated meta nodes to be shared with clients, while protecting their intellectual property.
- KNIME Team Space, which improves team collaboration efforts by providing a means for storing data flows and analysis workflows centrally to be shared and worked on collaboratively by multiple team members.
- KNIME Server Lite, which provides advanced collaboration capabilities -- such as basic user authentication and user rights, remote scheduled execution, report generation, shared data space, workflow repository, and meta nodes and priority updates.
- KNIME Server, which adds advanced features to the Server Lite extension, including more advanced user authentication and user rights, Web services support, workflow versioning and commercial support.
- KNIME Big Data Extension, which provides nodes for accessing data stored in Hadoop Distributed File System (HDFS) Hive databases from within KNIME.
- KNIME Cluster Execution, which provides a thin connection layer between KNIME and a cluster to improve performance by helping to optimize cluster use with KNIME.
Version 2.11 of KNIME provides enhancements to database connectivity, including improvements to the GroupBy node aggregation method, database-specific aggregation methods and aggregation column selection based on pattern matching. Additional database nodes have been added to the platform for improved database integration and handling, and a new connector node for connecting to HP Vertica databases has been added.
Enhancements to KNIME Big Data Extension (Commercial Extension) include new Impala Connector and loader nodes, as well as new nodes to extend file-handling node capabilities to operate on HDFS. Connectors for well-known database and data repositories are available to enable KNIME to extract data from those sources.
Executable versions of KNIME Analytics Platform are available for Microsoft Windows and Linux -- both 32- and 64-bit -- as well as Mac OS X.
KNIME licensing and pricing
KNIME commercial extensions are available from KNIME, as well as other partners and resellers.
KNIME support is generally provided through the website online forums and community support. Additional support is available from KNIME GmbH, with the purchase of the KNIME Server commercial extension. KNIME GmbH also provides other contracts for support.
The worst mistakes you can make when deploying big data analytics tools
Big data software plays major role for data warehouse projects