This article originally appeared on the
When deciding on a business intelligencesolution, one of
the most important decisions is what type of data integration product to buy. The data integration
sector is inundated with various products, and selecting one can be unnerving. It is important to
consider the scope of the project when selecting a data integration product since they have so many
capabilities. These capabilities range from a simple Extract, Transform, Load (ETL) tool that imports data from spreadsheets to a tool
that can produce a largeEnterprise Information
Integration (EII) project that connects
numerous data warehouses, databases, data marts and other data sources.
ETL is used to gather data from various sources (usually operational environments), cleanse, transform, and then load the data into database tables. ETL tools can be found in many different applications, ranging from database platforms to data integration suites. While some of these products are rather limited and simple, others provide very comprehensive data integration capabilities.
Simpler ETL tools are used to extract data from disparate sources, which is generally done in smaller scale solutions. The data is then cleansed and consolidated into relational tables. These tools load data via batch loading, and cannot accommodate real-time loading. They also have very simplified transformation types, such as read and write, copy column, string manipulation and development of custom scripts. Therefore, most of the complex transformations need to be hand-coded in custom scripts. These limitations in transformations make these tools more useful for smaller scale projects.
More comprehensive data integration suites are better for larger scale data integration projects. These products can create complex ETL transformations, Enterprise Information Integration (EII), and Enterprise Application Integration (EAI) solutions. Whereas an EII solution pulls data from multiple data sources and provides real time solutions, an EAI solution integrates transactions from multiple applications. These larger integration efforts can be addressed by using comprehensive data integration suites.
Products equipped with complex transformations facilitate complex ETL processes. They also cut down on the need for custom scripts and hand coding, thereby reducing cost and time required for creating an integration solution. They have robust loading options as well. Examples of this are parallel processing and distributed processing. Parallel processing is when multiple batch loads are processed simultaneously. Robust loading options allow for much better server memory management. EII solutions can be created through products that use web services data for real-time data.
Some of these data integration products are not just restricted to accessing database platforms.
They can also be connected to LDAP, XML, flat files and packaged
applications, such as SAP and PeopleSoft. The ability to utilize XML and flat files can cut
down on memory costs when transferring data. The ability to connect to the packaged applications is
valuable because it allows Enterprise Resource Planning (ERP) and Supply Chain Management (SCM) to
be included in EAI and EII solutions. These features in data integration products provide
flexibility in managing and transforming data for larger data integration solutions.
Knowing the capabilities of data integration products can help determine whether the data integration solution is a simple ETL, complex ETL, EII or EAI. While choosing the right product can cut overhead costs and labor when creating data integration solutions, choosing the wrong one can increase both costs and labor. Therefore, it is important to be familiar with the various data integration products and their capabilities before choosing a product.