News Stay informed about the latest enterprise technology news and product updates.

Customer data integration best practices

In the second part of her inaugural article for Business Intelligence Network's CDI and CDM channel, Jill Dyche sets out to bust some prevalent myths about customer data integration and set

This article originally appeared on the BeyeNETWORK.

First, I want to thank those of you who wrote to welcome me to the Business Intelligence Network. When it comes to customer data integration (CDI) and customer data management (CDM), I truly believe we are in the very early days, so it will be fun to watch as these nascent disciplines evolve. I’m happy that you’ll be accompanying me on the journey.

In Part I of this series, we did some Q&A on CDI, particularly as it relates to current data warehousing and business intelligence (BI) disciplines. We introduced the concept of a customer data hub, explained the differences between CDI and the ODS, and showed some examples of CDI in action. In this article, we’ll discuss some of the modern fiction around CDI, offering some of the common – but misinformed – arguments against CDI hubs (usually by people who are…er…“change resistant”). Here is a list of the most frequent claims about CDI made by those who are likely to have been led astray by their own biases and paradigms:

CDI is data warehousing on steroids.
First, let’s reiterate our definition of CDI:

Customer data integration is the collection of processes, controls, automation and skills necessary to standardize and integrate customer data from different sources.

As explained in Part I, CDI technologies are a different breed and specialize in more operational processing. Most CDI solutions focus on bringing data together from disparate sources (true of a data warehouse, too), and standardizing, reconciling and integrating it in real time. CDI hubs then make the newly standardized customer master data available to heterogeneous applications in the enterprise for various uses. This ensures that the widespread systems across your company can all process the same “master” version of a customer record.

Data warehouses, while often high performing, were designed to support complex queries and analytical processing to support business decisions. This means that the data model, access methods, data acquisition and APIs are different than they are for the CDI hub. Most data warehouse products focus on providing database management functionality, not the loading, integration and application access functionality inherent to CDI.

CDI is an application.
It could be considered an application, but I see it more as infrastructure. CDI doesn’t provide end-user business functionality. It’s actually more like middleware, supporting application functionality by simplifying data access and delivery.

Take business-to-business (B2B) hierarchy management, a specific function within the CDI repertory. B2B hierarchy management means that the hub can reconcile a business’ headquarters, divisions, departments, subsidiaries and business partners into a solid corporate hierarchy. Such a capability sounds straightforward – hence, the “application” moniker – but in reality it’s algorithmically quite complex. B2B hierarchy management capabilities can save a company untold millions of dollars by driving more accurate sales territory assignments, reducing mailing costs, preventing cannibalization of accounts, and ensuring that business customers – who often represent the bulk of a company’s revenues – are communicated with in a relevant and consistent way. CDI solution providers such as Initiate Systems and Purismaprovide robust B2B hierarchy management.

When positioned ideally, CDI also serves as a centralized “clearinghouse” for a company’s party data. A “party” signifies an entity a company does business with, be it a customer, a supplier, a business partner, a vendor or a service provider. The “hub” label is an apt one, because your CDI solution should sit in the middle of your IT ecosystem and reconcile party data from across different data sources.

CDI will solve our data quality problem.
CDI certainly automates data quality. In fact, data quality functions, such as validation, match/merge and standardization, are usually part and parcel of a CDI solution. However, CDI can only respond to data standardization. If the data value isn’t there, the CDI hub can’t create it, and many of our clients’ data quality problems are due to data that’s simply missing.
Moreover, data management functions such as ongoing data quality oversight are absolutely critical to the success of CDI. As the CDI hub processes data, a data steward will need to be ready to refine the business rules that the hub uses to match and reconcile its party data, or to fix broken records – for instance, records that were merged and shouldn’t have been. Ongoing, active data stewardship ensures that the applications in need of customer data get the best data possible. Indeed, some of our clients have used CDI as a pretext for stepping up formal data quality efforts and refining data stewardship roles.
There’s no business case for CDI in my industry.
This is only true in cases where a company has very few customers, as is the case for a high-tech manufacturing company we work with. The company is a $200 million global company, but it only has 56 customers. Those customers are themselves quite complex and important, but they are all uniquely managed by real people whose full time job it is to track their every move. No need for a customer hub – though a product hub is only a matter of time.

However, any company that sells commodity products usually has a list of suppliers, and in this case, CDI technology can still be beneficial.

There are diverse and industry-specific uses for CDI. One of the most popular is the real-time reconciliation of customers across different sales and service channels.

For instance, we know of a retailer who can’t yet reconcile an online customer with the same person when she buys from the catalog or shops in the brick-and-mortar store. The retailer’s parent company has diverse guidelines for its retailers when it comes to consumer marketing. Some of its retailers are beholden to strict rules about consumer mailings, avoiding overcommunicating and over-mailing to the same household. However, some of the retailers recognize that there are two distinct buyers in the same household and want to send duplicate catalogs. In the latter instance, the CDI hub will know not to consolidate customers within a household. The importance of the business rules inherent to the CDI hub makes all the difference here.

Certainly, though, CDI is most relevant – and even critical – in industries for which customer relationship details are important. And that’s most industries.

Our vendor is going to link its data quality offering with its ETL offering, so we’ll be using that for CDI.
It’s not that simple. The whole premise of CDI is to have a centralized service to maintain the most accurate and current version of a customer, and it’s predicated on a service-oriented architecture (SOA). While it’s true that many extract, transform and load (ETL) tools support record-at-a-time processing, CDI is about operational access.

Here’s what we look for in a bona fide CDI solution:

1. Is it transparent to the applications’ functionality?

2. Is it technology agonistic (e.g., not dependent on SQL or a specialized interface)?

3. Does is support operational (e.g., online transaction processing/OLTP) access and availability?

Companies who look to put CDI in place don’t want to build a specialty application to support it. An ETL tool might work just fine if the customer information is relatively static and can be updated in a bulk fashion, which isn’t the case with most companies we know.

We’ll have to duplicate our data all over again.
Not necessarily. Some CDI products offer a “registry” style approach, meaning that they directly index to customer records on the source systems, avoiding physical storage of customer data. The CDI hub creates a link key for every customer and maintains that key, in effect “closing the loop” between the source system data and the data on the hub. When the source system updates reference data about a customer, the hub will be updated too.
Other hubs that follow a more “persistent” style physically store customer records on the hub. Persistent style hubs also reduce processing and storage because they don’t distribute and replicate the processing of maintaining customer information – it’s now done in a centralized way. The whole premise of a CDI hub is to centralize duplicated processing and storage. Thus, both the registry and persistent CDI architectures, in fact, eliminate the need for data duplication.
CDI is a fad.
This is only true if you consider your customer data to be a fad. The concept of an operational repository for accurate customer detail has been around for 20 years, so the idea isn’t new. What is new is that vendors are providing off-the-shelf solutions for customer data integration when in the past, companies had to build their own capabilities. I say it’s about time!

Jill Dyché is a partner co-founder of Baseline Consulting, a technology and management consulting firm specializing in data integration and business analytics. Jill is the author of three acclaimed business books, the latest of which is Customer Data Integration: Reaching a Single Version of the Truth, co-authored with Evan Levy. Her blog, Inside the Biz, focuses on the business value of IT.

Dig Deeper on Business intelligence architecture and integration

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.