How Do You Make 2+2=5? Integrate!

Synergy is all about creating greater value than just the sum of the constituent parts, and this applies to your data as much as anything else.

Getting greater value from the data can be delivered through data integration, enriching each source with dimensions, facts, and attributes of the others. A classic case in the world of digital marketing would be integrating Web data with customer profile data or other offline data sources to build up a more complete picture of customer behavior across multiple touch points. How might these types of integrations be achieved and what does that mean for information consumers and analysts alike?

Data integration typically occurs on two levels – the macro level and the micro level.

Macro Integration

Macro integration is the process of pulling the outputs of the various data systems together into one place so you can see relationships between different data sources. Excel is often the ultimate macro data integration tool. Legions of analysts all over the world spend hours pulling together spreadsheets from multiple data sources to provide a summary of what’s going on. Usually macro integration is where the data is brought together on an aggregated level and joined on some common attribute like time or geography. An example would be pulling traffic data from a Web analytics system and overlaying it with marketing data such as GRPs, impression levels, etc.

Micro Integration

Micro data integration is when different data sources are integrated at a much more granular level, for example at the customer level. Often this has meant moving data from one system to another, developing some ETL (Extraction, Transformation, and Loading) processes to get the data into the right shape to connect it to the other data sources. Typically this is managed by the enterprise BI team, though analysts may be performing ad-hoc extractions to create datasets to do some analytics on, something that can be a slow and painful process. However, over the past couple of years we have seen the development of “data blending” to overcome some of those challenges.

Data Blending

Data blending is considered to be different from data integration in that the data is not integrated as a permanent data store but brought together by the analysts to address specific business issues. A number of technologies are now promoting data blending capabilities, such as Tableau, Alteryx, and Pentaho to name a few. These technologies create connections to a wide variety of data sources and allow the end user to create the “joins” between the different data streams. Those joins may be at a macro level or at a micro level depending on the attributes of the data itself.

The perceived advantage of data blending is that the tools and processes are owned by the analysts rather than an enterprise BI function and all they need are the connections to the data sources. This potentially enables more speed and agility in the analysis process, as analysts are essentially sucking the data into their own system to do the manipulation there without the need to touch the underlying core systems.

System of Record

However, while data blending brings opportunities, it must also been seen in the context of an overall data integration strategy. Data blending technologies are different from data warehousing technologies in that they are not generally designed to be the system of record or the “single customer view.” While they can overcome some of the challenges found in changing or adapting cooperate systems, those cooperate systems are usually wrapped in a layer of data governance to ensure that the data integrity is of the highest order. One of the potential hitches in data blending approaches is ensuring that the blender understands the facets and the nuances of the data that they are mixing. What “customer” means in one database may not be what “customer” means in another!

Blend, Then Integrate

I think a really useful application of data blenders is for rapid prototyping of data integration projects. No one wants to wait for ages for their particular analytical dataset to get created, competing with the various priorities of the business. Data blending is way of getting to some results fast and also learning along the way about what value that particular integration of the data can deliver. That in itself can then be used to build the business case for a full-scale implementation.

Data integration issues are still often cited by marketers as one of their top challenges in creating better customer experiences, so whatever the approach, it’s now more imperative and potentially easier to get your data out of its silos and synergize!

Related reading