Getting Your Data Into Shape

  |  August 10, 2004   |  Comments

Is your data in the most conducive format for use? Two case studies to help firm up your data.

What does your company have in common with both the U.S. government and an e-commerce/catalog company? Very likely, you share a problem: Your data is not in the format most conducive to extracting useful information.

Two projects I worked on illustrate the problem and its implications, as well as suggest a beginning to the solution. One project involves data supplied by the U.S. government; the other involves data from an e-commerce/catalog company.

Case Study: U.S. Census Bureau

The U.S. government project involved determining the best target markets for a particular product. The marketers didn't have a house file with information about their best prospects. However, they did have a reasonable idea of their target market's general demographics.

Armed with this knowledge, the marketers investigated the U.S. Census data as a source of demographic knowledge about different U.S. regions. The marketers then compared their target demographics against regional demographics and determined which areas were most appropriate for their product rollout. The U.S. Census is an excellent resource; it's comprehensive and available free of charge.

The marketers went to the U.S. Census Web site, where they found good news and bad. The good news was data are available for free via an FTP download. The bad news was the necessary data were scattered across an innumerable number of different tables and data files, not to mention different file formats.

Converting the available data from ASCII to another database format (Access and SAS instructions are provided) then to a statistical analysis format can be a formidable task. It could involve combining data from over 76 different files!

Yet since this is the land of opportunity, some enterprising entrepreneurs realized marketers probably don't want to go through this process. They completed much of the work for us. GeoLytics is one such company.

This type of data provider offers to put data into a format that's much more useful to marketers. It will provide either one of its CD products that already contain what you need, or a custom data set pulled to your specifications and requirements.

Though I appreciate these companies' efforts, I fault the folks at the U.S. Census Bureau for not creating some summary-level files that would be useful to marketers. Yes, data are collected and provided to the public. But it's not provided in a format that even approaches useable. I do, however, applaud them for their intent of making the data available via FTP.

Case Study: E-Commerce Cataloger

I worked on a retail catalog data analysis project, which involved examining the purchase history of customers who bought via mail, telephone, or the Web. The good news is the data funneled into one system rather than being stored in separate systems.

Unfortunately, the data was provided in a manner unfavorable to analysis without extensive data manipulation. Instead of being in one file, the data was in multiple files. Separately, the files yielded no valuable information.

One file contained basic customer information, such as name, address, sex, and age. Another file had a lengthy list of order numbers and the customer identification numbers associated with them. Yet another file held information about all purchased items and order numbers associated with them.

Storing data in separate tables like this allows it to be captured and methodologically stored in the data warehouse. However, in this structure the data can only be used to retrieve all information about one customer in a simple fashion. For example, a telemarketer can quickly extract all the records pertaining to one specific customer from each of these tables. When a customer calls the customer service line, the telemarketer can easily retrieve all the customer's information.

Unfortunately very little else can be done with the data in this structure. If you want to know what the most popular product is in Florida, for example, you'll have to go through four steps:

  1. Extract all records with "FL" as the state field from the customer file.

  2. Extract all orders for Florida customers from the order file.

  3. Match the orders to the item file to find the ordered items.

  4. Sum and rank the counts of each item to determine the most popular item in Florida.

What a cumbersome and time-consuming process! Imagine if you had 10 other states to examine, not to mention any other potential subsets. The analysis will take more time than it is reasonable.

The Solution

In both cases, the data is not designed to end-user requirements. It's designed to store, rather than use, data. Were the Census Bureau to identify some of the data's uses, it could easily provide the data in a format that's more easy to use. The information needed in the retail analysis was fairly basic and very useful -- as well as reusable. The retailer should examine it every month. However, in its current format, the data must be manipulated every time to put it in a useable format.

Data being provided in an nonuseable format occurs at far too many companies. It causes companies to be inefficient and costs them money.

If the provided data is going to be used for a one-time analysis, it isn't worth the effort to make the process more efficient. However, if the analysis is to be conducted regularly, process improvements should be made.

Clearly define how you want to use the data. The data warehouse group can then produce summary tables containing the necessary data, at the required level. Once in place, the summary table can be updated with current information. It can be used by end users, including marketers, without much further assistance from the data warehouse.

Going through this process helps the immediate parties involved in producing the new information and any subsequent groups that want to use similar information. The result is (once again) better information at a reduced cost, as well as happier marketing and data warehouse employees.

As a marketer, what's your relationship with your data source (likely a data warehouse group)? If you can help the data source complete the extra formatting necessary to accomplish your goals, it will prevent multiple instances of reformatting down the line.

The result? Better information at a reduced cost, as well as happier marketing and data warehouse employees.

ClickZ Live New York What's New for 2015?
You spoke, we listened! ClickZ Live New York (Mar 30-Apr 1) is back with a brand new streamlined agenda. Don't miss the latest digital marketing tips, tricks and tools that will make you re-think your strategy and revolutionize your marketing campaigns. Super Saver Rates are available now. Register today!

ABOUT THE AUTHOR

Brian Teasley Brian Teasley is the leader of Teasley, a consultancy that helpsadvertisers, marketers and advertising agencies use data and analysis toimprove their marketing campaigns. Brian has over 14 years experience inengineering and marketing, and has worked for numerous Fortune 100companies. Brian also teaches a marketing course at New York University. Heholds a M.S. degree in Applied Statistics from Iowa State University and aBA in Mathematics and a BA in Mathematics and Statistics from St. OlafCollege.

COMMENTSCommenting policy

comments powered by Disqus

Get the ClickZ Analytics newsletter delivered to you. Subscribe today!

COMMENTS

UPCOMING EVENTS

UPCOMING TRAINING

Featured White Papers

Google My Business Listings Demystified

Google My Business Listings Demystified
To help brands control how they appear online, Google has developed a new offering: Google My Business Locations. This whitepaper helps marketers understand how to use this powerful new tool.

5 Ways to Personalize Beyond the Subject Line

5 Ways to Personalize Beyond the Subject Line
82 percent of shoppers say they would buy more items from a brand if the emails they sent were more personalized. This white paper offer five tactics that will personalize your email beyond the subject line and drive real business growth.

WEBINARS

Resources

Jobs

    • Customer Service Consultant
      Customer Service Consultant (Bonner and Partners) - Delray BeachBonner & Partners: Full-time Customer Service Consultant Position Who we are...
    • Financial Editor
      Financial Editor (Confidential) - DurhamSIX FIGURE EDITORS WANTED: To enforce lofty NEW editing standards. Easy Conditions Unlikely. Promotion and...
    • Information Processing Specialist
      Information Processing Specialist (Agora Inc. ) - BaltimoreInformation Processing Specialist – The IP specialist position ensures the successful...