Getting Your Data Into Shape

  |  August 10, 2004   |  Comments

Is your data in the most conducive format for use? Two case studies to help firm up your data.

What does your company have in common with both the U.S. government and an e-commerce/catalog company? Very likely, you share a problem: Your data is not in the format most conducive to extracting useful information.

Two projects I worked on illustrate the problem and its implications, as well as suggest a beginning to the solution. One project involves data supplied by the U.S. government; the other involves data from an e-commerce/catalog company.

Case Study: U.S. Census Bureau

The U.S. government project involved determining the best target markets for a particular product. The marketers didn't have a house file with information about their best prospects. However, they did have a reasonable idea of their target market's general demographics.

Armed with this knowledge, the marketers investigated the U.S. Census data as a source of demographic knowledge about different U.S. regions. The marketers then compared their target demographics against regional demographics and determined which areas were most appropriate for their product rollout. The U.S. Census is an excellent resource; it's comprehensive and available free of charge.

The marketers went to the U.S. Census Web site, where they found good news and bad. The good news was data are available for free via an FTP download. The bad news was the necessary data were scattered across an innumerable number of different tables and data files, not to mention different file formats.

Converting the available data from ASCII to another database format (Access and SAS instructions are provided) then to a statistical analysis format can be a formidable task. It could involve combining data from over 76 different files!

Yet since this is the land of opportunity, some enterprising entrepreneurs realized marketers probably don't want to go through this process. They completed much of the work for us. GeoLytics is one such company.

This type of data provider offers to put data into a format that's much more useful to marketers. It will provide either one of its CD products that already contain what you need, or a custom data set pulled to your specifications and requirements.

Though I appreciate these companies' efforts, I fault the folks at the U.S. Census Bureau for not creating some summary-level files that would be useful to marketers. Yes, data are collected and provided to the public. But it's not provided in a format that even approaches useable. I do, however, applaud them for their intent of making the data available via FTP.

Case Study: E-Commerce Cataloger

I worked on a retail catalog data analysis project, which involved examining the purchase history of customers who bought via mail, telephone, or the Web. The good news is the data funneled into one system rather than being stored in separate systems.

Unfortunately, the data was provided in a manner unfavorable to analysis without extensive data manipulation. Instead of being in one file, the data was in multiple files. Separately, the files yielded no valuable information.

One file contained basic customer information, such as name, address, sex, and age. Another file had a lengthy list of order numbers and the customer identification numbers associated with them. Yet another file held information about all purchased items and order numbers associated with them.

Storing data in separate tables like this allows it to be captured and methodologically stored in the data warehouse. However, in this structure the data can only be used to retrieve all information about one customer in a simple fashion. For example, a telemarketer can quickly extract all the records pertaining to one specific customer from each of these tables. When a customer calls the customer service line, the telemarketer can easily retrieve all the customer's information.

Unfortunately very little else can be done with the data in this structure. If you want to know what the most popular product is in Florida, for example, you'll have to go through four steps:

  1. Extract all records with "FL" as the state field from the customer file.

  2. Extract all orders for Florida customers from the order file.

  3. Match the orders to the item file to find the ordered items.

  4. Sum and rank the counts of each item to determine the most popular item in Florida.

What a cumbersome and time-consuming process! Imagine if you had 10 other states to examine, not to mention any other potential subsets. The analysis will take more time than it is reasonable.

The Solution

In both cases, the data is not designed to end-user requirements. It's designed to store, rather than use, data. Were the Census Bureau to identify some of the data's uses, it could easily provide the data in a format that's more easy to use. The information needed in the retail analysis was fairly basic and very useful -- as well as reusable. The retailer should examine it every month. However, in its current format, the data must be manipulated every time to put it in a useable format.

Data being provided in an nonuseable format occurs at far too many companies. It causes companies to be inefficient and costs them money.

If the provided data is going to be used for a one-time analysis, it isn't worth the effort to make the process more efficient. However, if the analysis is to be conducted regularly, process improvements should be made.

Clearly define how you want to use the data. The data warehouse group can then produce summary tables containing the necessary data, at the required level. Once in place, the summary table can be updated with current information. It can be used by end users, including marketers, without much further assistance from the data warehouse.

Going through this process helps the immediate parties involved in producing the new information and any subsequent groups that want to use similar information. The result is (once again) better information at a reduced cost, as well as happier marketing and data warehouse employees.

As a marketer, what's your relationship with your data source (likely a data warehouse group)? If you can help the data source complete the extra formatting necessary to accomplish your goals, it will prevent multiple instances of reformatting down the line.

The result? Better information at a reduced cost, as well as happier marketing and data warehouse employees.

ClickZ Live Chicago Join the Industry's Leading eCommerce & Direct Marketing Experts in Chicago
ClickZ Live Chicago (Nov 3-6) will deliver over 50 sessions across 4 days and 10 individual tracks, including Data-Driven Marketing, Social, Mobile, Display, Search and Email. Check out the full agenda and register by Friday, Oct 3 to take advantage of Early Bird Rates!

ABOUT THE AUTHOR

Brian Teasley Brian Teasley is the leader of Teasley, a consultancy that helpsadvertisers, marketers and advertising agencies use data and analysis toimprove their marketing campaigns. Brian has over 14 years experience inengineering and marketing, and has worked for numerous Fortune 100companies. Brian also teaches a marketing course at New York University. Heholds a M.S. degree in Applied Statistics from Iowa State University and aBA in Mathematics and a BA in Mathematics and Statistics from St. OlafCollege.

COMMENTSCommenting policy

comments powered by Disqus

Get the ClickZ Analytics newsletter delivered to you. Subscribe today!

COMMENTS

UPCOMING EVENTS

Featured White Papers

IBM: Social Analytics - The Science Behind Social Media Marketing

IBM Social Analytics: The Science Behind Social Media Marketing
80% of internet users say they prefer to connect with brands via Facebook. 65% of social media users say they use it to learn more about brands, products and services. Learn about how to find more about customers' attitudes, preferences and buying habits from what they say on social media channels.

Marin Software: The Multiplier Effect of Integrating Search & Social Advertising

The Multiplier Effect of Integrating Search & Social Advertising
Latest research reveals 68% higher revenue per conversion for marketers who integrate their search & social advertising. In addition to the research results, this whitepaper also outlines 5 strategies and 15 tactics you can use to better integrate your search and social campaigns.

Resources

Jobs

    • Digital Marketing Analyst
      Digital Marketing Analyst (GovLoop) - Washington D.C.Are you passionate about audience acquisition? Love effective copy and amazingly effective...
    • Product Specialist
      Product Specialist (Agora Inc. ) - BaltimoreDescription: The Product Specialist is hyper-focused on the customer experience and ensures that our...
    • Partnerships Senior Coordinator
      Partnerships Senior Coordinator (Zappos.com, Inc.) - Las VegasZappos IP, Inc. is looking for a Partnerships Senior Coordinator! Why join us? Our...