What Machines Haven't Learned Yet

  |  September 30, 2013   |  Comments

Machine learning is really, really powerful and it opens up new ways of doing analysis with Big Data. But, it cannot act alone.


I had an interesting talk with Gary Angel the other day... which is the same as saying I talked to Gary Angel the other day.

Gary is the partner / principal of the Digital Analytics Center of Excellence at Ernst & Young. An engaging guy with a very large brain.

With more than 20 years of analytics under his belt, Gary knows about analytics and is more than willing to share. So, when the subject turned to machine learning, I tuned in a little tighter.

Gary is not an artificial intelligence guy. In my narrow sense of the term, I wouldn't even call him a data scientist.

Data Scientist; One responsible for understanding and advancing the nature of data, its collection methods, and the algorithms for processing it.
- Jim Sterne's Private Opinion Dictionary

Gary is a business problem solver who happens to use data to get the job done. So I tune into his perspective on machine learning because he's going to base that opinion on years of practical application. He works in the field, not in the lab.

Gary and I agreed that machine learning is really, really powerful and it opens up new ways of doing analysis with Big Data. But, it cannot act alone.

As Ron Kohavi, George H. John put it in their paper: Wrappers for feature subset selection:

A universal problem that all intelligent agents must face is where to focus their attention. A problem-solving agent must decide which aspects of a problem are relevant, an expert-system designer must decide which features to use in rules, and so forth. Any learning agent must learn from experience, and discriminating between the relevant and irrelevant parts of its experience is a ubiquitous problem.

This is where you come in. You need to be or find some subject matter experts who can separate the wheat from the chaff.

Big Data cannot gobble up every bit you collect and paw through tens of thousands of variables and figure out what's important.


As Gary puts it, "Good analysis comes from someone figuring out what the right variables are."

He then recounted an example from a Digital Analytics Association Symposium in Philadelphia about calculating what movies people are most likely to want to see next. The data was collected from set-top boxes and the machine determined that movies beginning with the letter "A" were far more likely to be preferred.

A human knows instantly that this is the result of movies being listed alphabetically and is not a valuable variable for determining the likeability of any given movie. It is not proper fodder for a recommendation engine.

If you're crunching numbers in marketing, do you know if time-of-day is any more predictive of a purchase than geography? Search behavior? Click behavior? Shopping cart population?

This is why people with business smarts will always have a job.

Gary provided another example:

We did a segmentation analysis for an online travel aggregator, looking at purely search behavior data. We found a very interesting segmentation, but we had to put a lot of thought into what that search behavior meant.

If someone did a search, changed the data of the search, and then looked for the same destination, we could infer that they were flexible about dates. If they change the destination but didn't change the date, we infer they were flexible about destinations.

We created those as variables in the analysis and that became a very powerful predictor for them. sterne-092613-2

But that's not inherent in the behaviors, right? An analyst had to
figure out that changing those two things was a valuable variable for the analysis.

When we started, the obvious variable was destination - was a traveler going to Las Vegas, for example. But as we thought through the analysis, many additional variables that were even more interesting emerged. It was about how far out they were searching, how many days between the search, when the search was conducted, and what the destination date was. Added to this it was, whether they change the search, whether they change the destination, whether it was a weekend, and whether it was a weekend included in the stay. All those kinds of things turned out to be not surprisingly very important but those are things that, unless you feed them into the machine, you won't get a good analysis.

Once the human picks out the high value variables, the machine will do a great job figuring out which ones are important.

The lesson is to use that part of your brain that it's best at: intuition, relevance, reasoning, etc., and then let the machine do what it's best at: calculation, tabulation, tabulation, enumeration.

Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
- Albert Einstein

ClickZ Live Chicago Join the Industry's Leading eCommerce & Direct Marketing Experts in Chicago
ClickZ Live Chicago (Nov 3-6) will deliver over 50 sessions across 4 days and 10 individual tracks, including Data-Driven Marketing, Social, Mobile, Display, Search and Email. Check out the full agenda and register by Friday, Oct 3 to take advantage of Early Bird Rates!


Jim Sterne

Jim Sterne is an international consultant who focuses on measuring the value of the Web as a medium for creating and strengthening customer relationships. Sterne has written eight books on using the Internet for marketing, is the founding president and current chairman of the Digital Analytics Association and produces the eMetrics Summit and the Media Analytics Summit.

COMMENTSCommenting policy

comments powered by Disqus

Get the ClickZ Analytics newsletter delivered to you. Subscribe today!



Featured White Papers

IBM: Social Analytics - The Science Behind Social Media Marketing

IBM Social Analytics: The Science Behind Social Media Marketing
80% of internet users say they prefer to connect with brands via Facebook. 65% of social media users say they use it to learn more about brands, products and services. Learn about how to find more about customers' attitudes, preferences and buying habits from what they say on social media channels.

An Introduction to Marketing Attribution: Selecting the Right Model for Search, Display & Social Advertising

An Introduction to Marketing Attribution: Selecting the Right Model for Search, Display & Social Advertising
If you're considering implementing a marketing attribution model to measure and optimize your programs, this paper is a great introduction. It also includes real-life tips from marketers who have successfully implemented attribution in their organizations.


    • Tier 1 Support Specialist
      Tier 1 Support Specialist (Agora Inc.) - BaltimoreThis position requires a highly motivated and multifaceted individual to contribute to and be...
    • Recent Grads: Customer Service Representative
      Recent Grads: Customer Service Representative (Agora Financial) - BaltimoreAgora Financial, one of the nation's largest independent publishers...
    • Managing Editor
      Managing Editor (Common Sense Publishing) - BaltimoreWE’RE HIRING: WE NEED AN AMAZING EDITOR TO POLISH WORLD-CLASS CONTENT   The Palm...