sterne-092613

What Machines Haven't Learned Yet

  |  September 30, 2013   |  Comments

Machine learning is really, really powerful and it opens up new ways of doing analysis with Big Data. But, it cannot act alone.

gary-angel

I had an interesting talk with Gary Angel the other day... which is the same as saying I talked to Gary Angel the other day.

Gary is the partner / principal of the Digital Analytics Center of Excellence at Ernst & Young. An engaging guy with a very large brain.

With more than 20 years of analytics under his belt, Gary knows about analytics and is more than willing to share. So, when the subject turned to machine learning, I tuned in a little tighter.

Gary is not an artificial intelligence guy. In my narrow sense of the term, I wouldn't even call him a data scientist.

Data Scientist; One responsible for understanding and advancing the nature of data, its collection methods, and the algorithms for processing it.
- Jim Sterne's Private Opinion Dictionary

Gary is a business problem solver who happens to use data to get the job done. So I tune into his perspective on machine learning because he's going to base that opinion on years of practical application. He works in the field, not in the lab.

Gary and I agreed that machine learning is really, really powerful and it opens up new ways of doing analysis with Big Data. But, it cannot act alone.

As Ron Kohavi, George H. John put it in their paper: Wrappers for feature subset selection:

A universal problem that all intelligent agents must face is where to focus their attention. A problem-solving agent must decide which aspects of a problem are relevant, an expert-system designer must decide which features to use in rules, and so forth. Any learning agent must learn from experience, and discriminating between the relevant and irrelevant parts of its experience is a ubiquitous problem.

sterne-092613
This is where you come in. You need to be or find some subject matter experts who can separate the wheat from the chaff.

Big Data cannot gobble up every bit you collect and paw through tens of thousands of variables and figure out what's important.

 

As Gary puts it, "Good analysis comes from someone figuring out what the right variables are."

He then recounted an example from a Digital Analytics Association Symposium in Philadelphia about calculating what movies people are most likely to want to see next. The data was collected from set-top boxes and the machine determined that movies beginning with the letter "A" were far more likely to be preferred.

A human knows instantly that this is the result of movies being listed alphabetically and is not a valuable variable for determining the likeability of any given movie. It is not proper fodder for a recommendation engine.

If you're crunching numbers in marketing, do you know if time-of-day is any more predictive of a purchase than geography? Search behavior? Click behavior? Shopping cart population?

This is why people with business smarts will always have a job.

Gary provided another example:

We did a segmentation analysis for an online travel aggregator, looking at purely search behavior data. We found a very interesting segmentation, but we had to put a lot of thought into what that search behavior meant.

If someone did a search, changed the data of the search, and then looked for the same destination, we could infer that they were flexible about dates. If they change the destination but didn't change the date, we infer they were flexible about destinations.

We created those as variables in the analysis and that became a very powerful predictor for them. sterne-092613-2

But that's not inherent in the behaviors, right? An analyst had to
figure out that changing those two things was a valuable variable for the analysis.

When we started, the obvious variable was destination - was a traveler going to Las Vegas, for example. But as we thought through the analysis, many additional variables that were even more interesting emerged. It was about how far out they were searching, how many days between the search, when the search was conducted, and what the destination date was. Added to this it was, whether they change the search, whether they change the destination, whether it was a weekend, and whether it was a weekend included in the stay. All those kinds of things turned out to be not surprisingly very important but those are things that, unless you feed them into the machine, you won't get a good analysis.

Once the human picks out the high value variables, the machine will do a great job figuring out which ones are important.

The lesson is to use that part of your brain that it's best at: intuition, relevance, reasoning, etc., and then let the machine do what it's best at: calculation, tabulation, tabulation, enumeration.

Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
- Albert Einstein

ClickZ Live Chicago Join the Industry's Leading eCommerce & Direct Marketing Experts in Chicago
ClickZ Live Chicago (Nov 3-6) will deliver over 50 sessions across 4 days and 10 individual tracks, including Data-Driven Marketing, Social, Mobile, Display, Search and Email. Check out the full agenda and register by Friday, August 29 to take advantage of Super Saver Rates!

ABOUT THE AUTHOR

Jim Sterne

Jim Sterne is an international consultant focused on measuring the value of the online marketing for creating and strengthening customer relationships. Sterne has written eight books on using the Internet for marketing, produces the eMetrics Marketing Optimization Summit and is co-founder and current chairman of the Digital Analytics Association.

COMMENTSCommenting policy

comments powered by Disqus

Get the ClickZ Analytics newsletter delivered to you. Subscribe today!

COMMENTS

UPCOMING EVENTS

Featured White Papers

IBM: Social Analytics - The Science Behind Social Media Marketing

IBM Social Analytics: The Science Behind Social Media Marketing
80% of internet users say they prefer to connect with brands via Facebook. 65% of social media users say they use it to learn more about brands, products and services. Learn about how to find more about customers' attitudes, preferences and buying habits from what they say on social media channels.

Marin Software: The Multiplier Effect of Integrating Search & Social Advertising

The Multiplier Effect of Integrating Search & Social Advertising
Latest research reveals 68% higher revenue per conversion for marketers who integrate their search & social advertising. In addition to the research results, this whitepaper also outlines 5 strategies and 15 tactics you can use to better integrate your search and social campaigns.

Resources

Jobs

    • Partnerships Senior Coordinator
      Partnerships Senior Coordinator (Zappos.com, Inc.) - Las VegasZappos IP, Inc. is looking for a Partnerships Senior Coordinator! Why join us? Our...
    • Assistant Product Listing Ads (PLA) Manager
      Assistant Product Listing Ads (PLA) Manager (Zappos.com, Inc.) - Las VegasZappos IP, Inc. is looking for an Assistant Product Listing Ads (PLA...
    • Marketing Technology Analyst
      Marketing Technology Analyst (Alfred Music) - Van NuysMarketing Technology Analyst DEFINITION Under the general/direct supervision of the head of...