Data Keeps YP Relevant in the Digital World

Content Takeover Big Data & Analytics

Before Google became a verb, people would look up information about local businesses in paper-bound devices known as “phone books.” Many of the major phonebook companies are long gone, but Yellow Pages, now known as YP, is thriving in the digital world. And big data is a big reason why.

Many of YP’s former competitors treated burgeoning technology as an IT problem and spent as little as possible. On the other hand, YP invested in a massive instrumentation around data. The platform’s capabilities have since evolved to process 6 petabytes of total data, with a terabyte coming every day.

“If we were still going to be around in five years, we had to get really good at ad serving and the advertising experience,” says chief technology Officer Darren Clark. “We had to get really good with consumer behavior and search relevance. We knew we shouldn’t be penny wise and pound foolish.”

If 6 petabytes sounds like a staggering number, that’s because it is – 1 petabyte is enough to store the DNA of every person in the U.S. and then clone them all twice. YP’s daily data, which is processed every 15 minutes, is the equivalent of 2.5 billion ad impressions, clicks, site hits and calls captured on behalf of customers.

YP has dozens of data scientists – statistics and advanced analytics backgrounds are preferred when hiring new ones – who value all data, though some are sets prioritized over others.

“We assume all data is valuable, and we prune and maintain the base of data to keep it at granular levels,” he says. “But there is a cap in granularity retention. How long can you keep something that gets one look over a year?

“There’s a lot of stuff we think is more valuable that we keep longer and examine in finer detail. Everything around conversion events, anything that connects businesses and consumers, those are the most valuable things in the platform,” he continues.

Phone calls are one example: who made the call, how long they lasted, what was discussed. That conversion information helps with targeting, much like location, another highly-valued data set, especially as mobile adoption and in-app search continues to skyrocket.

“If you’re in your house, we see people searching via the web, doing searches that are more related to the household in nature. They’re less concerned with geography and distance. It’s a factor but not a loud factor,” Clark says. “If we see someone use their device away from the home, what they care most about is distance relevance.”

YP’s data strategy also extends to that around the 21 million businesses that advertise on the platform. According to Clark, having a better understanding of those businesses allows YP to serve them better – buying the right traffic and monitoring conversion at scale, for example – which subsequently trickles down to consumers.

Related reading