Pareto Was Right

It’s often interesting how businesses can be lulled into a false sense of security by looking at superficial “topline” numbers without investigating what’s really going on at a more granular level. This often seems to be the case in online channels involving registrations or subscriptions.

A story from a business I worked at makes my point. I had recently arrived at the business and was starting to get my head around how it worked. One key metric the business looked at was the number of people who had registered to use the service.

At the time, this metric was growing very quickly. We had more than a million registered users, the graph was going up and to the right, and all was well with the world. Or so it seemed.

I wanted to see what parts of the service people were using the most, so I asked for a data extract showing which service type each registered user had actually used. When I got the data file from the database guy, I thought he had made a mistake. Out of the million-plus registered users we had, the data file he gave me only had details on about 20 percent of the users.

So I told the database guy that there seemed to be a problem with the data and sat down with him to check it out. Sure enough, when we looked at the data more closely it turned out that a massive chunk of the registered users never actually used the services they had signed up for. If they had, then they generally had only used one service once and a very significant proportion of all the activity on the site was due to a relatively small proportion of registered users.

This was an “aha” moment. We were tracking the wrong metric. Instead of focusing on the number of “registered” users, we needed to track the number of “active” users.

It was certainly a case of be careful what you measure, because what you measure is what you’ll get. Because the business was focused on measuring registrations, the drive was to generate as many registered users as possible, irrespective of the quality of those registrations and whether they were likely to actually do anything valuable on the site.

Because of that experience, I’m skeptical about reports or claims about the numbers of subscribers, the number of customers or the number of registered users. The reality is likely to be the same pattern of behavior as I found when I started to look in more detail at that business.

Economist Vilfredo Pareto was definitely right. It’s important to take an in depth look at data and understand in detail what the activity levels look like.

For example, consider a site that relies on user-generated content. Of all the people who have signed up to upload content, how many have actually done so? How many have done it more than once? When was the last time that they did it? How many people have done it in the last 30 days, 60 days, or 90 days?

These metrics are far more revealing about the health of the business that the superficial top line numbers that are often reported on.

Neil is off today. This column was originally published on Sept. 1, 2009 on ClickZ.

Related reading

Big Data & Travel
Flat design modern vector illustration concept of website analytics search information.