Sampling the Right Customers the Right Way

Last week, I covered the importance of going beyond site traffic statistics to understand why your customers do what they do. Today let’s explore the world of sampling, the key to unlocking the door to a successful research study. Sampling is the process of selecting customers to participate in your research.

Because it’s costly and often impractical to get feedback from every customer or site visitor that you have, you will need to take a sample of your total customer base or of your total site traffic. The way you choose your sample will have a significant impact on the accuracy of your research. A relatively small, well-chosen sample can provide extremely accurate information. On the other hand, a carelessly selected sample, even if it’s based on a large number of customers, can lead to incorrect conclusions and, ultimately, poor decisions.

Whose Feedback Do You Want?

The first step in any sample selection decision is to ask yourself, “Exactly who am I interested in getting feedback from?” This seemingly simple question can often mean the difference between high-quality and low-quality — or even unusable — data.

A manager wants customer feedback on the ease of ordering at her web site. She may define the customer population that she wishes to sample from as all people who place an order.

Another manager may choose to refine the population to include only customers who have ordered at least three times in the past month. This more narrowly defined population would not only have the benefit of ensuring that all respondents have had sufficient order experience to give a true picture of the order process, but it also limits feedback to a higher-quality segment of customers: repeat buyers.

Once the population has been defined, the next topic to consider is projectability. You will most likely want to be able to project findings beyond the customers who were actually surveyed. Take election polls as an example. A typical election poll is based on a sample of 2,000 people or so. Clearly pollsters aren’t particularly interested in just the votes of 2,000 people; they want to be able to project the results to predict the voting behavior of all 100 million U.S. voters.

Randomness Leads to Accuracy

Can such a small sample accurately represent the whole population?


In fact, one of the marvels of statistics is that a random sample of 2,000 people produces results that are (about) equally accurate whether the population is 200,000 or 2 billion.

Notice the word “random.”

Just about all of the statistical methods you forgot from Statistics 101 are based on the assumption that the sample is randomly chosen from the population. This means that each person has an equal chance of being entered in the sample, independent of everyone else. Random samples are the key to collecting projectable data.

Random sampling ensures that you aren’t introducing some unknown and unwanted bias into your results. Imagine that you ask site visitors about their satisfaction with the speed of your page downloads between the hours of 10 a.m. and 2 p.m. This group of users is clearly not a random sample; it’s limited to a particular slice of time. That four-hour period is likely to have an abnormally high percentage of surfers using a high-speed work connection. You would be missing the cries of pain from evening or weekend surfers who have to use home dial-ups.

Sampling Methods

There are a few ways to implement random sampling on the web. The first way is true random sampling. To carry this out, you need a list of all the customers in your population. This could be, for instance, a list of registered users. Then, using a computer random-number generator, simply pick the number of people you want from your list. This is the 21st-century version of picking names from a hat.

Another way to implement random sampling is called traffic-based probability sampling. Suppose your population is all site traffic for a particular month. If you have a steady stream of 25,000 users a month and you want a sample of 500 users, then you give each person a 1 in 50 chance of getting the survey.

Or you can use systematic random sampling. Systematic random sampling means you survey every nth user, replacing “n” with whatever number gives you the right sample size.

Look Beyond Convenience

A word of caution: Don’t fall back on convenience samples. A convenience sample is a group of people selected because selecting them is easy. A convenience sample might be the first 100 people to visit a site, a group of employees, or paid respondents.

Convenience samples are often unrepresentative for reasons unforeseen by researchers. Research based on convenience samples can be used as a launching pad for identifying issues, but extrapolation based on convenience samples can be dangerous.

We have just scratched the surface of sampling, but I hope I’ve opened your eyes to the implications of how the type of sampling you choose can affect your results. Next time we will cover the actual process of designing surveys and other media for customer feedback.

Happy sampling!

Related reading

site search hp