Shopping for a Software Solution, Part 1: Statistical Analysis

I’m constantly asked, “What software do you use?” and “What software should we purchase?”

Deciding what software to purchase to help analyze customer data often takes more time than it should. I’ve spent many hours evaluating and using software, and sat through countless vendor presentations. So now I’ll try to save you the trouble of doing the same.

There are various analytical software categories. It isn’t possible to cover them all in one column. Some software vendors, such as WebSideStory, have products especially geared toward Internet-related data. Today, let’s focus on traditional analysis software. Next time, we’ll look at reporting, or Internet-focused, applications.

For basic analysis of simple, small data sets (5,000-10,000 records or fewer) you can probably do everything you need with Microsoft Excel. There’s no reason to spend thousands of dollars on software if you can do everything with something much less expensive and more readily available.

For larger data sets or more advanced analysis, you’ll need a more advanced statistical software package, such as SAS or SPSS. (Interestingly, both companies once went by lengthy monikers but abbreviated them to play to a broader market.)

Silicon Graphics (SGI) developed a graphical analysis software package, MineSet, with amazing capabilities. It’s now maintained by Purple Insight, a U.K. company. Another fairly well-known package is Minitab, which many students are exposed to at school.

Regardless of what software you use, you’ll most likely accomplish tasks in the following three areas:

  • Data cleansing, manipulation, and initial exploration. In this stage, the analyst does things such as clean out records that obviously contain “bad data”; check to see if the right data has gone into the right variables (is income accidentally in the address field?); and create new variables (e.g., “month acquired” to analyze customer acquisition by month).
  • Basic analysis. The analyst calculates frequencies and means, as well as creates data cross tabulations (e.g., income by age, or customer response by age and income bracket) and generates basic graphs.
  • Advanced analysis. Cluster analysis (for segmentation), predictive modeling (including regression analysis), and factor analysis are among the algorithms analysts use to develop marketing insights.

Whatever software you evaluate, it must be able to accomplish the above tasks satisfactorily. Fortunately, the two most popular statistical packages, SPSS and SAS, do good jobs. If you’re looking mainly for graphical displays of data rather than data cleansing or manipulation, investigate MineSet.

Though SPSS and SAS do good jobs on the above tasks, they differ in ease of use and cost. A typical one-user license for SAS runs around $5,000 per year. If your company needs the SAS Enterprise Miner system (you don’t need it for general analysis and basic reporting), you’ll quickly encounter invoices exceeding $80,000.

SPSS, on the other hand, typically charges a one-time fee of about $3,000. It does offer optional “updates,” which run in the hundreds of dollars, but otherwise there are no additional costs.

Both packages can be used via menus (included in SPSS, an extra charge in SAS). However, the SAS menus are pathetic. They’re not practical for beginners, precisely the group who needs them most. The SPSS menus follow the pull-down style. This makes the program easy to use, even for a statistical novice.

If you’re a marketer or analyst who just wants an easy tool to use to help with some analysis, SPSS is the way to go. It costs less, does everything you want, and is easier to use than SAS.

SAS contains an amazing collection of algorithms that can do practically every statistical function known to humanity. Originally developed with the help of statistical users around the world, who freely wrote and shared code, the SAS library of useable functions is very extensive. It has at least 11 “cluster” algorithms to choose from (SPSS has two). But unless you know a great deal about statistics, you won’t want to learn the differences between 11 cluster algorithms.

If you’re already an SAS operation, there’s no reason (other then cost) to switch to SPSS. But if you’re choosing between the two, go with SPSS. Only interested in graphical features? Then MineSet is probably the best choice.

In part two of this series, we’ll look at the most popular reporting software.

Related reading