Testing is great. I love testing; one of the reasons I love e-mail is all the data you get. There’s so much information on how readers are interacting with your message, much more than you get from postal direct mail.
When you’re doing e-mail testing, many people (myself included) tend to focus on the trees, not the forest. It’s about the results of that specific send or, in the case of triggered messages, the days or weeks it takes to get statistically significant results. But a recent client experience reminded me that it’s important to look at the bigger picture.
For this client, we’re testing welcome messages; the welcome message is sent immediately when a person signs up to receive e-mail. Regular readers of this column will know that a welcome message is a transactional message – which means that it typically garners much higher open rates than a standard promotional message.
The primary goal of the testing was to reduce the spam complaint rate (the list is 100 percent opt-in, but prior to this engagement this metric was at a dangerous level). Secondarily the client is looking to:
- Decrease their unsubscribe rate.
- Increase customer interaction with the e-mail (read: opens and clicks).
- Maintain or grow the ‘revenue generated per e-mail assumed delivered’ figure.
This is a tall order. But I digress.
We’ve done four tests to date. None of our test versions has bested the control, meaning that the control has been the same throughout. It’s a frustrating situation when nothing is able to beat the control, and that always gnaws at me, but there was more here that was bothering me.
I finally figured it out when I pulled the metrics for just the control version. I can’t share the actual data with you (client confidentiality), but let me share a few of the extremes.
Remember the primary reason we were testing? It was the spam complaint rate, which was at a dangerous level where blacklisting becomes a possibility.
Over the course of testing the spam complaint rate on the control e-mail, which was same throughout, fell by half a percentage point, a decrease of 325 percent. I wish I could tell you that our testing had something to do with this, but I can’t; since the creative was the same across all sends there must have been external factors at play which caused the decrease.
The unsubscribe rate also fluctuated, but just by 43 percent, which was a variance of one percentage point. It’s interesting to note that while spam complaints went consistently down, the unsubscribe rate has a steady upward trend except for Test 3, where it decreases.
Remember, we’ll be looking at the same welcome message creative, sent to new subscribers, over a 5 month period.
Revenue per e-mail assumed delivered also fluctuated during this period, to the tune of $17 or 107 percent. This metric wasn’t tracked for the first two months, so this fluctuation happened in just a three month period.
We saw similar variances in:
- Deliverability, 32 percentage points, 47 percent.
- Open Rate, 5 percentage points, 31 percent.
- Click-through Rate, 4 percentage points, 54 percent.
- Click-to-Open Rate, 14 percentage points, 27 percent.
So here’s the end game. None of this negates the A/B split test results we got from our testing. But it does negate the results from the first test (where we compared the test results, sent in month three, to the control results from month 2). And it raises the concern that what performs well this month cannot be counted on to do as well next month (see the revenue results for an extreme example).
So what’s the moral of the story? Look at your test results in terms of the forest as well as the trees. If there are large variations in performance on the same creative from send to send, explore what other factors are impacting performance — and work to optimize those as well as your creative to boost your bottom line results over the long term.
And if you found this article informative, join me at EEC14 and attend my pre-conference workshop; use code SPK50 to save $50 on your registration fee.
Until next time,