Pearl of an Antispam Program


Founded in 1997, ActiveState started out by building what Chris Kraft, ActiveState director of product management describes as “quality assured open source distributions.” The company used basic programming tools such as Perl, Python, and Tcl,but branched out into building developer tools for open source languages. Today, ActiveState claims to have built working relationships with over 4 million developers through sales channels and online forums.

ActiveState developers decided that Perl was the best open source text processor, so naturally it would be a good foundation for an antispam application. The company launched PerlMX over a year ago and sought out enterprise-class clients as its primary target. Two months ago, ISPs starting taking an interest in ActiveState’s antispam applications. As a result, the company is currently working with service providers to make its antispam solution more attractive to ISPs.

Set to debut in mid-November, ActiveState’s PureMessage application uses probability modeling to determine whether a particular email message is or is not spam. The application incorporates other open source antispam initiatives including blacklists and SpamAssassin. Some day, ActiveState hopes to incorporate the SIEVE antispam language (IETF RFC 3028) into the application.

The program can recognize basic spam techniques, such as dictionary attacks. It also performs reverse domain name server (DNS) lookups on incoming messages. The program examines email messages for spam characteristics, which include:

  • Keywords or phrases, such as “amazing” or “casino”

  • Specific HTML style tags, such as a colored background, forms, and iframes
  • Complex spam patterns such as
    (?:You (?:were sent|have received|are receiving)|You’re receiving).{0,15}(?:message|e-?mail)s? because – if you (?:(?:want|wish|care|prefer) not to |(?:don’t|do not) (?:want|wish|care) to )(?:be contacted again|receive (any)?s*(?:more|future|further) (?:e?-?mail|messages?|offers|solicitations))
  • Text formats, such as all caps, spaces between all caps, and multiple exclamation points

The application completes a probability assessment that determines whether the email is unsolicited based on the percentage of spam-like characteristics the message exhibits. Next, it reviews settings to determine whether the email is rejected or sent through to the end user. ActiveState claims its antispam product rejects 98 percent of all unsolicited email.

A special policy enforcement bundle is capable of forwarding questionable email to a quarantine box and can also enforce corporate email policies or append a legal disclaimer to every outgoing email. Through an OEM agreement with McAfee, the company provides McAfee antivirus to PureMessage users.

Hugh Messenger, senior network administrator at Andalusia, AL-based AlaWeb, said the key advantage of PureMessage is its adaptability.

“Before we adopted PureMessage we used Lyris MailShield, which is a very good product, but it required a lot of hands-on configuration and updating,” Messenger said. “We like PureMessage because we don’t have to keep up with who’s doing the spamming. As long as we have the latest package, it identifies the junk by itself.”

Like most system administrators, Messenger said he takes spam very seriously. But ISPs have to balance antispam technology with the demands of different users.

“We’ve got two vocal classes of users. One group does not want us to touch their mail, and another group assumes that all spam they get is from us,” Messenger said. “We’re an ISP, not a big corporation, and we cannot make sweeping policy decisions about what mail people can and cannot receive. In a corporation you can install a filter and announce that it’s company policy, but when people are paying to get email, you need to listen to their needs.”

In order to respond to customer’s needs, Messenger set up a special Web page that allows them to determine how the spam filter is used. He said building the form and adding the opt-out feature was relatively easy.

“It’s 317 lines of Perl. Developing it was not too difficult because we already had set up a Web page for customers to manage their accounts,” Messenger said. “It’s not a feature in the current version of the product, although they may add it in future versions.”

AlaWeb customers are allowed to choose from three options: opt out of email filtering entirely, accept all filtering, or ask that each filtered email be forwarded with an altered subject line. Messenger said that the altered subject line shows the spam probability percentage that the filter assigned to that particular email.

“We’ll just prepend something like ’Spam 87%’ in the subject line,” Messenger said. The company calls this feature “subject striping.”

When a customer complains about not receiving desired email messages, such as a newsletter subscription or other bulk notices, they can add the newsletter to a whitelist and allow the messages to reach its target. But Messenger advises that novice Perl programmers precede with caution.

“Perl is a powerful language,” Messenger said. “You can get into trouble with any language that makes it easy to do cool things. Nobody should ever work on live servers.”

Messenger is monitoring the success of the program with his own email account.

“I get about 100 spam messages per day, and perhaps six get through,” he said. “I’ve had only one or two false positives in six months. I use the subject striping option.”

AlaWeb uses a pair of Linux boxes, each a dual processor 1.2 GHz box with 3GB of RAM, although Messenger is planning to upgrade to 4GB of RAM. He estimates that AlaWeb receives 600,000 email messages each day, of which 40 percent are spam.

“Six months ago, we saw what we feared was the elbow of a massive spam curve,” Messenger said. “We were right. Email traffic doubled, and the percentage that is spam rose from about 25 percent to 40 percent or even 50 percent.”

Messenger is glad to have been prepared for the surge in junk email. He said the best benefit, however, was the positive responses from users.

“We got many thank-you notes from users when we switched to the new product, even from people who did not yet know what we’d changed,” Messenger said.

Pricing and availability
The product is available now. Pricing is based on a one-time, per-server license fee, making it potentially very appealing for use in an ISP environment.

Related reading