by Dr. Neal Krawetz
Editor's note: part one of this article series is available here.
The Simple Mail Transfer Protocol was never designed for security. SMTP dates all the way back to a 1973 extension to the FTP protocol. [ref 1] In 1973, computer security was not a significant concern, and the Internet architects were not even certain about their implementation of the email protocol. For example, RFC 524 describes the bases for SMTP as a separate protocol. The author included this caveat:
Although the command-set has evolved over time, it appears that people implemented SMTP on the basis of RFC 524 and it was assumed that the bugs, such as security concerns, would be addressed later. Unfortunately, in 2004 the oversights from RFC 524 are still being addressed and SMTP is too popular to replace overnight. Spam is one example of an abuse of the SMTP protocol -- most spam tools are designed to forge email headers, disguise senders, and obscure the origination system.
As a brief review from part one of this article series, current anti-spam solutions fall into four primary categories: filters, reverse lookups, challenges, and cryptography. Each of these solutions offers some relief to the spam problem, but they also have significant limitations. The first article looked at filters and reverse lookup solutions. This second part now focuses on the various types of challenge-based systems and cryptographic solutions. While there are many different aspects to these solutions, this paper only discusses the most common and significant concerns -- this paper is not intended to be a complete listing of implementation options, solutions, and issues.
Spam senders use automated bulk-mailing programs to generate millions of emails per day. Challenges attempt to impede bulk-senders by slowing the bulk-mailing process. People that send a few emails at a time should not be significantly impacted. Unfortunately, challenges are only successful when very few people use them. As their popularity increases, they are much more likely to interfere with desirable email than to deter unwanted spam.
There are two main types of challenges: challenge-response and proposed computational challenges.
Challenge-Response (CR) systems maintain a list of permitted senders. Email from a new sender is temporarily held without delivery. The new sender is sent an email that provides a challenge (usually a click on a URL or reply email). After completing the challenge, the new sender is added to the list of permitted senders and the original email is delivered. The belief is that spam senders using fake sender email addresses will never receive the challenge, and spam senders using real email addresses will not be able to reply to all of the challenges. Unfortunately, CR systems have a number of limitations including:
The marketing myth emphasizes two misconceptions: (1) a human must perform the challenge, and (2) these problems are too complex for automated solutions. In truth, most spam senders ignore these CR systems because they do not account for a large recipient base, not because the challenge is difficult. Many spam senders use valid email addresses for their scams or for validating mailing lists. When CR systems begin to interfere with spam operations, spammers will automate the responses to these challenges.
1.2.2 Computational Challenge
There are many proposed Computational Challenge (CC) systems that attempt to add a "cost" to sending email. Most CC systems use complex algorithms that are intended to take time. For a single user, the time is unlikely to be noticed. But for a bulk mailer such as a spam sender, the small delays add up, making it take too long to send millions of emails. Some examples of proposed CC systems include Hash Cash [ref 2] and Microsoft's Black Penny. [ref 3] Unfortunately, CC systems have their own set of implementation issues that are likely to prevent rapid adoption and unlikely to prevent spam. Examples of these limitations include:
The currently proposed computational challenges are unlikely to be widely adopted -- they do not appear to mitigate the spam problem and do appear to inconvenience legitimate mailers.
A few solutions have been proposed that use cryptography to validate the spam sender. Essentially, these systems use certificates to perform the authentication. Without a proper certificate, a forged email can be readily identified. Some proposed cryptographic solutions include:
The existing mail protocol (SMTP) has no explicit support for cryptographic authentication. Some of these proposed solutions extend SMTP (e.g., S/MIME, PGP/MIME, and AMTP), while others aim to replace the existing mail infrastructure (e.g., MTP). Interestingly, the MTP author mentions "SMTP is more than 20 years old, whereas modern requirements developed within the last 5-10 years. The large number of existing extensions to the syntax and semantics of SMTP show, that pure SMTP doesn't fulfill these requirements and that is too inflexible to be extended without modification of its syntax. [sic]" [ref 5] It could easily be argued that the large number of existing extensions to SMTP demonstrates its flexibility, not inflexibility, and that a completely new mail transport protocol is unnecessary.
When using certificates, such as X.509 or TLS, some type of certificate authority must be available. Unfortunately, if the certificates are stored in DNS then the private keys must be available for validation. (And if a spammer has access to the private keys, then they can generate valid public keys.) Alternately, a central trusted certificate authority (CA) could be used. Unfortunately, email is a distributed system and nobody wants to see a single CA in control of all email. Many solutions even permit multiple CA systems where, for example, the X.509 certificate identifies the validating CA server. This extension is vulnerable to the situation where a spammer runs a private CA server.
When there is no certificate authority, there needs to be some method for distributing keys between the sender and recipient. PGP, for example, requires pre-shared public keys. While this approach is viable for closed networks or close groups of friends, this does not extend well across large groups of individuals, particularly when new contacts may be established between any sender and any recipient. Essentially, pre-shared keys face similar problems to white-list filters: only known and established senders may contact the recipient.
Unfortunately, these cryptographic solutions are unlikely to stop spam. For example, let us assume that one of these solutions (any one) is globally accepted. These approaches do not validate that the email address is real -- they only validate that the sender had the correct keys for the email. This creates a few issues:
Anti-spam solutions summary
Spam has reached epidemic proportions and people are looking for quick fixes of any kind. There are many existing and proposed anti-spam solutions. While these options are viable in limited circumstances, they all appear to have significant limitations with regards to global acceptance and an ability to prevent spam.
In Part I we saw that spam filters, while being viable options for identifying spam, do not prevent spam and require constant maintenance. Reverse-lookup systems attempt to identify forged senders but restrict email's usability by preventing host-less and vanity domains, and restricting mobile users' abilities to send email from anywhere at anytime. In Part II we observed that challenge-response systems are only viable as long as they maintain a low profile, and computational challenges are unlikely to deter spammers. Cryptographic solutions, while accurately identifying forged email, do not easily expand to a global scale.
While many people believe that any anti-spam solution is better than nothing, most of these solutions impede regular users more than they prevent spammers. While some of these proposed options report to have effectively stopped spam in limited tests, they do not take into account that spammers adapt their code rapidly, on the order of days or weeks -- a good solution today is unlikely to be a good solution tomorrow.
About the author
Neal Krawetz has a Ph.D. in Computer Science and over 15 years of computer security experience. Dr. Krawetz is considered one of the leading experts in spam research and anti-spam technologies.
[ref 1] RFC 458 (Feb. 20, 1973): Mail retrieval via FTP. RFC 510 (May 30, 1973): Network mailbox addresses (user@system). RFC 524 (June 13, 1973): Branching from FTP to a standalone protocol. RFC 561 (Sept. 5, 1973): Standard mail headers.
[ref 4] The delay is actually more than double due to operating system overhead.
[ref 6] Source: Nua Internet How Many Online http://www.nua.ie/surveys/how_many_online/index.html, 5-February-2004.
This article originally appeared on SecurityFocus.com -- reproduction in whole or in part is not allowed without expressed written consent.