Endpoint Protection

Nov 26, 2003 02:00 AM

Migration User

by Laurent Oudot

Abstract

Like most advertising flyers found in postal mailboxes, millions of emails -- now classically referred to as spam -- fill email inboxes around the world everyday. Spam can be considered as the most annoying cyber-pollution that targets all of us with tons of unsolicited emails. Those emails usually contain advertisements and spammers are paid to spread as many of them as possible.

Though spam should generally not be considered a real cyber attack, it may be difficult to distinguish between virus-contaminated emails, phishing scams and bothersome ads (those containing tricky JavaScript or specific forged HTML used to track them). Moreover, spammers slow the servers receiving legitimate emails and may cause availability problems. While spammers earn money by embarrassing people, employees and netsurfers lose time by receiving unsolicited emails -- in some cases, hundreds per day. Companies may lose money too, through lost productivity, bandwidth charges, purchasing blacklists, and so on. Typical solutions against this cyber-plague may be to filter emails received by using content analysis or blacklists, and to fix poorly configured servers.

This paper will evaluate the usefulness of using honeypots to fight spammers. The first part of the article will explain some background information on spam. Then, we will try to understand how honeypots may detect, slow and stop such activities while promoting a clean Internet. Finally we will conclude with some future perspectives.

1.0 Introduction to spammers

1.1 What is Spam?

While spam is the name of a food dish containing "mystery meat" [ref 0], this is also what people call their unsolicited emails received on the Internet. The origin of the common use of this name is a Monty Python sketch [ref 1] where the word "spam" become so present that you cannot hear anything else (Vikings singing the praises of spam, waitress repeating spam [ref 2]). The idea is that if Internet users were just flooded by spam, nobody would be able to distinguish spam from normal emails. A security platoon could say with humor that the first casualty of spam is innocence.

In this paper, we will use the word spam to describe UBE (Unsolicited Bulk E-mail) and UCE (Unsolicited Commercial E-mail). Examples and logs given in this paper are inspired by real-life events, but they were modified to retain anonymity.

1.2 How spammers work

The spam is sent by spammers because it has become a paid activity of cyber mass advertisement. Spammers' work can be cut into different categories:

Harvest: build a database of targets by finding valid email addresses
Stealth and open proxies: work anonymously while sending ugly emails to their targets
Spam and open relays: find and use servers that accept to relay emails anywhere

We would need a book to describe everything to do with spam in enough detail, and the Internet is ever full of excellent resources that talk about this already, so let's focus on the important issues.

1.2.1 Email addresses get harvested

The first need for spammers is to get an updated list of targets. Many different ways exist to collect thousands and thousands of email addresses on Internet. When you send emails to UseNet, for example, your address will be available to simple, automatic programs that are looking at the headers of every message posted. By saving specific fields (From:, Reply-To:), spammers may easily build huge lists of potential targets. Another example of harvesting addresses may be through the use of poorly configured mailing lists that give out the list of its subscribers. A third technique is based again on simple, automatic programs, this time ones crawling Web pages on Internet. For each HTML Web page found, such a program will check for a mailto: link ("send me an email by clicking here") and will follow the Web links proposed to continue this sort of evil seeking.

Figure 1: Harvesting email addresses

You may also want to read [ref 3] other documents to get more detailed explanations about the harvesting of email addresses.

1.2.2 Open proxies

Spammers may either directly connect to a remote mail server, or bounce through open proxies. For example, the role of a Web proxy is to do the job of a Web client for someone else. When a Web client connects to a proxy, he asks for a Web page somewhere on Internet. The proxy will then grab this Web page by itself, and will return the obtained data to the client. In the logs of the remote Web server, usually we can only see the IP address of the proxy who did the Web requests.

An open proxy is a proxy service opened to the world for almost any kind of request, allowing anybody to remain anonymous while crawling the net. Such proxies are used a lot in the underground: blackhat people, warez people, etc. Open proxies are also useful for many spammers, because they will be able to stay anonymous while sending their unwanted emails.

Here is an example of a TCP session recorded by snort [ref 4], showing a remote proxy check probably launched by Earthlink. The client connects to the proxy on TCP port 8080, and doesn't ask for a Web page but instead for a TCP session initialized with a remote SMTP server (207.69.200.120) owing to the HTTP CONNECT function. The rest of this TCP session is SMTP, directly sent to the SMTP server (HELO, MAIL FROM, RCPT TO, DATA, QUIT).

 $ cat /var/log/snort/192.168.1.66/SESSION\:8080-4072 CONNECT 207.69.200.120:25 HTTP/1.0 HELO [217.128.a.b] MAIL FROM:<openrelay@abuse.earthlink.net> RCPT TO:<spaminator@abuse.earthlink.net> DATA Message-ID: <36af800461754252ab1107386a9cd8eb@openrelay@abuse.earthlink.net> To: <spaminator@abuse.earthlink.net> Subject: Open HTTP CONNECT Proxy X-Mailer: Proxycheck v0.45 This is a test of third-party relay by open proxy. These tests are conducted by the EarthLink Abuse Department. EarthLink, by policy, blocks such systems as they are discovered. Proxycheck-Type: http Proxycheck-Address: 217.128.a.b 36af800461754252ab1107386a9cd8eb Proxycheck-Port: 8080 Proxycheck-Protocol: HTTP CONNECT This test was performed with the proxycheck program. For further information see <http://www.corpit.ru/mjt/proxycheck.html/> . QUIT

Using a proxy server is quite efficient for a spammer to have anonymity. As proxy owners may have logs, spammers may fear that their IP address could be recorded (remote proxy log). Usually, spammers bet that badly configured proxies don't have logs. Their fear of logs is why sometimes they use chains of proxies to increase their luck -- they connect to an open proxy server (TCP Session), then ask it to connect to another known open proxy server (CONNECT a.b.c.d:3128), etc. For example:

Figure 2: Open relays and spammers

The longer the chain, the stealthier they become, but they will lose time as multiple bounces will result in multiple delays added.

1.2.3 Open relays

An open relay (which is sometimes called an insecure relay or a third-party relay) is a Mail Transfer Agent (MTA) that accepts third-party relays of e-mail messages even though they are not destined for its domain. As they forward emails that are neither to nor from a local user, open relays are used by spammers to route large volumes of unsolicited emails.

Such a poorly configured MTA lends its system and network resources to the remote abuser who is getting paid to send out spam. Usually, an organization that unwittingly relays spam may be blacklisted on international lists (RBL, etc). That would annoy internal users because they couldn't use their own email properly. A big ISP sadly blacklisted would probably lose clients and money.

2.0 Honeypots versus spammers

To quote the leader of the Honeynet Project, Lance Spitzner [ref 5], a honeypot is an information system resource whose value lies in unauthorized or illicit use of that resource.

In this chapter, we will see if it's possible to use honeypots technologies in the following cases:

when spammers come to your Web site to steal email addresses and transform them into future targets;
when spammers try to connect to your proxy servers and try to bounce elsewhere by abusing your services;
when spammers inject SMTP traffic to your email servers in order to send unsolicited emails through you.

2.1 Honeypots and harvesting

One of the first phases of a spammer is the harvesting of email addresses. Here we will focus on the harvesting through Web pages, which may be the easiest case to solve for those trying to defend against spam. Without saying that honeypots can fool spammers during this phase, there are some efficient techniques that don't exactly correspond to the definition of classical honeypots. This is the concept: while spammers browse the Web, if they read Web pages with fake email addresses, they will feed their database with invalid targets. Purists may say that this is not exactly a honeypot, so let's say it's like adding one spoon of honey on your Web pages.

During automatic harvesting of valid email addresses on the Web, spammers may sometimes be recognized because of the tools they use by checking the User-Agent field sent by their browser [ref 6]. Some people have decided to either block a specific User-Agent known to be used by spammers, or transparently redirect those Web clients to fake Web pages containing tons of fake email addresses. The trouble is that it's very easy for spammers to change the User-Agent. So those same people defending against spam then decided to create Web links on their pages that would be invisible for a human reader (e.g. white characters on a white background) but visible for a spambot following every link read in the HTML source. Such a Web page waiting for Spam bots will dynamically create fake email addresses to fool the spammers.

One idea could be to create tons of fake addresses. There is a quite good example of a piece of freeware called Wpoison [ref 7]. This CGI script added to your Web site will generate fake email addresses looking like real ones. A live demo can be tried on this Web site [ref 8].

Another technique could be to create a fake address containing specifically chosen information. The day this email address is used as a target of spam, the owner will be able to determine the IP used by the spammer.

 <? // PHP example taken from the frenchhoneynet Web site // replace by your domain, add recipients filtering on your MTA (mimedefang...) echo '<a href="mailto:'.$REMOTE_ADDR.'_'.date('y-m-j').'-spamming@frenchhoneynet.org"  title="There is no spoon">For stupid spambots'; ?>

This script will dynamically generate a mailto: link, containing a fake email address with the IP of the current Web client and the date. For example:

 <a href=mailto:80.13.aa.bb_03-11-17-spamming@frenchhoneynet.org>...

If the Web client is a spambot, it will add 80.13.aa.bb_03-11-17-spamming@frenchhoneynet.org in the database of potential targets. Now we suppose that a spammer uses this database. He will probably send an email to this virtual address.

Then the mail server administrator can filter incoming emails by looking at the recipients (on your MTA or eventually on your MUA [Mail User Agent]). If you receive an email destined to 80.13.aa.bb_03-11-17-spamming@frenchhoneynet.org, then you surely know that 80.13.aa.bb is the IP address that was used on November 17, 2003. And more than that, you know that this address was a spam harvesting source.

 # Example of a simple recipient filtering with Mimedefang http://www.mimedefang.org/] # Will filter incoming email containing a recipient address in the form # of those created by the latter PHP example. sub filter_recipient { 	my ($recipient, $sender, $ip, $hostname, $first, $helo) = @_; 	if($recipient =~ /^<.*-spamming@frenchhoneynet\.org>?$/i) 	{ return ("REJECT", "Spamming activity"); } 	return ("CONTINUE", "ok"); }

Though those techniques seem to be interesting, they will only work with stupid spambots, ones which are probably not used by skilled spammers. The more sophisticated spammers may use open proxies to crawl the net, and the dynamically created email address will just help with finding such proxies and the spammer will keep his anonymity.

2.2 Honeypots and open proxies

One of the main paths used by spammers to reach mail servers is going through open proxies that accept and freely transmit requests. Those open proxies play the role of screeners for the spammers that hide beyond them.

So, would it be so difficult to set up a fake open proxy in a honeypot ? No, and that's what were are going to look at.

By looking at your firewalls logs, you'll probably notice attempts to access TCP ports like :

1080 socks proxy server
3128 squid proxy server
8080 web caching service

Many basement-dwelling people "courageously" hiding behind their monitor, and using tools they don't understand, will scan the net to map all interesting services. Some of them share their information in public lists of proxies on the Internet (just use Google and search for things like "open proxies list"). By connecting to the answering TCP ports, sending a few packets may help to understand if the proxy is open or not (will it accept and go anywhere?).

What if we setup some honeypots that will answer positively to incoming requests? We'll be able to fool some spammers.

My favorite honeypot, made by Niels Provos, is called Honeyd [ref 9]. To create a fake relay server, simulating open proxies and an open mail relay, you could use such a configuration file :

 create relay  set relay personality "OpenBSD 2.9-stable" add relay tcp port 25 "sh /usr/local/share/honeyd/scripts/sendmail.sh $ipsrc $sport $ipdst $dport" add relay tcp port 3128 "sh /usr/local/share/honeyd/scripts/squid.sh $ipsrc $sport $ipdst $dport" add relay tcp port 8080 "sh /usr/local/share/honeyd/scripts/proxy.sh $ipsrc $sport $ipdst $dport" set relay default tcp action block set relay default udp action block bind 192.168.1.66 relay

This will ask Honeyd to simulate an OpenBSD 2.9 computer with the IP 192.168.1.66 and three TCP ports opened: 25, 128 and 8080. For each incoming request coming to those ports, Honeyd will launch the appropriate fake service (sendmail.sh, squid.sh, proxy.sh). If those services want to see what was sent by spammers, they just have to read data from STDIN. To reply to the spammers, they just have to write data to STDOUT (like a classical Inetd process).

To fool the remote spammer, we'll have to simulate part or all of the discussion.

As an interesting proof of concept, we will look at the tool called Bubblegum Proxypot [ref 10] which is a sharp, small tool. The only goal of this tool is to fool active spammers by simulating an open proxy. In comparison with Honeyd, it cannot simulate something else (Honeyd may be used to simulate anything you need); it cannot change its IP stack behavior, etc. Though it's a simpler tool, we'll quickly learn many things from spammers.

Depending of his skill, the spammer will either simply check that the proxy is open, or perhaps try to see if it is working properly. Remember that the spammer's goal is to make money. Thus spammers cannot afford to lose much time sending thousands of emails out for nothing. On my temporary honeypots, I saw both of the above behaviors.

With Proxypot, you can choose one of three possible configurations to fool the spammers:

smtp1: the whole SMTP connection is faked.
- Pros : no SMTP outbound traffic is needed, so it will save your network bandwidth.
  Cons : this will only fool novices and you'll have to chose the kind of SMTP server to simulate. If the spammer connects to the proxy and asks to go to a Sendmail server while you are faking a Qmail server, he may notice that it is a honeypot.
smtp2: connect to the real SMTP server, read its 220 banner and maybe issue a HELP command to find out what kind of server it is, then hang up and use that information to fake a more convincing SMTP session.
- Pros : if the spammer knows the version of the targeted email server, he will believe this is the real one and you won't have much of a fingerprinting problem.
  Cons : this will generate outbound traffic. You have to be sure of the software used, to avoid being used as either a real spam relay or a hack relay. If the spammer targets an SMTP server he owns, for example for his first email, he will notice that the SMTP session he sees though the proxy is not the same as the one going to his mail server.
smtp3: connect to the real SMTP server and pass through all recognized commands except DATA and EXPN. RCPT and VRFY are rate-limited.
- Pros : this is the extreme simulation and it's almost impossible to do better, because using DATA properly would deliver the email and this is something you want to avoid.
  Cons : like every simulator, a spammer may discover that this not a real one, and fingerprinting possibilities will still exist.

I personally used the option smtp2 and got thousands of spam through it. [Continue to Part 2]

Credits

Thanks to Niels Provos for his ideas and reviewing.

About the Author
Laurent OUDOT is a computer security engineer employed by the Commissariat a l'Energie Atomique in France. On his spare time, he is a member of the team Rstack with other security addicts. Concerning honeypots, Laurent is an active member of the French Honeynet Project which is part of the Honeynet Alliance.

View more articles by Laurent Oudot on SecurityFocus.

References for Part 1

[ref 0] Spam food

[ref 1] Monty Python , The SPAM sketch

[ref 2] The Infamous Monty Python Spam Skit, in streaming RealVideo

[ref 3] Uri Raz, How do spammers harvest email addresses?

[ref 4] Snort Intrusion Detection System

[ref 5] Lance Spitzner, "Honeypots, tracking the hackers", 2002

[ref 6] http://diveintomark.org/archives/2003/02/26/how_to_block_spambots_ban_spybots_and_tell_unwanted_robots_to_go_to_hell

[ref 7] Wpoison, a CGI to annoy harvesters with spam bots

[ref 8] Live demo of Wpoison

[ref 9] Niels Provos, Honeyd the daemon to build honeypots

[ref 10] Proxypot, a fake proxy daemon to fool spammers

[continued in Part 2]

This article originally appeared on SecurityFocus.com -- reproduction in whole or in part is not allowed without expressed written consent.

Abstract

Like most advertising flyers found in postal mailboxes, millions of emails -- now classically referred to as spam -- fill email inboxes around the world everyday. Spam can be considered as the most annoying cyber-pollution that targets all of us with tons of unsolicited emails. Those emails usually contain advertisements and spammers are paid to spread as many of them as possible.

Though spam should generally not be considered a real cyber attack, it may be difficult to distinguish between virus-contaminated emails, phishing scams and bothersome ads (those containing tricky JavaScript or specific forged HTML used to track them). Moreover, spammers slow the servers receiving legitimate emails and may cause availability problems. While spammers earn money by embarrassing people, employees and netsurfers lose time by receiving unsolicited emails -- in some cases, hundreds per day. Companies may lose money too, through lost productivity, bandwidth charges, purchasing blacklists, and so on. Typical solutions against this cyber-plague may be to filter emails received by using content analysis or blacklists, and to fix poorly configured servers.

This paper will evaluate the usefulness of using honeypots to fight spammers. The first part of the article will explain some background information on spam. Then, we will try to understand how honeypots may detect, slow and stop such activities while promoting a clean Internet. Finally we will conclude with some future perspectives.

1.0 Introduction to spammers

1.1 What is Spam?

While spam is the name of a food dish containing "mystery meat" [ref 0], this is also what people call their unsolicited emails received on the Internet. The origin of the common use of this name is a Monty Python sketch [ref 1] where the word "spam" become so present that you cannot hear anything else (Vikings singing the praises of spam, waitress repeating spam [ref 2]). The idea is that if Internet users were just flooded by spam, nobody would be able to distinguish spam from normal emails. A security platoon could say with humor that the first casualty of spam is innocence.

In this paper, we will use the word spam to describe UBE (Unsolicited Bulk E-mail) and UCE (Unsolicited Commercial E-mail). Examples and logs given in this paper are inspired by real-life events, but they were modified to retain anonymity.

1.2 How spammers work

The spam is sent by spammers because it has become a paid activity of cyber mass advertisement. Spammers' work can be cut into different categories:

Harvest: build a database of targets by finding valid email addresses
Stealth and open proxies: work anonymously while sending ugly emails to their targets
Spam and open relays: find and use servers that accept to relay emails anywhere

We would need a book to describe everything to do with spam in enough detail, and the Internet is ever full of excellent resources that talk about this already, so let's focus on the important issues.

1.2.1 Email addresses get harvested

The first need for spammers is to get an updated list of targets. Many different ways exist to collect thousands and thousands of email addresses on Internet. When you send emails to UseNet, for example, your address will be available to simple, automatic programs that are looking at the headers of every message posted. By saving specific fields (From:, Reply-To:), spammers may easily build huge lists of potential targets. Another example of harvesting addresses may be through the use of poorly configured mailing lists that give out the list of its subscribers. A third technique is based again on simple, automatic programs, this time ones crawling Web pages on Internet. For each HTML Web page found, such a program will check for a mailto: link ("send me an email by clicking here") and will follow the Web links proposed to continue this sort of evil seeking.

Figure 1: Harvesting email addresses

You may also want to read [ref 3] other documents to get more detailed explanations about the harvesting of email addresses.

1.2.2 Open proxies

Spammers may either directly connect to a remote mail server, or bounce through open proxies. For example, the role of a Web proxy is to do the job of a Web client for someone else. When a Web client connects to a proxy, he asks for a Web page somewhere on Internet. The proxy will then grab this Web page by itself, and will return the obtained data to the client. In the logs of the remote Web server, usually we can only see the IP address of the proxy who did the Web requests.

An open proxy is a proxy service opened to the world for almost any kind of request, allowing anybody to remain anonymous while crawling the net. Such proxies are used a lot in the underground: blackhat people, warez people, etc. Open proxies are also useful for many spammers, because they will be able to stay anonymous while sending their unwanted emails.

Here is an example of a TCP session recorded by snort [ref 4], showing a remote proxy check probably launched by Earthlink. The client connects to the proxy on TCP port 8080, and doesn't ask for a Web page but instead for a TCP session initialized with a remote SMTP server (207.69.200.120) owing to the HTTP CONNECT function. The rest of this TCP session is SMTP, directly sent to the SMTP server (HELO, MAIL FROM, RCPT TO, DATA, QUIT).

 $ cat /var/log/snort/192.168.1.66/SESSION\:8080-4072 CONNECT 207.69.200.120:25 HTTP/1.0 HELO [217.128.a.b] MAIL FROM:<openrelay@abuse.earthlink.net> RCPT TO:<spaminator@abuse.earthlink.net> DATA Message-ID: <36af800461754252ab1107386a9cd8eb@openrelay@abuse.earthlink.net> To: <spaminator@abuse.earthlink.net> Subject: Open HTTP CONNECT Proxy X-Mailer: Proxycheck v0.45 This is a test of third-party relay by open proxy. These tests are conducted by the EarthLink Abuse Department. EarthLink, by policy, blocks such systems as they are discovered. Proxycheck-Type: http Proxycheck-Address: 217.128.a.b 36af800461754252ab1107386a9cd8eb Proxycheck-Port: 8080 Proxycheck-Protocol: HTTP CONNECT This test was performed with the proxycheck program. For further information see <http://www.corpit.ru/mjt/proxycheck.html/> . QUIT

Using a proxy server is quite efficient for a spammer to have anonymity. As proxy owners may have logs, spammers may fear that their IP address could be recorded (remote proxy log). Usually, spammers bet that badly configured proxies don't have logs. Their fear of logs is why sometimes they use chains of proxies to increase their luck -- they connect to an open proxy server (TCP Session), then ask it to connect to another known open proxy server (CONNECT a.b.c.d:3128), etc. For example:

Figure 2: Open relays and spammers

The longer the chain, the stealthier they become, but they will lose time as multiple bounces will result in multiple delays added.

1.2.3 Open relays

An open relay (which is sometimes called an insecure relay or a third-party relay) is a Mail Transfer Agent (MTA) that accepts third-party relays of e-mail messages even though they are not destined for its domain. As they forward emails that are neither to nor from a local user, open relays are used by spammers to route large volumes of unsolicited emails.

Such a poorly configured MTA lends its system and network resources to the remote abuser who is getting paid to send out spam. Usually, an organization that unwittingly relays spam may be blacklisted on international lists (RBL, etc). That would annoy internal users because they couldn't use their own email properly. A big ISP sadly blacklisted would probably lose clients and money.

2.0 Honeypots versus spammers

To quote the leader of the Honeynet Project, Lance Spitzner [ref 5], a honeypot is an information system resource whose value lies in unauthorized or illicit use of that resource.

In this chapter, we will see if it's possible to use honeypots technologies in the following cases:

when spammers come to your Web site to steal email addresses and transform them into future targets;
when spammers try to connect to your proxy servers and try to bounce elsewhere by abusing your services;
when spammers inject SMTP traffic to your email servers in order to send unsolicited emails through you.

2.1 Honeypots and harvesting

One of the first phases of a spammer is the harvesting of email addresses. Here we will focus on the harvesting through Web pages, which may be the easiest case to solve for those trying to defend against spam. Without saying that honeypots can fool spammers during this phase, there are some efficient techniques that don't exactly correspond to the definition of classical honeypots. This is the concept: while spammers browse the Web, if they read Web pages with fake email addresses, they will feed their database with invalid targets. Purists may say that this is not exactly a honeypot, so let's say it's like adding one spoon of honey on your Web pages.

During automatic harvesting of valid email addresses on the Web, spammers may sometimes be recognized because of the tools they use by checking the User-Agent field sent by their browser [ref 6]. Some people have decided to either block a specific User-Agent known to be used by spammers, or transparently redirect those Web clients to fake Web pages containing tons of fake email addresses. The trouble is that it's very easy for spammers to change the User-Agent. So those same people defending against spam then decided to create Web links on their pages that would be invisible for a human reader (e.g. white characters on a white background) but visible for a spambot following every link read in the HTML source. Such a Web page waiting for Spam bots will dynamically create fake email addresses to fool the spammers.

One idea could be to create tons of fake addresses. There is a quite good example of a piece of freeware called Wpoison [ref 7]. This CGI script added to your Web site will generate fake email addresses looking like real ones. A live demo can be tried on this Web site [ref 8].

Another technique could be to create a fake address containing specifically chosen information. The day this email address is used as a target of spam, the owner will be able to determine the IP used by the spammer.

 <? // PHP example taken from the frenchhoneynet Web site // replace by your domain, add recipients filtering on your MTA (mimedefang...) echo '<a href="mailto:'.$REMOTE_ADDR.'_'.date('y-m-j').'-spamming@frenchhoneynet.org"  title="There is no spoon">For stupid spambots'; ?>

This script will dynamically generate a mailto: link, containing a fake email address with the IP of the current Web client and the date. For example:

 <a href=mailto:80.13.aa.bb_03-11-17-spamming@frenchhoneynet.org>...

If the Web client is a spambot, it will add 80.13.aa.bb_03-11-17-spamming@frenchhoneynet.org in the database of potential targets. Now we suppose that a spammer uses this database. He will probably send an email to this virtual address.

Then the mail server administrator can filter incoming emails by looking at the recipients (on your MTA or eventually on your MUA [Mail User Agent]). If you receive an email destined to 80.13.aa.bb_03-11-17-spamming@frenchhoneynet.org, then you surely know that 80.13.aa.bb is the IP address that was used on November 17, 2003. And more than that, you know that this address was a spam harvesting source.

 # Example of a simple recipient filtering with Mimedefang http://www.mimedefang.org/] # Will filter incoming email containing a recipient address in the form # of those created by the latter PHP example. sub filter_recipient { 	my ($recipient, $sender, $ip, $hostname, $first, $helo) = @_; 	if($recipient =~ /^<.*-spamming@frenchhoneynet\.org>?$/i) 	{ return ("REJECT", "Spamming activity"); } 	return ("CONTINUE", "ok"); }

Though those techniques seem to be interesting, they will only work with stupid spambots, ones which are probably not used by skilled spammers. The more sophisticated spammers may use open proxies to crawl the net, and the dynamically created email address will just help with finding such proxies and the spammer will keep his anonymity.

2.2 Honeypots and open proxies

One of the main paths used by spammers to reach mail servers is going through open proxies that accept and freely transmit requests. Those open proxies play the role of screeners for the spammers that hide beyond them.

So, would it be so difficult to set up a fake open proxy in a honeypot ? No, and that's what were are going to look at.

By looking at your firewalls logs, you'll probably notice attempts to access TCP ports like :

1080 socks proxy server
3128 squid proxy server
8080 web caching service

Many basement-dwelling people "courageously" hiding behind their monitor, and using tools they don't understand, will scan the net to map all interesting services. Some of them share their information in public lists of proxies on the Internet (just use Google and search for things like "open proxies list"). By connecting to the answering TCP ports, sending a few packets may help to understand if the proxy is open or not (will it accept and go anywhere?).

What if we setup some honeypots that will answer positively to incoming requests? We'll be able to fool some spammers.

My favorite honeypot, made by Niels Provos, is called Honeyd [ref 9]. To create a fake relay server, simulating open proxies and an open mail relay, you could use such a configuration file :

 create relay  set relay personality "OpenBSD 2.9-stable" add relay tcp port 25 "sh /usr/local/share/honeyd/scripts/sendmail.sh $ipsrc $sport $ipdst $dport" add relay tcp port 3128 "sh /usr/local/share/honeyd/scripts/squid.sh $ipsrc $sport $ipdst $dport" add relay tcp port 8080 "sh /usr/local/share/honeyd/scripts/proxy.sh $ipsrc $sport $ipdst $dport" set relay default tcp action block set relay default udp action block bind 192.168.1.66 relay

This will ask Honeyd to simulate an OpenBSD 2.9 computer with the IP 192.168.1.66 and three TCP ports opened: 25, 128 and 8080. For each incoming request coming to those ports, Honeyd will launch the appropriate fake service (sendmail.sh, squid.sh, proxy.sh). If those services want to see what was sent by spammers, they just have to read data from STDIN. To reply to the spammers, they just have to write data to STDOUT (like a classical Inetd process).

To fool the remote spammer, we'll have to simulate part or all of the discussion.

As an interesting proof of concept, we will look at the tool called Bubblegum Proxypot [ref 10] which is a sharp, small tool. The only goal of this tool is to fool active spammers by simulating an open proxy. In comparison with Honeyd, it cannot simulate something else (Honeyd may be used to simulate anything you need); it cannot change its IP stack behavior, etc. Though it's a simpler tool, we'll quickly learn many things from spammers.

Depending of his skill, the spammer will either simply check that the proxy is open, or perhaps try to see if it is working properly. Remember that the spammer's goal is to make money. Thus spammers cannot afford to lose much time sending thousands of emails out for nothing. On my temporary honeypots, I saw both of the above behaviors.

With Proxypot, you can choose one of three possible configurations to fool the spammers:

smtp1: the whole SMTP connection is faked.
- Pros : no SMTP outbound traffic is needed, so it will save your network bandwidth.
  Cons : this will only fool novices and you'll have to chose the kind of SMTP server to simulate. If the spammer connects to the proxy and asks to go to a Sendmail server while you are faking a Qmail server, he may notice that it is a honeypot.
smtp2: connect to the real SMTP server, read its 220 banner and maybe issue a HELP command to find out what kind of server it is, then hang up and use that information to fake a more convincing SMTP session.
- Pros : if the spammer knows the version of the targeted email server, he will believe this is the real one and you won't have much of a fingerprinting problem.
  Cons : this will generate outbound traffic. You have to be sure of the software used, to avoid being used as either a real spam relay or a hack relay. If the spammer targets an SMTP server he owns, for example for his first email, he will notice that the SMTP session he sees though the proxy is not the same as the one going to his mail server.
smtp3: connect to the real SMTP server and pass through all recognized commands except DATA and EXPN. RCPT and VRFY are rate-limited.
- Pros : this is the extreme simulation and it's almost impossible to do better, because using DATA properly would deliver the email and this is something you want to avoid.
  Cons : like every simulator, a spammer may discover that this not a real one, and fingerprinting possibilities will still exist.

I personally used the option smtp2 and got thousands of spam through it. [Continue to Part 2]

Credits

Thanks to Niels Provos for his ideas and reviewing.

About the Author
Laurent OUDOT is a computer security engineer employed by the Commissariat a l'Energie Atomique in France. On his spare time, he is a member of the team Rstack with other security addicts. Concerning honeypots, Laurent is an active member of the French Honeynet Project which is part of the Honeynet Alliance.

View more articles by Laurent Oudot on SecurityFocus.

References for Part 1

[ref 0] Spam food

[ref 1] Monty Python , The SPAM sketch

[ref 2] The Infamous Monty Python Spam Skit, in streaming RealVideo

[ref 3] Uri Raz, How do spammers harvest email addresses?

[ref 4] Snort Intrusion Detection System

[ref 5] Lance Spitzner, "Honeypots, tracking the hackers", 2002

[ref 6] http://diveintomark.org/archives/2003/02/26/how_to_block_spambots_ban_spybots_and_tell_unwanted_robots_to_go_to_hell

[ref 7] Wpoison, a CGI to annoy harvesters with spam bots

[ref 8] Live demo of Wpoison

[ref 9] Niels Provos, Honeyd the daemon to build honeypots