Spam and Fraud Activity Trends | Analysis of Spam Activity Trends | Analysis of Spam Activity by Geography, Industry Sector and Company Size | Analysis of Spam Delivered by Botnets | Spam Botnet Analysis – A Strategic Viewpoint | Significant Spam Tactics | Spam by Language | Spam by Category | Future Spam Trends: BGP Hijacking | Phishing Activity Trends | Analysis of Phishing Activity by Geography, Industry Sector and Company Size
Spam by Category
BackgroundSpam is created in a variety of different styles and complexities. Some spam is plain text with a URL; some is cluttered with images and/or attachments. Some comes with very little in terms of text, perhaps only a URL. And, of course, spam is distributed in a variety of different languages. It is also common for spam to contain “Bayes poison” (random text added to messages that has been haphazardly scraped from websites to “pollute” the spam with words bearing no relation to the intent of the spam message itself). Using Bayes poison is done to thwart spam filters that typically try to deduce spam based on a database of words that are frequently repeated in spam messages.
Any automated process to classify spam into one of the categories following would need to overcome this randomness issue. For example, the word “watch” may appear in the random text included in a pharmaceutical spam message, posing a challenge as to classifying the message as pharmaceutical spam or in the watches/jewelry category. Another challenge occurs when a pharmaceutical spam contains no obvious pharmaceutical-related words, but only an image and a URL.
Spammers attempt to get their messages through to the recipients without revealing too many clues that the message is spam. Any such clues found in the plain text content of the email can be examined using automated anti-spam techniques. A common way to overcome automated techniques is by using random text, but an equally effective way is to include very little in the way of extra text in the spam and to instead include a URL in the body of the message.
Spam detection services often resist classifying spam into different categories because it is difficult to do (for the reasons above) and because the purpose of spam detection is usually to determine whether the message is spam and to block it, rather than to identify its subject matter. In order to overcome the ambiguity faced by using automated techniques to classify spam, the most accurate way to do it is to have someone classify unknown spam manually. While time-consuming, this process provides much more accurate results. An analyst can read the message, understand the context of the email, view images, follow URLs, and view websites in order to gather the bigger picture around the spam message.
MethodologyOnce per month, several thousand random spam samples are collected and classified by Symantec.cloud into one of the following categories:
- Diet/Weight Loss
- Jobs/Money Mules
- Mobile Phones
- Unsolicited Newsletters
NB. These percentages represent the overall average of the monthly percentages for each category during the year, and as such the overall total for all categories will not equate to 100%.
- Pharmaceutical products still dominate, although to a lesser extent than in previous years. Approximately two fifths (39.6%) of all spam in 2011 was related to pharmaceutical products, a fall of 34.4 percentage points compared with 2011. This was in large part as a result of the disruption of the Rustock botnet. Pharmaceutical spam accounted for the majority of Rustock’s spam output.
- The disruption of the Rustock botnet in March 2011 had a major impact on the decline in pharmaceutical spam products, although other botnets have also been involved in distributing pharmaceutical spam in 2011, including Grum, Cutwail, and Donbot.
- A category with a low percentage still means millions of spam messages. Although it is difficult to be certain what the true volume of spam in circulation is at any given time, Symantec estimates that approximately 42.1 billion spam emails were sent globally each day in 2011. Where some of the categories listed earlier represent 0.5 percent of spam, this figure equates to more than 210 million spam emails in a single day.
- Spam related to Watches/Jewelry, Sexual/Dating, Casino/Gambling, Unsolicited Newsletters and Scams/Fraud all increased. Particularly notable is increase in the Sexual/Dating category, which rose by 12.1 percentage points since 2010. These are often email messages inviting the recipient to connect to the scammer through instant messaging, or a URL hyperlink where they are then typically invited to a pay-per-view adult-content Web cam site. Often any IM conversation would be handled by a bot responder, or a person working in a low-pay, offshore call center.