Not all Reputation Technologies are Created Equal
Have you ever noticed how movies tend to come in waves? A few years ago it seemed like every action movie had a space theme; then the following year the big new movies featured some kind of natural disaster. This past summer it seemed like every other movie was in 3-D. Technology, as we all know, has waves too, and the security industry is no different. For example, recently there has been a lot of talk about reputation-based security and suddenly it seems like every vendor is claiming to have some type of reputation technology. But, not all technologies are created equal, so I thought I’d take a few minutes to look at what makes Symantec’s reputation-based technology so very different.
Why is a new approach needed?
Two fairly recent trends have had a negative impact on the effectiveness of traditional approaches to security. First, many of today’s threats are highly polymorphic—they are able to easily hide because nearly every instance of the threat is ever so slightly different from its predecessor. Second, most threats today are delivered via Web-based attacks. The Web greatly eases the wide distribution of polymorphic threat variants and together these two techniques push the limits of conventional defenses. To put this in perspective, consider that Symantec has gone from generating dozens of new virus definitions per day ten years ago, to generating an average of more than 15,000 new virus definitions each day, so far, in 2009. Other security vendors are no different. Not only has the volume of signatures now become a potential resource drain on the client, but also each individual signature contributes far less to a users’ overall protection. Clearly something better is needed.
What does Symantec mean by Reputation-based security?
More than three years in the making, Symantec’s reputation-based technology uses “the wisdom of crowds” connected to cloud-based intelligence to identify malware in an entirely new way beyond traditional signatures and behavior-based detection. To do this, reputation-based technology gathers application data for both potentially good and bad programs from multiple sources, including:
• Anonymous data contributed by tens of millions of Norton Community Watch members
• Data provided by software publishers
• Anonymous data contributed by enterprise customers in a data collection program tailored to large enterprises
The data we collect is continually imported and fed into a reputation engine where dozens of attributes for each file, such as file age, file download source, digital signature, and file prevalence are combined using a statistical reputation algorithm to determine a file’s safety reputation. This allows Symantec to produce a security reputation rating for every software file ever encountered by every participating Symantec user, all without ever having to scan the file itself.
So how is this different to what competitors are offering?
Many other vendors are currently talking about reputation as a novel approach, but on closer examination, it doesn’t add up to much. In fact, it doesn’t look like they’re really talking about reputation at all, and it certainly doesn’t look like they’ve addressed the real problem of those polymorphic malicious files. We commonly see two techniques being talked about a lot in the same sentence as reputation: “The cloud” and “website reputation.” Let’s look at each for a moment and consider why they don’t really address the problem.
The cloud. Other vendors talk about shifting security “into the cloud” but they typically mean nothing more than hosted malware signatures. Instead of transporting all their signatures down to each individual user’s machine, security vendors host them in the cloud. But there’s a problem. To write a traditional signature, a security vendor needs to see the original piece of malware, but thanks to server-side polymorphic techniques, malware today typically spreads in very low numbers, meaning that most malware variants will never be discovered by these vendors. (Our studies have shown that the majority of new malware variants only ever exist on fewer than 50 machines in the entire globe.) And, if the vendor can’t discover and obtain a copy of a malware sample, they obviously can’t analyze it and write a signature for that file. So, while users may have to download fewer virus signatures to their computer, their overall protection level is still limited to those samples actually discovered by the security vendor, which has not changed. So here the cloud doesn’t help much—it is speeding up access to signatures, but it doesn’t provide any new security capabilities.
- Website reputation. So what about looking at the source of a software file, i.e. the website from where it came? Perhaps that can tell us something? When some vendors talk about reputation, they’ve taken just this approach. They’ve built systems that focus on examining the track record of a site to serve up either good or bad files. Again, here though, there’s a problem. While it is true that many websites are set up solely to deliver malicious files to the unsuspecting user over time, unfortunately today there is a fundamental shift by attackers to try and target legitimate sites as the delivery mechanism for new malware. This proves problematic for website reputation systems because no security vendor will want to tarnish a reputable legitimate website with a bad reputation, and so the proliferation of malware continues.
Reputation-based technology is more than just another technology wave.
Reputation-based, predictive technology solves many of the problems of today’s security landscape by creating an environment where Symantec never actually has to examine and analyze a malicious file in order to protect users from it. This nails the polymorphic threat issue on the head. Reputation-based technology is able to predict the likelihood of a brand new, never-before-seen file being either good or bad, simply by looking at its attributes. This greatly increases the speed at which its calculations can be made and makes it a much more robust, long-term solution for today’s micro-distribution of malware climate.
We’ve been working on this brand new approach for the last three years. The data collection began in mid-2007 as part of our then “new” Norton Community Watch program, as a volunteer program. We first started to use this data in mid-2008 as part of our Norton 2009 products, in which a feature known as Norton Insight used the data to construct whitelists of known good applications (such applications don’t need to be scanned repeatedly on our customer machines, thus leading to much faster security). Just a month ago we reached a new milestone with the launch of our 2010 Norton products that use reputation-based technology to block totally new and never-before-seen malicious files. We believe this technology is not just part of another wave, but rather defines an entirely new way of protecting our customers’ machines.