by Jon Gary
This article is about breaking things. We discuss testing methodologies for your network. Even if you don't test your own network, be rest assured that someone is doing it for you. Every minute of every day malicious programs are pitted against commercial network applications. While this war rages on, throw in a few battalions of idiotic users - not malicious, but certainly not doing what you expect either.
Regardless of what you think, IDS systems are not perfect, nor will they ever be. Don't get us wrong, it is better to have an IDS system than to not have one - it lowers your risk. But, as an IS Engineer or a Software Engineer you should be aware of the limitations of any IDS system you build or deploy.
Furthermore, you should be aware of the limitations in your network application(s). Your "network application" is all of the software that collectively runs your network services - and you'd be surprised at some of the places where it can break. Not just denial of service attacks, but also buffer overflows and bad reactions to unexpected input. Software testing is the only clear solution - without awareness of the weak points, how can you possibly mitigate risk?
"Bad IDS, no doughnut"
Here in the lab, we test a lot of software. A good example is network-based IDS. In the lab, we have built software to test and exploit a number of IDS de-synch problems. We have twisted packets into knots so thick that no IDS on the planet could possibly hack through them. History has shown that IDS Systems can be 'de-synchronized'. In short, 'de-synchronized' means the IDS stops working. The IDS fails to see the attack. A few years ago, a commonly used technique was to fragment your packet streams. This worked well, but today most IDS software can deal with this simple obfuscation.
After an easy to use kernel module became available a few years ago - attackers have been able to automatically 'de-synchronize' all of their TCP/IP traffic. Not until it was a widespread phenomenon did IDS vendors scramble to fix their products.
Testing and validation is very important - and this doesn't mean running a set of known signatures past an IDS and seeing if it fires. We will discuss several basic methodologies for testing any network system, not just IDS.
Where IDS and Content-Filters will fail
Using IDS as an example, in the best case, a company could purchase one of every commercial IDS and they would have the time to correlate the responses from all of them. Of course - they might have to deal with more false positives, increasing the time investment. Or, they could outsource this mundane task to a third-party monitoring company.
There are many solutions deployed for Intrusion Detection, from simple log watching to elaborate host-based syscall analyzers. The largest industry seems to be network-based signature sniffers - and these are popular because they are so simple to deploy - no host-based issues, no integration - it just plugs into a promiscuous port and starts singing along. It's a 'point solution' - which means low cost of installation. It sounds good to our boss when we tell them we have deployed the latest and greatest IDS - it has good CYA factor. But - there is a looming cloud on the horizon for both Network IDS and Firewalls.
The fact is that more and more traffic is encrypted every day. IDS systems are going to be hard pressed to keep up - more processing to decrypt traffic - no longer a point solution because the IDS needs to use keys to decrypt traffic. The IDS becomes part of your infrastructure at that point - no longer just 'an extra bolt in the rack'.
Firewalls are effected by a slightly different phenomenon - the fact that software vendors have simply had it. Application developers hate firewalls. The solution? Move all of your application traffic to port 80. Look at Microsoft: RPC/DCOM represents a very important direction for Microsoft - so much in fact that Microsoft has now made DCOM available over an HTTP interface. Firewalls are daily becoming more and more useless as application vendors account for port filtering by simply moving their services onto port 80 - a port that surely will work over most firewalls.
Firewalls that filter ports are simply rendered useless, leaving the firewall vendors in the position of providing 'content-based' security - in other words - filtering application level requests. Vendors in this space have a huge obstacle to overcome if their filtering and intrusion detection is based on signatures. The crux of the problem is that content-filters have all the same problems that network IDS systems have had in the past - they can be de-synchronized. This is a classic problem which started with virus detection and has never rested since.
History has already shown us that IDS systems fall prey to a complexity problem - that is, TCP/IP is HARD. It's hard to follow, it's hard to reconstruct - and most of all - it's hard to predict. TCP/IP is on everything - and vendors are dumping all sorts of proprietary stacks on their devices and software. All this means is more chance that a network-based IDS can be de-synced from it's target.
The Testing Strategy
Idea #1: Two Systems can never be exact copies of one another.
'De-synching' an IDS system is an old idea - first introduced many years ago. For many years hackers and security professionals have been aware of this potential problem with network IDS. It goes something like this--> Attacker inserts extra characters into a transaction - characters which are invalid and will not be processed by the target. However, due to the difficulties in reconstructing the TCP/IP data - the network IDS accepts the characters and therefore fails to see what is really going on.
What does it mean to insert or inject into an IDS? How about commands that indicate the reset of a connection? Or, characters that obfuscate the actual transaction - converting an uber-elite 'GET /cgi-bin' into something more like a 'GET /cgi-bleatin' request (you think I'm kidding don't you?). Additional data can cause the header of a packet to appear corrupt, or render an application layer message impossible to decode.
In the same vein, the attacker can cause the IDS system to miss packets that are valid and processed normally by the target. The IDS doesn't fire an alarm because the IDS simply didn't get the packet. The most obvious example of this sort of attack is the leverage of IP Fragmentation and Overlapping TCP Segments. A few years ago, this method was highly effective against network IDS systems. Even today, these old school techniques can be effective.
Now stop and think for a minute - the stranger the packets that we send, the more an IDS system has to work to reconstruct them. For every exception - the 'perfect' IDS has to make the assumption that a packet could have been dropped by the target. The IDS must maintain two states for every packet - accepted and dropped. For every packet, this doubles the number of states that the IDS must follow. For every duplicate or replay packet, the IDS must account for 1/2 again the current load. For an IDS of this nature to even watch for a 10 character signature would require a great deal of memory. Assuming single byte packets, the IDS would need 10K of memory just to watch this one session. If my signature is 11 bytes long, the figure doubles to 20K - exponentially increasing in size for each byte thereafter! This is compounded by the fact that the smaller the packets are, the more possible permutations of state. As we halve the size of each packet, we double the required size of the IDS buffers. Clearly this is impossible, considering both that the signatures are most likely larger than 10 characters, and that there are close to 80 web hits a seconds going over our network. What's the point? There is no such thing as a 'perfect' IDS.
Idea #2: Bury it deep enough and no one will find the treasure.
Take all of these facts one step further. Get crazy and start embedding multiple layers of de-synchronization within one another! For example, a malicious individual could send a malformed cgi POST request in an attempt to get
We have designed patterns which exercise a variety of TCP and IP headers. A good place to start is to examine the source code of your favorite TCP/IP stack and note all the points where a packet might be dropped or cause a connection to reset. In IP Fragments, we can play with checksums and options. Depending on the target, packets may be dropped if the packet is source routed. IP Options can be invalidated, set to wrong lengths or values. In some networks, the IP fragment can be sent larger than the MRU - and we can selectively set the DF (Dont Fragment) bit to cause some packets to be dropped.
We can also monkey around with TCP Flag options. In at least one test, we simply cycle through every possible TCP flag combination - including the reserved bits. We can find out if the system is validating sequence numbers on reset packets, for instance. Flag tests alone can generate thousands of packets when combined with other permutations.
We can combine invalid and valid overlapping IP fragments with encapsulated invalid and valid TCP segments. Within these, we can encode malformed cgi requests. On top of this, we can replay valid parts of the TCP stream. In some cases, the target will accept the replay, and in some cases it will be dropped. By alerting the TTL in these packets, some of them might never reach the target system, but easily pass by the IDS system. Remember, we are running all of these obfuscations at the same time over the same stream. Even assuming that the IDS can reconstruct a valid request from all of this, we have created a significant load on the CPU.
It would be fair at this point to mention that a bunch of small IP fragments ripping around network is, by itself, enough to warrant an IDS alarm. Although some of these obfuscations work wonders, they also look strange. And, if your an attacker, looking strange is the last thing you want. So, some IDS systems will alert you when strange constructions of traffic are noticed. However, not all obfuscations of this nature are noticed, especially when we start talking about application layer requests and TCP segments.
To make matters worse, protocols layered above TCP/IP have their own sequencing issues. In the case of SMB/RPC, there are many factors, including UID/MID/TID/FID values - an IDS wishing to correctly asses file access and other windows-networking activity must take-on another protocol. It must account for a whole new book of invalid possibilities. And, given a protocol like SMB, which has even more exceptions to the rule than TCP/IP - on top of being closed-source, will leave IDS vendors in the dust. Why is this important? An auditing system can sniff all SMB TRANSACT requests in an attempt to see who is looking at what on the network. But, spicing this up with some invalid Session packets might very well make the auditing system 'forget' what sessions are important.
Another layer of de-synchronization can be added by using HTTPS. For the IDS to reconstruct a packet sent of HTTPS, it has to have the session key. It is unlikely that the IDS will have this key, but even if it does, a significant amount of CPU time would be used in the process. In addition, all of the segmenting and fragmentation de-synchronization methods could then be applied to the encrypted HTTPS packets, so that if the IDS did not reconstruct the request _perfectly_ it would not even be able to decrypt the request. The vast number of eCommerce web sites using HTTPS for secure transactions run a large number of CGI, ASP, and CFM scripts that are not being monitored by the IDS systems they use. One good example of this is the recent Openhack.com hacking event sponsored by eWEEK. The IDS logs were full of reams of attack attempts, port scans, and the like; but the first compromise that actually worked? Nope, the winning exploit did not even appear in the IDS log. Why? Because it was an exploit of a CGI program running over HTTPS.
Conclusion: Multiple layers of de-synchronization, including additional protocols, application layer obfuscations, and encryption, will be the downfall of network-based intrusion detection.
Idea #3: Cross reference the output with the input (i.e., log files).
De-synch applies to host based IDS as well. When people think of de-synching an IDS system, they usually think of TCP/IP problems and network based IDS. De-synch can also occur on host-based IDS - especially those that watch log files. Log file entries are created based on an interpretation - therefore a possibility exists that the interpretation is wrong. And, remember that network based ID systems make reports using log files also - so they are not immune from log file issues.
We set-up a traffic pattern to perform a few thousand GET requests to the IIS server - each one slightly different. For each request, we compare the resulting log-file entry to the original request. We also cross-check the filesystem request against the original request. Discrepancies soon arise.
For example, in the lab we have discovered that the Windows 2000 IIS server makes the following blunder:
is logged as:
The filesystem access is for:
is logged as:
The filesystem access is for:
We have identified a problem - the log file is reporting the WRONG filename in one of our requests. The requests are for two separate and distinct files on the system, yet they are logged the same. The potential exists that someone could covertly hide data on a website and access it using 'malformed' requests. Network IDS systems may falter on the interpretation as well, and host-based log watchers will certainly see the wrong data.
Conclusion: By cross referencing the output (i.e., log files) with what we input, we discover inconsistencies.
Idea #4: Look for 'stripped' data - additional data that doesn't effect the validity of a request
Again, by using a special pattern, we generated a variety of requests using special characters and relative paths. It became quickly apparent that the Win2K IIS server was stripping out huge sections of our requests. This is very interesting - so we tested against a Redhat/Apache server also. The results were the same for both web servers.
"GET /dir1/dir2/../dir2/../dir2/../dir2/../dir2/../dir2/test.html" 'Redundant traversing' is a valid request "GET /dir1/dir2/this_directory_is_invalid/../test.html" This request has an invalid path, yet it still works
Using our traffic patterns we quickly discovered that the relative path trick only works one level deep, that is to say "GET /dir1/dir2/this_directory_is_invalid/so_is_this_one/../../test.html" doesn't work.
The potential exists here that an IDS system may not identify the correct request because what it sees on the wire is not the same as what the filesystem sees on the target.
By the same token, the following pattern also results in a valid request:
The slash characters are extraneous to the request.
Idea #5: Use invalid filename characters
Making requests with invalid characters can reveal bugs in cgi-programs and content-filters - strange characters always present the possibility of mangling parsers. We ran several patterns in the lab that stuffed invalid characters down the pipe and analyzed the results.
The following characters are invalid as part of a filename, yet are passed directly to filesystem calls under Win2k:
By simply permuting all combinations of these characters in GET requests, we were able to determine that many of them can be used in a valid request.
These are valid web requests:
GET "/dir1/dir2/\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\/../test.html" GET "/dir1/dir2/:|:|:|:|:|:|:|:|:|/../test.html" GET "/dir1/<><><<>><><||""""""""""""""""/../test.html"
What this means is that a variety of characters, known to be invalid in filenames, can still be inserted into a GET request without effecting the validity of the request.
The only filename character that does not work appears to be the '*' asterisk. However, when making a request using an asterisk, we noticed a huge spike in processor and file activity. Suddenly we had found a potential denial-of-service problem. Yet another outcome of simple software testing.
Conclusion: Even invalid characters can be used, if they are filtered out before they hit the target subsystem (i.e., the filesystem).
Idea #6. Exploiting NTFS case sensitivity
"test.txt" and "TEST.txt" can both be created on an NTFS volume (however, not by the shell). One way to do this is to create the files on a UNIX Samba server and copy them to the NT system. This brings up a problem, because IIS always accesses files as all capitals. However, the shell will always return the first one found (which is usually the first one created). So, 'notepadtest.txt' or 'notepad TEST.txt' will both return the same file, but it may not be the same file that IIS is serving.
Conclusion: By promoting the assumption that the filesystem is case-insensitive, Windows NT has induced an atmosphere where security mistakes can be made. Did your IDS drink the blue coolade too?
Idea #7: Use Alternate Character Encoding or Equivalent Characters
Make application-layer requests using a variety of encoding methods. For HTTP, an obvious example is to encode characters using their %nn equivalents. "../" can be encoded as %2E%2E%2F - another layer of parsing intelligence that a content-filter or IDS would need to recognize. The problem gets worse when you consider that a variety of character encoding methods are available to GET and POST requests. If you think about it - your CGI programs are the most vulnerable points in your web site. The mechanisms which accept user-input are always vulnerable to exploitation. A POST request, for instance, can be encoded in a number of character sets. Device-in-the-middle parsers have a complexity problem similar to that of the TCP/IP issues of old. The device in the middle is not the target parser - and potentially can trip over characters.
Changing what you use as a delimiter character can be interesting. A delimiter is very important and an IDS parser will likely depend on the delimiter. If an equivalent delimiter character can be inserted instead of the default, the IDS or Proxy may fail. Ideas that come to mind are ',' -vs- ';' and ''' -vs- '"'. Using a simple pattern you can create many thousands of combinations of these until an acceptable delimiter is found.
Conclusion: Equivalent requests can be made using entirely different character streams. Does your IDS follow?
Idea #8: Add/Remove trailing meta-characters
The task of designing these attack patterns is fun - you get to think about the abstract ideas that promote and drive the discovery of new exploits. Another useful trick is to add and remove meta-characters from the beginning or end of 'fields'. In the lab we discovered that simply adding and removing a slash from a filename induces quite different behavior. In some cases, the request is not processed. In some cases, using a slash causes a much higher load than not using one.
Conclusion: The end of field is a good place to try and confuse a parser.
Idea #9: Inserting invalid characters into fields
Many attacks take place over HTTP today and it will only get worse. Agents or processes at the terminal end of GET/POST requests are handling user-supplied data. IDS systems and content filters must watch the fields and parameters being passed to CGI scripts to sniff out attacks. Consider that many attacks involve requests with large relative paths. Here in the lab we built a pattern to test just this problem. We store a whole set of possible CGI field names and pass them along with large relative paths:
GET /cgi-bin/test.cgi?MXT=student&SEM=MRC&DIV=0&LOG=any& DAC=10&DEE=2000&FILE=/../../../../../../../../../../ ../WINNT/SYSTEM32/REPAIR/SAM._
It became apparent that it didn't matter how far out we escaped the path, once you hit the root of the filesystem you stay there. Simply by sending a large enough path, we are almost assured to reach the root of the drive.
A test pattern can simply run a dictionary against the parameter list, inserting thousands of name/value pairs and watching the results. If the script understands what we send, it crafts an SQL statement and ships it off to the SQL server. At other times, the script fails if an invalid parameter is found. The IDS doesn't know this and throws a false positive. It is also possible the IDS will fail to recognize the attack when certain fields or characters are present. By inserting sets of meta-characters into each field we may confuse the IDS, crash the script, or cause a nonstandard SQL statement to be built.
Conclusion: Try every permutation. Be consistent and try everything.
Idea #10: Sensitive Dependance on certain Fields
Generally, you will find that certain fields are very important to the application request. We call these 'sensitive' fields. These fields are usually responsible for malloc's or offset calculation. Most denial of service attacks leverage sensitive fields to cause more resource usage. For example, a large buffer is sent that is very close to the upper limit. Think 'Ping of Death'. As we approach a border in the system (the buffer upper limit), the chances of a failure increase. We can cause 100% CPU usage by driving an intrinsically large buffer allocation - or an infinite loop (or equivalent). At the edges of buffers we can flush out off-by-one errors. At the edges of numerical ranges we can discover roll-overs. Y2K was a perfect example of this, and is the most widely known bug in the world.
Furthermore, by simply generating all of these requests from many source IP's, you can simulate DDOS attacks on your own servers and IDS systems. The load can be substantial when a sensitive field is leveraged.
Conclusion: Sensitive fields like hang out near the 'borders' (i.e., borders of ranges, borders of fields, borders of buffers... etc).
The recent Openhack.com incident brings up an interesting issue with current IDS systems. Most commonly, the IDS system is watching for known exploits to be executed. These are often the same known exploits that are tested for by commercial security scanners. For example, ISS sells both a security scanner and an IDS system. It is not unreasonable to assume that the vulnerabilities detected by their security scanner correspond to the attacks that their IDS system detects.
If a site administrator runs a security scanner on the network, and resolves all issues detected by it, the vast majority of attacks that an IDS detects will be failed attack attempts, while the ones that succeed will likely be detected by neither the scanner nor the IDS. Therefore, while the IDS may detect 98 or 99% of attack attempts, it is the 1 or 2% of attacks that are not detected that are particularly troubling.
The Openhack.com test site was compromised by exploiting a previously unknown exploit in the shopping cart software, something that is very unlikely to be detected by an IDS system (even if it wasn't tunneled through HTTPS), because none of their signatures match the attack. The PCWeek hacking challenge in the fall of 1999 fell in almost exactly the same manner. The hacker, JFS, exploited a previously unknown hole in the classified ads CGI script. This type of attack is most disturbing, because, in the case of the Openhack.com server, it was locked down by a professional security consultant. It is even less likely that a known exploit will succeed. The attacks that do succeed will usually be crafted by an individual who has examined the software on the system and found a new hole.
The methods and software used in this article has been developed into a package called 'Hailstorm'. The software runs under Windows NT/2000 and generates spoofed TCP/IP transactions using a custom kernel driver. If your interested in beta-testing or evaluating Hailstorm, it can be downloaded from our lab's web site at http://www.clicktosecure.com. The package is available for free and will include source code.
Greg Hoglund is a software engineer and researcher. Hoglund's primary focus is kernel-based intrusion detection/prevention. Starting several years ago in the "security scanner" market, Hoglund has moved on through a variety of research positions and is now a founder of Click To Secure, Inc., a lab-based company devoted to software reliability testing. Other projects include rootkit.com, a windows NT kernel-mode rootkit project, and speaking/training at various security conferences. He wrote a chapter on buffer-overflows recently published in 'Hack Proofing your Network' (published by Syngress).
This article originally appeared on SecurityFocus.com -- reproduction in whole or in part is not allowed without expressed written consent.