Firstly, please let me clarify that this article refers to “Vulnerability Scanning”. I wouldn’t want you to read this with the wrong expectations.
Secondly, I need to outline that there is information in this article that can help in certain situations. These situations, that I will clarify later, are NOT recommended as an initial design, nor supported (when using the Symantec technologies and perhaps others). The information helps in the unfortunate event in which architectures can change and cost prohibits the positioning of a VS solution, leaving it in a less than ideal place (behind firewalls and other security devices).
Right, now that’s out of the way… Vulnerability scanning! I love vulnerability scanners. Over the past several years they have come on leaps and bounds in helping us automate the time consuming task of finding known vulnerabilities. The discovery of assets, enumeration of ports and services, evaluation (and consequently the marrying of know vulnerabilities against service versions) has been bolstered with web scanning, database scanning, policy compliance scanning and several other scanning methods that improve our visibility and therefore risk within organisations. We are even presented the opportunity for unsafe scanning that allows vulnerability conformation by exploitation to improve the accuracy of information. However, this article is not to describe the ins and outs of Vulnerability Scanners, but to help with a particular situation of the architecture and design.
Vulnerability scanners are most accurate and provide better results, when their connections are not blocked or interfered with by other security technologies. Firewalls and IDP systems are designed to block unwanted and unnecessary traffic from entering (and leaving) particular areas of a network. It is quite unusual, therefore, to design vulnerability scanners to sit on the ‘wrong’ side of these devices. However, in some situations, it is unavoidable and in my particular example, was unforeseen.
Before I continue, I must mention (again) that most vendors won’t offer support in these types of situations (including Symantec). There are technologies at play that vendors can’t support and can be the reason for the scanning process not functioning in the way it is designed. So why do I continue? As part of the TSS (Technical Sales and Services) team at Symantec, I am a techie at heart. I find it personally rewarding to be as innovative and helpful as possible in ‘techie’ situations. I worked something out and wanted to share it, just in case anyone was having the same problem. I thrive on techie knowledge and understanding, however, I digress. So, on to the situation…
A VS solution was put in place to monitor several areas of a network. These were logically separated areas and each area was on a boundary with a firewall device. The original solution included scan engines in the relevant areas, to avoid scanning through the perimeter devices. A change in network architecture meant that these logical areas were subsequently sub-divided, meaning that traffic from one to another, had to go through a routing device. These, for budget reasons, were the perimeter firewalls. So, not only were the firewalls responsible for filtering the traffic between particular areas, they were now acting as routing devices in smaller segments of those networks. I know, I asked about layer 3 switching capability too, but it was not available. What can one do?! So the problem now was; When a scan was being run for a segment, a greater number of allocations in the firewalls state tables were being used to route the traffic. As a brief overview of state tables, they provide (amongst other functions) a way in which we can monitor and manage connections between two devices. They support a simple network rule that allows Computer A to connect to Computer B on a given port (or ports). The state maintained means that we do not need to write the reverse rules for these communications. This I HUGELY beneficial for the firewall admin teams. Now, during the asset discovery and service discovery phases of the scan, we connect to one, if not many, ports on a destination asset at the same time. For the purposes of speed we typically connect using only the SYN part of the three-way-handshake for statefull protocols like TCP. What does this mean? Well, for every port on every computer we connect to (through the firewall), the firewall maintains a state for it. Because we focus on speed (and usually there are no firewalls in the way) we only use the SYN and do not response to the SYN/ACK response. If we get that response, we can determine if the port is open or not and move on swiftly. This means that there are many ports left open, in the state tables of the poor firewall we are scanning through. Now, firewalls have also advanced in both function and performance. However, state tables can only be filled up so far before they are exhausted. This exhaustion is what typically contributes to a DDoS (Distributed Denial of Service) attack. What’s the last thing we want to do when testing and trying to improve security? Yep, that’s right, we don’t really want to take down segments of the network or firewalls in the process.
So here was my result; a formula that allowed a company to determine how many devices and ports could be scanned at the same time, without exceeding a defined number of connections through a firewall. This defined number of concurrent connections through the firewall could be based on what the firewall could handle (a maximum), or a number defined by a policy. For example: “There must be no more than ‘x’ ports open, at any one time, by the vulnerability scanner through this firewall.” In my particular case, it was the latter. Now, firewalls aren’t stupid. They know when something’s not being used and can close open ports and remove entries in state tables after a period of time. So this is something I needed to take into consideration. Also, vulnerability scanners today have a plethora of performance tuning options, including ‘Packets per second’, ‘Delay between scanning hosts’, ‘Connection time outs’, ‘Number of retries per port’ to name but a few. If the scanner your using doesn’t, then perhaps take a look at Symantec Control Compliance Suite Vulnerability Manager, it does! (OK, that is the first and last sales reference in this article, I promise). So, all considered here is the formula… drum-role please…
C = Maximum of concurrent connections allowed on a firewall
RT = Time (seconds) after a stale connection is reset
CP = Number of concurrent ports scanned on a target system
TP = Time (seconds) taken to scan and complete 1 port on a target system
S = Number of concurrent targets that can be scanned at the same time
In my particular case, the firewall was not allowed to consume more than 10,000 open ports at any given time, it reset stale ports after 15 seconds, we were scanning 23 ports concurrently per host and it took 3 seconds to scan each port. Let’s work that out..
((10,000 \ 15) \ (23 \ 3)) = S
(666.67 \ 7.67) = S
S = 86.9
S = 86
So, we are scanning 86 hosts at a time, each with 23 ports being scanned at a time and taking 3 seconds to scan each port. Therefore every second, 659.34 ports would be open. After 15 seconds, 9890 ports would be open before the firewall started closing down stale ports, resulting in no more than 10,000 ports being open at any given time.
This equation can also be re-arranged to work out how many ports to scan per host if you are scanning ‘S’ number of hosts, which might be more suited towards a situation.
There are also some limits on how many hosts and also ports can be scanned at any given time by vulnerability scanners. These limits with the firewall limits might not present a problem for concurrent connections, but it’s always nice to be sure.
If anyone else has run into similar situations and devised a work-around, I’d love to hear about it. We can share ‘techie’ thoughts on helping to fix these awkward situations together. Thank you to anyone who made it to the end of the article without falling asleep.