Waledac, Part 2: Its Bootstraps and Armor
In a previous post I provided an overview of W32.Waledac’s functionalities, tactics, origin, and connections. This time, I will discuss more on the bootstrap mechanisms and armoring techniques used by Waledac in order to sustain and protect itself.
When a Waledac executable is installed, it turns the compromised system into a zombie and acts as an agent for the botnet. It creates a window named fhfhkjfhwefkwj and registers itself with a class name jfkljfilfj23fi32io. As a self-starting mechanism, it also adds any of the following entry in the registry so that it can run whenever Windows starts:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Run\”PromoReg” = “[PATH TO EXECUTABLE]”
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Run\”PromoReg” = “[PATH TO EXECUTABLE]”
W32.Waledac also uses the registry to store configuration data such as IDS and IP list data. Under the HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion registry key, this malware creates four more values: FWDone, LastCommandId, MyID, and RList.
• FWDone — a state flag that is set to the hexadecimal value ”74 72 75 65” (which equates to the ASCII string ”true”) when the malware achieves connectivity.
• LastCommandId — stores the hexadecimal equivalent of the ID number of the last download-or-update command that the malware received and executed. This command will be explained in further detail later.
• MyID — contains a 16-byte hexadecimal value that is randomly generated at node startup to uniquely identify itself in the botnet.
• RList — stores an obfuscated IP address list that is used by the bot to maintain its connectivity to the botnet. When de-obfuscated, the RList value is an XML-formatted list that has been compressed using Bzip2 and encrypted using AES. Figure 1 below illustrates how the original XML IP list is encoded into the RList value we see in the registry:
Figure 1. An illustration of how an IP list is encoded before being stored into the RList registry data
The data in the RList registry value is initially populated from a list of IP addresses hardcoded into the binary, and may vary per infection. Each entry in the hardcoded list has two components: a node ID and the corresponding IP address.
To construct RList, the malware randomly picks a node from the hardcoded list and constructs an XML-formatted list out of the node ID and IP address data, a Unix time stamp, and port element. Next, it compresses the XML list with bzip2 and then encrypts it with the Advanced Encryption Standard (AES) cipher. The output is stored into the RList registry value. Afterwards, it randomly picks another node from the hardcoded list, converts it to XML format again, prefixes this new node into the previous XML list, and then changes the value in between the localtime tags into the current time using a Unix time format. After the new XML node IP list is generated, bzip2 and AES are again applied and the new value replaces the previous value stored in RList. These steps will be repeated all over again until the list is full. The IP list in RList can contain up to a maximum number of 500 nodes in it. This maximum number is retained even when the list is updated. When a full list is updated, old nodes are simply replaced with newer nodes—so, consequently, the last entries in the list are usually the first ones to go. As an example, a five-node IP XML list is shown in the snapshot in Figure 2:
Figure 2. A sample of a five-node IP list
Botnet topology and bootstrapping
A system compromised by a bot agent is usually referred to as a zombie. There are mainly two types of zombies in the W32.Waledac botnet, and they are determined when the zombie bootstraps to the botnet. If a zombie is publicly accessible and has very good bandwidth, the more it is likely to act as a HTTP and DNS proxy server for the botnet. I will refer to this type as a relay-node for the rest of this discussion. In contrast, I will refer to a non-proxy zombie as a slave-node.
Relay-nodes basically act as an intermediary between the slave-nodes and the master command-and-control (C&C) servers, as well as for each other. With the help of these relay-nodes, Waledac is able to implement blind proxy redirection as another armoring tactic to protect the botnet from full enumeration and complete shutdown. Figure 3 depicts a sample illustration of Waledac’s network topology.
What is more, the hardcoded IP list embedded in a W32.Waledac binary is actually a list of relay-nodes that is used by a new node to bootstrap to the network. Given that a relay-node is also a compromised system, it is likely to be taken offline from the botnet when the infection is discovered and removed. For this reason, Waledac has to replace the list that it embeds on its binaries after a certain time, since offline relay-nodes will obviously be useless in bootstrapping a new node into the botnet. Essentially, Waledac uses redundancy to ensure that in case of failure, there is still at least one alternative path available from one point to another.
Figure 3. A sample illustration of Waledac's network topology
Node IP list updates
A Waledac node updates its peer IP list in two methods. So in the event that one method fails, there is still one other method to use. The first method involves an IP list exchange with another node. For example, a Waledac slave-node randomly selects 100 relay-nodes from its locally stored list and then attempts to connect with one of the relay-nodes. Once a connection is established, the relay-node constructs a list of 100 relay-nodes in its own stored list, and exchanges it with the slave-node. After they receive the list, they update their locally stored list by replacing older entries with newer ones.
In the second method, a Waledac slave-node updates its IP list by connecting to a hardcoded Waledac domain after 10 minutes and fetching a list of active relay-nodes using a GET request for an index.php file. The list received typically contains up to a maximum of 500 entries. After receiving the list, the client-node would once again update its locally stored list. This hardcoded domain usually turns out to be a fast-fluxed domain. So in a short period of time, the website host can resolve to multiple IP addresses. There is a good possibility that these hosts are compromised systems as well.
Bot agent binary updates
Yes, Waledac is also capable of updating the bot agents installed in its nodes. This process is initiated when a node receives a command to download an updated version of Waledac. After the download is finished, the node executes the binary and terminates itself so that it can be replaced by the newer version.
At present, the binary updates we have observed being used by Waledac arrive bundled with a JPEG file. What appears to be a normal JPEG file at first turns out to be a Trojan. Embedded in the JPEG file is an encrypted binary, and one of those binaries being distributed by Waledac this way is an actual update of itself. This will be discussed in more detail in the next blog post.
As mentioned in the previous section, Waledac makes use of fast-flux hosting for its domains. Meaning, in a short period of time, one Waledac domain can resolve to multiple hosts that are most likely just acting as proxies. A fast-flux DNS system makes it harder to track the source and as a consequence, it is not easy to completely shut down. Evidently it is meant as another defense mechanism for the Waledac network.
Figure 4, below, depicts the work of fast-flux service networking. It shows the results of DNS dig queries for one of Waledac’s domains during its “dirty bomb” campaign back in March 2009. Notice that in spite of the queries being launched only four seconds apart, the returned IP address for the host in the second query is already different from that of the first query.
Figure 4. Outputs of dig queries launched only about four seconds apart for the same fast-fluxed Waledac domain
Similar to what Peacomm (a.k.a. Storm) did during its run, Waledac regularly repacks the executable binaries housed in its malicious domains and consequently achieves server-side polymorphism. In its bid to evade file-based detections it changes the multilayered encryptions and protections that are wrapped around its executable. Shown in Figure 5 is a sample before-and-after scenario illustrating Waledac’s server-side polymorphism. The initial snapshot in Figure 5 shows a first look at a Waledac binary that was downloaded from one of the malicious domains used in the recent 4th of July campaign. When another download is initiated after around ten minutes, a repacked version of the binary is received this time (as shown in the next snapshot of Figure 5). Notice that there are differences in both samples already, even after only a few minutes apart of being downloaded.
Figure 5. A demonstration of Waledac's server-side polymorphism. Both samples were downloaded only about ten minutes apart.
In this section I will discuss the communication protocol used by Waledac to control its botnet. The nodes in the W32.Waledac network use HTTP requests and responses to communicate between peers, which is mostly to update spam campaigns and to download updates or components.
To establish a connection with a relay-node, a slave-node opens a random local port on the compromised system and tries to connect to port 80 of the remote relay-node. Before message data is sent to the relay-node, it goes through at least four transformations to obfuscate the original data. Hence, anybody that sees the HTTP packets on the wire will not be able to figure out the messages without prior knowledge.
In communicating with other nodes, this malware uses HTTP POST and GET messages. The data of the POST and GET requests is encrypted, while the request header values are plaintext. An example of an HTTP POST request and response exchanged between a slave-node and relay-node is shown in Figure 6. The top part is the request portion, while the bottom part is the response portion. As you can see, the encrypted message data can be seen in the a URI parameter of the POST request, and is followed by the b URI parameter. On the contrary, the response to the POST request does not use these URIs.
Figure 6. A sample of an obfuscated Waledac message exchange
Waledac node message format
The plaintext message format used by W32.Waledac follows an XML structure. The root element is lm so all W32.Waledac messages are enclosed with this root start-tag and root end-tag. A message contains several elements containing data relevant to a specific task executed by the node, such as configuration data, commands, or stolen information. Table 1 contains just some of the major elements used by Waledac.
Table 1. Some examples of Waledac's major XML elements
Meanwhile, a sample snapshot of an XML-formatted Waledac message is displayed in Figure 7. To find out more about the different types of messages exchanged between nodes, watch out for the next blog post in this series.
Figure 7. A de-obfuscated version of a sample Waledac message sent by a slave-node
Waledac message encoding
As mentioned earlier, a Waledac node message goes through several transformation stages before it is transported. This is most likely an attempt to secure botnet data and thwart potential eavesdroppers.
In the first stage, the XML form of a message is compressed with bzip2 before it is encrypted. The compression mainly helps to minimize the size of the message.
After a message is packed with bzip2, Waledac encrypts the compressed form with the Advanced Encryption Standard (AES), which is a symmetric block cipher algorithm. At the time of writing, Waledac specifically uses Cipher Block Chaining (CBC) mode and a total of three 128-bit (16-byte) keys for its AES encryption.
Basically, after the key schedule routine is applied to a key, it is expanded into an array of 44 word-sized values. The expanded keys are then saved into heap memory to be later used for encryption. Conversely, Waledac also pre-computes the inverse of each expanded key and saves them near the same area in heap memory so they can be used for decryption later.
Base64 encoding and some character substitutions
Waledac uses a slight modification of base64 encoding to transport data over the network. After the message is transformed using standard base64 encoding, the malware further transforms the encoded message by substituting some characters and removing any padding at the end. It searches for plus characters (+) in the base64 output and replaces them with minus (-). Next, it searches for forward slashes (/) and replaces them with underscores (_). Finally, it searches for equal signs (=), which are normally found at the end of a padded base64 form, and replaces them with underscores (_). Later on, the trailing underscores (_) at the end are discarded, which in effect removes the original padding that base64 added.
RSA keys and x.509 certificates
To establish a connection, a slave-node generates a new 1024-bit RSA public and private key pair during its initialization. It then uses the public key to create an x.509 or self-signed certificate (see Figure 8 for an example) and uses it to establish a session with another node. The newly generated self-signed certificate is set to have a validity period of one year, while the subject field is set to C=UK, CN=OpenSSL Group. When a connection is successfully established, the slave-node receives a base64-encoded, RSA-encrypted session key that it then uses to AES-encrypt ensuing messages.
Figure 8. An illustration of a self-signed certificate generated by W32.Waledac
Waledac message types
There are two types of C&C messages that can be seen in the Waledac botnet. For the rest of this discussion, I will refer to them as the following:
• Task message — contains data related to a task that the bot executes; has a prefixed task header
• IP list message — contains a list of IP addresses that Waledac uses to update its list of nodes or servers; has no prefixed header
C&C messages are sent using the HTTP protocol, mostly through POST. However, a node may also use GET to receive updates or to download components. The headers of the HTTP requests use a Referrer and/or User-Agent string impersonating the Mozilla browser.
These pertain to messages that are related to routines or tasks that are carried out by a node. At the time of writing, there are nine types of tasks classified by the first byte in the header of encrypted task messages. Table 2 shows the current possible values. Each value corresponds to a specific task type, which in turn has different message contents. These tasks will be discussed in more detail in the next installment of this blog series.
Table 2. A list of TaskID types
The creds task actually did not exist until the later variants of Waledac. Before that, Waledac only had the first eight task types. The earliest sample I saw that had the added creds task was received around February 2009. It is possible that it may have started earlier than that though. As an illustration, please take a look at Figure 9 below. The routine jump table at the top came from one of the samples we received in late December 2008. In contrast, the jump table shown at the bottom of Figure 9 came from a sample we received in late February 2009. Through this, you will notice that the creds task was not a functionality of Waledac until the later variants.
Figure 9. A comparison of jump tables used by Waledac variants
If you are wondering about the “unknown command” indicated in the jump tables in Figure 9, this merely refers to Waledac’s bootstrap routine of exchanging and updating its IP list. It is the one responsible for generating the IP list messages.
Moving back to a task message, one thing that differentiates this type from an IP list message is the presence of a header prefixed to a variable-sized AES-encrypted data, right before it is transformed with base64 encoding. Thus, the task message body will have the following format:
byTaskID (BYTE) dwDecSize (DWORD) taskData (Variable size)
The value dwDecSize will be in network byte-order and specifically refers to the decrypted size of the message attached to the header. Figure 10 shows an illustration of a sample task message containing a prefixed task header.
Figure 10. A sample message with a task header
A task message POST request poses as a PNG or HTM file that is randomly named. The encrypted task message data is contained in a URI parameter named “a”. If the sender of the task message is a slave-node, a URI parameter called “b” with a value of “AAAAAA” is appended. This value is just a base64-encoded equivalent of a null dword. Meanwhile, if the sender is a relay-node, the URI parameter value for “b” would be its base64-encoded IP address.
On the other hand, the POST response message data will also be obfuscated but it will not contain the URI parameters seen in the POST request. The HTTP header of the POST response passed down by the relay-node also suggests that it came from an nginx server.
IP list messages
Unlike the task messages, an IP List message data does not contain URI parameters or a prefixed header. Furthermore, an IP list exchange by HTTP POST has two kinds, depending on the request type. It may either contain a list of relay-nodes or a list of proxied C&C servers. A POST header containing a “X-Request-Kind-Code: nodes” line is a typical request used by a slave-node to exchange an IP list of relay-nodes with one of the relay-nodes in its locally stored list. On the other hand, a POST header containing a “X-Request-Kind-Code: servers “ line is a request used by a relay-node to exchange an IP list of proxied C&C servers with another relay-node.
Obfuscation process flowcharts
As an overview, Figure 11 below shows flowcharts illustrating how Waledac typically transforms the messages used by its nodes to communicate with each other.
Figure 11. Waledac’s message transformation process
Summary and Conclusion
In this blog post, I talked about Waledac’s bootstrapping and armoring capabilities. A new node bootstraps to the botnet by using a hardcoded list of relay-nodes and also by fetching an updated list from a fast-fluxed domain. It tries to sustain its connection to the botnet by constantly updating its peer list. Waledac is also capable of replacing its bot agents with newer versions that have additional or improved functionalities. In the meantime, Waledac armors itself with server-side polymorphism, a fast-flux network, blind proxy redirection, and obfuscated communication. Since these techniques takes a considerable time to implement, they show that a lot of effort was put into making the Waledac botnet resistant to complete tracking and shutdown.
In my next blog post, I will talk more about the different types of task messages that were introduced in this blog post. I will also discuss some of Waledac’s functionalities in more detail, such as how it downloads components or updates, how it constructs its spam, and how it steals information. So, please watch out for the next installment of this blog series.