by Matthew Tanase
|Detecting and Removing Malicious Code
by Matthew Tanase
last updated July 22, 2002
Has it happened yet? The phone call, the e-mail, the page, or maybe you discovered it yourself. Something wasn't right: sluggish performance, too much network activity, a missing file. After a little investigating, the realization - you've been cracked. If this isn't familiar to you yet, odds are it will be in the future. Crackers have access to countless variations of malicious code: automated rootkits, trojans, viruses and specific exploits, all designed to breach your security. Detecting and removing these programs can be a daunting task, with little room for wasted time or error. In this article, I'll explain techniques readers can use to get their system back on-line and prevent it from happening again.
Backups and Evidence
Before we get started, there are two items to address. First, I cannot stress how important it is to backup your systems. If you haven't rebuilt a cracked box, lost months worth of data, or spent precious downtime reconfiguring, you might overlook this procedure. However, standard practice in incident handling calls for a format and reinstall during system recovery. The reason? It's difficult to quickly determine the depth of a security breach. Was it a simple, automated exploit designed to replace the default Web page? Or a complex attack built to load your system with dozens of trojan files and back doors. Regular backups turn the nightmare of a cracked system into a mere inconvenience, since you can wipe the disk and restore the backup with minimal downtime.
The second issue deals with the formalities of forensics and incident handling. This article will outline general techniques for detecting and removing malicious code, regardless of OS or exploit. The goal is to help readers understand the core concepts required to get a system up and running again - i.e, what to do when a fresh rebuild is not an option. If you or your organization plan to formally investigate the incident, certain procedures must be followed. An image of the cracked system must be created immediately. Why? Most attacks leave multiple fingerprints: logs, modified files, access times. Any sort of activity on a compromised machine can taint this evidence, much like a physical crime scene. Fortunately, the digital nature of our work allows us to create multiple copies for future inspection. The lesson? Before proceeding with any part of the recovery, remove your device from the network and create a replica of the system as you found it. And as you proceed, carefully log your work. The log itself will become evidence of how you conducted the investigation.
Something's Amiss - Detecting Malicious Code
Although some exploits are designed to announce their success (or taunt their victim), most remain quiet. As you can imagine, they stay silent for a reason: so they can conduct nefarious activity surreptitiously. Therefore, malicious code is difficult to detect. In the well known, international cracker story The Cuckoo's Egg, Clifford Stoll tells how his journey began with an obscure accounting error of just a few cents!
In my experience, the telltale signs of a problem are often just small anomalies. For example, a client once told me that nobody could log in via the console. I logged in remotely with no problems. Realizing that the console log-in utilizes a different executable, I discovered that it had been replaced with a trojan several days earlier. Since most users connected remotely, no one had noticed. If you have a feeling that something's wrong, investigate. Unfortunately, you're probably correct.
What Do the Logs Say?
Let's assume something has triggered your alarm. Where do you start? If you can, take the system off-line. This ensures that no further damage can take place. Unfortunately, you won't always have this luxury. Some machines are too important to take down on a whim. However, as soon as you do confirm an incident, take the system offline immediately to contain damages.
The investigation itself should begin with a thorough examination of log files. Specifically you'll want to look at the log-in records, system events and administrative actions. Unfortunately, there's not a signature to search for, just unusual activity. Strange log-in times, unplanned reboots, failed log-in attempts and unscheduled administrative procedures to name a few. For greater depth, try reviewing application logs such as those created by FTP, mail, and Web and database daemons. These files tend to capture greater detail.
Now the bad news. Many intruders modify, clean or completely destroy log files. If you haven't implemented a remote log server, there's often nothing left to inspect. The review process should not be overlooked though; it's very difficult to completely mask all traces of an exploit. Secondary logs, such as the application specific files, may catch stray events. The location of such files varies between systems, as opposed to default logs, so they're often left untouched.
How's it Behaving?
The next step in the analysis process is to observe the system's behavior. The goal, again, is to discover anomalies or changes in the running configuration of the system. Start by looking at what processes are running. As a general rule, you should know why each job runs on a specific machine. When a security situation arises, it should be easy to spot what has changed. Stats to observe include: CPU utilization, how frequently a job runs, and the owner of the process. One caveat is that an exploit can hide or rename a running process. Like every other step in the examination, investigation and research is required. Crackers rarely make it easy on us, but with a trained eye, some patience and the proper tools, we can find their mistakes.
Output from Unix command top:
8:54pm up 40 min, 2 users, load average: 0.00, 0.01, 0.04 48 processes: 45 sleeping, 3 running, 0 zombie, 0 stopped CPU states: 1.3% user, 0.9% system, 0.0% nice, 97.6% idle Mem: 254200K av, 249128K used, 5072K free, 0K shrd, 57964K buff Swap: 522104K av, 960K used, 521144K free 39936K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 1281 matt 18 0 11984 11M 10144 R 1.1 4.5 0:00 kdeinit 825 root 16 0 83820 17M 3584 R 0.7 6.8 1:17 X 1306 matt 11 0 1028 1028 836 R 0.1 0.4 0:00 top 1 root 8 0 476 476 420 S 0.0 0.1 0:00 init 2 root 9 0 0 0 0 SW 0.0 0.0 0:00 keventd 3 root 19 19 0 0 0 SWN 0.0 0.0 0:00 ksoftirqd_CPU0 4 root 9 0 0 0 0 SW 0.0 0.0 0:00 kswapd 5 root 9 0 0 0 0 SW 0.0 0.0 0:00 bdflush 6 root 9 0 0 0 0 SW 0.0 0.0 0:00 kupdated 8 root 9 0 0 0 0 SW 0.0 0.0 0:00 khubd 9 root 9 0 0 0 0 SW 0.0 0.0 0:00 kjournald 135 root 9 0 0 0 0 SW 0.0 0.0 0:00 kjournald 460 root 9 0 560 560 472 S 0.0 0.2 0:00 syslogd 465 root 9 0 444 444 384 S 0.0 0.1 0:00 klogd 545 root 8 0 636 636 480 S 0.0 0.2 0:00 cardmgr 677 root 9 0 1816 1816 1300 S 0.0 0.7 0:00 sendmail
The most crucial step in the detection process is an analysis of the system's network behavior. This is especially true if your investigation hasn't lead to a solid confirmation of an exploit. Many cracks install new services or open ports on victimized machines. Your cracked server might be used as a spam relay, or maybe it's awaiting instructions to participate in a distributed denial of service attack. So what can you do? Start by looking at what services your device should be running, then run a portscan on it. Are there any differences? No unexpected ports should be open.
Output from portscan tool Nmap:
[/home/matt]$ nmap -sT -v -v -v localhost Starting nmap V. 2.54BETA31 ( www.insecure.org/nmap/ ) Warning: You are not root -- using TCP pingscan rather than ICMP Host localhost.localdomain (127.0.0.1) appears to be up ... good. Initiating Connect() Scan against localhost.localdomain (127.0.0.1) Adding open port 25/tcp Adding open port 6000/tcp The Connect() Scan took 0 seconds to scan 1554 ports. Interesting ports on localhost.localdomain (127.0.0.1): (The 1552 ports scanned but not shown below are in state: closed) Port State Service 25/tcp open smtp 6000/tcp open X11
Next, try running a packet sniffer. Filter normal traffic (i.e., DNS queries, HTTP, etc.) from the output using a customized query. Again, you're looking for anything unusual. Keep in mind that you might not see traffic right away. Some exploits send bursts of data periodically, so you will have to monitor the output over time. I once encountered a cracked machine that sent out an unsolicited ICMP echo reply only a few times each day. It took several hours of detailed monitoring and a complex filter to catch such an event.
Simple tcpdump filter designed to ignore standard traffic:
[root@localhost matt]# /usr/sbin/tcpdump -elXnvvv -i eth0 dst port not 22 and dst port not 80 and dst port not 53
Another tool to employ during the investigation is netstat. It can be used to display a list of active network connections. Use it with Unix tools such as lsof and fuser (or fport for Windows), to map which programs are using network ports.
[/home/matt]$ netstat -l Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 *:printer *:* LISTEN tcp 0 0 *:x11 *:* LISTEN tcp 0 0 *:ssh *:* LISTEN tcp 0 0 localhost.localdom:smtp *:* LISTEN
An exploit can only do one of three things: add, remove or modify files. What's likely to be removed? The logs, as we discussed above, or everything if the exploit is extra nasty. Files added to the system are normally tools such as sniffers or viruses that may be used to carry out post-crack activities. You might even discover that your machine has become a place to stash files, for instance as a FTP warez server. But what you really need to be concerned with is what has been modified. This is the most challenging part of the recovery process. A cracker can change the system in countless ways: new accounts, modified passwords, tweaked registries, trojaned files and more. In short, back doors galore, and the reason most experts recommend reinstalling from scratch.
Where to begin? Critical system files. Everything related to accounts or access privileges deserves special attention. If these core files can't be restored, you have to verify that they haven't been modified.
Next, employ tools such as the Unix find command or those found in external forensic kits to search a filesystem for recent changes. These programs look at three specific attributes: access time, status change time and data modification time. You can use such utilities to generate a list of what has been modified by the malicious code on your system. Review, replace or restore everything on this list. If any system executables are tainted, they must be replaced with originals from the system installation disks. Modified binaries are likely trojans used to open a back door.
Output of Unix find command searching for files modified in /etc in the last 10 minutes:
[root@localhost matt]# find /etc -mtime -10 /etc /etc/sysconfig/hwconf /etc/mtab /etc/mail/statistics /etc/dhcpc /etc/dhcpc/dhcpcd-eth1.cache /etc/dhcpc/dhcpcd-eth1.info /etc/dhcpc/dhcpcd-eth1.info.old /etc/aliases.db /etc/adjtime /etc/pcmcia /etc/pcmcia/backup_wireless.opts /etc/pcmcia/wireless.opts /etc/pcmcia/1/etc/kde/kdm /etc/kde/kdm/kdmsts /etc/ioctl.save /etc/.aumixrc
Any new files introduced to the system by the code should be removed, but not destroyed. They can help identify exactly what was done on your machine. Security experts often post a forensic analysis and point-by-point list of how popular exploits affect a target. If you find files commonly included in malicious code packages, you might be able to take advantage of such reports.
Before Going Back On-line
There are a few steps left before a system can be cleared to go back on-line. Make sure all patches, at system and application level, are up-to-date on machines under your control. Next, change the administrative passwords on the recovered machine, in case they were compromised. Additionally, you'll want to temporarily remove any host based authentication measures you have in place for this system.
As a final step in the recovery process, ensure that you have multiple monitoring methods active. Although everything will soon be operational, the monitoring process is just beginning. You need to keep a close eye on the target to prevent a back door from being activated. This can be accomplished by repeating the log and network analysis steps from the detection process, outlined earlier. Consider implementing tighter security measures as well, or using some of the techniques introduced in the next session.
Hasn't Happened Yet?
If you're reading this article and are not in a state of panic, consider yourself lucky. There are several steps you can take to ease the recovery process, or prevent it from taking place altogether.
First, look into implementing an intrusion detection system. They are often just as helpful in reconstructing attacks as they are in preventing them. You could use the IDS logs to pinpoint exactly how your system was cracked and what sort of code was introduced. They can also notify you immediately of incidents that would have otherwise gone unnoticed.
Integrity checkers are another helpful set of tools. Programs like Tripwire and AIDE create a hash of important files on your system. They report changes to these files based on user-defined rulesets. Such programs allow an admin to instantly assess the damages and modifications to a filesystem after an attack. These applications are the single most effective tools that can be used in the recovery process because they verify what files have been altered by malicious code.
Most importantly, create regular backups - both full and partial. This simple process will save you hours of downtime at some point in the future.
I'd like to hope you will never play the role of victim, but the odds are against that. In this undesirable situation, you're wearing multiple hats. As the system administrator, you want to get everything back on-line. As the security engineer, you want to learn what happened and prevent it from occurring again, anywhere on the network. Finally, as the incident handler, you have an obligation to properly investigate the crime and document your findings. If you keep these three distinct roles and their respective goals in mind, the process will be a success.
Drive and File Analysis
Matthew Tanase, CISSP, is President of Qaddisin, a network security company based in St. Louis. His company provides consulting services for several organizations. Additionally, he produces The Security Blog, a daily Weblog dedicated to security.
Forensics Tools for Unix - Detailed usage examples of Unix forensics
Securing Linux with AIDE - AIDE usage examples.
This article originally appeared on SecurityFocus.com -- reproduction in whole or in part is not allowed without expressed written consent.