Malware in Linux: Rootkits, introduction and classification

Malware in general, and rootkits in particular, can work just as well in a Linux operating system as in Windows. From Windows XP onwards, security in Microsoft systems has noticeably improved, so failings in this area cannot be seen as the cause for the existence of much more malware for such platforms.
However, from the viewpoint of somebody creating malware, Windows will always be a much more attractive system to target than Linux, because it is much commoner and more inter-compatible, has more users and is virtually omnipresent all round the world. The use of Linux is growing and in recent years the IoT or "Internet of Things" phenomenon has led to an exponential increase in the number of devices connectable to the Internet that implement free software and in particular Linux.
This means greater opportunities and a large area for malware creators to exploit. As an illustration, in 2014 Operation Windigo run by the information technology security company ESET brought out a report on malicious code with more than 25,000 servers affected around the world, a good number of them based on Linux.
Last year was especially interesting, as it saw the discovery of serious vulnerabilities like Heartbleed and Shellshock which also affected Linux/Unix systems and which without doubt had been used to compromise servers. It is clear that malware does not live by Windows alone.
This paper will cover rootkits in Linux, their types and characteristics, and will give some examples by way of illustration.
Compromising a System: Rootkits
What is a rootkit, and how does it work?
A rootkit must be seen as a "tool" that can act independently, or in tandem with any other sort of malicious code, with the main aim of hiding its activity from users and systems administrators. Understanding how a rootkit works is very useful knowledge that can help in detecting the presence of such undesirable elements.
Basically, the chief purpose of a rootkit is to hide information about processes, network connections, files, and so forth. In addition, it may incorporate other functions such as a back door to allow permanent access to the system or a key-logger to intercept and record strokes on the keyboard.
Types of Rootkit. User Space and Kernel Space
In general an operating system will have two areas of memory that are carefully differentiated: user space and kernel space. In a typical architecture, these spaces are located in "security rings" with differing levels of privileges.
Security Rings in an Operating System
The user space is where applications are executed in a controlled environment, which has no direct access to resource (memory, disk space, peripherals and so on). Rather, these must be requested through calls to the system. The kernel space is where resources of all kinds are managed and from here there is total access to them, as this is where the core of the operating system is executed.
Keeping this distinction in mind, it is possible to classify two main varieties of rootkit: rootkits in user space and in kernel space. Of course, in view of the different levels of privilege the two spaces have, a rootkit in the kernel will be much more advanced, powerful and hard to detect than a rootkit in user space.
Rootkits in User Space
This kind of rootkit executes in user space with the same standing as applications and other binary code. This sort of rootkit usually replaces legitimate executable items from the system with substitutes that are modified in such a way that the information they yield is tweaked as required. Among the main binary code items targeted by rootkits in order to hide and operate would be the following:
- Files: ls, df, stat, du, find, lsof, lsattr, chattr, sync…
- Connections: ip, route, netstat, lsof, nc, iptables, arp…
- Processes: ps, top, pidof, kill, lsof…
- Tasks: crontab, at…
- Logs: syslogd, rsyslogd…
- Access: sshd, login, telnetd, inetd, passwd, last, lastlog, su, sudo, who, w, runlevel…
The option of changing binary code is rather crude and easily detected, so often it is complemented with other techniques that attempt to modify the execution of binary code items on the fly, by intercepting calls to libraries. Naturally, this is possible only if the binary item being manipulated is not statically compiled, since in that case it would not call external libraries.
Such a technique allows the rootkit to interfere at run time in binary items that make use of dynamic libraries. In this case, library calls are intercepted through the variables LD_PRELOAD, LD_LIBRARY_PATH, or cache (/etc/ld.so.cache) and configuration (/etc/ld.so.preload) files so as to preload or divert calls to altered libraries.
An Example of Intercepting Library Calls with LD_PRELOAD.
The executable program ls is dynamically compiled and as it runs there is a call to the function strlen, which is in the library string.h:
Call to Library on Executing the Command "ls". (Part of Output)-
The idea is to get in the way of the call to the library that contains the function and modify it (hooking the call). One of the possibilities among those described is to use the local variable LD_PRELOAD. To do this, a new library is created defining the function strlen, making sure that the executable program finds this before it can get through to the original library, making use of the variable LD_PRELOAD:
Program Defining a Fake strlen Function
This is compiled as a dynamic library:
gcc -shared -fPIC -Wall -o mi_ls.so mi_ls.c
The effectiveness of the hook is immediately noted when the ls command is executed, as it is clear that the library created with LD_PRELOAD is loaded:
Effect of Intercepting a Dynamic Library and Replacing the Function strlen
The call to the library invoking strlen has been intercepted and the output of the ls command has been modified.
To conclude, it may be stated that such rootkits are very easily implemented, but have the downside that they are relatively simple to detect with basic checks on the integrity of binary code (hash values), verifying local variables that change the path, monitoring symbolic links, routes and the configuration of dynamic libraries and so forth.
Moreover, the amount of binary code to be modified and specific compilation for each version of the kernel poses a considerable problem to overcome in programming this sort of rootkit. When the technique involves attacking non-static binary items, the use of preloading of dynamic libraries is limited by kernel security measures that prevent such libraries from being loaded if they pose a security problem, for instance if privilege levels would have to be raised.
All this has caused rootkits in user space to become obsolete, as against more advanced varieties that take advantage of kernel space.
Rootkits in Kernel Space
Without any doubt, the most attractive aim for an attacker, once a system has been accessed and super-user privileges obtained, is to install a rootkit in kernel mode and thus ensure total control of the system together with better concealment. Once it is integrated into the system, detecting it becomes much more complicated, because it operates at a higher level of privilege giving it permission to modify not just binary items themselves, but the very functions and calls of the operating system.
For this purpose, their design must be much more elaborate, since they work in kernel space, interacting with system functions and calls that are fundamental. Hence, any defect in their operation might trigger a failure in the kernel with fatal consequences.
Bootkits
Bootkits go a further step, by adding boot-up functions to rootkits and affecting the firmware of systems and boot sectors of disks. In this way, the intrinsic difficulty of detecting them is heightened by their resistance to elimination. This is the kind of rootkit that is most advanced and persistent.
One recent example is Thunderstrike, a bootkit for MacBooks that modifies the firmware of the boot extensible firmware interface (EFI), making it persist even after re-installation of the operating system, and indeed after disk replacement. It was presented by the researcher Trammell Hudson at the Chaos Computer Conference held in Germany in December 2014.