Home / Blog / Botnet detection via analysis of flow records

Botnet detection via analysis of flow records

Posted on 12/04/2014, by David Cantón (INCIBE)
Botnet

Another technique used for the detection of botnets is the analysis of network traffic flow records. This technique is based on analysing traffic from a more abstract point of view, without actually analysing the data itself. Basically, the idea is to characterize the communications parting from data such as origin and destination IP addresses, communication ports, traffic volume or duration of sessions. With all this data, the analysis of flow records system can detect statistically anomalous behaviour or behaviour patterns through which botnets or other types of threats can be detected.

 

NetFlow 5 data module

- NetFlow 5 data module (source: CISCOSystem) -

As this technique doesn’t analyse the payload of packages, it reduces the amount of information that must be stored and examined. Thanks to this, problems which other techniques have are avoided, such as:

  • Legal problems: as private information isn’t inspected there are no conflicts with laws on privacy of communications.
  • Scaling problems: real-time processing of a great amount of data, which is generated in medium-size and large networks, tends to be a very complex problem that requires a great amount of resources. As the amount of data being handled is reduced, this problem also diminishes.

However, this type of analysis doesn’t have direct access to evidence of malicious behaviour, given that is doesn’t analyse the data itself. This technique is limited to the analysis of the communications’ metadata, but it is quite effective at identifying threats in a network.

NetFlow

The network protocol that has become the de facto standard for the analysis of network flow records is NetFlow. NetFlow is an open protocol (RFC 3954) designed by CISCO System and subsequently adopted by other manufacturers such as Juniper, Huawei or 3Com/HP. This protocol enables the compilation of information regarding the information of IP traffic of a network for its subsequent analysis. The routers and switches that support this protocol not only distribute traffic throughout the network but also send metadata from the aforementioned traffic (IPs, ports…) to a specialized compilation server. This server carries out the corresponding analysis aimed at identifying anomalous traffic patterns, communications with other ports that are in theory filtered, connections to malicious IPs, excesses of a certain type of traffic, sessions that go on for too long…which can indicate whether a network or some of its members have been compromised.

 

Example of NetFlow architecture

- Example of NetFlow architecture (source: Cisco) -

The analysis of network traffic flow can detect both unknown threats, by detecting anomalous behaviour such an unidentified botnet, and previously identified threats, such as known IPs from the central C&C of a botnet.

Another advantage of the analysis technique is the possibility of storing the reduced metadata from communications, thanks to which retrospective analyses of threats or attacks can be carried out, with the aim of studying the system’s response and improving the existing counter-measures. On the other hand, this technique generates a considerable overflow of traffic, since the routing devices send all the metadata from communications to the server which is responsible for storing and processing this information.

Following we will look at some publications and studies that develop this technique for the detection of botnets:

  • Using machine learning techniques to identify botnet traffic(Livadas C., Walsh, R., Lapsley, D., Strayer, T) explains, as far back as in 2006, how to use automatic learning methods for the detection of IRC botnet traffic. The proposed system analyses the traffic of a network statistically, firstly segregating IRC traffic from non-IRC traffic and secondly examining which IRC traffic corresponds to a botnet. To do this, the number of bytes per package and their variance in time are analysed amongst other data.
  • DISCLOSURE: Detecting Botnet Command and Control. Servers Through Large-Scale NetFlow Analysis (Bilge, Leyla; Balzarotti, Davide; Robertson, William; Kirda, Engin; Kruegel, Christopher), which proposes a system for the detection of C&C servers from botnets via the analysis of NetFlow’s register; this analysis is performed depending on the flow size, client access patterns and temporary behaviour.

DISCLOSURE architecture

- DISCLOSURE architecture -

  • A real application is Cisco Cyber Threat Defense Solution. This system uses the analysis of information obtained through NetFlow to identify possibly suspicious network traffic patterns. Besides analysing patterns, this system incorporates other methods for the detection of threats:
  • Blacklist, detection of botnets via blacklists. This system confirms the network’s connections against multiple sources of information that register IP addresses from possible threats.
  • Monitoring of very short lasting periodic unidirectional connections outside the monitored network. These types of connections could be a sign of malicious connections.
  • Backtracking, when a new C&C server is discovered the existence of old communications with this server is confirmed, using the historical archive it stores.

This detection technique is, of course, easily avoidable if it is the only measure used to detect botnets. There are techniques such as Domain Generation Algorithm (DGA)or covert channel, that are used to avoid its detection, with the aim of making the connections carried out by these types of threats pass unnoticed. Therefore, although this technique shouldn’t be the only one used in a system and must be complimented by others such as package analysis, log analysis, DNS analysis…