Monitoring of security in industry and critical infrastructure is essential for conducting business. The following well-known statement attributed to Lord Kelvin, an important British physicist and mathematician, describes a situation that is perfectly applicable to industrial control systems:
What is not defined cannot be measured. What is not measured cannot be improved. What is not improved always gets worse.
The monitoring of security in environments with SCADA systems differs from monitoring using on IT systems, due to the following issues:
- Events tend to be grouped into very few alerts that may be significant for the operator.
- The monitoring cannot hinder the first requirement of these systems: availability. As such, monitoring must not be intrusive.
- Each alert makes the operator responsible for a state in the situation that they must understand and on which they must act accordingly.
- The choice of metrics and definition of indicators must be in line with the business’ fundamental requirements: the availability and continuity of operation.
Metrics and indicators are mechanisms used to monitor a system. There are different standards or best practices that establish and support them:
- For MAGERIT the calculation of risk provides an indicator that allows decisions to be made by explicit comparison with a certain Risk Threshold, that is, a characteristic of the vulnerability/impact relationship and therefore of the relationship between assets and threats.
- CERT-RMM (Resiliency Maturity Model) is a model in which indicators are established to measure the level of maturity of the resilience of critical systems. These indicators are a specific representation of the levels of capacity.
- In the field of critical infrastructure there are various aggregate indicators and we can find those related to security. An example is the matrix developed by the Center for Security Studies in Zurich. The English CPNI also describes the best practices for defining metrics in the area of change management.
Monitoring and data collection
The monitoring process principally follows the next steps:
- Collection of information and data
Correct collection of information is key to having quality data for the subsequent analysis. Normally, specific sensors are used for the collection of these data, through hardware or software elements. One commonly used method is that known as Port Mirroring or Span Port. Through this mechanism a copy of the network traffic that is transmitted through one or more of the switch ports is made and extracted by a dedicated port to which the inspection is applied. However, in order to not overload the network elements through port mirroring, which could result in delays in packet transmission, in industrial networks it is recommended to use network sniffer devices such as TAPs, since they allow network traffic to be examined in a completely passive manner, without overloading the network elements used for other tasks.
An example of architecture with hardware sensors
To supplement the information obtained from the network traffic it is necessary to collect all data from the operating system and other events generated by the rest of the network’s security systems, such as intrusion detection systems, firewalls, etc.
The collection of information is very important but it is necessary to bear in mind that for monitoring to be effective the system must be capable of processing all of the information collected. If this is not the case, packets or information revealing anomalous or malicious behaviour may be lost.
The collection is, therefore, an indispensable element for subsequently analysing all of the information and determining whether the state observed is normal or anomalous and, in the latter case, activating an alarm mechanism.
There are various anomaly detection methods, but they can essentially be grouped into signature-based and anomaly-based detection mechanisms.
Monitoring in the industrial environment
Industrial control systems are increasingly being integrated into other networks, allowing flexible exchange of data and improved remote operability. Remote locations can be connected to corporate facilities or offices and these, in turn, can be connected to the central control system.
Industrial control systems’ different connections with other networks are helping to improve productivity but, on the other hand, they have the potential to introduce a series of elements that are susceptible to suffering vulnerabilities.
As such, for example, some vulnerabilities associated with network segmentation, such as configuration errors, could allow intrusions in the system or anomalies in traffic routing. These vulnerabilities can be exploited intentionally or unintentionally by individuals or groups inside and outside of the organisation and they therefore are a risk to it.
Industrial control systems processes may provide real time notifications of potential and apparent problems; however, both dependence on existing commercial systems (COTS) and orientation towards standardised connections (TCP/IP) may result in malicious cyber intrusions.
As with the IT sector, to achieve an acceptable level of security in industrial control systems, it is necessary to measure the state of cybersecurity and compare the current level of security with appropriate reference points.
Monitoring strategies: Concepts, metrics, indicators and management
In monitoring there are events, metrics, indicators and alerts; these are key concepts for knowing what monitoring is:
The identification of all of these concepts in industrial control systems may become difficult. The US initiative I3P has carried out various studies on metrics that help to define the criteria through which to measure the state of cybersecurity.
The metrics must measure the values of an asset that, following a risk analysis, is considered to be potentially vulnerable and, therefore, exploitable.
As a reference example, the US organisation SANDIA, which specialises in matters of technological security, establishes and structures metrics in follow categories:
- Mission: metrics that ensure the mission of the system and company. Examples:
- Availability: % of business continuity.
- Indicator: 88% in the regional electricity sector.
- Process: assists in achieving the mission’s objectives. Examples:
- Access control: % of erroneous access to semi-critical systems.
- Indicator: 0.3%
- Control: Its measurement constitutes a specific attribute. It is part of the process. Example:
- Encryption strength: cryptographic power in line with the system’s performance.
As can be observed in this structuring, the most simple metrics to obtain go from bottom-up; that is from control to mission. The control metrics are relatively simple to define and obtain since they focus on the target data of a system or part of a system that can be obtained quasi-directly.
As metrics are being defined for higher levels, events from different sources should be added and the accuracy of the metrics should be improved. This will help governance of the system.
The application of various mechanisms (heuristics, evolutionary computation, fuzzy logic, etc.) will be used to correlate events and provide a significant view so that the operator may evaluate the state of the system at any given moment.
Widely used metrics and indicators
The following table shows some essential indicators for industrial processes according to SANDIA’s ASRM structure:
Management and use of alerts
It is essential to coordinate the deployment of security alerts with individuals in charge of the operation in the control centre, particularly in an important operator or critical infrastructure. The deployment of a new alert involves new responsibilities that the operator must understand completely. The operator must understand that the monitoring system may be susceptible to new impacts and they must therefore know both the plans for prevention and the contingency plans in the event of a threat or attack.
The deployment of a new alert is in itself an extra responsibility for the operator, who must look out for the alert. As such, many operators are not receptive to new alerts, since their responsibility is expanding and, moreover, these alerts usually provide information about aspects that are not well-known by them.
This aspect is essential to bear in mind and it must be accepted by the whole organisation, from directors to engineers (and its impact must also be known by the relevant insurers).
Monitoring solutions: SIEMs in industry and their impact
SIEMs are very common monitoring solutions in IT systems that aim to gain a foothold in the industrial field. The introduction of security monitoring elements have, until very recently, been relatively intrusive in industrial environments, since they cause a lot of noise in the environment elements and, as such, affect the essential requirement of availability.
In a typical industrial network segmentation, as explained in the article The evolution of network systems in control systems , SIEMs could only monitor elements with high processing capacities, since in more profound segments these solutions may monopolise too many resources.
As such, and with the aim of not interfering in the system’s functioning, monitoring must be passive or, if active, it must extract information of events from copies obtained passively from its elements.
Current SIEMs from the IT world tend to adapt these strategies to industrial paradigms.
Monitoring: better secure…
One of the best-known cases with one of the biggest impacts in the industry world was that caused by Stuxnet. Stuxnet was a multi-vector virus that took advantage of various 0-days and marked a turning point in terms of awareness about the importance of security in industrial control systems. One of the vectors used in the attack perpetrated with this malware consisted of modifying a temperature value that was part of a metric. The value was modified, but the correct value was allowed to be sent to the control centre (related to the indicator). An appropriate monitoring system that managed these events could have allowed the threat to be identified quickly and accurately.
There is no doubt that security monitoring will be the flagship product, maintained through countless adapters and aggregators, which will make it an essential tool and not only for cybersecurity aspects.