The Simple Cyber Governance Program OT security incident management process

This material contains proprietary information and is copyrighted by Langner, Inc. Using the content of this document in production environments, in the process of planning or designing operations technology systems and architectures, in product development or any other commercial activity requires a commercial SCGP license. A commercial SCGP license may also allow you to modify SCGP documents and share them individually, apart from the full SCGP document set. You can also download a full copy of the Simple Cyber Governance Program documents in PDF format for a thorough evaluation.
Copyrighted Material

© 2018 Langner, Inc.

Using the content of this document in production environments, in the process of planning or designing operations technology systems and architectures, in product development or any other commercial activity requires a commercial SCGP license.

Content

0      Introduction

0.1       Scope and Intended Audience

0.2       The Role of Cyber Security Incident Management in SCGP

0.3       Understanding OT Security Incident Management

0.4       Revision Notes

1      Cyber Security Incident Response Capability

1.1       Cyber Security Incident Response Personnel and Relevant External Parties

1.2       Technical Cyber Security Incident Management Requirements

1.3       Cyber Security Incident Response Prerequisites

1.4       Training and Exercises

2      Cyber Security Incident Detection and Assessment

2.1       Cyber Security Incident Identification, Validation and Assessment

2.2       Cyber Security Incident Prioritization

2.3       Cyber Security Incident Notification

2.4       Mobilization of Response Forces

3      Cyber Security Incident Response

3.1       Predictive Analysis of Potential Incident Response Side Effects

3.2       Cyber Security Incident Containment

3.3       Eradication and Recovery

4      Post-Incident Procedures

4.1       Post-Recovery Notification

4.2       In-Depth Forensic Analysis

4.3       Cyber Security Incident Response Review and Documentation

5      Cyber Security Incident Management Using the OT-BASE Asset Management System

5.1       Extending the OT-BASE User Database for Incident Management

5.2       Storing Contingency Plans and Backup Files

5.3       Detecting and Reporting Cyber Security Incidents

5.4       Assessing an Incident

5.5       Cyber Security Incident Response

5.6       Closing or Canceling an Incident

 

0    Introduction

0.1        Scope and Intended Audience

Cyber Security Incident Management in SCGP is the organized capability to minimize the adverse effects of potential and actual cyber security incidents in the most efficient way possible by applying predetermined procedures coordinated and executed by a competent incident response team.

A cyber security incident in the context of SCGP is defined as a threatened or actual violation of authorized OT system or network configuration and usage. Cyber security incidents do not necessarily have a malicious cause. Apparently the majority of actual cyber-physical incidents on the record have their roots in accidental circumstances or in humans acting in the best intent but in violation of policy (such as when reconfiguring OT systems without authorization). However, random equipment failure is not regarded as a security incident because a random event does not violate anything.

This document is written for cyber security incident responders. In industrial environments, the respective stakeholders extend far beyond OT and IT experts, including, for example, operators, physical security, and emergency response forces.

0.2        The Role of Cyber Security Incident Management in SCGP

Along with Workforce and Contractor Management and Asset and Configuration Management, Cyber Security Incident Management is one of the major activity groups in SCGP. Unlike the former two, where activities are executed on an almost continuous basis, Incident Management is event driven. As the name implies, it is triggered by cyber security incidents, and in the absence of such incidents, little needs to be done. One could add that if the activities within Workforce and Contractor Management and Asset and Configuration Management are executed thoroughly, the chance for incidents to occur is minimized. Nevertheless an organization needs to be prepared for what is bound to happen over time with increasing digital complexity. It should also be pointed out that Cyber Security Incident Management activities may be required by regulation. Finally, different from other SCGP activities, incident response capability requires a 24/7 availability, thereby putting higher resource requirements on the organization than other SCGP activities.

Where Cyber Security Incident Management is not required by regulation, the organization may choose to focus its initial efforts in implementing SCGP on Workforce and Contractor Management and Asset and Configuration Management first in order to develop a solid cyber security capability which is also a prerequisite for establishing effective Incident Management. To put it another way, the idea to focus on Cyber Security Incident Management in order to save the effort required for the other two activity areas in the assumption that a good incident response capability makes good for lacking capabilities elsewhere is a dangerous misconception.

0.3        Understanding OT Security Incident Management

0.3.1        Security Incident Management Objectives

  • Verify and understand security incidents
  • Minimize the potential safety impacts of security incidents
  • Make operators aware of potential consequences of security incidents
  • Minimize the business impact of security incidents
  • Coordinate with all forces that might be required to mitigate the incident, including external forces
  • Prevent future security incidents based on lessons learned from incidents
  • Prosecute illegal activity
  • Improve overall OT security and security incident response.

0.3.2        Specific Challenges for OT Security Incident Management

Incident Management in the OT space has different characteristics when compared to office IT:

  • Physical consequences cannot be remedied by restoring from backup tapes.
  • A simple reboot or disconnect from the network may not be possible during production.
  • Cyber incidents in OT may have safety impacts.
  • The more serious security incidents cannot be assessed and mitigated without the help of operators and engineers.
  • Cyber-physical events occur in realtime and may therefore require immediate response by operators and incident managers alike in order to prevent or minimize physical harm.
  • Even more than in IT, flawed incident response may make matters worse.

0.3.3        Structure of OT Security Incident Response Activities

OT incident response activities are structured in the following groups which are also reflected in the various chapters of this document:

  1. Develop and maintain a cyber security incident response capability. Without such a capability, there is no basis for the following procedures.
  2. Detect and assess cyber security incidents. If an anomaly, suspected or reported incident is not an actual cyber security incident, go to 1.
  3. Respond to cyber security incidents.
  4. Perform post-incident procedures and go to 1.

0.4        Revision Notes

0.4.1        Changes from RIPE-17 to SCGP-18

Changed product name from RIPE to SCGP

Changed title from Incident Management to Cyber Security Incident Management

Changed copyright details

Changed incident management to cyber security incident management throughout the document

Added a procedure to check for other systems with the same vulnerability in section 2.1.6

Added a chapter on cyber security incident management using the OT-BASE Asset Management System

0.4.2        Changes from RIPE-16 to RIPE-17

Replaced myRIPE with OT-BASE (new product name)

Deleted paragraph on secure file sharing in section 1.3.5

0.4.3        Changes from RIPE-15 to RIPE-16

New document in RIPE 16

1    Cyber Security Incident Response Capability

1.1        Cyber Security Incident Response Personnel and Relevant External Parties

Cyber security incident response is a team effort including a variety of stakeholders for all but the most trivial incidents. An important part of cyber incident response capability is that all of those stakeholders are identified and prepared to contribute to Cyber Security Incident Management according to their roles and responsibilities. The following sections identify the major stakeholders in incident response.

1.1.1        Incident Response Team

OT Support Center: The core incident response team for a specific case is always recruited from the OT Support Center. The OT Support Center is responsible for managing and supervising incident response activities, and acting as first responders.

Decentral SCGP Support Group: On-site members of the SCGP Support Group at the affected facility (see SCGP Workforce Management) support the OT Support Center in responding to cyber incidents.

Engineering: In the OT domain, engineering expertise is required to assess the impact of a cyber incident and its mitigation. It is recommended to name at least one senior engineer who acts as a permanent member of the central incident response team.

Operations: For severe cyber incidents, operations will need to participate in incident response. It is recommended to name at least one senior operator who acts as a permanent member of the central incident response team.

1.1.2        Relevant Other Internal Parties

Management: Severe cyber security incidents require management decisions such as when and how to inform the public, or about the need to escalate incident response by allocating more resources or informing law enforcement.

Physical security: Some security incident response procedures require the help of physical security personnel.

IT department / IT security: Security incidents relating to network intrusions or malware infections may be addressed more effectively with the help of the IT department or IT security experts. There may also be a protocol in place that triggers specific action on the IT side if malicious activity is discovered in control systems and networks.

Human resources department: Cyber security incidents that involve staff members or contractors may require the assistance of the HR department.

Public relations department: Severe cyber security incidents or incidents that are already recognized by the public may call for media briefings. The public relations department has the responsibility to prepare and execute such briefings.

Legal department: Malicious security incidents may raise the question of involving law enforcement and/or litigation. Such activities are handled by the legal department.

1.1.3        External Parties

External incident response experts: Most asset owners will not be able to develop and maintain high skill sets for cyber-physical incident analysis and response internally, making the help of specialized experts essential. As it is customary in the IT domain, it is recommended to contract appropriate service providers on a retainer basis to assure timely help in the event of a severe cyber incident that exceeds the organization’s capabilities.

Vendors and contractors: Incident response may require technical assistance from vendors and contractors. It is recommended to identify a focal point for incident response at the organization’s major vendors and contractors.

Internet service providers: Malicious traffic that hits the organization from the Internet (e.g. at remote access points) may require the assistance of the respective Internet service provider.

Network and IT service providers: Where network and IT services are outsourced, it is important to assure timely help from the respective service providers in the event that incident response requires re-configuration of network gear or servers.

Government entities: Cyber incidents may require notification of government entities such as regulators or CERTs.

Law enforcement: Malicious cyber attacks may call for the help of law enforcement.

Media: The organization’s PR department should identify a set of relevant media outlets and contact information for the event that a cyber incident requires notification of the public.

Peers: Large-scale cyber incidents will most likely affect peers as well. A network of relevant peers along with contact information is helpful for coordinated incident response and media strategy.

Customers: Cyber security incidents may affect customers in various effects, for example in the form of delayed shipments or impaired product quality. Customers with an online connection to the organization may also play a role as potential sources of malicious network traffic, and as potentially becoming infected by malware spreading from the organization.

Suppliers: Just like customers, suppliers with an online connection to the organization may play a role as potential sources of malicious network traffic, and as potentially becoming infected by malware spreading from the organization.

1.2        Technical Cyber Security Incident Management Requirements

Cyber security incidents in the OT domain cannot be managed professionally just by using word of mouth or pen and paper. They require a modern software application similar to what is standard in IT environments. The respective software application should support the following functions:

  • Logging of cyber security incidents
  • Assignment and identification of roles and responsibilities for incident management
  • Notification of incident response stakeholders
  • Workflow for cyber security incident management
  • Reporting capabilities about cyber incidents.

1.3        Cyber Security Incident Response Prerequisites

1.3.1        Reporting Channels

Almost every person on the plant floor may observe or suspect a cyber security incident. This includes contractors. To facilitate easy reporting of suspected or observed cyber incidents, a dedicated list of contact persons with contact information should be communicated.

1.3.2        Contractual Prerequisites

More complex cyber security incidents will require the help of external experts. Starting to identify and contract such experts after an incident is detected can be a bad idea as it might delay incident response for weeks. Therefore, it should be considered to contract external experts in advance in a retainer agreement that assures short-term availability when an incident occurs.

1.3.3        Useful System Model and Technical Documentation

Incident response capability is severely limited if incident responders don’t have useful technical documentation in the form of a complete and accurate System Inventory along with network and data flow diagrams. It would be similar to physical security trying to chase down attackers who have already managed to infiltrate the facility without floor plans and intimate knowledge of rooms, stairways, and elevators etc.

In SCGP, the recommended way to allow for all members of the incident response team to access vital system documentation is to give them access to the OT-BASE OT Management System where all relevant configuration and network data is accessible online.

1.3.4        Contingency Plans

Procedures for restoring compromised or otherwise dysfunctional OT systems can and should be developed ahead of time, since the major types of system disruptions are predictable. A contingency plan outlining a detailed procedure on how to restore an impaired system should exist for critical systems.

The worst time to develop a contingency plan is after the fact, when incident responders may have to deal with a host of other problems and under time pressure. It is therefore recommended to develop and store contingency plans for critical systems during times of normal operation. In a perfect world, such contingency plans are part of the system delivery or commissioning and are developed by vendors and integrators. The SCGP System Procurement guideline contains respective criteria for system procurement.

It should be noted that contingency plans are not just needed for Cyber Security Incident Management but also for unscheduled maintenance in case systems become impaired for reasons that are not related to cyber security, for example due to equipment failure. The recommended practice is that OT maintenance and incident responders access the same library of OT contingency plans, maintained by the OT Support Center.

1.3.5        Secure Communications

Managing some more delicate incidents will require the exchange of sensitive material with outside parties such as external experts. Therefore, the capability to securely communicate with such parties should be established before the need to use it arises.

1.3.6        Access Rights to De-central Facilities and Systems

All but the most trivial cyber incidents will not be addressed properly by sitting behind a desk and communicating with other members of the incident response team by email. On-site visits will be necessary. Such on-site visits imply physical and logical access rights that might take weeks to arrange for, especially if they include requirements for specific safety trainings, security clearances etc. It is therefore recommended to assure that all necessary access rights are available for the incident response team and its potential external experts.

1.3.7        Designated Meeting Facilities

Every on-site incident response team needs a base camp, but meeting rooms can be hard to find. Therefore, an arrangement should be in place that the incident response team has priority for allocating suitable physical meeting space.

1.3.8        List of Cleared Analytic and Forensic Equipment

During an incident investigation the incident response team will need to use special tools such as network sniffers, cameras etc. which might not be allowed in the facility during normal operations. It is recommended to assure the required access rights and procedures beforehand so that no undue delay occurs when they need to be used ASAP.

1.3.9        Command Protocol

To some extent incident response may require decision making capabilities that conflict with existing policies. For example, the incident response team may decide to disconnect systems from networks, to shut down links to outside parties, to command assistance from physical security, or to access personal data, etc.

In order to assure for optimum performance of the incident response team it is recommended to define a specific command protocol that lists the various authorities of the incident response team and the expected cooperation of other stakeholders. An incident response team that has no authority to command technical changes of system and network configuration, or to direct operators and physical security, will hardly be effective.

1.4        Training and Exercises

Until now, real-world sophisticated cyber-physical attacks occurred with a very low frequency. That implies that an incident response team will have little practice in responding to such incidents based on experience. Therefore, training and drills become an important part of cyber incident preparedness. Such practice is commonplace for physical security.

The SCGP Training Curriculum contains exercises and training modules for incident responders, including tabletop exercises for management.

2    Cyber Security Incident Detection and Assessment

2.1        Cyber Security Incident Identification, Validation and Assessment

The purpose of this activity is to analyze incident indicators in respect to what is relevant for developing a response strategy, not to start a detailed forensic analysis (which may take too long to finish and thus delay incident response).

The analysis shall be documented and shared among the incident response team. Incident analysis may be updated as incident response progresses.

2.1.1        Incident Logging

Procedure: Create a new entry in the incident log.

The log entry should specify the identity of the person that creates the incident log entry, date and time, and a brief description of the incident’s symptoms.

2.1.2        Incident Status

Procedure: Specify if the incident is threatened, suspected, reported, or observed.

  • Threatened cyber security incidents are announcements by a group or individual to misuse systems or data of the organization for malicious purposes. Threat actors are usually found among hacktivists, organized crime, disgruntled insiders, and even nation state or their proxies.
  • Suspected cyber security incidents are a set of indicators (such as “funny” behavior of digital components) that cannot be explained by general hardware or software problems and thus may point to a cyber incident. The task of the OT Support Center then is to determine if those indicators point to an actual cyber incident or to something else, such as equipment failure.
  • Reported cyber security incidents and events are observations by users that may indicate a cyber incident and call for further investigation by the OT Support Center.
  • Observed cyber security incidents are positively identified incidents that must be followed up by incident response.

2.1.3        Incident Cause

Procedure: Specify if the cause of the incident is malicious or not.

If the cause of the security incident is malicious:

  • Specify if the attack is ongoing or done
  • Specify if the attack is autonomous or controlled (by insider or from the outside)
  • Specify if the attack blended with a physical attack (either preparatory or actual)

2.1.4        Incident Category

Procedure: Specify the type of the incident by associating it with postulated cyber incident categories.

Postulated incident categories are:

  • Malicious breach of software configuration integrity
  • Presence of unauthorized, apparently non-malicious software
  • Infection with known malware
  • Security-relevant system misconfiguration (e.g. deactivated AV, deactivated Whitelisting, default passwords on firewalls etc.)
  • Stolen or otherwise missing equipment (laptops, desktops, network gear etc.)
  • Physical destruction or unauthorized configuration change of digital components or cabling (vandalism)
  • Unauthorized hardware in the control network.

2.1.5        Identification of Affected Systems

Procedure: Produce a list of positively identified compromised systems, along with location, device ID, operating system, IP address, network association, device category.

Procedure: Produce a list of systems depending on the compromised system that are or may become impaired in functionality or reliability.

2.1.6        Identification of Systems at Risk

Cyber security incidents are often characterized by dynamic developments such as rapid network flooding and spreading across network zones. It is not enough to identify systems that are positively compromised but also systems which are at risk of becoming compromised due to the same cause or source of the incident at hand. Therefore, a listing of systems at risk should be produced by predictive analysis. The OT-BASE OT Management System makes this procedure straightforward and quick.

Procedure: If a known vulnerability was exploited, check which other systems are affected by the same vulnerability, and assess if they are or may become affected by the incident.

Procedure: Identify similar systems (using the same application software, operating system version etc.), both in the location compromised and in other locations, and assess if they are or may become affected by the incident.

Procedure: Identify systems in the same network zone and assess if they are or may become affected by the incident.

Procedure: Identify systems at the same location (cabinet, room etc.) and assess if they are or may become affected by the incident.

Procedure: Identify systems commissioned/maintained by the same person or company, both in the location compromised and in other locations, and assess if they are or may become affected by the incident.

Procedure: Identify systems that are or can become affected by the compromised systems based on legitimate use cases and trust chains.

2.2        Cyber Security Incident Prioritization

The next step in the incident response procedure is to assess the cost of consequence of the incident. This usually cannot be done by the ↗OTSC alone for anything but the most trivial incidents, because it requires operational and engineering expertise. For this reason, the incident response team needs to consult with engineers and operators.

2.2.1        Impact Severity

Procedure: Assess the observed or expected impact severity of the incident in respect to the following attributes:

  • Physical impact: Specify if a physical impact observed or expected in respect to a) process functionality, b) potential damage of equipment, c) product integrity, d) environmental damage.
  • Safety impact: Specify if a safety impact of the incident can be ruled out.
  • Operational impact: Specify if an operational impact (loss of view, loss of control, malicious control) is observed or expected.
  • Business impact: Specify the observed or expected business impact of the incident on the scale minor/relevant/major/critical.

As with other incident characteristics, this assessment may be updated during the course of progressing incident analysis.

2.2.2        Urgency of Response

Procedure: Specify the urgency of incident response based on the assessments and analytics of the previous steps, allowing for the possibility that some questions may remain unanswered, on the following scale:

  • emergency: all incident response forces must be allocated immediately
  • urgent: incident response must begin within eight hours
  • regular: incident response must begin within one week
  • routine: incident response can be delayed until a scheduled maintenance event, such as an operational outage
  • discretionary: incident response can be delayed until workload permits responding to the incident

2.2.3        Required Resources to Respond to the Incident

Procedure: Specify if existing decision making authority is sufficient or if incident response decisions must be escalated in the chain of command.

Procedure: Specify if help outside of the OT Support Center (but inside the company) is required, such as physical security, legal department, HR department etc. to appropriately respond to the incident.

Procedure: Specify if external help (external experts, law enforcement, civil defense etc.) is required to appropriately respond to the incident.

2.3        Cyber Security Incident Notification

Procedure: Based on the assessments and analytics of the preceding steps, notify stakeholders, observing any needs for confidentiality and secure communications:

  • Internal entities: System owners, IT department, management, operations, engineering, physical security, HR department, stakeholders at other sites which might be affected by the incident.
  • External entities: Regulator or other relevant government entity, contractors, vendors, external experts, law enforcement, customers, suppliers.

Notifications should include all available details of the incident and the nature of the requested help.

2.4        Mobilization of Response Forces

Many cyber security incidents will not be addressed properly by sitting behind a desk and communicating with other members of the incident response team by email. Physical presence is often necessary for complex incidents, which implicates the need for travel for central and external incident responders.

Procedure: Based on assessments and analytics of the previous steps:

  • Specify which stakeholders should be mobilized for on-site assistance
  • Schedule first response on-site activities
  • Coordinate scheduling with external parties and on-site staff.

3    Cyber Security Incident Response

3.1        Predictive Analysis of Potential Incident Response Side Effects

Cyber security incident response often requires significant modification, reconfiguration or even replacement of systems and components and may therefore have a significant impact on the process. A flawed incident response strategy can even make things worse. Therefore, technical modifications must be planned and assessed carefully before implementation. This step will usually require close cooperation of engineering and operations.

Procedure: Determine the impact if compromised systems are isolated from the network or other means of digital communication (e.g. fieldbus, serial point-to-point).

Procedure: Determine the impact if compromised systems are shut down and rebooted.

Procedure: Determine the ability to manually operate and control the affected systems.

Procedure: Determine if a process outage is required for incident response. An example would be an incident that has a safety impact, or an impact on emergency preparedness systems.

Procedure: Based on the assessment reflecting the steps listed above, arrive at an incident response strategy that is communicated to and scheduled with relevant stakeholders, especially with engineering and operations.

3.2        Cyber Security Incident Containment

3.2.1        Perimeter Lockdown

If the incident involves self-replicating malware or is ongoing and controlled by external threat actors:

Procedure:

  • Notify respective stakeholders about imminent perimeter lockdown, time permitting.
  • Shut down legitimate remote access links (such as VPN) to external parties such as vendors and contractors if possible without causing unacceptable effects on the process.
  • Shut down legitimate remote access links (such as VPN) to internal parties such as other sites.
  • Shut down conduits to business networks.

If the attack is ongoing and controlled by external threat actors but does not exploit legitimate remote access links:

Procedure:

  • Search for and disable any rogue WLAN or radio access points.
  • Search for and disable any rogue dial-in modems.

3.2.2        Suspension of On-Site Contractor Activity

For severe cyber attacks, on-site external personnel maintaining OT systems creates a risk of interfering with incident response activities and should therefore be ordered to leave the facility until the incident is contained.

An exception to the following procedures may exist when on-site external OT personnel have already been identified as part of the incident response team and are needed for the incident at-hand.

Procedure:

  • Notify physical security of temporary cancellation of external engineering access to respective site areas.
  • Suspend all activity by external parties working on site on digital configuration and maintenance (external engineers).
  • Disconnect all external devices and media from OT systems and networks and store them for further investigation until they can be ruled out as a factor in the incident.
  • Log the identities of external personnel during the time of incident.

3.2.3        Isolate Affected Systems

Procedure: Isolate affected systems from the network if possible without causing unacceptable effects on the process.

3.3        Eradication and Recovery

Eradication and recovery should follow the following procedures, broken down by postulated cyber incident categories.

3.3.1        Malicious Breach of Software Configuration Integrity

Incident category examples: Targeted and semi-targeted cyber attacks that often show characteristics of Advanced Persistent Threats.

Procedure:

  • Retain the compromised configuration and associated forensic evidence (either by replacing the compromised systems with spares or by producing a disk and memory dump of the compromised systems)
  • Assure that backups are not compromised
  • Restore authorized software configuration by executing the contingency plans for the affected systems
  • Test the restored systems
  • Put the restored systems back online and test in networked environment.

3.3.2        Presence of Unauthorized Software, Apparently Non-Malicious

Incident category examples: Unauthorized software applications such as media players installed by legitimate system users.

Procedure:

  • Determine the source of the unauthorized software
  • If it is not clear whether the unauthorized software is malicious, retain a copy for forensic purposes
  • Remove the unauthorized software
  • Assure that the authorized configuration is restored

3.3.3        Infection with Known Malware

Incident category examples: Infection with Botnet clients such as Conficker, often by phishing email or USB stick.

Procedure:

  • Determine the source of the infection (email, USB stick, network)
  • Apply recommended procedure by AV company to eradicate the malware
  • Apply any security patches that may mitigate the vulnerability exploited by the malware on the affected devices and on devices that may become affected if practical
  • Restore the system configuration from backups if required, executing the contingency plans for the affected systems, after verifying that the backup isn’t compromised
  • Test if the system functions properly
  • Put the system back online
  • Review policies, procedures, and system architecture

3.3.4        Security-Relevant System Misconfiguration

Incident category examples: Deactivated Anti-Virus software, deactivated Whitelisting, default passwords on firewalls etc.

Procedure:

  • Identify the cause of the misconfiguration
  • Restore the authorized configuration
  • Review access rights for the affected location and systems
  • Change passwords
  • Analyze the effect of the security-relevant system misconfiguration

3.3.5        Stolen or Otherwise Missing Equipment

Incident category examples: Stolen monitors, network gear, or laptops.

Procedure:

  • Replace missing equipment with spare parts
  • Restore the original configuration by executing the contingency plans for the affected systems
  • Update the System Inventory to reflect new serial numbers etc.
  • Review physical access rights for the affected location
  • Notify law enforcement if appropriate

3.3.6        Physical Destruction of Digital Components or Cabling

Incident category examples: Vandalism, such as intentionally cut network cables.

Procedure:

  • Replace damaged equipment with spare parts
  • Restore the original configuration by executing the contingency plans for the affected systems
  • Update the System Inventory to reflect new serial numbers etc.
  • Review physical access rights for the affected location

3.3.7        Unauthorized Hardware in the Control Network

Incident category examples: Computers, PLCs, wireless access points etc. identified by network scan, roaming signal scan, or walkdown inspection.

Procedure:

  • Confirm that the hardware component is unauthorized (e.g. by checking the System Inventory)
  • Isolate hardware from the network
  • Investigate the source of the hardware
  • Store unauthorized hardware for forensic analysis or for reclaim by legitimate owner
  • Analyze the effect of the unauthorized hardware on other systems, e.g. breach of configuration integrity

4    Post-Incident Procedures

The following procedures are executed after the cyber security incident is no longer a factor and normal operation has been resumed.

4.1        Post-Recovery Notification

Procedure: Notify all stakeholders that the incident is remedied.

4.2        In-Depth Forensic Analysis

If necessary or useful, an in-depth forensic analysis can be conducted for better understanding of incident characteristics that were not required for coordinating and executing incident response. In-depth forensic analysis will most likely involve external experts, and maybe law enforcement. Since in-depth forensic analysis can take quite a long time, it is recommended to start the effort only after the incident is no longer a factor.

4.3        Cyber Security Incident Response Review and Documentation

A major part of post-incident procedures is to document the incident and the efforts taken to remedy it, along with lessons learned and an assessment of the cost caused by the incident and incident response.

  • What would have prevented the incident? (Example: Presence of appropriate security patches, more rigid network segregation)
  • Could the incident have been predicted? (Example: Presence of software products and configurations with known vulnerabilities — see SCGP Vulnerability Management)
  • Which procedural and configuration changes are helpful to prevent incidents like this?
  • What would have made incident response more effective?

The documentation should be produced as a brief written report that then enters the annual performance evaluation and improvement activities of the SCGP Program.

5    Cyber Security Incident Management Using the OT-BASE Asset Management System

5.1        Extending the OT-BASE User Database for Incident Management

Incident responders don’t necessarily play a role in daily plant operations, especially when they are externals such as consultants, or experts working for a law enforcement agency or regulatory body. Nevertheless you want to have their contact details and areas of responsibility readily available when needed. The OT-BASE workforce database is a good place to store information about these individuals. It may also grant incident responders access to OT-BASE, which may help external experts to assess the incident, plan, and execute a mitigation strategy.

5.2        Storing Contingency Plans and Backup Files

One way to have contingency plans and backup files ready is to store them in OT-BASE along with the respective device, product, or network data. Since you want to be as efficient as possible here, examine if a generic contingency plan can be created for a specific product. If it can, store the document file as an attachment for that product in the Inventory/Software/Products or Inventory/Hardware/Products area. In this area, you could also store master images for software products and firmware.

For anything device specific, use the device entries. They are also a good place to store backup images for individual devices. In the following example, project files for a PLC are attached to the device profile along with a description of a replacement procedure.

5.3        Detecting and Reporting Cyber Security Incidents

5.3.1        Auto-detection of cyber security incidents

OT-BASE can automatically detect cyber security incidents which are characterized by unauthorized configuration change, if OT-BASE configuration collectors are used, and if the organization follows the change management procedure that is mandated in SCGP. Anytime OT-BASE detects a configuration change for an OT device for which no change case exists, it will automatically be flagged as a cyber security incident.

Another type of cyber security incident that is automatically detected by OT-BASE is the presence of “unknown” devices on the network.

5.3.2        Reporting a cyber security incident

OT-BASE gives users the opportunity to report suspected or observed cyber security incidents by using a simple online dialog in the Workflow/Incidents section. The user can enter a short description of the incident, and log any number of comments in the incident log.

5.3.3        Specifying affected devices

For specifying affected devices, the user can chose from the listing of all devices for which he or she is responsible, according to the user settings in the workforce management. Based on the selection of affected devices, the stakeholders section will be updated automatically, so that the user and everybody else in incident management can see whom to contact if need be.

5.4        Assessing an Incident

5.4.1        Specifying incident characteristics

The next tab in the incident dialog allows the user to define incident characteristics. Note that these characteristics may be overwritten in the process of incident response by an expert incident responder.

Based on the settings, a priority is calculated automatically by OT-BASE.

5.4.2        Exploring systems at risk (1): Systems with similar characteristics, and dependent systems

The tab “Devices at risk” gives you a quick overview of systems that share similar properties such as location, network, application software. It also shows devices that are dependent on the devices affected by the incident.

Exploring systems at risk may result in modifying the assessment of the incident, for example in elevating urgency if a large number of systems, or systems with critical process/safety function, are at risk.

5.4.3        Exploring systems at risk (2): Systems with the same vulnerability

If an incident has been shown to be caused by an exploit of a known vulnerability, other systems at risk can be identified easily. You can check quickly which systems are affected by a given CVE by simply inputting the CVE ID in the quick search field on top of the window (right next to the OT-BASE product name). The CVE profile lists devices affected by the vulnerability, and also shows if the vulnerability for each device is fixed or not.

5.5        Cyber Security Incident Response

Responding to a cyber security incident will often involve a configuration change of the affected devices, and of devices at risk. Opening a change case for such devices should then be performed in the Workflow/Changes area.

5.6        Closing or Canceling an Incident

Once that an incident is remediated, or remediation is no longer necessary, the incident can be closed or canceled by using the appropriate buttons in the edit incident dialog box.