1. Technology has improved but is far from perfect
In order to check which known vulnerabilities affect your installed base, you first need to know exactly what you actually have: Which software products and versions, security patches, which hardware models, firmware versions, and so on.
If you have a solid OT asset inventory in place, such as OT-BASE, this is pretty much taken care of.
You could now easily identify which of your systems are prone to a new vulnerability that you have just learned about from an ICS-CERT advisory, the media, or other sources. However a purely anecdotal approach to vulnerability management doesn’t get you a long way.
What you really want is a higher degree of automation, where your OT asset management system informs you right away about any and all new vulnerabilities that have been discovered for your systems. You want automatic links to the CVE database.
While this sounds easy in theory, it’s difficult in practice. The reason is a conceptual one. The US Government had spent a lot of thought about automating the vulnerability management process, but implementation falls short of expectations. Here is why.
In theory, you match a given product, let’s say a specific operating system version, with its known vulnerabilities (CVEs) by attaching a Common Platform Enumerator, or CPE, to it. A CPE links products with CVEs and is something like a URL. Every CVE includes a CPE list to identify affected products. Unfortunately, you don’t get the CPE from the vendor, and you sure as hell cannot simply read the CPE from the product itself.
So where does the CPE come from? Either from yourself, from a software algorithm, or from a service provider. Different from what you might think, generating the “right” CPE from the product information that you have is anything but a simple table lookup. It’s fuzzy logic at best. And there is no way that a heuristic algorithm could verify that the CPE it has generated is actually correct; human intervention is required for validation.
The remedy that vendors (including ourselves) have chosen to address this problem is a curation process where humans check the validity of CPEs and apply manual corrections if needed. And even that is not guaranteed to deliver 100% accuracy. But without such accuracy, users will have to deal with false-positives and false-negatives as well. (For a more in-depth elaboration of the problem, check out this paper.)
2. You won’t get anywhere without a well-defined mitigation process
For the sake of our discussion, let’s assume that all CPE/CVE validity problems are solved and you are looking at hundreds of known vulnerabilities for your installed base. Now what?
Unlike IT, you can’t get rid of these vulnerabilities by rolling out patches automatically and be done before lunch. You can’t reboot industrial control systems just like that. Many of these systems run 24/7, and nobody in their right mind would interrupt production in order to install a security patch or new firmware version.
And it gets worse. In OT, every single patch, or firmware update, needs to be checked for compatibility with installed software applications and libraries. Installing a patch, or a new firmware, is a configuration change that can have negative side effects, including downtime. In many engineering departments, such configuration change is subject to a change management process, and rightly so.
Checking patch compatibility can be more difficult than you might think, because even in 2019, vendors haven’t yet adopted a standardized format for publishing patch approvals. Some use their web site, others send out PDF documents by email, yet others simply tell you to not install any patch if you don’t want to lose warranty.
You need a well-defined process to regularly identify CVE urgency, patch compatibility, and time to patch. As another part of this mitigation process, you need to define procedures for patch testing, rollout, rollback, and audit.
You don’t have to define such a process from scratch because standards are available. The most important of these standards is IEC 62443-2-3, from which the image below is taken. We can’t provide a link to the full document because you have to pay for a license. Anyhow license fee is low (around $100), and it’s money well spent if you want to go for a systematic patch process. As a free alternative, check out this document from DHS, but note that it is a bit outdated (2008).
3. Patching ICS is the most expensive OT security activity under the sun
Once that you have the technology to identify vulnerabilities, and a process on how to deal with them, the next step is to provide the resources required in order to execute the process. Start with the question who is actually supposed to assess patches and firmware updates, to test and ultimately deploy them. The burden usually lies on the engineering department.
Does your engineering department have enough resources for executing the patch management process? Most likely not, and you would have to be a fool to assume that all that patching and updating could be done on the side. And this is where cost becomes a significant factor.
Unfortunately, vulnerability management is a recurring activity. Every single week, new vulnerabilities are published. It never ends. And with any additional piece of OT equipment that you install, the amount of effort increases proportionally.
When you want to get an idea of the cost involved in patch and firmware management, you need to consider the lifetime of your OT assets. Let’s assume an average of 15 years, and you’re looking at 15 years of patching. Many organizations try to lower cost by patching only twice, or even once, per year. But then you might not actually gain the security improvement that you had hoped for.
15 years is a long time for IT standards. Within this timeframe you’re guaranteed to run into another problem: End of product support. You can’t patch systems for which patches are no longer available. As an example, support for Windows 7 ends in January 2020. If patching is your security control of choice, you will need to start updating all these Windows 7 boxes rather soon. (Just admit that you were thinking about all your Windows XP and Server 2003 systems right now.)
Will all the applications presently running on those boxes be available for Windows 10? Will any update be free of charge? Will the whole package require more potent hardware? What about the labor involved to actually make it happen for the dozens, hundreds, or thousands of these systems that you have?
The bottom line is, patch execution in OT comes with some rather tough challenges. Being able to identify new vulnerabilities in realtime is one thing. Implementing and executing a smooth patching regime is a completely different thing.
4. No matter how much you patch, it won’t make you secure
Sorry to break this to you: Even if all your OT systems were fully up to patch, and you were always running the latest firmware versions on your PLCs, RTUs, protective relays, frequency converters and so on, you will still remain highly vulnerable.
How can that be?
The dirty little secret known to anyone in the OT security space is: ICS systems, designs, and technologies alike are insecure by design. This does not apply only to “legacy” products; it applies to the latest and greatest as well. We’re talking about vulnerabilities for which there will never be a patch, because they are considered legitimate features of the technology or product architecture.
As an example, consider the fact that all current industrial protocols and products lack basic authentication features. That means, once that an intruder has found his way into a process network, the PLCs and actuators in this network can be manipulated comparatively easily without exploiting a single unpatched vulnerability.
Let’s be more specific. If you follow CVE feeds, you may have recognized that PLCs get periodic CVEs because yet another software vulnerability in their embedded web server has been discovered, such as cross-site scripting. Exploiting such a vulnerability can result in a DoS situation etc. pp.
What the CVE doesn’t tell you is that even after mitigation, that same PLC can be DoSed easily by simply using legitimate product features.
Armed with this knowledge, are you still convinced that it’s worth the effort to install that new firmware version with the HTTP bug fix on all affected PLCs?
Let’s add another goodie. The majority of ICS lacks basic network resiliency. They don’t react well to unexpected, funny formed (though completely standards compliant) network packets. You don’t need to be a super hacker to DoS a manufacturing plant, an aggressive Nmap scan is usually sufficient.
Talking about super hackers: Even the best of the best did only exploit legitimate product features when designing the cyber-physical super weapon of all time, a.k.a. the first version of Stuxnet. It also didn’t use a single Zero-Day, just the exploitation of legitimate product and design features.
5. Pillars of an efficient OT patch strategy
Should we now throw patch and firmware management under the bus? Not at all. The lesson to be learned is that given the poor cost/benefit ratio, you should not be over-eager to patch as many OT systems as possible. Just to the contrary, your goal should be to limit patching and firmware updates to those systems where better options are not available.
This is what we consider a rational OT patch strategy that addresses cost/benefit and feasibility. Here is some concrete guidance on how to make it happen.
Prioritize: Patching holds more benefits for exposed systems, such as those in a DMZ, for jump servers, and rendezvous servers. Since we’re talking mostly about IT systems here, chances are that patching and updating is possible without involving scarce engineering time.
Standardize: Define standard configurations and reference architectures. For example, whenever possible use a common configuration for engineering stations, HMIs, operator stations, and so on. These standard configurations can include security patches and software/firmware versions. In a modern OT asset management system such as OT-BASE, standard configurations can be expressed as baselines which can be audited and updated quickly.
Prevent: You don’t need to patch what you don’t have. In a typical OT environment we find lots of highly vulnerable software packages that have no good reason to be there in the first place. A perfect example is Adobe Flash Player, which is notorious for being plagued with critical vulnerabilities. De-install Flash, and all of a sudden you have much less to patch! This approach is known as system hardening, and is supported by the baseline function in OT-BASE.
Compensate: Use compensating mitigation where you can. The top candidate is application whitelisting. Whitelisting software doesn’t remove all those nasty vulnerabilities, but even if an exploit manages to load rogue software on a whitelisted computer, it won’t be executed. Another benefit is that whitelisting also prevents the execution of non-malicious, yet still unauthorized software. Best of all, whitelisting is virtually maintenance-free, depending on the update cycles of your OT software infrastructure.
Patch and firmware management in OT is such a consequential, heavy-weight activity that it warrants a thorough strategy. Patching OT systems is anything but a no-brainer, due to
– inaccuracies in matching installed product versions with CVEs
– required engineering resources
– long system lifecycles that call for platform updates
– little payoff for systems that are insecure by design.
A systematic patch strategy limits patching and firmware updates to situations where cost/benefit of patching exceeds that of alternative strategies. Following the guidelines above, supported by a powerful OT asset management system, will help you to establish an efficient OT vulnerability management process.