In this article written by a vendor of a security monitoring solution, the main argument (often echoed in other publications and various tradeshows) was that patching OT systems is hard. The author argues that since it is hard, we should turn to other methods to improve security. His theory is to turn on an alerting technology like his and pair the pending alarms with a security incident response team. In other words, just accept patching is hard, give up, and pour more money into learning earlier (maybe?) and responding more vigorously.
However, this conclusion (patching is hard so don’t bother) is faulty for a couple of reasons. Not to mention, he also glosses over a very big factor that significantly complicates any response or remediation that should be considered.
The first reason this is dangerous advice is you simply cannot ignore patching. You must do what you can, when you can, and when patching is not an option, you move to plan B, C, and D. To do nothing means any and all cyber-related incidents that make their way into your environment will produce maximum damage. This sounds a lot like the M&M defense from 20 years ago. This is the idea that your security solution should be hard and crunchy on the outside, but soft and chewy on the inside.
The second reason this advice is faulty is the assumption that OT security staff (for incident response or patching) are easy to find and deploy! This is not true – in fact, one of the ICS security incidents referenced in the article points out that while a patch was available and ready for install, there were no ICS experts available to oversee the patch installation! If we can’t free our security experts to deploy known, pre-event protection as part of a proactive patch management program, why do we think we can find the budget for a full incident response team after the fact when it’s too late?
Finally, the missing piece from this argument is that the challenges to OT/ICS patch management are further exacerbated by the quantity and complexity of assets and architecture in an OT network. To be clear, when a new patch or vulnerability is released, most organizations’ ability to understand how many assets are in scope and where they are is a challenge. But that same level of insight and asset profiles will be required of any incident response team to be effective. Even his advice to ignore the practice of patching can’t let you avoid having to build a robust, contextual inventory (the basis for patching!) as a foundation of your OT cyber security program.
So, what should we do? First and foremost, we must try to patch. There are three things that a mature ICS patch management program must include to be successful:
- Real-time, contextual inventory
- Automation of remediation (both patch files and ad-hoc protections)
- Identification and application of compensating controls
Real-time contextual inventory for patch management
Most OT environments use scan-based patching tools like WSUS/SCCM which are pretty standard but not overly insightful to show us what assets we have and how they are configured. What is really needed is robust asset profiles with their operational context included. What do I mean by this? Asset IP, model, OS, etc. is a very cursory listing of what might be in scope for the latest patch. What is more valuable is the operational context such as asset criticality to safe operations, asset location, asset owner, etc. in order to properly contextualize our emerging risk because not all OT assets are created equal. So why not first protect the critical systems or identify suitable testing systems (that reflect critical field systems) and strategically reduce risk?
And while we are building these asset profiles, we need to include as much information as possible about the assets beyond IP, Mac address and OS version. Information such as software installed, users/account, ports, services, registry settings, least privilege controls, AV, whitelisting and backup status etc. These types of information sources significantly increase our ability to accurately prioritize and strategize our actions when new risk emerges. Want proof? Take a look at the usual analysis flow offered below. Where will you get the data to answer the questions at the various stages? Tribal knowledge? Instinct? Why not data?
Automate remediation for software patching
Another software patching challenge is the deployment and preparation to deploy patches (or compensating controls) to the endpoints. One of the most time-consuming tasks in OT patch management is the prep work. It typically includes identifying target systems, configuring the patch deployment, troubleshooting when they fail or scanning first, pushing the patch, and re-scanning to determine success.
But what if, for example, the next time a risk like BlueKeep appeared, you could pre-load your files on the target systems to prepare yourself for the next steps? You and your smaller, more agile OT security team could strategically plan which industrial systems you rolled patch updates to first, second, and third based on any number of factors in your robust asset profiles like asset location or criticality.
Taking it a step further, imagine if the patch management technology did not require a scan first, but rather had already mapped the patch to in-scope assets and as you installed them (either remotely for low risk or in-person for high risk), those tasks verified their success and reflected that progress in your global dashboard?
For all of your high-risk assets that you cannot or don’t want to patch right now could instead create a port, service, or user/account change as an ad-hoc compensating control. So for a vulnerability such as BlueKeep, you could disable the remote desktop or the guest account. This approach immediately and significantly reduces current risk and also allows more time to prepare for the eventual patch. This brings me to the ‘fall back’ actions of what to do when patching isn’t an option – compensating controls.
What are compensating controls?
Compensating controls are simply actions and security settings you can and should deploy in lieu of (or rather as well as) patching. They are typically deployed proactively (where possible), but can be deployed in an event or as temporary measures of protection such as disabling remote desktop while you patch for BlueKeep, which I expand upon in the case study at the end of this blog.
Identify and apply compensating controls in OT security
Compensating controls take many forms from application whitelisting and keeping antivirus up-to-date. But in this case, I want to focus on ICS endpoint management as a key supporting component of OT patch management.
Compensating controls can and should be used both proactively as well as situationally. It would be no surprise to anyone in OT cyber security to discover dormant admin accounts and unnecessary or unused software installed on endpoints. It is also no secret that best practice system hardening principles are not nearly as universal as we would like.
To truly protect our OT systems, we also need to harden our prized possessions. A real-time, robust asset profile allows industrial organizations to accurately and efficiently clear out the low hanging fruit (i.e., dormant users, unnecessary software, and system hardening parameters) to significantly reduce the attack surface.
In the unfortunate event, we have an emerging threat (like BlueKeep) adding temporary compensating controls is doable. A quick case study to highlight my point:
- BlueKeep vulnerability is released.
- The central team immediately disables remote desktop on all field assets and emails field team requiring specific requests on a system-by-system basis for remote desktop service to be enabled during the risk period.
- Central team pre-loads patch files on all in-scope assets – no action, just prepare.
- The central team convenes to decide the most reasonable plan of action by asset criticality, location, presence or absence of compensating controls (i.e., a critical risk on a high impact asset that failed its last backup goes to the top of the list. A low impact asset with whitelisting in effect and a recent good full backup can likely wait).
- Patching rollout begins, and progress is updated live in global reporting.
- Where necessary, OT technicians are at the console overseeing the patch deployment.
- The schedule and communication of this need are fully planned and prioritized by the data used by the central team.
This is how OT patch management should be handled. And more and more organizations are starting to put this type of program in place.
Be proactive with compensating controls
ICS patch management is hard, yes, but simply giving up on trying is not a good response either. With a bit of foresight, it is easy to provide the three most powerful tools for easier and more effective patching and/or compensating controls. Insight shows you what you have, how it is configured, and how important it is to you. Context lets you prioritize (first attempt to patch – second lets you know how and where to apply compensating controls). The action allows you to correct, protect, deflect, etc. To only rely on monitoring is to admit you expect fire and buying more smoke detectors might minimize the damage. Which approach do you think your organization would prefer?