Those of you who know me know I try to take a very pragmatic stance when examining what a vulnerability means and whether an OT system is truly at risk. The truth is, identifying vulnerabilities takes several steps from flaw-to-exploit and exploit-to-owned, which can paint a complicated picture.
I can see asset owners and even IT crowds saying, “Please no more vulnerabilities, my swamp is overflowing,” which is a fair statement, especially in an industry entirely focused on generating vulnerabilities and CVEs (Common Vulnerabilities and Exposures).
ICS vulnerabilities are frequently announced, but there are still many undisclosed vulnerabilities present in a manufacturer’s product portfolio. Many of these are critical in that they could enable a malicious entity to take charge or apply to swaths of devices operating in your environments.
For this reason, it is best to know as much as you can about your OT assets; the more detailed the better (with reasonable expectations). And there is a clear need to think about managing the risk of the environment in a holistic way, rather than simply relying on pure “scan and treat” vulnerability management programs. My colleagues and I at Verve have written a lot on this need for a different approach to OT risk management, and this is just one more evidence of the need for that.
3 problems from devices with undisclosed, known vulnerabilities:
- What do I have in my asset inventory? What can I learn out about it?
- What can I infer from the data?
- From what I infer, what actions can I take, and how?
The cyber security technology and strategies that exist for asset inventory are demonstrated as feasible and ready to use. The approach morphs into an interrogative process to communicate with a device and retrieve value from it… except it is never that trivial… expertise is needed to safely profile OT assets with minimal risk of disruption.
The second problem, making assumptions based on the data, sounds relatively straightforward. If you gain details about a system and match them to a list of vulnerabilities or other error/flaw/weakness catalogues, you could find an unnoticed vulnerability that appears in another product. In theory, code is developed to manage the similar flaws in other embedded OS devices. Conveniently, this allows solutions to leverage an existing primary vulnerability data feed such as the National Vulnerability Database.
Mapping similar vulnerabilities to devices using existing vulnerability feeds is solved then, right? No.
The end solution might be concentrated on mere human efforts to map known/unknown vulnerabilities, or an opportunity for real machine learning technology, but neither really matter when asset owners or cyber security researchers are not looking at a data source that hints to an undocumented/undiscovered flaw. Yes, other sources of vulnerability information exist outside of the CVE/NVD/CERT world – and it exists in the form of:
- Per vendor cybersecurity portals
- Differing CVE advisory portals/lists
- Product release notes and updates
- Sibling products from the same vendor
- Re-badged products or OEM-ized for different vendors, but share a common underlying platform
- And components contained within (a software BOM would certainly help here)
To aggregate all data sources for ease of use, it would be an interesting enabler. However, we can manage the two of the points today:
- For example, I am about to upgrade a device’s firmware and I read the release notes: it says, fixed a flaw in Y software by correctly a buffer overflow.
- And if I know that device is running ABC operating system, of 123 version, I can say it has <whatever> buried inside of it
The first point might be that we limit reading to only the OT devices we have today (assuming we can efficiently do so), and we locate all supporting resources and documentation to find language that represents a potential flaw. There are many assumptions in this paradigm, and it does not scale well.
However, sometimes human effort and casual reading of product documentation will surprise you with undocumented vulnerabilities. Here is an example of one I found the other day:
<version>
<date>
Notes:
– Fixed buffer overflow in password variable
– Fixed timing bug
– Fixed roll over counter bug
– Properly track variable for outgoing packet. This increases…
This example is not creative marketing, and it’s not the only one I’ve recently found. To summarize, this device seems to have some trivial vulnerabilities to exploit a user accessible field, which may be able to load non-intended code into a device, a potential opportunity to disrupt system resources, leverage erroneous data, etc. The key here is that it took me almost no effort to discover.
Here’s another example: What if I was not in complete understanding of my devices and did not know they were operating a common embedded operating system called VxWorks or another named CodeSys? Or how about, a cascading vulnerability when I enable a feature?
- To find out if a device is running a particular operating system, we could interrogate it actively with an agentless approach (passive often will not leak much), fingerprint it, or reverse engineer it. The latter is time consuming, but by the network, it is quite feasible using Verve Security Center.
- To determine an undisclosed risk, we could look at secure guidelines documents for hints to things hidden or inherited by a solution’s architecture, but this takes a breadth of tribal knowledge.
This last point brings up another intriguing discovery that was documented in a device’s secure deployment guidelines:
“In version 123, URLs to communicate to the device may be specified in a specific implementation of XML. Nodes contained could be referred to by an IP address or hostname, and as such, take care when defending against DNS spoofing, or XML injection.”
At this point, my insides begin to pucker up as XML often has numerous dependencies and parsing challenges, but also because the device has a channel that is subject to indirect risks requiring compensating controls.
While I or the manufacturer may not be able to change the architecture, these could be inherited vulnerabilities, but I can take the necessary precautions to minimize attack vectors with reference to vulnerable devices or services. For example, I could secure my OT system physically, monitor for changes or activity, and segment and firewall these devices into appropriate zones/conduits.
The third challenge about what actions can I take when finding vulnerabilities embedded in a product that does not have a feasible fix? This question comes back to your risk appetite, especially when replacement with a secure alternative option is not possible.
Apply compensating controls:
- Gather and aggregate all detailed information for your deployed assets, as well as those that may be in storage inventory. Cross reference this information with purchase information and specific documents that relate to an asset’s part-number(s).
- Record all risks and affected systems in a registrar for continuous monitoring as part of the organization’s risk management program and record.
- Define a Cyber Security Management System (CSMS) that is appropriate for your organization. In the realm of Operational Technology, ISA’s 62443 standards outline a great start to examine, classify and explore for Systems under Consideration (SuC) and cyber risks and threats. Another approach could be to use Cyber Process Hazards Analysis (Cyber-PHA) as a general direction for OT cyber security too.
- Define a Vulnerability Management program that is in alignment with your organization’s equivalent of a CSMS, has verified end-to-end processes, and appropriate resource training.
- Define all organizational policies, procedures, standards, and guidelines as relates to vulnerability management, CSMS and risk management functions.
- Implement or develop an automated vulnerability information gathering that also includes aspects such as determining when products are end of life or newer versions are released.
- Patch and update systems according to risk, vulnerabilities, function, and stability.
- Harden configurations, disable unnecessary features/services, and protect appropriately either from user access or network communications.
- Replace systems with a higher risk exposure level, and if not possible, implement barriers and compensating controls to monitor, and protect systems from direct access (network or physical) so these types of vulnerabilities cannot be successfully exploited by malicious parties.
- Monitor systems for atypical behavior (either from themselves, or adjacent systems) physical access.
- Be proactive following product updates, releases, and vendor product portals.
- Continually review, test processes/guidance and train for incident response.
Applying all of these steps does not entirely eliminate the risk or prevent an event from occurring, but if they are used together, it will minimize the impacts from what could have been a catastrophic cyber-enabled event in your organization. The devil is in the details, and some of them are both obvious and reasonably mitigated with best practices.