One major topic of significant contention is the concept of IT/OT convergence. In the cybersecurity industry, some propose that since there is no single, agreed upon definition for the term, using it simply blurs the nuances of OT cyber security and therefore should not be used. I understand this point of view, but feel the opposite is actually true. Using an ill-defined (or at least a non-consensus definition) stimulates debate, and I’d argue greater awareness and discussion is needed in OT security. So maybe the theory that no press is bad press holds true here too?
At this years’ Black Hat event, it was suggested that:
“Top cybersecurity professionals are increasingly anticipating major cyber breaches in the future – the percentage of respondents who believe that a U.S. critical infrastructure breach will take place in the next two years has spiked nearly 10% since 2018, to 77%. As online attacks become more prevalent in both politics and war, professionals fear vulnerabilities in U.S. defenses could lead to significant danger. Only 21% believe that government and private industry are prepared to respond to an attack on U.S. critical infrastructure.”
Only 21% are prepared? Whether you believe that statistic or not, and whether you think IT/OT convergence is a ‘thing’ or not, the plain truth in this space still holds true: more and more (IT) technology is coming to the operational environment. If we don’t start adopting IT practices and mindsets, we will be left with an increasingly impossible job.
BlueKeep Vulnerability Case Study
Let’s assess the details of a very specific and real-world OT team I have had the pleasure of working with and their very demonstrable struggles with OT security adoption. This team of people is a very hardworking, often over-tasked, under-supported group of dedicated, loyal operational support types. They were given a corporate directive to correct the BlueKeep remote desktop risk on their assets. This team supports over 12,000 OT assets in hundreds of physical locations across a wide geography. They have a very diverse mix of asset types, vintage and criticality. They are supported across the geography by a number of local, third-party consultants and systems integrators.
This team also lacks a common or current inventory of their assets. There is no consistent or universally deployed corrective or remedial tools, and it is a matrixed organization for responsibilities and maintenance functions. The resulting effort for the BlueKeep removal was intensely manual, error prone, required significant communication, co-ordination and planning efforts and effectively redirected about a half dozen staff to this effort (instead of their ‘day jobs’) for over six weeks. [Read more about them here in how their removal of Blue Keep was contrasted to an automated approach.]
What I was hinting at then but am explicitly calling out now is the need to adopt some very basic, but very pervasive IT security practices. Most notably, the development and maintenance of a configuration management database (CMDB), the introduction of automated practices (or at least the ability to automate bulk and repetitive actions) and the empowerment and enablement of centralized services. Each of which will be outlined in this article.
CMDB vs. Asset Inventory
Much is made of the need for a CMDB, but not nearly as much is made of what it should look like, how it should be used and who should host it. In reality, the intended purpose of a CMDB is not what OT needs – Asset Management. But there are some very real benefits for an OT team to at least populate a corporate CMDB with its data. The biggest challenge our hypothetical case study has is that the OT team starts every such ‘emergency’ task with first trying to figure out what they have and where it is. It is not a secret that every major regulatory or best practice guideline asks the owner/operator to start first with an inventory. NERC CIP 002, NIST CSF Identify stage and the CSC20 Controls 1 and 2 all say that the most logical starting point is an inventory to first help operators understand the sheer volume of the task at hand. And if done correctly, additional asset data like criticality and the presence or absence of compensating controls is well understood, and this can help drive priority as well.
More importantly, an accurate and well understood (read: shared with the right people) asset inventory can help avoid mistakes too. We see a number of OT environments struggling to assign ‘ownership’ to an otherwise unknown asset list. In some cases, a centralized IT team folds OT assets into IT patch and reboot cycles with disastrous results. At a minimum, we see unknown assets simply ‘orphaned’ in a ‘no-man’s land’ of no ownership, therefore no patching, protection or support.
If our friends were to adopt a proactive, automated approach to developing and maintaining their own asset database (for configuration OR asset management) they would see multiple benefits including:
- Increased accuracy of the scope, support, and maintenance required
- Significant time savings in reporting, tracking, remediation, etc.
- An audit trail or ‘empirical evidence’ of what they have, where and how it is configured
- Automated detection and investigation of asset status, health and risk
- Ability to monitor subnet for adds, moves, and changes to the environment
Whether it is simply knowing what you have and who it belongs to as a starting point, or proactive maintenance of asset lifecycles for health and security, an OT version of automated asset inventory is a huge step forward.
Automation of Vulnerability Remediation
I touched upon it in the inventory section, but the only real answer to the huge imbalance of OT owners relative to IP-enabled assets is to begin adopting automated mechanisms and tool sets. Too often we see a huge reluctance to allow ‘non-OEM’ approved agents or technologies on assets.
An article on how IT security teams are over-tasked and under-supported illustrates the reality that without automation, a number of significant costs are paid by the owner/operator such as:
- Diversion of skilled resources to low level security research and planning
- Huge consumption of time in the execution of security tasks
- Larger number of human errors and mistakes in execution of remediation
Is this true? Let’s look at our case study as compared to each of the three above:
- Diversion of skilled resources
- The client took six otherwise fully engaged people (primarily) and leveraged dozens of boots on the ground to first plan, then execute the roll out of the BlueKeep patch to their thousands of assets. This is a very real and avoidable result.
- Huge consumption of time
- This team started the initial process with a 20-person, 90 minute planning session which resulted in the decision to break the team down into three focus areas and another series of planning sessions. (They did not have an automated inventory for prioritization nor did they have tools to automate the patch rollout.) In this case, the team did burn significant hours in the deployment of this patch.
- Larger number of errors
- This one too is true. The larger team and the smaller, focused teams connected daily to share lessons learned around deployment tasks, challenges and oversights. Even with this knowledge sharing, the individual teams copied each others’ mistakes in testing, communication, and troubleshooting. Overall, this cost is very real as well.
Although this specific example is about a very particular, out of band, security patch, the same principles hold true for any OT security practice. From deploying new security tools like whitelisting to day-to-day patching and tuning (system health?) practices, automated insight and the ability to take action is a huge opportunity for operators hoping to simply keep up.
And for those who are looking to be proactive, the need for automation in a robust vulnerability management program with a fundamental requirement to regularly poll and profile assets then take action hugely amplifies the need for OT to have automation in order to survive.
Centralization of Services
Perhaps the most troubling trend for many OT types in the convergence of IT and OT is the emergence of central services or commodity skill sets in a remote location or data center. I tend to agree with many of the arguments against why an IT team should hold the reins in this type of scenario, but I can’t completely discount the growing need for an OT version of this trend.
A simple Google search will show there’s no shortage of predictions or trends depicting the huge lack of skilled resources in this space. To grow your own OT security experts takes time and money, assuming you can lure enough people to this pursuit. (Remember the ‘automation’ section above? Try pitching to a new security or OT graduate that they are going to spend the first X years of their career manually tracking down and patching aging legacy equipment, and see how your employee retention numbers plummet!)
Alternatives for OT teams are to either let IT take this space over or to do nothing. Unfortunately, the need for greater protection and getting things done negates the option of standing pat. The biggest challenge for all organizations struggling with this topic is that very few realize the IT mindset that IP enabled assets are homogenous (or should be) and can therefore be treated with bulk, automated, uniform does not hold true in OT. To simply assume OT cannot automate anything is to miss the point too. There is a middle ground.
With that being said, our OT types can either let IT run with it or they can get out in front of this problem and setup their own OT-safe central services model. At Verve, we like to call it the ‘Think Global, Act Local’ model which means central or specialized oversight, analysis and even first stage of remediation (with automation!) are coupled with and tempered by last mile OT oversight.
In reality, it is a blend of centralized monitoring, and automation where possible, but a very OT last mile oversight of actual asset management and manipulation. It is not true central, scalable IT type of support, but it is a vast improvement and is truly the only way for OT security practices to scale.
BlueKeep Case Study Revisited
The reason I called our case study a hypothetical one is that this client has the challenges described here, but have not yet adopted all of the suggestions laid out in this case study. The resulting improvements are still being developed and are not yet fully realized by this specific client.
The next time a BlueKeep type of challenge shows up, our operational friends would experience a much different scenario. A centralized, automated database immediately provides insight into the true depth and breadth of the security risk at hand. Inventory tools that include 360 degree views of the asset (i.e. including asset criticality, or compensating controls) would provide empirical evidence and therefore salient content for drafting and executing a remediation plan.
The automation inherent in our proposed solution means the thousands of assets in scope would be prepared for remediation from a central, automated toolset ensuring maximum efficiency in deployment, minimal error in execution and a high degree of insight into the status of executing low level security tasks.
Finally, the real-time status of all risk, tasks and resulting improvements to security on an asset-by-asset basis in a centralized reporting console means the coordination, communication, execution and ultimately success of the task is ensured.
The Need for IT OT Convergence
OT needs to look more like IT in its protection and containment of IP enabled assets. The truth here is that the OT world looks too much like IT to ignore real world trends such as increased risk, decreased skills and staff and a growing lack of patience for a ‘set it and forget it’ mentality. Any operational company needs to not look at IT vs. OT as opposing and competing entities, but rather at two sides of a similar (not identical) coin. OT looks like IT, but is not that straightforward. IT teams can’t look to industry analysts for proper IT tool selection since most don’t have an OT tool ranking or category. The only safe and successful path forward is to navigate a negotiated set of IT tools that are deployed with a heavy dose of OT oversight in the selection, deployment and maintenance phases of any security program.