One of the biggest challenges organizations face is developing a robust business plan for OT cyber security that creates the right momentum, focus, and budget required to truly make measurable progress against these threats.

 

 

Session Transcription

Introduction

Greg Orloff: Good morning, good afternoon, good evening, everyone. Once again, thank you for joining us here today, and welcome to our next session at ICS Cybersecurity IoT. This is leveraging attack surface management tactics to improve ICS security, brought to the IoT world and sponsored by Verve.

So I wanted to point out that a large part of the value of these sessions is the opportunity to engage with the speakers live. We have two ways that you can do this this morning, this afternoon, or this evening. We have a Q&A window, and we also have the chat window. So please post your questions specifically in the Q&A window here on the Zoom platform because that’s where we’re going to look for them, making it much easier for us to find those and facilitate answering them as we move forward.

While you’re getting acquainted with the Zoom platform, drop us a note in the chat window and let us know where you’re logging in from and what brought you to the session today. For those joining us for the first time, please allow me to take a moment to tell you a little bit about our organization. IoT World is a leading authority on industrial IoT for a digital media outlet, bringing the latest industrial Internet of Things and ICS cybersecurity virtual conferences and content to a global community of over 284,000 decision-makers. IoT World focuses on delivering daily insights on the industrial Internet of Things across multiple industry verticals. Online, IoT World creates a series of virtual events that draw thousands of delegates.

In this session and Q&A, we’re going to discuss an approach to generate an organization-wide commitment and engagement to successfully combat the ever-evolving and morphing world of ICs and OT cybersecurity, from business plan to execution. Joining us live today to lead this discovery session is Mr. John Livingston. John is the CEO of [unintelligible], and I could go on and on for hours about the accolades and accomplishments John has under his belt, and his experience consulting with McKinsey and Company, board memberships, initiatives, and success stories with [unintelligible]. But we have a lot to unpack in the next 15 minutes or so. I’ll let you check out his bio on IoT World, as well as his LinkedIn profile for a deeper dive. So without further ado, John, the floor is yours.

Overview of Attack Surface Management in IoT and OT

John Livingston: Thanks, Greg, and welcome, everyone, wherever you are around the world today. I’m excited to have some time to share some thoughts around attack surface management as well as to engage in some Q&A, hopefully. As Greg mentioned, there’s a variety of ways to do that, but that Q&A box at the top is probably the most effective way. So diving in.

So one of the big themes around cyber these days is moving beyond traditional vulnerability management or risk management to really thinking about it as an attack surface and what are all of the various potential entry points to my environment from attackers that might have an impact on us. The reality is, as we look at that space of attack surface management, IoT or OT, or ICS, pick your favorite phrase, and I recognize there are slight differences in all of those, but the industrial environments, call it, have also been left out of these ASM issues. So that’s what we’re going to talk about today just by way of introduction.

I’m the CEO of Verve, as Greg mentioned. I’ve been with the firm for about six years. Prior to that, I was a senior partner at McKinsey and Company, and a lot of the work I was doing was helping clients move through these digital transformation journeys. Part of that continued to see the cybersecurity issues that would arise in that, and so I met the founder of Verve about six or seven years ago and joined him in this journey.

Just a quick background on Verve and where some of the discussion today comes from is our 30 years of control systems experience. The company started back in 1993 as a control systems integrator designing systems for power, water, steel, paper, etc. About 15 years ago, our clients, mostly in the power industry originally, had NERC CIP requirements that they had to achieve, and so the company built the first version of what we call The Verve Security Center, specifically focused on how to achieve security management in the sensitive ICS OT environments. Over the past 15 years, we’ve been continuing to expand on that platform, improve it, and expand into different industries. So now we touch on pretty much every industrial category, pharma, med device, chemicals, pulp and paper, et cetera. We also continue to provide a range of services to our clients that align with that, and some of what we’ll talk about today comes from some of those services engagements where we see clients going through this challenge of how do I think about an overall program to protect the IoT in my environment?

So we’re going to talk about three things. First of all, what is this thing called attack surface management? One more cybersecurity buzzword. In fact, in this case, we think there’s some relevance to it and some benefit to using that term to talk about that, and why is it relevant for OT or IoT? Second, we’ll spend a little bit of time just talking about some of the challenges that come from trying to apply this in these kinds of environments. And then finally, we’ll break down a set of steps that we’ve seen successful in actually making it work and adding overall to your security risk posture within the IoT space.

So first of all, let’s define this term, right? You know, with all the new phrases of attack service, maybe we got zero trust, we got all these things. It’s helpful to start with at least how we define this. The way we define it is a continuous discovery, collection, assessment, classification, then prioritization, remediation, and monitoring of all of your assets, be it OT, IoT, etc.

It differs from traditional vulnerability management in that it takes, quote, unquote, the attacker view, i.e., how does an attacker look at this environment? And where are they likely to begin an attack and then move through that environment to best disrupt, steal data, or whatever it is that is their mission. 

If you do it correctly, it allows the organization to prioritize their most critical exposures, not just on some CVE with the CVSS score, but on what is the true potential. We’ll talk more about this, but this is absolutely critical in our view to move from traditional vulnerability identification in OT to something more like what’s defined here, where we are looking not just at a vulnerability, but more broadly, the potential impact of an attack on that environment.

Six Key Elements of Attack Surface Management

John Livingston: So if we break that definition down a little bit, there are basically six key elements, and we could just parse the definition into those pieces. 

Number one, discovery, the ability to see, quote unquote, all the corners of the world that is your attack surface. A lot of times people will talk about this relative to the cloud, which obviously is very, very important. We’re focused more on the IoT, OT, sorry, OT, ICS side of this and includes the discovery of unknown assets, unknown activities, unknown software, etc. 

Secondly, assessment, then understanding the risk of that asset based on what we refer to as a 360-degree view. So not just the vulnerability data, like the CBSs score, but what are the user and account risks? What’s the network access risk? Do we have software that doesn’t need to be there that creates risk, even if it’s not, quote unquote, vulnerable? TeamViewer as the most logical example of missing patches, insecure configs, etc., but a 360 assessment.

Next is context. So this is where we put in, what is the criticality of that device? What’s the usage of that device or asset within the operating process of a plant or the substation of us have a refinery at seven? 

Then prioritizing, okay, so, we know we’ve discovered the assets, we’ve assessed the risks, we’ve put them in context of how critical is it, and then we can start to think about prioritization. 

How do we best think about prioritizing remediating these risks based on looking at it from an attacker’s perspective? That eventually gets you to a risk score, where you can say this asset has this risk or that asset has that risk score? Based on that 360-degree view?

Fifth is remediation, so we actually have to go fix it. And that hardening could be a range of different things. It could include network protections, it could include patching, it could hardening, but essentially the remediation plan. 

And then finally, once it’s remediated, maintaining that monitoring that security status on an ongoing basis, not just the next time I do an assessment in a year, but in real time being able to monitor and review those threat vectors to see if I have closed those attack surfaces.

So these are the basic six elements of what we think about as attack surface benefit.

Threats to OT and ICS are Increasing

John Livingston: The reality is that as we think about these OT systems or IoT systems, they are under what we refer to as an air raid. And it’s an acronym that we use based on what we’re seeing in the market, in terms of what we’re hearing from customers.

So number one, attackers, right, so ICS or OT is becoming a much greater target, in large part because of commercial attacks. Commercial hackers, we’ll come back to some data on that. But obviously attacks are going up. It is now becoming a greater part of OT systems. We’re doing remote operations, we’re putting virtualization in. 

We’ve got IoT devices coming in. Sometimes we don’t even know about them, etc.

Regulatory pressures are increasing. TSA already has put out the pipeline regulations in the United States. We’ve got NERC CIP being basically applied right out of the box to Chile. We’ve got the Kingdom of Saudi Arabia applying their own OT cyber standards. Europe, etc. And so regulation is starting to increase.

Fourth is the resource constrained. With all of these vulnerabilities and growing risks, people are just constrained, we don’t have enough bodies. So we’ll talk about that a little bit more.

Access, so obviously a lot more remote access. COVID really lit the fuse here. And more and more people are moving to more remote controls.

Insurance. This is a huge issue. In fact, we’re doing an entire webinar of our own later this month on insurance. This has become a massive challenge for many of our clients and industrial organizations. How do we make sure we have the right information, the right proof of controls on the OT side so that we can get insurance?

And then finally, directors, right, they are starting to breathe down the neck of the CSOs and others to make sure that OT security is in place. So we call this the area that basically the seven factors that are really driving on urgency around improving the attack surface management in OT.

Just a couple of data points on that. So this is from the most recent SANS survey which was completed last at the end of 2021. And you can see that the degree of risk in OT has increased substantially. And then on the right-hand side here, looking at the threat factors, ransomware has moved from basically not even in the top five a couple of years ago to being the number one thing.

This is my point about, you know, no longer is IoT security worried just about, quote unquote, the nation-state or just about, you know, some partner in your supply chain. It is now mostly about that ransomware, that commercial attacker, who has a totally different mindset around their attack pads and what they’re trying to do. And as a result of that, we need to think, number one, about the increasing threats, but also about how to protect against these attacks that are essentially increasing.

We also, at the same time, have a significant increase in the number of published vulnerabilities and the risk of those published vulnerabilities.

So, this is some data from the last couple of years of the total ICS advisories coming out of ICS cert. And what you see is obviously, you know, kind of a 30% increase year over year and roughly 40% increase in the number of CVS. And then the actual risk score, and we can debate CBSs, and lots of people will say, “Oh, it’s not relevant,” and all the rest of that. It’s the best thing we’ve got right now as a standard to measure how risky are these vulnerabilities? And that has increased significantly over the past couple of years.

The point in all of this is not to get into individual numbers of. Okay, is it 354 or 353? The point is, this is an increase in the number of vulnerabilities that we have to address in the IoT space.

And what’s happening is more and more of these threads are moving from it into OT. So they implant ransomware, they gain access, that ransomware then exploits vulnerabilities. It traverses those weak networks, weak ACLs, unpatched systems. And then once it moves to OT, it can spread like wildfire because in most cases, those OT systems are not applying the same attack surface management that might be applied in IT. And so these threats from attackers and, frankly, the availability of the number of vulnerabilities that have been discovered means that these threats can move even more quickly.

You know, what is attack surface management? It is those six things, and it’s relevant for IoT because we are living in a time where we have an increasing number of attacks, an increasing number of vulnerabilities, and we don’t have the resources to deal with it.\

So, okay, if it’s relevant, if we want to do attack surface management, we want to be able to prioritize and focus our resources on those that are most critical.

Challenges in Attack Surface Management

John Livingston:  What are our challenges? So, if I go back to what were those six key steps and talk about what are the challenges in each one of those, right? Because there are so many.

Number one, on Discovery: There are complex network architectures in these IoT or OT environments. They live devices sitting within a rack behind that rack. You know, you may have a network coming out the back of a PLC to another switch with another set of devices underneath that switch, which don’t come back up. We’ve got NATs, you got unmanaged switches. You know, we’ve got misconfigured network devices. So even if we see the flows that are occurring today, it may be that there’s a rule in that firewall that allows other flows to go through at certain times that we just don’t see. And in many cases, software users’ configs are hard to get right in it. There’s someone easy that we scan the device, we pull that stuff back in OT. Sometimes that’s very difficult to do. So that’s the first challenge.

The second challenge is the assessment. So if we don’t have that direct access to the asset with all of that information from a discovery point of view, it’s very difficult to get that 360-degree view that I talked about.

On the criticality front of the context: Many cases organizations haven’t actually defined asset criticality. Yep. They kind of know what the operator may know it. But at a corporate level, at an enterprise level, there’s not a sense of yes, these are our critical crown jewels. And if someone gets to these, these are the impacts that could happen.

Fourth, on the prioritization front: How do you make a prioritization if you don’t really know the risks? And so the prioritization, without some of that information above, starts to fall down.

Fifth, and perhaps the most challenging here, is what do you actually do about it? Right, I’ve got outages, outage schedules. There are risks of patching. We all heard the stories of somebody pushed a patch and took the plant or, you know, Greg to an HMI or something like that, on changing configs. You know, frankly, doing any sort of network segmentation takes a lot of effort. So, remediation challenge, and then finally maintenance, monitoring these systems on an ongoing basis for new vulnerabilities, new patches, new threats, on creates challenges.

Limitations of Traditional IT Approaches in OT Context

John Livingston: So anyway, a range of challenges in applying ASM to IoT. And if you actually look at there’s a survey on the CIM routed around the state of ASM security, and what you find is, you know, the number one thing on their list was lack of visibility of certain technologies.

So you know, having a view of how to deal with these technologies is a challenge. We know OT is different, right? 70% of these assets are not going to be Windows devices. So the traditional IT approaches often don’t work. You know if you scan a PLC, we even heard from a client the other day, you know, they were finding that unless you were being really targeted in those communications, you’re tripping the plant or breaking PLCs, etc. And you got very old devices, old OT device kernels, you know, these are different environments. Traditional ITSM is probably not going to fly.

We talk about discovery, right? A lot of people have gone to the well, we’ll grab packets off of the network, right? If we have the packets, we can know. Well, when you actually dive into that in detail, what you find is the packets don’t tell you in a lot of cases what you really need to know. You’ll know what’s a Windows box, probably. But that doesn’t necessarily tell you what patches, even if you know the OS version, doesn’t necessarily tell you what CVEs have been deployed, certainly won’t tell you what policies are on that box, what all the users are, what applications are there, what’s the purpose of the box, etc. You don’t have the conduit rule set.

So, you know, again, the challenge of the way that we’ve sort of thought about doing this is trying to use packets because that way it doesn’t disrupt the network, but oftentimes it doesn’t give you what you need. 

And then the number of risks is enormous, right? Which one are you going to start with? Are we going to start with the shared passwords? Are we going to start with, you know, third-party remote access? Are we going to start with all of our old devices that are out of support? Are we going to start with, you know, the dual home NICs that are frankly designed in by many of our OEMs to allow their control systems to operate? And you can go down the list right? I mean when we go in these environments, we will show you a little bit of this. We do hundreds of assessments every year using our software and the risks are enormous, and where to start is a huge challenge, you know?

So, okay, that’s what attack service management is, why it’s relevant, but there’s some hard things about it. So the rest of our time together we’re gonna talk about okay, so So what, how do we actually do this? And this is based on our experience, I’m sure there’s lots of other experiences of folks implementing this on but it’s based on our experience of what we’ve seen work and, and be able to be effective in OT.

Utilizing a Platform Approach for a Comprehensive Risk View

John Livingston: So we’ve talked about these six elements, we’re going to essentially walk through these six elements to talk about how can we achieve the six requirements of attack surface management in the IoT space. 

So how do we do this again, you can use probably lots of different approaches. What this is going to be based on is the way we do it with our solution. So you have to use this solution. It’s just the way that that we reflect on it on and what that does. 

What the Verve solution does is it brings together a bunch of different information and capabilities into one platform, so that you can see that 360 degree of risk. So that starts with that visibility piece that that discovery, the asset inventory.

And then by getting that deep asset inventory, we then overlay things like vulnerabilities on that. We know the full patch history and the patch status of that device as well as new patches when they emerge. We can view all of the configuration settings on those devices, full cis or whatever standards DISA Stig, etc. To be able to evaluate whether we’ve got hardened configs or not pulling back all the users in the accounts so that we can know if there are dormant users if there are password settings that are inappropriate, etc.

Pulling back any of the anti-malware protections that may already exist. So great. You know when we bought this system we put in McAfee, the vendor, you know put in McAfee when when the system was put in terrific. When was the last time it was updated? Do we have the updated signatures we have whitelisting Okay, is it in lockdown, and how much is whitelisted in their intrusion detection so are we able to monitor those bogs, the flow is etc? 

So we can identify, wait a minute, we’re actually seeing patterns of traffic or we’re seeing behaviors in the system that may be a threat already in progress. And then finally, the ability to assess our ability to respond to a threat what of our backups. So do we have a backup? Is it recent etc.

And so by using a platform approach, we can see those 360-degree views and start to make progress on prioritizing. So again, just to highlight it, this is how we do it, whether it’s Verve or something else. Figuring out a way to get this data into one platform is key, in our view, to being able to make ASM work.

Importance of Detailed Asset Inventory

John Livingston: So visibility discovery, it is the foundation of all of this right you have to have an accurate asset inventory of your devices. It has to include an accurate software inventory, not just the app not just the OS versions, but all the applications the firmware etc. Updated conductivity flows and the rules. Okay, great. I got a firewall what are the rules in those firewalls, physical port connections, etc. That discovery allows us the ability to start to assess risk trade-offs. It also allows us to avoid any false starts. 

We see a lot of times where a client has jumped into network segmentation, for instance, and they come back four months later, without much progress a year later, without much progress. Because they don’t know what they’re trying to segment they don’t actually have visibility into what’s there to know what should be segmented what assets are there, particularly when you get into these more complex network environments. 

So anyway, starting with a way to get at that, okay, so how do you do that? Right, you got to get down to the detail. Essentially being able to get all the way down to the bottom of the ocean, so to speak. And not rely just on what you know, at the surface level, so to speak. 

So, getting down to every piece of installed software, all of the user-defined risks, the password settings, huge issue. We get have great view of your vulnerabilities. But if all your passwords are standard, or the default passwords in the accounts of the fall counts, you know, people don’t need a vulnerability to make a dent.

Anyway, so getting that deep, deep visibility. 

Debunking Common Myths about Network Traffic, Software Agents, and Active Communication

John Livingston: So there are some myths here that that if we’re going to make ASM work, we have to question the myths versus the reality. So the first is the network traffic one that we talked about before. The second is software agents are not safe, right? You just can’t use them. 

The reality is that that is not true agents can be deployed if designed in an OT context and designed for ICs. And if they’re tested and proven over time, active communication causes disruption. You know, when we can’t do anything to communicate with our embedded devices, it will it will cause a disruption 

The reality is that is not true operators. are communicating with these devices every day. And if done with the native protocols and the right approach, you can do this with absolutely no impact to operations. 

And then lastly, that flows and packets are enough to understand architecture. In fact, understanding the actual rules the configs etc, in those networking devices is critical. To really get at ASN, we’re going to have to question some of those myths that have been that have been out there and we’re going to have to push to say, Okay, we’re going to try some things that are a little different from what has been accepted wisdom, so to speak, in the past.

Okay, so that’s on the Discovery front.

Evaluating Vulnerabilities, Configurations, and User Accounts

John Livingston: Now we get into our What about the assessment? So this is back to the 360-degree view. When we think about how do you assess the risk of environment, we got to know the vulnerability certainly, right. We’ve got to know the patches that are missing. But we also have to look at what are the configurations on these devices and are they insecure both the rule sets as well as things like you know, what is my password setting goals?

What are you know, and do I lock out? Do I have all of the core configuration settings on device user an account access risks? So again, as I mentioned before, I may be fully passed, but I have a whole bunch of admin accounts with you know, password is the password. 

So getting a view of that getting a view the anti-malware says getting a view of my network protections and firewalls backup status, and then finally, that asset criticality piece so that we can really build a view of what is the full-on risk that I can then in our future steps start to take polarization against.

So obviously, CPUs are a big deal, right? This is just as a simple dashboard of, you know, ‘Hey, okay, we need to know our CVs, we need to know it, you know, by different asset types. So, you know, desktops versus network versus PLCs versus whatever. We need to know it by severity, recognizing that CBSs score is not a perfect thing. Absolutely not. But it’s a starting point for us to work from, right. We need to have an overlay of, of exploitability, etc.

But the short answer is we’re going to have a lot of these, right? When you go into the first time to look at ASN in OT, you’re gonna have literally 1000s of vulnerabilities we then need to look at configurations. So, you know, we, what we’ve done is we’ve taken things like CIS and DISA STIG, and we’ve brought it down to about 150 to 200 Different checks, so that we can see whether on a device is compliant with those checks and we can start to understand are we secure, all from a configuration point of view where we want to be and then eventually be able to remediate those configuration risks?

And that as I’ve mentioned many times in this presentation so far, account management most of the time in our in our experience, these things are not tied into a central Active Directory server. And so we have multiple Aedes, nobody’s really managing them. And so what you end up with is tons of local accounts, accounts that are possibly managed by a local ad. They’re inactive people haven’t used them in six months. They’ve got expired passwords or dated maybe better phrase passwords over over 180 days. We’ve got administrator accounts, you know, you would assume this but by getting that data out, you can get at specifically where are they and and What risks do they pose and then obviously, getting at that network architecture risk, right?

Understanding Network Architecture for ASM

John Livingston: So, am I seeing remote access coming into the environment? Do I want to actually look into the rules of those networking devices? Are there actual rules separating out VLANs etc. You know, do I have wireless routing around the fire hose? Right? So there’s some 5g device sitting in the back of a rack that nobody really knew about, that’s connected directly to the internet, and I’ve now breached whatever security perimeter I put in place, etc.

Developing a Scoring System for Risk Prioritization

John Livingston: So anyway, from an assessment point of view, getting our handle around all of these different things is critical to get there and then finally, starting to think about context. So what are the what is the criticality of these various devices and processes? So that we can start to put together a prioritization and so criticality is key. So you know, is it a safety system? Does it control critical processes? 

You know, does it access other parts of the network? And then basically going through saying, ‘Okay, it’s a certain asset tied to certain criticality, what are the risks that allows us to put a score into it right, and the actual number that we can all agree on that our goal is how are we going to reduce that that unmitigated score, and then obviously, we might have mitigating controls that could adjust that score? Based on you know what we’re doing to protect that device. 

So having that context piece, that’s the next piece is is critical. And you may not right some, some organizations don’t know this. And so there are ways to either manually do it, or frankly, you can use the data that you gather back from the system to create a heuristic to allow you to build to build the criticality and and so as we get into that, how do you prioritize?

So, you know, I can start to look and say, what’s the asset impact category? Right, so what are the, you know, is it a high impact? Is it a medium impact, etc. And what’s the overall risk score? Let’s put it in our dashboards so we know exactly what how many of our assets are critical risk based on that, that model that I laid out in the product page. And then we can start to prioritize, right?

So first, let’s prioritize CVS that have a high number of exports, right? These are just some sort more famous ones, like not Petya and WannaCry. But you know, that we integrate with things like the Sisa exploit lists, right? So that we can go from, quote, unquote, a CVSS as the number one dynamic to Okay, now we’re going to overlay exploit realities. 

We’re then going to break down into critical assets. So we’re going to take that criticality, score the context and dive down into just those assets.

And then finally diving into things that don’t have compensating controls. So perhaps, I’ve got a bunch of vulnerabilities, but some of those devices have, they’re critical, but some of those devices have whitelisting enabled and that whitelisting is in lockdown, and it’s tight. Okay, I may then prioritize other devices that don’t have any compensating control ahead of them. And if you do this, you can then start to get down to Alright, I’ve got 308 vulnerabilities in this particular context. They’re critical vulnerabilities on a critical device, no compensating controls of application whitelisting and there’s no backup. So those are ones that I really, really, really want to make sure I’ve heard this photo and you can think of other lenses that you would put on this. This is just one example.

But the idea is back to the platform notion, having all of this data in one platform allows you to then trade off and look at these various elements of risk. So we can start to prioritize, and then to start to model attack. That’s okay, so now I know the risks, but what are the what are the attack pads that might come through so I know that an HMI or controller is exposed to some form of external access? I’ve got then dual NICs that HMI routes around my VLAN on the supposed to be sub-subsegmenting my network. And that doorman account that’s sitting on that. HMI that’s dual Nic, I can now start attacker could come in and take that password that’s on the dark web come in and then move through my unpatched systems to spread throughout the system.

And so we can model out these attack pads to then say, Alright, based on those risks that I have one of the likely pads that a customer is gonna an attacker is gonna take into the environment, and then eventually prioritize and this is a very busy chart to understand but it’s just trying to get it kind of the visual of this, right. There’s a whole bunch of things that I could prioritize of remediating actions, I can remove TeamViewer things I can do tomorrow. And then there are things that, you know, may take me a while, right, creating, hardening the ACL ACLs might take a little longer, patching critical vulnerabilities might take a while.

So basically laying out what are the things that I can do across the environment based on the vulnerabilities and risks the 360 risk view and based on asset criticality attack paths, and then we can say, Okay, now we get to what is our remediation plan? 

Developing a Remediation Plan

John Livingston: Okay, there’s things I can do that I have current capabilities today that I can immediately start to do things with, right? So there’s dual NICs, and I’ve got a firewall that sits between them that these NICs are basically routing traffic around the firewall.

Okay, first thing, let’s use the firewall, right, let’s put a rule in firewall that allows for the right traffic to go around and we don’t need to do on that. Let’s ensure network device compliance. Implementing password policies of some form on the OT systems, removing risky software, there’s things I can do today. And then there are things that are kind of the next level of maturity in the next level. The point is that the ASM approach starts to allow us to lay out essentially a roadmap of things that we can do to prioritize actions over time. And then eventually remediate.

So we have that 360 risk assessment, we then need to move to execution. 

Executing the ASM Plan

And this is one of the things that is often the most concerning to operators, right, because now I’m going to start touching the systems. I’m going to start patching things. I’m going to start hardening configs I’m going to start, you know, removing software from the system.

So obviously, all of this has to be done with the utmost care on an OT understanding. But there is a need to move to action and move to action in a way that is scalable because we just don’t have enough resources to be going machine by machine in every plant around the world to do this. So we have to find a way to automate. 

And so we do what we call Think global, Act Local. 

Verve’s ‘Think Global, Act Local’ Approach

John Livingston:  In that approach, every organization is going to be different, right? But for most larger organizations, we need to find a way to take that data that we’ve been talking about up to a central place. Right. So Greg mentioned that part of this discussion is going to be organizational, right. How do you organize for this, in our strong view is you have to take the analysis part and scale it globally. Bring that data from 60 sites, 50 sites, 2200, 800, 1000 up to some central team who can do the prioritization, they can build the playbooks they can know what controls we’re going to prioritize, etc. There may be some regional level in there to help local teams but then the app local part of this right so a lot of people question well, is this an OT thing or is it an IT thing? 

It’s a both thing the teams have to work together, but we can scale to a quote unquote IT team or a central team to do the analysis.

Those recommended playbooks can then get pushed down to a site or a region or whatever it is where there are people who understand the process. But before we start remediating, the local team, or by local, we just mean people who really are OT experts that know that process. They’re the ones who approve it, and execute those actions that can do that in an automated way. We have tools that they can use to do that in an automated way. But they are involved in that step, right? It’s not okay, great. We’re pushing patches from 6000 miles away. So this think global act local in our view is an absolute critical element of being able to effectively do ASM for the OT environments.

That think global peace also includes monitoring. Alright, so the last step of our ASM journey was maintenance. 

Monitoring and Maintenance in ASM

John Livingston: So we have to be able to monitor how are we doing? Right? You know, and this is an example of basically how am I doing against vulnerabilities. While terrific, I made a big move here in June, and I was able to patch and remediate a bunch of vulnerabilities. But then I’m starting to tick back up in July as the vulnerabilities emerge. So being able to track this daily, weekly, whatever it is this new organization and monitor how I’m doing against my ASM is key because if we’re not monitoring it, it’s pretty much not worth doing.

So when we think about the way we approach this and why it’s beneficial in OT, is we get that detailed data, right, getting to the asset base discovery number two, being able to do that 360 view number three, being able to import into whatever platform you’re using client criticality, that customer, the organization’s criticality and context data, the ability because it’s all in that same database to be able to make those trade-offs to prioritize specific actions. And then that Think global act local peace, right? The ability to automate locally to enable scale among very, very limited resources in the OT world, but to put that automation in the hands of the control systems engineers who understand their systems best and then finally being able to monitor this on an ongoing basis. We think that this kind of approach, whether it be through verb or some other platform solution, is really the only way we’re going to be able to achieve ASM which is so critical in these very complex environments where a lot of times you can’t take the most direct route to an answer.

So with that, I’ll wrap up the formal conversation. I know there’s a bunch of things in chat it looks like and some Q&A. So Greg, you and I guess we’ll go back and forth a little bit on some of the Q&A and chat comments.

Q&A Session

Greg Orloff: Absolutely. Well, thank you, John. I mean, that’s it. As always, that was a comprehensive overview and a lot to unpack. So thanks for sharing with all of us.

It definitely looks like we’ve got some questions, surmounting here in the in the Q&A, and also the comment section. If you can, if you’ve posted things in the comments, I’m trying to police that too. But please do transition those over into the Q&A’s. We’ve got one centralized in one spot. I know I saw a couple of key points here. I’m going to jump back to this one that was from, let’s see, he was said he had asked, ‘Is it passive discovery or agent-based?

John Livingston: Oh, how Verve does it? Yes, yeah. So Verve uses two elements. So for the Windows type windows, Unix, Linux types out the HMI is the servers etc. We put an executable on those devices. A very, very small spot takes up about less than 1%. We basically cap it at 1% of CPU, etc. 

And then for the embedded devices where you can’t put an agent or an executable we use essentially proactive communications to those devices in their native protocols. So we’re able to discover okay, it’s a Rockwell, you know, control logics or whatever we then communicate with that device. In the SIP protocol for Rockwell, it’s Siemens, it’s s seven, etc, etc. To be able to pull back that, that key information from that device. So it’s a joint essentially agent agentless, or executable agentless type approach.

Greg Orloff: Got it. Okay. Thank you. So conserve effect. clear it up. Great. If you’ve got a further question, feel free to drop it into Q&A with us. So we knew back then, this is an interesting question came up from Georgios, he said embedded controls for remote monitoring of OT has introduced a multitude of vulnerabilities. Where is the balance to be drawn between safety and ease? of control? That’s a loaded question. 

John Livingston:  So, okay, got it. I got it. I’m trying to look for these. So it’s a great question. In fact, we’re doing a couple of pieces of work right now with clients of how do you design a secure remote operation center so as I mentioned VIRB, as both a software company as well as we do services, and it’s question of, where is that balance? So first, first and foremost is to understand the sensitivity and criticality of those assets in the remote that you’re trying to remotely monitor. So are these things that if they were impacted, it would be a massive impact your operations or, you know, it’s a very small facility in the grand scheme of things and yet if it went down, it went down. That wouldn’t be great, but it doesn’t take down the entire the entire network for instance, or your entire production. So first is criticality. The second then is what functions are critical for remote operations. So what I mean by that is, are you looking to a just monitor remotely and analyze without the need to go back in or you be looking to do basic control ie start stop? Or see, are you actually looking to do true remote tuning in real time? And if we think about trying to lay out the objectives is maybe a little too much of my McKenzie role, but lay out the objectives, what is my business goal here? I can then start to think about okay therefore, what is the remote connection that I need to have to achieve that business objective? So it may be all feasible by doing this so we have one organization who does all the remote operations management through pi and RDP. Right? So they’re taking PI data out, they’re doing all their analysis, and then they can use RDP to the extent they think others will want a more full suite. So point is, there are ways to essentially design a solution based on what the needs are to secure. There is no quote unquote, right answer here of every time. This is the balance. You need to analyze those various pieces and then build a security approach that addresses that level of security need for the criticality as well as the objectives. You’re trying to achieve from a business point of view. That’s probably I could probably do a whole webinar on hoppers control. Hopefully that was helpful as a starting point.

Greg Orloff: It’s a great canvas for continued conversation. Barry’s just asked us, he says, “How do you test the ASM improved after changes have been made?”

John Livingston: Great. This is that whole monitoring and maintenance point. So the key to this is you have to have a real-time assessment going on in the environment, rather than doing manual assessments. Every 12 or 18 months, which is oftentimes the current status right? By having a platform that is in there all the time. You literally every day, when you come in, your attack scores will change. Meaning Okay, yesterday, here were the risks in my environment here. Are the vulnerabilities here were the users and accounts whatever. And great, we made some progress, right? We cleaned up stuff, we changed some configs whatever. Whatever my risk score is now gone down. And I can see that like that chart show. We were just doing vulnerabilities but you could do it at the total risk or level. And every day, week, whatever, you’d have a dashboard. So she’s now got new red buttons, because of those risk scores have suddenly increased, right? So new vulnerabilities are found in the environment or somebody went in and change the config on a firewall. And so being able to monitor and you may not be doing it daily to be fair, right? I mean, the reality is we may not have that many resources. You may want to be alerted to certain changes daily, but weekly a monthly analysis, but you can’t do a weekly and monthly analysis of that platform is gathering the data all the time. Anyway, to us if you’re not measuring change, and how that improvement is happening. It’s kind of pointless, right? It’s like okay, I did this but do I know it didn’t get worse since yesterday? Or last week or last month? So anyway, to us monitoring changes key and that’s where that think global peace in our view comes into play.

Greg Orloff: Now that makes a lot of sense. On the scenes, like Martin Martin just has a question. He said, Do you have any specific tools supporting 360-degree based prioritization process? So how do you sort through all those things? Once you’ve done that Gantt chart of what the universe looks like?

John Livingston: Yeah. So I’ll tell you the way we do it, and I’m sure there are other tools out there that you could use. So I want to be transparent. This is not intended to be just a pitch for Verb. I’m sure there are other tools that do this. But the way we would do it is because that data is in the Verb platform, right? So we’re gathering that direct data from all the endpoints and doing that risk by having that in there. You can then essentially manage the vulnerability, the risks within the platform. And so the tool, doing it for you is probably too simplistic of a way of phrasing it. The tool enables you to do it, meaning by having it in the platform, you can create those risk scores we talked about, which is an automated way of doing it, but then you can go in and start to prioritize, right? You can do what I was showing you there in some of those dashboards, where you say, Okay, I know I’ve got 8000 critical vulnerabilities. Alright, now I’m going to click on just the critical assets that I’m going to click on those that have application whitelisting. And I’m gonna click on those that have no backup, and I can literally within four clicks, I’ve narrowed down my my, my prioritization, but the key to all of this is you got to have the 360 view in the same database. You can’t be doing this with spreadsheets and trying to match this to this right. It’s gotta be in the same database. Otherwise, it’s probably impossible.

Greg Orloff: That makes a lot of sense. We’ve got a question here from Mohammed, and we’re gonna jump back Lee references, what’s an attack surface compared to a vulnerability so maybe can you differentiate between those two, folks?

John Livingston: Yes, so vulnerability. As you know, if we think about the narrower of the very specific definition of a vulnerability, a vulnerability is a known, well, yeah, it’s a known, I guess you could have zero-day vulnerabilities, but it is essentially a insecurity in a piece of software or hardware, or in the configuration of the way that device works. And so it is a relatively specific definition. Now, a vulnerability is different from a risk. I could have a vulnerability on an asset, so I get about an asset that has no known vulnerabilities, but I got no security, no software vulnerabilities, no hardware, no configuration vulnerabilities, but I have a risk in that. That device has got hundreds of admin accounts. Then those admin accounts all have the password “password.” That is, depending on your definition, right? We would call that a risk. We would not call that a vulnerability. What an attack surface then does is take all of those different risks and then add layers on one additional piece, which is how would an attacker actually leverage those risks to make an impact? So attack surface adds on this layer of Risk Number one, and number two impact, right? So how is someone going to come through here and actually deliver impact? The definition in the early part of this presentation probably does a better job than I just did of describing that. But that’s the basic concept. And by the way, we will be sending out the presentation so folks can have that.

Greg Orloff: Yeah, absolutely. So you’ve asked about that too in the chat. So all content will be available. Jon’s agreed to share this presentation deck as well. Bernardo has a question here. I’m just gonna read this one out. Bernardo asks: “OT and IT Convergence is a big challenge. Legacy critical assets with security by design is vulnerable. If patches or upgrades are no longer supported, but you’re still in use for OT networks. Remote Access is a particular risk of vulnerability. How do you currently deal with that?

John Livingston: Yeah, so almost all environments we see have some form of legacy assets. And those could be legacy PLCs, controllers, or they may be all the way up, in the higher levels of the stack, HMIs and servers, etc. And let’s say you can’t do anything about it, right? You don’t have the budget, you don’t have the means to replace all that equipment. So there are things that we can do to address those risks and create compensating controls because, obviously, there’s no more patches coming out for XP as an example. So what do you do? So there’s a few things you could do.

Number one, we can put on application whitelisting. Several years ago, DHS from Homeland Security in the United States did a study of all the ICS attacks that had happened, and they found that something like 35% or something along those lines could have been stopped with appropriate application whitelisting. Application Whitelisting obviously doesn’t work on a PLC or controller but works on a Windows device, and you can get it all the way back to XP. Where you essentially lock down so that no programs can run other than those you approve. It is a terrific, it really doesn’t work well in IT because of the number of changes that people make, etc. But in OT, we don’t want people to make changes. So even in those control systems where it’s legacy, we can start to deploy Application Whitelisting.

The second thing we’ll probably need to do is we’ll probably need to go in and clean up all over the configs that users and accounts in the software on those systems. So one of the things we often find in these old devices is that not only are they legacy, but over the last 20 years a bunch of things have been added and no one ever cleaned it up. So I’ve done all sorts of so I might have DVD burners on there. I’ve got some old version of LogMeIn or something. So just getting rid of software. We can get rid of accounts right so for the last three years, every time one of the OEMs and vendors showed up to do a maintenance window, they added an admin account. So now I’ve got dozens of admin accounts on all my key systems. We might greatly get rid of those. So there’s things we can do to clean up the devices. And then the last thing we can do is we can start to deploy network protections, segmentation, even if we’re not going to do physical segmentation, we can at least put in what we call virtual segmentation, meaning we can start to alert on communications between network segments that shouldn’t be communicating. So there are absolutely things we can do in those environments. They’re just not going to be as linear. Another, by the way, back to the 360-point, this is exactly the reason why you need the 360 attack surface, right? You need to know hey, maybe the box isn’t all that important. So let’s not worry about it. Or maybe that box is already behind a really, really rigorously controlled firewall. Okay, great. Let’s not worry about that one. But others aren’t. And so thinking about what we can do in those environments where we can’t patch is really what the whole ASM thing is about.

Greg Orloff: It’s really shift a little bit then what do you see from a role for digital twins, where you might perform breach simulation? How’s that coming into play in the IoT security environment?

John Livingston: I think it’s a great opportunity for IoT security. So there are two types of digital twins in my book. One is the very advanced, truly doing the full digital twin of control systems or control system processes, which almost has to be done literally plant by plant, process by process, is very time-consuming. Anyway, that’s one of them. There’s another element which is digital twinning at an environment level.

So, if we think about what we’ve done when we do this 360-degree view, we know every asset, we know how the assets communicate, we know every piece of software, every account, etc. So when you think about what the ASM is doing, it’s really creating a digital twin of your control system. It’s not a digital twin in the sense of knowing every process setting and how to tweak this process, but what it does is it digital twins out the controls and the security environment in those. And then we can run a bunch of scenarios on this digital twin of your environment to see whether that attack would work given the controls that are in place. So I think it’s a great concept. I think the full-on digital twin is great, but oftentimes, it takes a lot of time and expense to create. You can get a lot of the impact from a security point of view by doing a simpler model focusing on the OT part of that environment and doing threat models.

Greg Orloff: Martin says hi, great session. Thank you, Martin. Extending discovery through the six steps is really useful. What is your view on preventive techniques that are becoming available? To mitigate real-time attacks, the attack process disrupt an IoT, etc. The tech window remains in a discover-to-remediate approach simply due to time to remediate and constant change. So, behind the attack curve, innovations in cyber are looking to address such as software-defined segmentation, moving target defense, etc. Do you have any view on what’s coming from an IoT or an OT perspective?

John Livingston: So, I think that there will be more and more around active defense in OT. So, if I just take the break we talked about application whitelisting, application whitelisting is an active defense. It is stopping an attack. Now, granted, it’s a relatively rudimentary one or a blunt instrument one, right? It’s not like you’ve got CrowdStrike going to the cloud, doing analysis and stopping a process. Even within the world of AV, we have active defense going on, right? You’ve got an AV tool that is blocking, blocking malware to some extent. So we’re already doing it to some extent, frankly, segmentation does that to some extent. But I think what is really being referred to here is the more advanced the detection and remediation or EDR type tools are used.

What we’re going to see is we’re going to see that first in the world of the true IoT, really the edge areas where the device or a better phrase, the process is less critical. So what do you mean by that? The downside risk of taking a plant offline because you’ve made a proactive defense because you saw something in the environment is hugely is very high. Right? If I take that down, that cost is incredible. And so I think what we’re going to see is this is going to come from more of the devices that are actually less critical. People are going to get comfortable with it in more and more of those devices. And then eventually it will move into the truly critical part of a control system. Because of this concern of the quote-unquote type two error, meaning we remediated or we reacted to something we defended against something, but that was actually an appropriate call to a controller and it’s not the plan. So anyway, the short answer is I think we’re already doing some of it today. I think some of that’s been proven works well. I think those more advanced things are likely to come more from the IoT side. It’s going to take a while until they move into the IoT and the critical processes when we think of control systems. Gotcha.

Greg Orloff: So, needing to kind of wrap up because I think we’re coming to the top of the hour. John, from your experience. Are there any key first steps leaders can take to create a culture of alignment across an organization for OT cybersecurity?

John Livingston: Absolutely, the first thing we recommend before you do any of what we’ve talked about today is to get a clear on a workshop together with IT leaders and OT leaders and the security team together and walk through Are we all aware of what the risks are. So make sure people are aware oftentimes there’s just a complete lack of awareness be. Is there a be getting clear on from an operational point of view here are the here are the things that we cannot give on right? And then sitting down with that team and building out a high-level objective function, okay. We recognize these are attacked with these threats. Okay, we gotta do something about them. We recognize these are kind of hard stops that OT can’t handle. Great. Now, let’s say here’s over time over the next 18 to 24 months. We agree we’ve got to go in this direction. And we’re going to put a team together, multi-functional team together, that’s going to set that as our Northstar and we’re going to work towards it and we’re gonna have a deadline without a deadline. Now, this was a firm deadline within 12 months within 18 months, whatever it is, we’re going to have x improvement in IoT security, getting that initial series of workshops, we typically do that over kind of a three workshop session. Getting that set is absolutely necessary because otherwise you get no buy-in and none of this goes on. So I hope that helps. That’s what our recommendation would be first thing to do.

Greg Orloff: That’s fantastic, John, and I think that was a great way to wrap up the session. So thank you, John, for sharing your amazing insight, your experience. And thanks again to our sponsors to building cybersecurity organization, Autonet, Euro tech keep actor, verb and zona. Your support ensures that we have all this great content to share with you folks. Thank you to the audience, our attendees for listening in and for your great questions. Please make it a point to visit our events page for upcoming educational webcasts as well as events brought to you by IoT world, including our industrial University coming up on October 26th this year. And we also have IoT rules Manufacturing Day right before the holidays on December 7th. So, without further ado, moving to our next session, incentivizing ICS cyber protection through insurance. Have a great day, everyone. Thanks for joining.

Related Resources

Blog

Attack Surface Management: 6 Steps for Success in OT/ICS

Most attack surface management tools and approaches do not understand the technical complexities and operational requirements of these OT systems. But there is a way to effectively and efficiently conduct ASM in OT.

Learn More
Webinar

Rising Demand for OT Attack Surface Management

Learn how to apply Attack Surface Management in industrial environments in an OT-safe and efficient way. Watch the on-demand webinar today.

Learn More
Whitepaper

OT Endpoint Management Whitepaper

OT endpoint management is necessary to protect the world’s infrastructure, but it is often not deployed due to several key challenges. Find out how to overcome those hurdles.

Learn More

Contact Verve

Contact us to learn more about leveraging Attack Surface Management tactics to protect your industrial operations.

Contact Us