Managing ICS Endpoint Security During COVID-19

The world has turned upside-down by the rapid spread of the COVID-19 virus in the first few months of 2020. The novel coronavirus has not only threatened the health, future and stability of people, but business as well. Attempts to diminish the spread of the pandemic and “flatten the curve” have resulted in nation and worldwide mandates to shelter in place, closing non-essential businesses and keeping residents at home as much as possible.

But it’s not just attacks on our health systems we need to be concerned about. Much has been written about the increasing cyber threats taking advantage of the physical virus to spread their attacks. Perhaps most dangerous for the sectors that rely on industrial control systems (or operating technology), potential attack vectors increased with the rapid shift to remote access support of these critical systems.

We understand that many companies have robust remote access policies, procedures and technology in place. But for thousands or tens of thousands of others, the urgency of protecting team members has meant leaving a known – or unknown – gap in the security of their control systems.

We are strong believers in the need for robust secure remote access technologies in all ICS environments. However, practically, those measures, even if deployed, may not be effective when the volume of connections increases dramatically.

So, what can we do to protect against remote cyber threats in ICS environments?

Beyond the core of establishing a robust secure remote access program, we recommend four steps to add greater defensive layers. These steps will not only help secure the environment during the immediate pandemic-period, but will also provide a robust foundation for securing these systems in the long-haul.

4 Steps for Greater Defense Against ICS Attacks

Step 1: Develop a detailed inventory of all software, hardware, users, accounts, etc. in your OT environment

Yes, we have all heard it before: you can’t protect what you don’t know about. But in this case, the need for inventory goes beyond that. You need a baseline of what is there to be able to manage your endpoints in a way that can protect them from new, potentially damaging software, files, or users that are created during this time. This baseline enables you to do several things:

Ensure you understand the current risks already present in the environment. For instance, are there settings on machines that could enable a remote worker to access other devices without approval? Is there software on devices that don’t need to be there that could create unknown entry points into the system? Are there vulnerabilities in the current software that might be used by attackers to gain access through the remote system to expand an attack vector?
Provide a baseline to identify changes that occur after a worker uses remote access. This is critical to any secure remote access solution. Without a baseline, it is very difficult to know if changes that have been made are approved or potentially malicious.
Ensure that the user accounts, approved network paths in firewalls and switches, etc. are all required and follow the policy of “least privilege”.

Step 2: Accelerate the remediation of identified risks using a prioritized risk-rating based on asset criticality and potential vulnerability

Once an operator understands the baseline environment and risks present, they can create a holistic picture of all of the endpoints and network connections including how critical an asset is to operations, how critical the vulnerabilities that exist are, whether or not the asset has layers of compensating controls such as application whitelisting or protection from layers of network defense, etc. From this “risk-rating”, the operator can then begin to take remediation actions on those assets that are most at risk to operations. These actions can include:

Patching vulnerable systems. We recognize that patching is often difficult in operating environments. We have proven, however, there are safe ways to patch many systems efficiently with automated tools controlled by the technicians with knowledge of the process.
Eliminating configuration risks. Once the risk assessment is complete, operators will identify configuration errors or just overlooked items such as firewall rules not hardened enough, device settings that provide easier access to attackers like open ports and services or lack of lockouts from failed logins etc.
Providing compensating controls. For some assets the criticality requires extra steps to ensure protection. This may involve further limiting access or deploying application whitelisting with tight rules of what can run to avoid new software running on the endpoint.
Removing software, users, or accounts that are not necessary.

Step 3: Define and alert on “out-of-normal” changes that occur on systems

While vendors and employees connect into these systems, they will make changes. For many operators, the change management processes that might be present in a NERC-CIP Medium-Impact facility in the NA Power Sector do not exist. Perhaps change management is more informal. This is the time to address this procedural gap and apply technology that identifies critical changes.

It is likely that most operators will not be trying to accomplish major changes during the time while teams are working remotely. As a result, we would recommend that the change management approval process be tightened significantly and be on an exception-only basis. While the pandemic period passes, more refined policies can be put in place, but in the near term a limitation on change would be prudent.

At the same time, ensuring you have visibility into any changes that are outside the normal bounds will be key to the security as well.

Step 4: Ensure the incident response policies are clear and recovery technology and plans are in place

Many companies may become enmeshed in so much overhead time dealing with the virus adjustments, that they may forget about the potential for cyber viruses which may be as bad for the company as the physical virus. It is critical in this time that companies publish and communicate their IR plans.

For some, this may be the first time those have been developed. They can be simple. But it is critical that both remote workers and those onsite know what to do in case they identify potential incidents.

As important as the incident response process is ensuring that assets have robust recovery technology and plans. This includes robust, up-to-date, and tested backups. One of the greatest risks to ICS is the increasing spread of ransomware. Knowing that backups are up to date and can be restored easily is a key element of a robust defense against ransomware. We strongly encourage immediate actions to ensure all backups are up-to-date and tracked.

They say that we are living in extraordinary times. These times also call for extraordinary measures such as working from home and remote access. Let us not lose sight that this also calls for extraordinary defenses given the additional risks posed by remote access.

The great news is that this may be the “emergency” that can drive rapid action on endpoint management and security that is so needed in the ICS industry.

Related Resources

Whitepaper

OT Endpoint Protection Whitepaper

OT endpoint protection is necessary to protect the world’s infrastructure but faces many challenges. Find out how to overcome those hurdles today.

Learn More

Whitepaper

Protecting Embedded Systems

From an asset owner's perspective: Defining firmware and discovering embedded vulnerabilities to protect devices from exploitation.

Learn More

Blog

Remotely Manage OT Security During COVID-19 and Beyond

Gain OT security maturity with remote work and limited physical plant access. Learn how to rapidly enable remote security deployment and management in ICS.

Learn More

Verve's Biweekly Newsletter

Fill out form below

Managing ICS Endpoint Security During COVID-19

4 Steps for Greater Defense Against ICS Attacks

Step 1: Develop a detailed inventory of all software, hardware, users, accounts, etc. in your OT environment

Step 2: Accelerate the remediation of identified risks using a prioritized risk-rating based on asset criticality and potential vulnerability

Step 3: Define and alert on “out-of-normal” changes that occur on systems

Step 4: Ensure the incident response policies are clear and recovery technology and plans are in place

Related Resources

OT Endpoint Protection Whitepaper

Protecting Embedded Systems

Remotely Manage OT Security During COVID-19 and Beyond

Subscribe to stay in the loop