The need for a new approach to OTSM (OT Systems Management)

For the past several years, Verve has emphasized the importance of OT Systems Management both for reliability and cyber security of OT/ICS. For too long, these systems have been excluded from organizations’ traditional IT systems management. There may have been some good reasons for this, from a lack of IT understanding of the devices and their operations to the complexity of processes and control system designs. However, in today’s threat landscape with regulatory and insurance requirements, it is no longer feasible to put these systems behind a firewall and forget them. There are too many threat actors targeting them and too many ways of bridging that IT-OT separated network to leave these systems unmanaged.

The challenge, however, is finding an efficient approach to do this. In IT, organizations have invested in years of technology and procedures that are mainly centrally managed with a drive to “lean operations” to maintain those systems. In OT, however, the assets are spread over thousands of miles, dozens or hundreds of sites, dozens or hundreds of different vendor and device types, different legacy network architectures, and no centralized tooling to manage.

It is no longer feasible to exclude these systems from management. Verve’s recent analysis of ICS advisories and vulnerabilities highlights the growing risk to these devices.

And OT is now in the cross-hairs of attackers, with manufacturing ranking as the number one most-targeted sector, with energy close on its heels.

Furthermore, insurance companies have increased their requirements to gain OT coverage, now requiring separate policy applications and detail on steps organizations are taking regarding managing vulnerabilities in OT.

Over the past few years, Verve has been working with dozens of clients to address these OT systems management challenges. From this experience, we developed a set of approaches that all industrial organizations can take to improve both the effectiveness and efficiency of their OT Systems Management function to drive security and reliability. We are providing our technology platform and managed service support to our clients that need it to help drive a step-function improvement in results with much lower total costs than would be feasible without such approaches.

What is OT Systems Management?

Over the past decade, IT Service Management (ITSM) leveraging CoBit, ITIL (IT Infrastructure Library), and other standards has become a proven, rigorous process in most large enterprises. The basic components of ITSM include designing, planning, operating and controlling IT services provided to users.

In practice, this includes how hardware is configured, patched, managed, and deployed, how software is developed and deployed, and how teams respond to incidents.

These practices are critical to improving overall IT cyber security posture. In fact, according to the National Initiative for Cyber Education’s (NICE) Cyberseek database, over 70% of job openings in cyber security are related to operations, maintenance and provisioning – jobs often contained within ITSM functions.

Moving away from traditional IT realms of PCs, laptops, cloud-based servers, and mobile devices into the world of Operating Technology (OT) found in manufacturing plants, the power grid, building controls, and other cyber physical systems, the role IT plays in managing these devices is less clear.

OT is often controlled by manufacturing engineers, process control engineers, or instrumentation and controls technicians. The systems are critical in functioning physical processes, such as power generation and transmission or manufacturing production. These OT systems are built to last ten-to-twenty years, as opposed to the five-year lifecycles of traditional IT equipment.

OT systems contain many embedded devices running on firmware developed by specific OEMs that are not built with open management interfaces. Programming these systems often means accessing OEM-specific tools to update or reconfigure them.

Over the past decade – and in the decade to come – corporations connected OT systems to the traditional enterprise IT infrastructure to drive greater efficiency and effectiveness of operational processes. But with this added benefit comes risks from IT networks: ransomware, hacking for espionage and potential disruption of physical processes to cause physical damage.

Greater connectivity offers the hope of leveraging the cloud for advanced predictive maintenance analysis, improved operational efficiency by adjusting control parameters, and more efficient use of labor by managing sites through centralized, remote access. These financial drivers are significant, so much so that this idea has been branded as Industry 4.0.

As these systems connect, security becomes a much greater issue. Systems formerly “air-gapped” or “islanded” from enterprise IT and its access to the internet and communication applications, such as email and cloud interfaces, are accessing the enterprise infrastructure to take advantage of scale and the power of big data analytics.

Chief Information Security Officers (CISOs) and Chief Information Officers (CIOs) are asked by their boards of directors to secure these systems to ensure they have the same level of security as the rest of the devices within the enterprise. As a result, IT leaders want to increase the integration of IT and OT in driving standardized cyber security across all endpoints and networks.

However, in most organizations, OT assets such as HMIs, servers, PLCs, relays, RTACs, and other intelligent electronic devices are excluded from ITSM processes for a variety of reasons.

From organizational boundaries to lack of skills of IT personnel on OT systems to regulatory requirements, ITSM practices do not extend to these systems. Further, OT staffs are already under headcount pressure to increase efficiency.

Many foundational elements of cyber security are not present in OT. Inventories are inaccurate, configuration and patch databases are outdated, and account and user access management are poorly executed. This is partly because tools have not been available to automate these processes, given the sensitive, unique, and embedded nature of OT assets.

This gap means IT OT convergence or integration in cyber security has no foundation to build on. The C-suite and board of directors are left in the dark about real risks on cyber physical systems because IT leaders (CISOs and CIOs) cannot measure progress or risks the same way OT can due to the weak foundation security tools and processes are deployed from.

For instance, if a company deploys a detection platform into an OT environment without foundational elements, it is never certain whether all hardware and software assets are accounted for. They cannot effectively protect the systems because they lack active device management.

It would be like installing locks on all your doors but ignoring the dozens of windows in the house that do not have locks and are easily accessible on the first floor.

To develop robust OT cyber security roadmaps and foundations, organizations with OT systems (everything from manufacturing process controls to building control systems to security access systems) should embrace the concept of OTSM, paralleling their ITSM practices but within the unique environments of operating systems.

Achieving a mature level of OTSM is critical to improve overall ROI from increasingly connected industrial systems and to ensure foundational elements of OT cyber security are in place to protect critical infrastructure from targeted and untargeted attacks.

5 Benefits of a Robust OTSM Program

  • Insight into all hardware and software in the network to ensure vulnerabilities are identified quickly
  • Properly updated and configured systems to reduce opportunities for cyber attacks or reliability issues
  • Operationally efficient systems update to provide automation on key operational tasks
  • Consistent reporting and monitoring across IT and OT for simplified progress documentation
  • Effective advanced security controls built with proper visibility and access to the underlying endpoints and network data

Best Practices based on a decade of OT Systems Management support

For the past decade, Verve has helped industrial organizations mature their OT systems management functions through the application of our software platform – the Verve Security Center – as well as through our managed services support. During this time, we have learned many lessons about the success of such programs. Below we synthesize these into five key lessons learned that may help others in their journeys.

1. Align leadership on priorities.

OTSM requires a significant change effort. Traditionally, industrial controls systems have been long-term capital investments that last fifteen to twenty years between major upgrades.

OTSM requires regular management: updating, configuration management, access management, and vulnerability management. In many cases, this requires changing mindsets and behaviors of team members and more functional and procedure requirements. Senior leadership is key to making effective changes within already stretched operational organizations.

Perhaps the greatest difference between success and failure is the amount of alignment between the operations teams and the security and systems management organizations. One of the most effective approaches was an organization that assigned one of their most experienced ICS engineers who had led all of the company’s automation engineering projects, their regulatory security program for OT assets, as well as managed some of the most critical assets as a core leader of the overall cyber security efforts. This organization’s decision immediately brought credibility to the security team with the operational personnel as well as an implicit understanding of the challenges of OT systems management vs. the same function in IT.

Regardless if the choice is structural, such as that mentioned above, or procedural in establishing aligned OT-specific policies and procedures, or cultural in increasing collaboration between the separate teams – alignment among these IT, Security, and OT functions are critical.

2. Centralize systems into an enterprise OT Systems Management platform

Yes, Verve does offer such a platform. However, we didn’t always do so. We began our corporate life as a services organization trying to help organizations mature their reliability and security with support services. We learned this lesson the hard way by not having such a central management console. OT systems are, by their nature, distributed geographically, complex mixes of different vendors and types of devices, legacy and newer devices, different protocols, etc. In addition, they require very high levels of uptime and specialized process knowledge to manage. IT systems management often includes a range of different “Towers” or “Functions,” each with its own technology solutions: Vulnerability scanning, patch management, configuration management, user and account management, software management, CMDB, etc. And IT resources have been trained for years to manage these systems actively. The combination of these things – complexity, specialty knowledge, varied “towers,” and limited resources – is a recipe for high cost and potential reliability risks.

After years of trying to manage these systems by following the traditional IT or OT approaches offered by OEM vendors, we realized that a centralized platform was the only way to do this efficiently and effectively. We call the concept a “Think Global: Act Local” approach, as pictured below. The basic idea is to centralize all risk information – comprehensive inventory, software installations, vulnerabilities on all devices, user and account management, configuration risks and compliance, backup status, Anti-virus status and alerts, etc. – into an enterprise database that allows for risk prioritization. What we call “Think Global.” At the same time, that platform must allow for control over remediation by those closest to the process. OT Systems Management requires OT knowledge of the process and what is feasible at any given time on any given system. Therefore, the platform must provide automation capabilities to those who understand the process – what we call “Act Local.”

This platform must get deep into the asset base as well as provide a “360-degree” view of the risks because the most direct remediation measure may not be feasible in an operational site that runs 7X365.

Go deep and direct to the asset to get the required asset inventory visibility to conduct such a 360-degree view.

3. Develop the right talent – both internal and external.

Because of the sensitivity of these OT systems as well as the need for an integrated program of vulnerability management as well as other systems management functions such as configuration hardening, network management, AV or Whitelisting management, etc., it is critical that the personnel involved in OT Systems Management are OT-knowledgeable. Many organizations have already experienced the impact of IT trying to adopt their traditional IT Systems Management tools and techniques to OT, resulting in downtime from inappropriate patching, “bricking” PLCs or other devices after they are scanned by a vulnerability scanner, etc.

We often hear this debate of “is OT security and management an IT or an OT function?” It is a false debate – it is and will remain BOTH. You need people who understand security and risks, often IT-experienced personnel. But you also need people with deep process knowledge of how to apply those philosophies in your OT environment.

The NIST database CyberSeek provides a good reference for the types of skills that will be required for systems management and security. Cyberseeek analyzes all the open recs for cyber security positions and breaks those into different skill types. The conclusion when one analyzes this database is over 75% of the security tasks are related to systems management.

So now that the skills are defined, the question arises whether to build this in-house or to leverage 3rd parties. Again, one might argue we are “talking our book” in this area, but the lessons have been learned from real-world experiences with clients. We strongly believe that the right answer combines internal and external resources. Internal is necessary because they have the ongoing relationships, internal process connections, inter-departmental connections, etc. But external also brings significant advantages: the ability to maintain scale and training, cross-sector experience from other organizations, established processes, etc.

One thing we have seen time and again is an organization that attempts to build its OTSM team. They scale it to 3-5 people. They recruit internally and externally. They train this group on their systems. They often come in after the program is in flight, so they must learn history and the currently deployed approaches. Within a year, one or two have found opportunities outside the organization, either in different departments internally or externally (given the high turnover of cyber staff), one or two have not worked out as expected, and perhaps one is ready to be promoted. A year or 18 months later, the company is out recruiting and training again for one or two personnel. It is often just sub-scale. By partnering with a quality 3rd party, they bring the scale so that the team assigned is flexible, and when an individual leaves, others are there to step up. There is an ongoing recruiting and training function. They bring knowledge on a regular basis from other organizations for whom they are doing OTSM. By integrating external resources with internal, the organization can develop long-term knowledge while managing the scale and variability of OT security.

We have seen several companies adopt this OTSM approach to manage and secure their OT assets and networks.

POWER UTILITY CASE STUDY

This North American company operates dozens of facilities across a wide geographic range. The CEO established an objective that IT and OT would achieve the same cyber security standards.

The mandate was to find ways to build systems management foundations across IT and OT. So, instead of saying ITSM and OTSM, they refer to it as SM or Systems Management. IT was well ahead of OT when they started their journey in 2017.

There was no accurate inventory of hardware or software. They had no visibility into their facilities’ configurations, user accounts, password settings or backup statuses. Where they did manage devices, they did so in an ad hoc way with manual processes and dozens of OEM toolkits.

Over a two-year period, the utility company built an OTSM process to mirror their ITSM process, using a range of standards from ISO to CIS to establish basic objectives and scoring mechanisms to track progress. This created clear executive alignment on objectives and processes to manage different environments.

They deployed tools and automation relevant for the environments and trained resources to assess, manage and remediate endpoints across their systems.

As a result, the utility company significantly increased their cyber security maturity in both IT and OT. They built a strong foundation for additional controls and operational consistency. They effectively communicate their progress and status across all networks to the C-suite and board of directors, and they measure traction against security improvement on a regular basis.

Establishing OTSM with Verve Industrial

Success in OT cyber security and reliability requires a new foundation in OTSM. Systems management is critical to ensuring connected systems are protected and managed appropriately.

OT must take a page from the IT playbook and deploy processes, tools and training that enable the 50-60% of cyber security functions that are foundational in systems management. Without it, IT OT integration and true OT security will be difficult, if not impossible, to achieve.

Verve Industrial Protection has worked with many clients seeking OTSM capabilities in our twenty-five years of business. Leveraging the Verve Security Center (VSC), a true endpoint management platform built for OT, with the support of our OT expertise, these companies doubled their cyber security maturity with measured results to track progress.

The vendor-agnostic VSC makes OTSM possible and efficient through automation and visibility. Its closed-loop approach provides visibility and tracking of assets and vulnerabilities, as well as actionability to manage patch configurations, user accounts, etc., from a single platform.

The Ultimate Guide to Understanding OT Security

What is OT security, how does it work, and where should you start when building a robust cyber security program?

Get the Free Guide

Additional OTSM Resources

Blog

4 Essential Elements of Effective OT Systems Management (OTSM)

Explore the crucial 4 key elements of OTSM for enhancing cybersecurity and reliability in connected industrial systems.

Learn More
Blog

The Future of OT Security: OT Systems Management

Learn why there's an increasing need for OT security to adopt the core elements of IT Systems & Security Management in the coming years.

Learn More
Webinar

Why an OT Systems Management Platform Approach is Critical for Converged IT OT

Download this on-demand webinar to learn how to improve OT cyber security practices in manufacturing environments.

Learn More

See it action

See how OT Systems Management comes to life in the Verve Security Center to safeguard your most strategic assets.

Request a Demo