Our webinar on the results from 10 years of Verve’s cyber security assessments in industrial and critical infrastructure environments left you with a bit of a cliff hanger – what do you do if you get an insightful and complete assessment back? What do we commonly see? Here is a hint – it’s often insecure Windows systems.

To re-iterate some of the main objectives of our webinar:

  • Choose an assessment methodology that targets the right parts of the organization’s risk or vulnerability profile
  • Use a common language, and standardize on a framework that works for your organization
  • Obtain a complete and accurate assessment that used a deep technological approach to getting detailed perspectives on the system(s) under consideration
  • Identify selective completable targets of improvement that can positively affect the organization
  • Deliver on them (or why bother?)

We’ve discovered with a relative frequency that:

  • Most networks “rot” or “drift” from their initial implementation over time
  • Most endpoints (even if centrally managed) have software and security configurations that could be improved
  • Most networks have a variety of security features and technologies in place, but are not actively being utilized sufficiently (e.g., outbound firewalls, or VLAN security enforcement)
  • Most organizations do not have secure base-images (or configs) for commodity OT systems, and a sufficient backup/restoration program
  • And most environments need to define proper OT cyber security policy and procedures

While your organization is unique, we have not seen an organization that did not have room for improvement: endpoint security, hardening of network device configurations, asset synchronization for passwords/usernames, tightening of access controls, OT cyber security policies and procedures, and even enforcement for a variety of security technologies.

There are always risks related to unpatched vulnerabilities on IT/OT convergent systems or related to human errors/misconfigurations, but consequently, these types of risks are a contributing factor in the increase of cyber security attacks affecting Operating Technology in general.

In addition to our webinar, we want to suggest an emerging concept that assists decision makers and security teams in improving security: Moving towards a specific type of a “blueprint” that is used as the basis of securing assets running commodity Operating Systems (OS), and as part of a transitional approach to move currently insecure assets to their more secure and future-facing representations.

This blueprint or standard can be used for any number of types of assets, but in this case – we are choosing one particular type of asset commonly deployed in an insecure state, and giving you an example roadmap and blueprint to secure them.

Alternatively, this approach was generated around building a secure “golden image” and moving it into your organization as part of a maintenance approach vs. a security-only mindset. How engineer of us, eh? (in both the Germanic and Canadian senses of that statement).

Blueprints for OT endpoint cyber security

Based on Sarah Fluchs’ concept that you can make the blueprint for a secure Programmable Logic Controller (PLC) similar to how one builds a house, a discussion emerged between Admeritia GmbH and Verve around standardizing an endpoint security blueprint.

First, we need to describe what it is, and second, what is within our budget or needs – in other words, what good, better, and best look like. For example, a house can come in multiple variations including a good warm/watertight home on a budget, the better version using top tier materials, and the best money can buy with complete customizations.

Using the layered blueprint model, let’s examine a Windows-based HMI as a starting point, but let’s begin with the four components in a blueprint starting from the bottom (layer 1):

  • Function (FC) – what are the preconditions and objectives of that system?
  • Risk (RI) – what are the risks, or uncertainties that need to be evaluated?
  • Requirement (RE) – what are the security requirements we need to reduce the risk to a level we are comfortable with?
  • Implementation (IP) – what is the design of the overall solution, and what do we need to implement?

Assuming a standard or average HMI commonly found in a facility, we explore this definition for the FC stagewhat is the purpose and function of an HMI?

  • The purpose of the HMI is to provide a reliable always-on computer-driven monitoring facility that allows operators to view a process live in near real-time, and provide alarms or controls such that a human can interrupt the process when conditions vary from specification, when failure occurs, or for safety.
  • This system is used by operations for the manufacturing of ABC companies’ product, and contributes to the overall revenue generation process used by XYZ business unit.

Now of course, an HMI might do more or provide multiple functions organizationally, but in a nutshell, once you have seen an HMI in one location, all of the others at the same site are usually nearly identical, and this might include others for similar processes and vendor implementations (with small variations of course).

Next, we need to have an honest conversation with ourselves about the potential risks an HMI would face from both physical and cyber perspectives. What are some uncertainties for an HMI to be aware of? Here are some examples:

  • What can go wrong? What happens if the system reboots?
  • What happens if an attacker takes advantage of a known vulnerability and compromises it through remote access?
  • What happens if we cannot restore it quickly if it falls to ransomware?
  • What happens if I lose network connectivity?
  • What happens if an account is compromised?
  • What happens if attackers or unauthorized users connect to it over the network?
  • What happens if someone plugs in a removable media device such as a USB stick and it infects the system?
  • What happens if it is isolated and no one monitors the alarms locally or remotely?
  • What kinds of software need to be installed for an operator?
  • How can the HMI be manipulated and brought to failure?

All the above “what ifs” are valid and general questions for the purpose of assessing and refining the definition of the elements covered in the FC layer. These additional questions act as inspiration such that the groundwork is concrete enough before diving into risks and requirements.

From another perspective, one might use technology, such as the information obtained from ICS cyber security tools like Verve to understand usage and the profile of an active HMI.

Here, you could use technology to inform the blueprint creation by suggesting what an asset’s usage profile looks like, what protocols are used, established connections to other systems, user logins, account lists, logs, or extra software.

With a list of established connections on hand, work your way through it and check if you missed an additional purpose the HMI is frequently fulfilling (and learn HOW it fulfills the purposes you already know). If you employ and enforce user logins, you can determine if there is a user group doing something you forgot thinking about.

Most approaches are currently focused on interviews and in-person workshops that hope attendees provide sufficient relevant information. A tool that provides hard data to create a validated and largely more accurate hypothesis would be very beneficial.

Similarly, to the point above for the FC layer, OT security tools like Verve socialize and provoke discussions by sparking inspiration for uncertainties. For example, detecting vulnerabilities for the HMI leads you to think about how they might become a risk scenario. Imagine using Verve to poll technical information on a well-known “model” HMI asset currently in circulation to examine the following threat scenario precursors.

It could answer questions such as:

  • What are its qualities?
  • What OS is running on the system?
  • What protocols and to what hosts is it communicating?
  • How often do changes occur on it? Can we ensure authorized changes?
  • Are alternative forms of media necessary?
  • Should we be logging from it, and which sources or events?
  • Should a variety of users and roles be implemented?
  • What addressing schemes are needed (e.g., IPv4 vs. IPv6)?
  • Is there remote access?
  • What kinds of backups are needed?
  • What kind of hardware is needed for secure operations?
  • How do we securely manage and monitor configuration changes?

All of those above what/how kinds of questions are perfect fodder for requirements construction, but they are often related to threat scenarios needed by the RI layer. This discussion has two purposes:

  • You cannot derive threat scenarios without an accurate understanding of the SuC and inherently a well-defined FC layer
  • You cannot derive the requirements in RE without an adequately defined RI layer

Given the SuC is a commodity system such as Windows, or even Linux, potential threats need to be enumerated and evaluated. Generally, there is a lot of great existing material on how to derive cyber-based threats already, but assessing cyber risk is still a challenge and requires decision-making within an organization.

Still, you can get a good understanding of potential and likely threats and move into the RI phase without sizable barriers. Therefore, in the RI phase, we discover, deliberate, and then communicate identified scenarios so they are converted into requirements in the RE phase as follows:

From the combination of the layers {FC, RI, and RE}, we move to the IM layer: construction of a solution and its eventual deployment.

Aside from this being an engineering activity, this is a great way to validate the blueprint before moving towards a fully tested or net new template of a secure HMI without requiring a larger commitment. For example, you might take an existing asset, harden it, and add periodic programmatic checks as part of the asset inventory process – and this is both your gauge of efforts and evidence to change control committees that have performed your due diligence.

The above example pseudo blueprint is made up of a few layers that need to be formalized and might even need to be split into a basic generic formula that extends in different flavors based on different needs or OEM vendors (e.g., Windows 10 host running Rockwell HMIs in more of a discrete application might be different than that of an Emerson site that has turbines).

If another analogy is relevant, it might be prudent to think in programming terms where you have a base class and can extend it as needed – e.g., a base class for a vehicle has four wheels, an engine, and a steering wheel, but a car sub class might specify it’s length, fuel economy, and a truck might have a “box” for your weekend renovation materials from home depot.

Bringing the blueprint up to “code” – from pseudo to formal

At this point, we considered the basics for what goes into a blueprint, but it is not necessarily formalized – it is still a pseudo blueprint. So once you have a pseudo blueprint that comprises layers { FC, RI, RE } whilst being technologically informed, it needs to be shored up using existing security requirements and controls from a blend of existing material that is appropriate for your desired achieved security level (SL-A in ISA/IEC-62443 terminology). To quote Sarah again, “it’s all rather witch-crafty and not yet engineer focused”.

For the purposes of this blueprint, we have determined the FC and RI layers have been mapped out sufficiently, but the RE phase needs proper levels of engineering scrutiny. One approach would be to have an expert dive into this process, but another might be to use existing frameworks to formalize the requirements you may have already enumerated. One example of this might be to:

  • Choose one framework as the basis and standard of language in this blueprint
  • Break the pseudo requirements section into several areas that align to the capabilities in NIST CSF, or even the CSC Top 20 (where applicable)
  • Augment and update the pseudo requirements with best-practices and select requirements obtained from other sources
  • Review iteratively with adequate cyber security rigor

Truth be told, even if the cyber security supplements are IT, they largely work for Windows-based systems across the board because these are commodity (and convergent) IT/OT systems. In other words, HMIs are an IT-based system that faces IT-based attacks, that hosts OT functionality used by OT users.

As a final step (figure above) in this formalization phase (once we have FC, RI and RE layers completed), we need to build a “soft” implementation of the blueprint by way of an iterative prototyping process; after all, you can’t go to implementation without a working concept.

Aside from this being an engineering activity, this is a great way to validate the blueprint before moving towards a fully tested or net new template of a secure HMI without requiring a larger commitment. For example, you might take an existing asset, harden it, and add periodic programmatic checks as part of the asset inventory process – and this is both your gauge of efforts and evidence to change control committees that have performed your due diligence.

Again, you might use OT cyber security platforms such as Verve as part of the validation process to add an intermediary step that looks at the assets that comply with existing organizational policies and requirements. It begins by applying what a pseudo blueprint might look like through programmatic check functionality available in platforms such as Verve.

This way, it’s not “net new”, but rather a good way to apply desired security changes/controls to gauge complexity and effort. You may even be able to estimate which hosts to prioritize first and transition towards the more secure option pre-transition to the IM layer.

Validating and implementing the blueprint

Next, the formal blueprint NEEDS validation before its eventual implementation. Certainly, we have taken a pseudo blueprint to something we can use as a reference, but movement to practice has a few extra “hoops” to jump through:

  1. The blueprint needs to be semi-finalized (aka “frozen”)
  2. A prototype needs to be built using the frozen blueprint
  3. The prototype needs to be instrumented such that the blueprint can be programmatically assessed and monitored
  4. Any changes necessary to be added back into the final blueprint release candidate

Validation is under the control of the blueprint engineering team, but it might be tightly bound to organizational artifacts such as governance, compliance, and existing technology policies. So, validation in this sense might also be expanded to include these steps:

  • Setup a testbed or an environment that mimics a real site/deployment
  • Deploy blueprint and define implementation procedure
  • Test the blueprint and compare it against expected results via adequate testing
  • Review results and build the validation artifacts to convince change management boards or decision makers this is a good idea
  • Prepare for IM phase

A wise person once said, the best made plan is often the first to go wrong, so it is CRITICAL that the blueprint be a working (or rolling) iterative model that is validated, isn’t used just once, but rather one that is created, deployed, adjusted, updated, and maintained until a better solution comes along.

In addition to the “simple” objective to transition the blueprint from the RI to IM layer, we need to make our validated blueprint acceptable to OT, but also to “jive” with business motivators – total cost of ownership, return on investment, implementation plans and roadmaps. Here’s why:

  1. Design of a blueprint that continually changes requires continual investment and support to be a success, but also one that lends itself to long-term cost savings vs. a short-term loss
  2. If you are building a blueprint, maintain the base versions from deployment onwards. This is a commitment, but also simplifies a lot of decisions. You KNOW exactly how systems deploy because you built it from a blueprint you designed and maintained
  3. If you build base systems, policies, blueprints etc., be prepared for this to also reap rewards in other areas such as improved documentation, risk management, reduced compliance costs/overhead, and even asset vulnerability management – cost savings upfront will undoubtedly result in cost savings long-term
  4. Standardized images are advantageous in virtualized settings, backup and restoration, and remediating ransomware or malware attacks
  5. Designing a base image such that it is rolling forward allows you to implement improved security (such as updates) for “free” when you re-image systems, on-boarding assets into centralized management, or restoring from a hardware failure
  6. Allows for standardized compliance and requirements to be institutionalized vs. applied ad-hoc
  7. Promotes automation through technology – whether through agents to monitor vulnerabilities, or to automatically test or report issues immediately as they arise
  8. Reduce the potential for human error because of the nature of making images that have gone through sufficient testing and hardening by your security teams (e.g., fix once, fix everywhere)
  9. Merely designing and implementing a blueprint still needs to abide by change controls and requires modification of organizational infrastructure and assets

I note, in theory, it is simple to go from layers {FC, RI, and RE}, but that in-between phase called validation can be tricky. At the end of the day, creating a blueprint is more than making a document, it is a virtual representation created through prototyping, making a production candidate, and getting that into production, until retirement after clearing a variety of gates. In summary, getting the blueprint into production or even into widespread adoption takes multiple steps:

  • A plan for rolling out the blueprint reference design needs to be devised
  • The blueprint reference design needs to be tested and validated (or adjusted)
  • Change management needs to be informed, and accommodated
  • Deployed at a small scale, and then shifted into larger environments

The above example outlines the whole process from building the blueprint, deployment, maintenance and retirement. These are multiple phases easily self-contained, but can often be interlinked or iterative. Some elements of change or risk management must be enacted throughout the blueprint validation and implementation process.

Perhaps obvious, or perhaps not, as I built out that simplified process diagram, in the maintenance phase I realized that if you monitor the risks on the endpoint blueprint systems (e.g., made a live version of them even as a virtual machine), you can effectively have a security benchmark across your organization for free (if you stick to the concept and remediate drift). Everyone likes free stuff – and better yet, if you can programmatically issue blueprints as secure configurations with a solution “automagically” – it immensely reduces cost and error.

Next steps: Maintenance and continuous monitoring

Previously, there was a pervasive reluctance by a number of asset owners to manage and maintain electronic/cyber assets in OT environments due to a variety of reasons.

Fortunately, blueprints secure HMIs in environments focused on engineering, continual monitoring, and maintenance. This is just an overview, but the concept of building blueprints and deploying reference systems overlaps with security and asset lifecycle programs which are out of scope for this article, and best revisited as a larger concept that includes vulnerability and asset management.

For more information on this topic, please check out our other articles and white papers on this subject, and feel free to reach out.

 

Related Resources

Blog

3 Benefits of a 360-Degree Vulnerability Assessment

Defending critical infrastructure requires 360-degree visibility into asset and network vulnerabilities through a vulnerability assessment.

Learn More
Video, Webinar

[Webinar] Enhance Your ICS Security Program with Findings from 10+ Years of Vulnerability Assessments

Download our on-demand webinar to discover how to achieve the greatest risk reduction for the time and money available.

Learn More
Blog

Top 5 Learnings from a Decade of OT/ICS Vulnerability Assessments

What ten years of vulnerability assessments can teach the OT/ICS cyber security industry about vulnerability exposure and risk prioritization and remediation.

Learn More

Subscribe to stay in the loop

Subscribe now to receive the latest OT cyber security expertise, trends and best practices to protect your industrial systems.