Inspire, develop, and guide a winning organization.
Create visible workflows to achieve well-architected software.
Understand and use meaningful data to measure success.
Integrate and automate quality, security, and compliance into daily work.
Understand the unique values and behaviors of a successful organization.
LLMs and Generative AI in the enterprise.
An on-demand learning experience from the people who brought you The Phoenix Project, Team Topologies, Accelerate, and more.
Learn how making work visible, value stream management, and flow metrics can affect change in your organization.
Clarify team interactions for fast flow using simple sense-making approaches and tools.
Multiple award-winning CTO, researcher, and bestselling author Gene Kim hosts enterprise technology and business leaders.
In the first part of this two-part episode of The Idealcast, Gene Kim speaks with Dr. Ron Westrum, Emeritus Professor of Sociology at Eastern Michigan University.
In the first episode of Season 2 of The Idealcast, Gene Kim speaks with Admiral John Richardson, who served as Chief of Naval Operations for four years.
New half-day virtual events with live watch parties worldwide!
DevOps best practices, case studies, organizational change, ways of working, and the latest thinking affecting business and technology leadership.
Is slowify a real word?
Could right fit help talent discover more meaning and satisfaction at work and help companies find lost productivity?
The values and philosophies that frame the processes, procedures, and practices of DevOps.
This post presents the four key metrics to measure software delivery performance.
June 17, 2021
The traditional model of incident management using ticket handling progresses a ticket through multiple tiers: L1, L2, L3. This model creates queues that elongate response times and create ticket handoffs, which loses vital context with each group. In complex systems and failures, the ticket is delayed in getting to the correct responders. The end result is long response times and customer frustration.
In the new Prepare/Respond/Review Incident Management Framework, we advise against using this tiered ticketing system and moving toward incident swarming.
Swarming provides a mechanism to remove queues and handoffs for major incident handling and to quickly bring responders and dependent responders together. Incident swarming focuses accountability to drive reduction in recovery time and to share knowledge about the incident rapidly.
The incident-team swarming model is an alternative to solve many tiered-approach challenges. It is based on a networked collaboration across the incident team rather than a funnel approach. There are very few tiers and escalation is fast, getting all on-call members from all teams on as quickly as possible.
It is recommended to define triage groups for every product that includes all parties to be paged/escalated for an outage. For example, an application that uses a popular database platform would have DBA and storage on their triage list. This model prefers the full triage on-calls to be paged, and then members are dismissed once the problem has been targeted.Tickets can be escalated quickly by the initial intake point (L1 help desk) or routed automatically to the owning team.
Most organizations form their swarm teams based on individuals’ areas of expertise and reputation. A combination of the models seen at the companies BMC at CSG can be categorized by these four types:
When incident swarming is well planned and given the required environment and empowerment, it leads to significant business benefits, including:
When combined with the other patterns within the Prepare/Respond/Review Incident Management Framework, swarming can provide an effective and swift mechanism for responding to incidents and outages.
To learn about the full framework, download the white paper here.
Trusted by technology leaders worldwide. Since publishing The Phoenix Project in 2013, and launching DevOps Enterprise Summit in 2014, we’ve been assembling guidance from industry experts and top practitioners.
No comments found
Your email address will not be published.
First Name Last Name
Δ
I know. You’re thinking I'm talking about Napster, right? Nope. Napster was launched in…
When Southwest Airlines' crew scheduling system became overwhelmed during the 2022 holiday season, the…
You've been there before: standing in front of your team, announcing a major technological…
If you haven’t already read Unbundling the Enterprise: APIs, Optionality, and the Science of…