Excerpted from the guidance paper DevOps Case Studies. Download the full paper here.
Written by: Jim Stoneham, Paula Thrasher, Terri Potts, Heather Mickman, Carmen DeArdo, Thomas A. Limoncelli, and Kate Sage
Technology Practices Journey
At this organization, the starting environment consisted of approximately eight thousand IT staff supporting two thousand applications for twenty business-facing IT areas. The delivery model being used was waterfall, with some established CI and TDD practices in advanced areas. There was large variation in how development was done across the enterprise, resulting in inconsistency in quality and challenges in delivery.
As a result of this, the board was considering more outsourcing. A senior vice president in an area that was doing some Agile work was given a year to run an experiment to demonstrate results that these Agile practices, which were showing promise, could be established and scaled to drive in-sourcing of work in a centralized development organization.
CI/TDD and Full Agile Stack Adoption
The journey started by establishing consistent Agile management practices, which were applied across all technologies (e.g., standups, visual system management, iterations, show and tells, iteration planning, etc.). Engineering practices were adapted as applicable based on technology (version control for all source code, CI for Java/.Net, TDD, automated accepted testing for all stories). Continuous peer-based learning and improvement processes were put into place to drive sustainability and scaling across the enterprise.
The results within one year demonstrated that the experiment was a success. Significant results were achieved in quality (80% reduction in critical defects), productivity (82% of Agile teams were measured to be in the top quartile industry side), availability (70% increase), and on-time delivery (90%, up from 60%). As a result of this, the amount of work being done by Agile teams increased to over 70% of the total development work being done in IT (growing from three Agile teams to over two hundred teams) over five years.
Adopting Lean Engineering Practices (DevOps)
The result of the Agile implementation was that work moved very efficiently and quickly in the middle of the development cycle, but there were wait states at the beginning and the end of the cycle, creating a “water-Scrum-fall” effect.
There were wait states at the beginning of the life cycle, waiting for work to flow into the backlog of Agile teams, and wait states downstream waiting for dependent work to be done by teams and for environments. Sixty percent of the time spent on an initiative was prior to a story card getting into the backlog of an Agile team. Once work left the final iteration, there were high ceremony, manual practices leading to increased lead time for deployments. This was due to disparate release and deployment practices, technologies, and dependences due to large release batch sizes.
At this point, the focus moved to reducing variation in processes such as work intake, release, deployment, and change across the delivery pipeline, and applying Lean practices (A3, value stream analysis) to identify areas for improvements to reduce end-to-end lead time. The results of this was the construction of an integrated CD foundational pipeline to provide visibility and an accelerated delivery capability. Other initiatives to reduce wait times included infrastructure automation and more test automation, including service virtualization and test data management.
Small Batch Sizes, APIs, Microservices, Monitoring, and Metrics
By itself, the creation of a CD foundation with automated infrastructure would not significantly reduce lead times to accelerate delivery without reducing dependencies, which led to the wait states that the Agile teams encountered. As such, the next step was to focus on small batch sizes and modern web architecture practices (APIs, microservices). Additional methods to remove dependencies included the implementation of dark launching, feature flags, and canary testing to move from releasing based on a schedule to releasing based on readiness. More emphasis was placed on creating infrastructure as code, including containerization.
In order to improve lead times, it is first necessary to measure the entire end-to-end lead time. Having an integrated pipeline and workflow allowed the implementation of metrics to baseline lead time and measured improvements, which also shed light on where wait states existed within the delivery value stream.
As delivery becomes accelerated, it also becomes more critical to get real time monitoring and feedback for both operational performance and customer feedback to determine if the business case “hypothesis” (including multivariate A/B testing) is in fact driving more customer value.
To download the full DevOps Case Studies Guidance Paper, click here.