The Science Behind the 2013 Puppet Labs DevOps Survey Of Practice
(This is a post written by Gene Kim and Jez Humble)
Last year, we both had the privilege of working with Puppet Labs to develop the 2012 DevOps Survey Of Practice. It was especially exciting for us, because we were able to benchmark the performance of over 4000 IT organizations, and to gain an understanding what behaviors result in their incredible performance. This continues research that Gene has been doing of high performing IT organizations since 1999.
In this blog post, Jez Humble and I will discuss the research hypotheses that we’re setting out to test in the 2013 DevOps Survey Of Practice, explain the mechanics of how these types of cross-population studies actually work (so you help this research effort or even start your own), then describe the key findings that came out of the 2012 study.
But first off, if you’re even remotely interested in DevOps, go take the 2013 Puppet Labs DevOps Survey here! The survey closes on January 15, 2014, so hurry! It only takes about ten minutes.
2013 DevOps Survey Research Goals
Last year’s study (which we’ll describe in more detail below) found that high performing organizations that were employing DevOps practices were massively outperforming their peers: they were doing 30x more frequent code deploys, and had deployment lead times measured in minutes or hours (versus lower performers, who required weeks, months or quarters to complete their deployments).
The high performers also had far better deployment outcomes: their changes and deployments had twice the change success rates, and when the changes failed, they could restore service 12x faster.
The goal of the 2013 study is to gain a better understanding of exactly what practices are required to achieve this high performance. Our hypothesis is that the following are required, and we’ll be looking to independently evaluate the effect of each of these practices on performance:
- small teams with high trust that span the entire value stream: Dev, QA, IT Operations and Infosec
- shared goals and shared pain that span the entire value stream
- small development batch sizes
- presence of continuous, automated integration and testing
- emphasis on creating a culture of learning, experimentation and innovation
- emphasis on creating resilient systems
We are also testing two other hypotheses that one of us (Gene) is especially excited about, because it’s something he’s wanted to do ever since 1999!
Lead time: In plant manufacturing, lead time is the time required to turn raw materials into finished goods. There is a deeply held belief in the Lean community that lead time is the single best predictor of quality, customer satisfaction and employee happiness. We are testing this hypothesis for the DevOps value stream in the 2013 survey instrument.
Organizational performance: Last year, we confirmed that DevOps practices correlate with substantially improved IT performance (e.g., deploy frequencies, lead times, change success rates, MTTR). This year, we will be testing whether improved IT performance correlates with improved business performance. In this year’s study, we’ve added inserted three questions that are known to correlate with organizational performance, which is known to correlate with business performance (e.g., competitiveness in the marketplace, return on assets, etc.).
Our dream headline would be, “high performing organizations not only do 30x more frequent code deployments than their peers, but they also outperform the S&P 500 by 3x as measured by shareholder return and return on assets.”
Obviously, there are many other variables that contribute to business performance besides Dev and Ops performance (e.g., profitability, market segment, market share, etc.). However, in our minds, the reliance upon IT performance is obvious: as Chris Little said, “Every organization is an IT business, regardless of what business they think they’re in.”
When IT does poorly, the business will do poorly. And when IT helps the organization win, those organizations will out-perform their competitors in the marketplace.
(This hypothesis forms the basis of the hedge fund that Erik wants to create in the last chapter of “The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win”, where they would make long or short bets, based on the known operating characteristics of the IT organization.)
We want to acknowledge the amazing contributions from Dr. Nicole Forsgren Velasquez from Utah State University on the survey design. She is well-known for her work with the USENIX community and organizational performance research:
Nicole Forsgren Velasquez is considered an expert in the work, tools, knowledge sharing, and communication of technical professionals and has served as co-chair of WiAC ’12, WiAC ’13, and CHIMIT ’10, and is the general chair of LISA ‘14 (Seattle). Her background spans IT impacts, user experience, enterprise storage, system administration, cost allocation, and systems design and development. She has worked with large and small corporations across many industries and government.
Nicole holds a Ph.D. in Management Information Systems and a Masters in Accounting from the University of Arizona. She is currently an Assistant Professor at Utah State University and her public work includes technical white papers, a patent, newsletter articles, and academic research papers. She has been a featured speaker at industry and academic events and was involved in the organization of the Silicon Valley Women in Tech group.
The Theory Behind Cross-Population Studies and Survey Instruments
Like last year, this year’s DevOps survey is a cross-population study, designed to explore the link between organizational performance and organizational practices and cultural norms.
What is a cross-population study? It’s a statistical research technique designed to uncover what factors (e.g., practices, cultural norms, etc.) correlate with outcomes (e.g., IT performance). Cross-population studies are often used in medical research to answer questions like, “is cigarette smoking a significant factor in early mortality?”
Properly designed cross-population studies are considered a much more rigorous approach of testing efficacy of what practices work than say, interviewing people about what they think worked, ROI stories from vendors, or collecting “known, best practices.”
We then put this question in the survey instrument, and then analyze the results. If we were to plot the results on a graph, we would put the dependent variable (i.e., performance) on the Y-axis, and the independent variable (i.e., presence of high trust) on the X-axis.
We would then test to see if there is a correlation between the two. Shown below is an example of what it looks like when the two variables have low or no correlation, and one that has a significant positive correlation.
If we were to find a significant correlation, such as displayed on the right, we could then assert that “the higher your organization’s trust levels, in general, the higher your IT performance.”
(Graph adapted from Wikipedia entry on Correlation and Dependence.)
The 2012 DevOps Survey
In this section, we will describe the the key findings that came out of the 2012 DevOps Survey, as well as a brief discussion of the research hypotheses that went into the survey design.
In the DevOps community, we have long asserted that certain practices enables organizations simultaneously deliver fast flow of features to market, while providing world-class stability, reliability and security.
We designed the survey to validate this, and tested a series of technical practices to determine which of them correlated with high performance.
The survey ran for 30 days, and we had 4,039 completed respondents. (This is an astonishingly high number, by the way. When Kurt Milne and Gene Kim did similar studies in 2006, each study typically required $200K to do the survey design, gather responses from a couple hundred people, and then perform survey analysis.)
The first surprise was how much the high performing organizations were outperforming their non-high-performing peers:
- Agility metrics
- 30x more frequent code deployments
- 8,000x faster lead time than their peers
- Reliability metrics
- 2x the change success rate
- 12x faster MTTR
In other words, they were more agile: they were deploying code 30x more frequently, and the lead time required to go from “code committed” to “successfully running in production” was completed 8,000x faster — high performers had lead times measured in minutes or hours, while lower performers had lead times measured in weeks, months or even quarters.
Not only were the high performers doing more work, but they had far better outcomes: when the high performers deployed changes and code, they were twice as likely to be completed successfully (i.e., without causing a production outage or service impairment), and when the change failed and resulted in an incident, the time required to resolve the incident was 12x faster.
We were astonished and delighted with this finding, as it showed not only that it was possible to break the core, chronic conflict, but that it seemed to confirm that just as in manufacturing, agility and reliability go hand in hand. In other words, lead time correlates with both both agility and reliability.
(I will post more on my personal interpretations of the 2012 DevOps Survey Of Practice in a future post.)
We hope this gives you a good idea of why we’ve worked so hard on the 2012 and 2013 DevOps Survey, as well as how to conduct your own cross-population studies. Please let us know if you have any questions or if there’s anything we can do for you.
And of course, help us understand what in DevOps and Continuous Delivery work by taking 10 minutes to participate in the 2013 Puppet Labs DevOps Survey here by January 15, 2014!
Thank you! –Gene Kim and Jez Humble