Adapted from War and Peace and IT: Business Leadership, Technology, and Success in the Digital Age by Mark Schwartz.
The traditional project plan tries to manage risks by itemizing them in a register and proposing a mitigation plan for each one. Risks are mitigated to the extent necessary to bring the plan back into line—in other words, to adjust the initiative so the initial business case and plan are maintained. But the uncertainties in the IT domain go deeper—it might be that the very core of the plan needs to change.
Risks can only be itemized if they’re known. The problem is that true uncertainties—unknown unknowns—are probably what will have the deepest impact on the initiative. And the number of unknown unknowns is staggering in the digital world. They range from things we can’t know (Will a competitor suddenly release a new product tomorrow?) to things we just don’t know (Is a hacker about to compromise our system? Is there a bad piece of code in our system that’s about to be triggered when we add the next feature?).
Yes, you can incorporate risk into the traditional investment process by risk-adjusting the discount rate. But even this benign and textbook-adherent way of managing capital budgeting misses an important point. It assumes, incorrectly, that we have to make a single decision regarding our investment right now. But we don’t; agility allows us to make an IT investment in stages. We can choose to risk a smaller amount to begin the project, then gain further information that will help us make decisions about future stages. Such an approach is called metered funding, staged investments, or more broadly, discovery-driven planning.
Venture capital firms practice metered funding—series A investments usually fund a startup as it develops its products, hires its first set of employees, and performs its initial marketing and branding activities. Series B usually occurs once the product is in the marketplace; it’s used to scale up and establish a market position. Series C occurs when the company has been proven successful and is looking to introduce new products, grow more substantially, or prepare to be acquired or conduct an IPO. At each successive stage, investors pay more for the amount of equity they receive because uncertainty has been reduced.
In an old-style waterfall project, you wouldn’t necessarily gain useful information in early project stages that you could use to reduce the uncertainty of later stage decisions. After several months of work, the developers might report that they’re “15% done with component A and 13% done with component B.” That doesn’t give you much information about whether to invest in the next round of funding. But in the digital world you would set up the initiative to quickly deliver results, elicit feedback, and yield information about whether to fund the next stage, or what changes should be made in order to justify additional funding.
With an Agile initiative you can also get an immediate return on the early stage investments, since capabilities are constantly released to production. If the company decides not to fund the second stage, then the first stage’s product is still available for people to use. As I mentioned in Chapter 4 of War and Peace and IT, the return should really be modeled as a return on series A plus an option on future stages. If the option isn’t exercised, there still remains the value of series A.
That’s why the Agile approach makes it effective to innovate through experiments; these are economically justified because of the option value. If you make the first stage short enough and its investment small enough, sooner or later the option values start to outweigh the first stage cost. And the portfolio of ideas being tested, like a VC firm’s startup portfolio, may yield a successful idea that the enterprise can later make a big bet on.
Metered funding can be used throughout the project’s life, which leads to my next point: we should always cancel successful projects, not failing ones. Here’s why. If the investment decision-making process has done its work well, then the initiative is well-justified and is expected to return business value. Now let’s say that we’re staging our investments, which amounts to periodically remaking our investment decision—perhaps monthly.
Since a successful initiative has been constantly delivering results—this is what we expect in the DevOps world—we can evaluate what it has delivered so far and what we believe will be delivered in the future. And since we’ve prioritized the highest return tasks and accomplished them first, we should be seeing diminishing returns. At some point our oversight process might find that enough has been achieved, so it makes sense to stop investing in the effort: resources should instead be moved to a different initiative. This would be a rational decision and one that reflects very well on the project.
On the other hand, if the project seems to be going off course—it’s not returning what we truly believe it could be returning—then we shouldn’t cancel it. After all, we believe it can return more. Rather, this is the moment to make adjustments to the project to get those higher returns. Is the team running into impediments that can be removed? Does it not have the resources it needs? Is this the wrong set of people to be executing the initiative? At this point we should address all of these issues.
We have often thought of project failure as being the fault of the team assigned to execute it; we cancelled their project as a sort of punishment. This makes little sense for two reasons. First, it’s probably not the team’s fault—after all, they were chosen for the project because we thought they could execute it best. Secondly, the justification for the project still exists; if it is a real business need then project cancellation still leaves that need unfilled.
We should instead take advantage of all the options that new IT approaches present. If we can buy additional information to reduce the risk of our investment decision, if we have the choice of stopping an initiative that has already returned sufficient value . . . well, why not? It would be irresponsible to pretend that we can make long-range, point-in-time decisions despite the uncertainty in our environment. We now have the option of staging investments, learning as we go, and adjusting plans. And if we insist on sticking to a plan that we made early—before the initiative started and when we had the least available information—then we’ll likely miss out.
In Chapter 4 I suggested that instead of soliciting initiatives from around the business and prioritizing them, it would be better to start from the organization’s strategic objectives and cascade down from these to the initiatives. In The Real Business of IT Richard Hunter and George Westerman describe using this approach:
We used to work with the power users in every function from the bottom up to develop the IT strategy, and it didn’t necessarily connect to the business strategy. By coming from the top down, we were able to redirect IT effort on major initiatives.
It might seem impractical to do this for basic maintenance work, which includes any number of small tasks that are difficult to tie to strategic objectives. But this can work for two reasons. The first is that all of the little maintenance tasks that “must” be done . . . must they? Has the company not been able to operate without them? It’s important to focus resources on what is most essential, not on what can somehow be justified.
The second reason is that the initial development work, if done correctly, might make such small tasks unnecessary. In the DevOps model the team that launches a feature continues to monitor its success and make adjustments to it. The feature is not really finished until it’s meeting all the company’s needs, so there is little reason to “maintain” it later by fixing bugs and tweaking functions. That backlog of small requests should become small.
The preceding thoughts apply as long as we’re governing discrete initiatives—projects, in old speak. We should always stage our investments and buy down risk. We should experiment freely, creating options that might become valuable. We should cascade strategic objectives into initiatives, rather than improvising initiatives that might or might not be relevant to strategic objectives. We should avoid vomiting user stories, in Pascal Van Cauwenberghe’s phrase. This is the Agile way to govern projects.
But we should not be governing projects. DevOps, as a Lean process, is based on minimizing batch sizes, which means processing very few requirements at a time, finishing them quickly, and moving on to the next set of requirements. Each requirement can be coded quickly and its capability delivered to users—on the order of minutes to days. DevOps can even take us close to single-piece flow, where one requirement at a time can be worked on and delivered.
This is quite remarkable. It would make the IT process amazingly responsive, taking in each new mission need, immediately cranking out a solution, then quickly improving that solution until it is perfect. It would let you change course at any moment to respond to changing circumstances or to try new ideas. It would reduce delivery risk to near zero, since every item would be delivered almost as soon as work started.
But even if we have the technical ability, we often can’t use it because our governance committee only meets once a year, when the Star Chamber room can be rented. And of course we can’t convene the Star Chamber for every requirement. In fact, Lean principles would suggest that we avoid the wait time necessary for getting the hooded figures to make investment decisions in the first place. The only way to take full advantage of single-piece flow is to decentralize governance decision-making.
But isn’t the whole point of governance to centralize these decisions, to avoid the chaos of decisions made separately across the organization? Yes, but there are ways we can decentralize decision-making and maintain centralized direction. I know of three models for doing so: the product model, budget model, and objective model.
The Product Model
In the product model, teams of technologists are assigned to work as part of a particular product group. This group oversees the roadmap for their product, taking feedback from the market and input from the company’s overall competitive strategy. They’re generally responsible for the performance of their product, measured in whatever way makes sense to the company, but they have some freedom to develop and prioritize ideas for their roadmap. For digital products, these are largely digital features, of course. This is fairly close to the model used by Amazon Web Services, where product teams manage their own feature roadmaps in consultation with customers.
In this model the technologists become very familiar with their product and its underlying technology. Because decision-making remains within the group, communication channels are short and lean. The team works toward product objectives, which might be cascaded down from companywide strategic objectives. They also work backward from customer feedback, test hypotheses about which features will be valuable to customers, and gather additional feedback from them as they use the product.
A similar idea can be applied to “products”—business support applications—used internally by the company. The technologists align to whomever is responsible for the product and become experts in both its use and internals. For example, the technologists might align with the HR group that oversees a human resources system.
The Budget Model
The budget model is the approach we use all the time for spending that isn’t either “project based” or related to large capital investments. Now that we can execute our efforts at the single requirement level, why even have IT delivery projects? There’s just everyday IT work, analogous to routine efforts across the rest of the enterprise. IT folks simply come to work every day and, like everyone else in the company, produce whatever needs producing. This may mean they create new IT capabilities, modify existing ones, or perhaps improve security. Some of this effort might need to be capitalized for financial reporting, but that’s a topic for a later chapter.
When a company allocates budgets and cascades them through an organizational hierarchy, it’s passing governance authority down to the budget holders. Why shouldn’t this be done with IT initiatives as well? Some of IT’s expenditures are already managed as budget items, after all—why not the rest? Such an approach is all the more plausible now that there is very little difference, execution-wise, between maintenance of existing systems and development of new ones. There is simply a rolling set of tasks that must be completed by delivery teams.
If you drop the idea of individual systems or products and consider the entire IT estate as a whole—the single large IT asset I’ve described—then all IT development work simply amounts to enhancements or maintenance work to this asset, whether expensed or capitalized. Investment decisions are really the assignment of budgeted teams to work streams, along with the decision as to how many teams to fund in the first place. If the company funds twenty delivery teams, for example, then the CIO can decide how many of them to put on each objective or set of capabilities, and can move those teams between work streams as deemed appropriate.
The budget approach allocates funds to the CIO to use in managing the company’s technology assets. It’s the approach most consistent with the Intrax CEO’s message in the Introduction, as it makes the CIO responsible for the returns from the organization’s IT investment portfolio. Yes, this puts a lot of responsibility in the CIO’s hands—just as the enterprise places heavy responsibilities in the hands of other CXOs. They all report to the CEO or board and are managed by them. No CIO is free of oversight.
One reason why this approach has seemed out of the question is simply the traditional business/IT split—that arms-length, contractor-control model. You wouldn’t give this decision power to a contractor, right?
The Objective Model
In the objective model, a team is chartered with a specific business objective, cascaded from a critical company objective. The team consists of technologists together with business operations people—a group the organization believes can actually accomplish the objective. The team then owns the objective rather than a set of requirements. It does whatever it can to accomplish it: testing hypotheses, making decisions, and rolling out IT or business process changes.
I can explain this best by an example. My team at USCIS was responsible for E-Verify, an online system employers use to check whether an employee is eligible to work in the US. Although employers aren’t generally required to use E-Verify, we were afraid that its use would become mandatory as part of a broader immigration reform. If so, we knew it wouldn’t be able to scale up enough to handle that transactional volume.
We also realized that expanding E-Verify wasn’t primarily a technical problem but a human one. The system could automatically determine the eligibility of 98.6% of the people presented to it, but a person (called a status verifier) had to research and adjudicate the remainder. In addition, observers had to monitor use of the system for potential fraud and misuse. Neither set of people would scale with increased use of the system.
So we launched an E-Verify modernization project, initially using the traditional waterfall approach. A team collected requirements, over time organizing them into about eighty-five required capabilities—including hundreds of specific features. They then began designing the system and preparing the many documents required for the DHS investment governance process. After four years, all they had produced was a stack of one-inch paper binders.
We decided to take a radically different approach. We . . . ahem . . . reclassified the one-inch binders as trash, then reduced the project to five well-defined business objectives:
- Raise the number of cases a status verifier could process per day (about seventy at the time).
- Increase the 98.6% of cases the automated system could process to be closer to 100%.
- Improve the registration completion rate—a large number of companies were beginning the E-Verify user registration process, but never completing it.
- (A goal around fraud and misuse.)
- (A goal around technical system performance.)
We then made a very Lean investment decision. We said we were willing to spend 100 livres every three months to accomplish each of these goals,
but would informally revisit the investment decision every month, and formally each quarter. Meanwhile, we also built dashboards to track metrics continuously for each objective. Because the project executors had all of the technical tools and cloud platforms already set up for them, we expected them to show results in some metrics within two weeks and continuous improvement thereafter.
Having formed a team consisting of technologists (with skills in coding, testing, infrastructure, and security) and business operational folks (status verifiers), we gave them the first objective. We instructed them to do whatever they thought best to raise that number of cases, whether by writing code or making business process changes, and that we (management) would help remove impediments.
More precisely, I said that for every case above seventy they were able to deliver, they would get a gold star. If they did any work that wasn’t intended to increase that number, with a wink I said I’d take one of them outside and shoot them as an example to the others. That was our control for scope creep and feature bloat. I also said that we would meet every two weeks to discuss the results and see what we in management could do to help.
To begin the initiative, we also brought together a broader team—managers, verifiers, technologists—to brainstorm ideas that might help the team in its efforts. We used an impact mapping technique (described in the next section) to create a “mind map” of hypotheses about what might increase that metric. But the team wasn’t required to use the mind map—they were to use their judgment to prioritize tasks. We only cared about results.
Every two weeks we had a discussion to align management and the team, as well as to remove impediments, and every month we reported our results to the steering committee responsible for overseeing the investment. We were able to show immediate gains, and after several months the metric continued to improve. The steering committee chose to continue with the investment.
We did something similar with the other four objectives by assigning each to a team, then regularly checking on progress. Something interesting happened with the registration rate objective (number three). Initially the team showed improvements in the metric, but after a few months it reached a plateau. The business owners and I asked about the ideas the team was trying—the hypotheses it was testing—and agreed with the team that it was doing the right things. We concluded that the metric was not likely to improve any further, perhaps because a certain number of companies who started the registration process realized that E-Verify wasn’t for them, or because people were trying it out to see what it was but weren’t ever planning to sign up.
In reporting back to the steering committee, we therefore recommended that it stop investing in that objective, and instead move the budget to another one—even though we had originally planned to spend more. In other words, the team cancelled the remainder of its own project, with the consent of the steering committee.
What had been planned as a four-year project ended after two and a half years because it was so successful. Each objective had been accomplished to the extent that we all agreed it could be, so the remaining funds were returned for use in other projects. You could say that the project had achieved the Agile ideal: maximizing outcomes while minimizing output, or in other words, maximizing the amount of work not done.
To me, this shows the power of DevOps when used with an appropriate investment management process. The amount of money at risk at any given time was only one month of funding, as the investment was reviewed monthly and showed results daily. Value was delivered immediately and frequently thereafter. The teams could innovate freely but only in relation to an agreed-upon business objective. And the process had very little overhead: each month we reported the business results (obtained from our dashboard) and the amount spent to the steering committee, and each quarter we had an hour-long discussion with them.