Episode 10: The Surprising Implications of Architecting for Generality
Guest: Michael Nygard (Part 2)
On this continuation of Gene Kim’s interview with Michael Nygard, Senior Vice President, Travel Solutions Platform Development Enterprise Architecture, for Sabre, they discuss his reflections on Admiral Rickover’s work with the US Naval Reactor Core and how it may or may not resonate with the principles we hold so near and dear in the DevOps community. They also tease apart the learnings from the architecture of the Toyota Production System and their ability to drive down the cost of change.
They also discuss how we can tell when there are genuinely too many “musical notes” or when those extra notes allow for better and simpler systems that are easier to build and maintain and can even make other systems around them simpler too? And how so many of the lessons and sensibilities came from working with Rich Hickey, the creator of the Clojure programming language.
Michael Nygard strives to raise the bar and ease the pain for developers around the world. He shares his passion and energy for improvement with everyone he meets, sometimes even with their permission. Living with systems in production taught Michael about the importance of operations and writing production-ready software. Highly-available, highly-scalable commerce systems are his forte.
Michael has written and co-authored several books, including 97 Things Every Software Architect Should Know and the bestseller Release It!, a book about building software that survives the real world. He is a highly sought speaker who addresses developers, architects, and technology leaders around the world.
Michael is currently Senior Vice President, Travel Solutions Platform Development Enterprise Architecture, for Sabre, the company reimagining the business of travel.
Gene Kim is a Wall Street Journal bestselling author, researcher, and multiple award-winning CTO. He has been studying high-performing technology organizations since 1999 and was the founder and CTO of Tripwire for 13 years. He is the author of six books, The Unicorn Project (2019), and co-author of the Shingo Publication Award winning Accelerate (2018), The DevOps Handbook (2016), and The Phoenix Project (2013). Since 2014, he has been the founder and organizer of DevOps Enterprise Summit, studying the technology transformations of large, complex organizations.
In 2007, ComputerWorld added Gene to the “40 Innovative IT People to Watch Under the Age of 40” list, and he was named a Computer Science Outstanding Alumnus by Purdue University for achievement and leadership in the profession.
He lives in Portland, OR, with his wife and family.
Gene Kim (00:00:00):
This episode is brought to you by IT Revolution, whose mission is to help technology leaders succeed through publishing and events. You're listening to the Idealcast with Gene Kim brought to you by IQ Revolution. In this epis
Gene Kim (00:00:00):
This episode is brought to you by IT Revolution, whose mission is to help technology leaders succeed through publishing and events. You're listening to the Idealcast with Gene Kim brought to you by IT Revolution. The last two episodes were with Mike Nygard, senior vice president of Enterprise Architecture & Platform Development at Saber, and whose work I so genuinely admire. That first episode was an interview I did with him. The last episode was Mike's 2016 DevOps Enterprise Summit Presentation, where he talks about maneuverability and how to get team of teams working towards a common objective. If you haven't listened to those yet, I'd recommend you listen to those first because this is a continuation of that first interview.
Gene Kim (00:00:56):
Today, we discuss his reflections on Admiral Rickover's work with the US naval reactor core and how it may or may not resonate with the principles we hold so near and dear in the DevOps community. We talk about and tease apart the learnings from something I recently learned from Dr. Steven Spear about the architecture of the Toyota Production System and their ability to drive down the cost of change. We talk more about the characteristics of great software architectures. Specifically, I asked him to help me understand further the amazing example he gave in that first interview.
Gene Kim (00:01:32):
How can we tell when there are genuinely too many musical notes to quote a phrase from the movie Amadeus or when those extra notes allow for better and simpler systems that are easier to build and maintain and can even make other systems around them simpler too? And how so many of the lessons and sensibilities came from working with Rich Hickey, the creator of the Clojure programming language. As with every one of these episodes, I've listened to it many times because I was so dazzled by the insights. And several passages I had to listen to many more times so I could convince myself that I actually understood what Mike was saying. Okay, let's jump in. We start as Mike and I discuss his reflections on the episode I did with Steve Spear on Admiral Rickover and the US naval reactor core.
Michael Nygard (00:02:30):
I did listen to the first of your two episodes with Steven Spear. So I understand a bit more about what you were saying with structure and dynamics. I'm really enjoying it. I want everyone in my company to listen to that. Particularly the idea of emitting signals to allow coordinated action without requiring micromanaging every detail. That's really good. Then also you started talking about team of teams and I'm like, "This is the exactly the situation we're in." What was it? Like 72 hours from sighting to-
Gene Kim (00:03:06):
Michael Nygard (00:03:06):
Yeah. That doesn't work. I also picked up several books about Admiral Rickover. I had been aware of who he was and that he did remarkable things, but I didn't know anything about him or the specifics of how he did it. I'm finding that pretty interesting. In some ways it's counter to the idea of DevOps because Rickover wanted everything solved in advance.
Michael Nygard (00:03:34):
There's a story in one of the books about him having a bundle of envelopes with a rubber band around it. And when somebody came to him and described a problem that one of the nuclear subs was having in the Bering Sea or someplace like that, he went to his desk, went to one specific compartment in his desk, pulled out one specific envelope and gave it to his subordinate and said, "Tell them this." And it was four words written in there that solved the problem.
Michael Nygard (00:04:06):
Rickover had worked out that this problem could occur and he'd worked out what the correct solution was years in advance. That's pretty different from the sort of test and learn and trial approach that we tend to take. But there is a commonality in that he didn't allow any defects or work arounds to persist. Things had to be fixed.
Gene Kim (00:04:33):
Okay, Gene here. I was so thrilled to hear Mike talk about his reflections on Admiral Rickover and I'm going to jump in to state more clearly what I could not during the interview. After the interview, we both talked about how we were both grappling with to what extent the values that Rickover espoused, are they consistent with or not consistent with what we believe in the DevOps community? There is a 1962 memo from Rickover that Steve Spear showed me. It's a pretty remarkable memo that I'm going to read to you.
Gene Kim (00:05:03):
The context is people in The Naval Reactor Organization or NR, granting waivers to their contractors from NR rules, which again, embodied the best understanding of the system as a whole. And it reads, from time to time, I note evidence that NR representatives at field offices, such as a shipyard or laboratory, do not fully understand their primary mission. It is amazing to me how representatives new to these positions uniformly get themselves into the frame of mind, where they conceive of themselves as intermediaries between NR and the contractor.
Gene Kim (00:05:38):
That is, that their job is to judge who is right, NR or the contractor, and then make the decision on their own. In many cases, not even notifying NR. In this way, the NR representative then becomes in effect NR's boss. All NR representatives are of course, encouraged to state their views to me at any time, but it is not their job to assume my responsibility. Another and more serious mistake arises when the NR representative decides what he should or should not report to me. Frequently, he decides not to report things to me because he feels he can handle the matter better himself or he is afraid that by notifying me of the situation, which is his job, I will take ignorant, improper action and upset the applecart.
Gene Kim (00:06:28):
Nearly all NR representatives have had inadequate experience to handle the important and complex tasks they face. I do not expect them to be able to make wise decisions on all matters by themselves. Under some circumstances, it is better to have no NR representative at all because I would not then be lulled into thinking the NR interests are being taken care of. Please bear in mind always that you are the NR representative. That you are to carry out the policies of NR. That you are not to judge NR or to represent the contractor to NR. To achieve the status of a true NR representative requires the acquisition of godlike qualities, but you can try. Signed H.G. Rickover.
Michael Nygard (00:07:15):
Holy cow. It's an amazing memo to me in so many ways: the tone of the memo, the incredulity he has that anyone would take an action that by fierce logical argumentation puts the contractor goals over the NR goals or the NR representatives judgment would be placed over the hard-earned collective wisdom of NR. At times it seems, as Mike says, contradictory to the principles that we love so much in DevOps, but it's difficult to argue that if you want to make the best decisions... Because each decision is informed by all the knowledge of the outcomes of all the decisions made by other people in the organization, which are codified by rules, then we want anything that could improve those rules, put back into the rules. Not corrected or waivered away at the edges.
Gene Kim (00:08:01):
To be more specific, I was also feeling conflicted as Mike was. On the one hand, I think it's easy to call the Rickover approach only applicable to the domains of the simple and complicated. Those are the domains of rules and best practice. So of course, I'm referring to the Cyefin framework by Dave Snowden, where he describes four domains: the obvious domain, formerly known as simple, complicated, complex, and chaotic. The obvious and complicated are the domains where rules and practice can be used.
Gene Kim (00:08:33):
On the other hand, complex and chaotic are where simple cause and effect rules don't apply, usually require a different mode of problem solving. But I don't think anyone can call the creation of a system that has allowed nearly 20,000 hours of safe and accident-free nuclear operations in a dynamic sometimes near war time conditions merely complicated. It is clearly in the complex domain and to call it anything else would do a grave injustice to that achievement.
Gene Kim (00:09:01):
I recently read Gene Kranz's book, Failure Is Not an Option, about his experiences as a mission controller for NASA during the Mercury, Gemini and Apollo programs. And you could definitely see a similar philosophy at work there too. One thing that caught my attention was their continual insistence on resolving all the funnies as in other words, that's funny, why did that happen? For example, anytime there was an unexpected instrument reading or a fault from the computer or telemetry that wasn't there, at the end of the shift they had to resolve all those funnies. They either had to explain it and resolve it or they would assign the funny to the next shift for them to try to explain.
Gene Kim (00:09:41):
Across Mercury, Gemini and Apollo, they forced themselves to reconcile their imperfect understanding of the system and make it better. They exposed their ignorance of the system through drilling and simulation as a way to challenge their assumptions. Without a doubt, Apollo was definitely in the complex domain and especially in situations like Apollo 13, it was definitely in a chaotic domain. Reading the book, it's so clear that Gene Kranz viewed as supremely important the three-ring binders at every mission control carried around, which is full of their procedures.
Gene Kim (00:10:14):
And yet reading the book, I kept thinking, "Gosh, that sounds like rules and best practices only applicable for the obvious and complicated domains." But Kranz's goal was to make sure that as many of those problems were thought through ahead of time, especially around anomalies and what could have caused them, understanding the faults often revealed dependencies that were unknown and would trigger generating solutions uncovering even more dependencies about what would be required to implement that solution. By the way, I also learned that Apollo 9 was actually a Hail Mary to beat the Russians to the moon. Lunar orbit wasn't actually planned until Apollo 11.
Gene Kim (00:10:53):
And in the actual Apollo 11 mission, the simulations team were the unsung heroes. They exposed blind spots and key decisions that were happening too early, each time resulting in all lives lost during the landing. The result of those simulations was always a crash rewriting of those procedures, which was critical to enabling a successful lunar landing. Going back to Rickover, I don't think Rickover is saying that he always knew best, but he absolutely believed that the system knew best. I think Rickover and Apollo shared many of the same principles. In fact, I learned Kranz held very dearly the notion that mission controllers knew the best more than the astronauts and certainly more than manufacturers.
Gene Kim (00:11:33):
In fact in space, the edge can't always know best. Those are the astronauts who in an emergency are often under enormous physical strain, overwhelmed as information, sometimes disoriented or almost passed out unable to make sense of their environment. In fact, in the book and the awesome movie, The Martian, the stranded astronaut Mark Watney had to solve problems all the time. And his return to earth was enabled by being able to tap into all the collective intelligence and resources back on earth, which helped him overcome all the challenges required to survive on the planet for over a year and eventually figure out how to return back to earth.
Gene Kim (00:12:11):
I think that's what Rickover and NASA was all about, empower the edge with the full support of the core. Which means yes, fix the problem at the edge or make sure that any solution are brought back to the core. Follow the process is the best way we know how to do things. And if there's a better way than the process, improves the process. In short, I think the principles that Rickover held so near and dear are very much applicable to even the DevOps space. In fact, Mike had a thought about that. Back to the interview.
Michael Nygard (00:12:42):
We did that in the form of the system automation, right? We have that same kind of belief that you should take the wisdom and codify it in tests and scripts and automation. In some sense, Rickover had and advantage though because the laws of nuclear physics don't change, whereas in the software world we change our laws of physics every few years.
Gene Kim (00:13:08):
I'm wondering if it's no surprise that those kinds of strict rules apply kind of to the build, test, and deploy, right? Those more mechanical things where the infrastructure really is more in our control, right? Versus the adversary, which is probably not as easily codified and you can't enforce the rules there. Right? I mean, that seems like that would lead to disaster.
Michael Nygard (00:13:35):
Yeah. I also think Glenn Vanderburg did this great talk on what is software engineering really. He determined that the part of what we do that is most akin to engineering is actually the build phase, the construction, the validation and the creation of the artifacts. And if you think about it in those terms, the six, eight, eight class of nuclear submarine all had the same reactor, right? So you could write down rules for what to do with this reactor.
Michael Nygard (00:14:12):
Every system we build is different because of the competitive nature of our business. No two companies have exactly the same system. You sort of have to rediscover or reinvent the rules for this company and this company and this company. But the part that is the same is you regard each deployment as a new construction of the same class of system all built at the same shipyard, all built, right?
Gene Kim (00:14:39):
Michael Nygard (00:14:40):
Ideally deterministically. But then what that means is the procedures that work for your particular nuclear reactor may not appropriate for someone else's. In fact, they might be actually dangerous. Which is why we sometimes see this hard time of picking up somebody else's methodology or somebody else's deployment tool and just dropping it in because it doesn't fit our reactor.
Gene Kim (00:15:07):
Because of the environmental factors of which the reactor exists in. Oh, that's super interesting. And just to maybe even conquer to that further, right? The biggest aha moment for me in The DevOps Handbook was really the bifurcation of the creative act of design and development, right? Where it's lead times is measured in weeks, months or quarters versus build, test and deploy, where it should be minutes or hours. Of which the dividing line is a point of CodeCommit into version control.
Gene Kim (00:15:38):
I thought that was really great because I mean, even what you just said is the engineering part, most akin to engineering is that build, test, and deploy phase, which you one can imagine a recovery and adherence to the rules. But then I think we're also saying that the design and development phase, that's hard to believe that that same philosophy will lead to good outcomes.
Michael Nygard (00:16:03):
I think that's true. I think Glenn would agree with you as well.
Gene Kim (00:16:07):
Gene here again. Wow. I think this is so interesting. And as a little aside, this is why I'm finding these interviews in this podcast to be so illuminating. This quest to find a more parsimonious set of principles to explain the world around us, to explain the most amount of observable phenomena is just so dazzling. And that's exactly the feeling I had when talking to Mike. Let's go back to the interview where you will hear me tell Mike a story that Steve Spear had only told me the week before. Part of me wanted to present a cleaned up version of story to you, but I decided to keep the original so you could hear Mike's reaction to the story because it so much mirrors the incredulity I had when I first heard it from Steve.
Gene Kim (00:16:50):
I want to share with you this other thing that Spear told me last week that I did the verbal equivalent of tripping and falling flat on my face. It was so riveting. So my question was kind of this notion of structure shows up in Toyota plants. I asked who in the Toyota plants is creating that organizing logic of the plant runs that results in these amazing dynamics? And he goes, "Nobody." In software, right? His theory is maybe it's because you do it every two years so there's this discipline in, in manufacturing plants. But one is stood up every 15 years, right? So nobody really has... It's not really in anyone's job description, not the chief engineer, not everybody.
Gene Kim (00:17:29):
And I find this a little bit preposterous, but then he told me the story. He said in the mid-90s, he went to visit a Toyota plant, which his mentor, Kent Bowen at Harvard Business School, and a VP of manufacturing at a Big Three plant. One of the things that they were showing off at Toyota was the fact that they did 60 line-side store changes per day. I didn't know what that was, bu so at every work center is basically the racks where you store all the inputs, right? So changing that 60 times a day and the VP of manufacturing from the US auto manufacturer said, "That's crap. That's bullshit." I asked him, [inaudible 00:18:09], "What does that mean? That's a bad idea that it's absurd, it's crazy?"
Michael Nygard (00:18:17):
Gene Kim (00:18:19):
It's impossible, exactly,. Right. Disbelief. And he said, "We tried six and it shut down our plant for three days." And so it evoked kind of... I think what many people how they reacted when we heard the 10 Deploys A Day at Flickr, right? The Allspaw/Hammond presentation.
Michael Nygard (00:18:41):
My first thought was this is a clickbait title. There's no way. [crosstalk 00:18:46] like some phony definition of deploy.
Gene Kim (00:18:51):
That's bullshit, right? I tried one.
Michael Nygard (00:18:53):
Yeah. We were trying to do three a year and it was shutting us down.
Gene Kim (00:18:58):
Gene here, brief break in. I just wanted to make sure that you caught that reference. I was referring to the famous 2009 presentation by John Allspaw and Paul Hammond about how they were doing 10 deploys a day, every day at Flickr. Mike summarized how so many of us reacted when we heard about that presentation, which was primarily disbelief. And even if they were telling the truth, it just seemed preposterous because it seems so dangerous and reckless and maybe even immoral. I love how Mike suspected that they were even being a little bit disingenuous of how they were using the word deployment, just like the VP of manufacturing from that Big Three auto plant.
Gene Kim (00:19:41):
Alright. Back to the story. So, I was asking what's the difference between a system where you can do, you tried six and it blows everything out versus where you can do 60? He said, it's because pieces are decoupled from each other. Imagine in the Big Three plant, there's a central MRP planning system that says, "Here's the production control. Here's the routings. Here's whatever." And everything's so coupled together that when you try to change six things that you get something wrong and the whole system falls apart. Whereas in the Toyota plant is driven primarily through Kanban cards, is an envelope with three pieces of information on it, here's who I am, here's what parts I need, here's why I need them from, the parts and quantity.
Gene Kim (00:20:29):
Basically, no one's actually need to know, except for the originator and where the parts need to go. We just hand the materials handler the envelope and they'll be able to find it. So if you and I both have worked center, you and I both trade jobs, all we got to do is write it down on a kanban card, right? And the parts will eventually find us. When I told that to Jeffrey Fredrick, he said, "Oh, that's information hiding." I recognize that because it allows things to get done without having to tell the central planning system every detail, which impedes the ability to change things.
Michael Nygard (00:21:04):
What's interesting about that to me is the same debate plays out over and over again in different contexts. In the microservices design world, there's this argument between orchestration or choreography. Now, I'm not fond of these two terms because with choreography, there is still a choreographer who decides where everyone goes. But the way it's meant in microservices is that there is not a central controller telling everyone what to do it, is that the services themselves know how to react and who they call. So you have this localized knowledge and you don't require the global sort of controlling mind.
Michael Nygard (00:21:51):
I see it inside the design of software as well, right? I've certainly seen software designs where everything worked perfectly, but if you changed one piece, you had to change the whole thing, right? Versus other designs where it's more built out of composition and you can change things pretty freely and they only have local effects. And you don't require someone editing locally to have global knowledge of the whole system in order to be safe. So this idea comes up over and over again, but then how do you have... But who is the person that says, "We're going to build a system that doesn't require global knowledge?" Isn't that sort of a paradox? Like you have to have someone in that global position to say, "We're not going to require global knowledge?"
Gene Kim (00:22:42):
Steve Spear said something to me that it was kind of equally stunning. He said, "I've had the blessing to be able to study the Toyota Production System for 30 years, that the miracle that is Toyota and people still think it's about manufacturing." Also he's saying that the miracle... And I'm going to use miracle [inaudible 00:23:01], right? Is to your point, right? Who decided that the kanban card, they keep these pieces decoupled from each other and not impose a higher-level order on it? And-
Michael Nygard (00:23:13):
And isn't there almost a seductive nature to the idea that if you want optimization, you need a global view and you need to optimize everything from the top? Again, it seems sort of obvious that that's what you should do, doesn't it?
Gene Kim (00:23:26):
Yeah, totally. I think what Spear is asserting is that... I think what he's saying or certainly what the implications are is that that was never actually decided, it was a synchronization problem that takes you you don't know what's it trying to solve. In other words, how do you get parts from A to B, which resulted in the deployment of kanban cards, which has this other property of keeping things decentralized? But I'm dazzled by the implications and the benefits that that cause. Are you finding that also pretty freaking amazing?
Michael Nygard (00:24:03):
It is amazing. I think it takes a special type of mind and a special personality to do that because it requires somebody who can think of simple rules that generate complex behavior, which is not common. It also requires somebody who doesn't desire to be in control day-to-day, which, let's say, it's not the most common attribute of managers
Gene Kim (00:24:34):
Gene here again. Let's see if I can describe why I think the story is so important. In the typical Big Three automotive plant in the 1990s, everything was tightly coupled together in a centralized system. So when you try to do say six line-side store changes in a given day, it was too easy to miss something. Suddenly parts weren't where they needed to be and now you can't ship completed cars at the end of the production line. And that is what ends up shutting the plant production down. It would take them three days to resume production.
Gene Kim (00:25:09):
What this says is that the cost of change was too high. In other words, there are genuine changes they may want to do, but can't because the potential consequences were to grave causing too much chaos and disruption. So therefore, the organization is unable to do the things they need to do. This is very much like the team of teams story, where the enemy leader might've been cited, but the US forces were unable to respond quickly enough to capture them. So contrast this to the Toyota plant, where they were doing 60 line-side store changes per day, presumably quickly, easily and fearlessly.
Gene Kim (00:25:46):
So the incremental benefit of each one of those changes might be small, but it allowed them to cost experiment, to tweak and tune to improve the standards work. Which over a longer period of time allowed them to continually set the world standard. And this goes to one of the themes emerging about the role of architecture to ensure that the cost of change will continue to be low enough so that everything that needs to get done can be done easily, safely, quickly, and fearlessly both now and in the future.
Gene Kim (00:26:17):
Okay, let's go back to the interview where I started to ask Mike more about the concrete characteristics of great architecture. You may recall from my first interview that he gave an example of a business process that defined not only the payment methods that customers could use, but which payment methods were accepted in a certain country. He described the first option where we can solve the problem by putting more logic into the same place where the payment methods are defined. He gave a second option where we create a second service that would enable country managers to define which payment methods are accepted.
Gene Kim (00:26:51):
Then he presented the exciting alternative of adding a third service, which might seem more complicated, but is actually easier and simpler to maintain in the long-term. Mike had this amazing comment, he said that most people react to that third option just like the court musicians did in the movie Amadeus, which was, "I don't like it. It has too many notes." So, before we go into that payment method example, I wanted to get a better understanding of how you can actually tell whether something has too many notes or maybe you don't have enough notes. Let's hear Mike Nygard talk about this.
Gene Kim (00:27:26):
By the way, when he mentions Rich Hickey, Rich Hickey is the inventor of the Clojure programming language, which he and I had love so much. Here we go. You mentioned the notion of some might think too many notes, which I love, but I also... That reaction is very familiar to me. So I remember over the last 20 years when I pick up a certain software library or trying to use a certain API, my reaction is I would recoil from it. I just want to do a simple thing, like send a log event or draw a rectangle on the screen. And I'm looking at 12 parameters that I have to fill out of which I don't even know what they are. I'm like, "What is a graphics content?" Or, "What is this thing that I need to pass it?"
Gene Kim (00:28:11):
And my reaction is, "Oh." It was disgust, like too many notes. At that time, built an application on top of the reframe, our architecture in Clojure, which I love. I think it magnificently decomposes the system so that they can be kept apart. But I remember my first reaction was, "Holy cow! What are all these notes? I don't know what an effective is or co-effect or a interceptor." So I have that emotional reaction of like, "Why are there so many notes?" Can you help me understand how does one develop that sensitivity, that sensibility to understand when notes are useful when there are too few notes? Maybe start with that example again with the e-commerce payment processing. Can you help me understand that better?
Michael Nygard (00:29:02):
Yeah. Well, one way to do it is to work with Rich Hickey for several years, which I had the privilege to do. Which is a way of saying, one way to develop that sensibility or taste is to work with somebody who already has it. That shoulder to shoulder learning is always kind of the best.
Gene Kim (00:29:24):
Can I interrupt you with one quick question? Have you had that as well where you look at something and you're like, "Holy cow! That's a lot of notes to have to get my [crosstalk 00:29:32-"
Michael Nygard (00:29:32):
I absolutely can. In fact, my first reaction with reframe was, "I just want a database field on the screen." One of the things that I had the opportunity to do was work in Objective-C and a little bit in Smalltalk. They have a very interesting approach to one of these things. If there is a method that takes 12 parameters, there will also be one that takes 11 of those 12, one that takes 10 of the 12 and all the way down to the simplest possible thing. So if you wanted to just draw a rectangle, the simplest method would take four points, right? X, Y, one X, Y two. So in a sense, the parameters... Small talk and Objective-C used named parameters. The set of parameters was a little bit open. You could provide anywhere from the minimum up to the maximum with some optionality in there.
Michael Nygard (00:30:27):
This is one of the things that I learned from Rich, is making your parameters an open set provides a lot of benefits. So if you say I take exactly these six parameters and the use case changes... Say we're talking about distributed systems where you can't easily refactor across the boundary. Now, if I need a version with the seventh parameter, I either have to change everybody all at once and deploy everything all at once. Or I have to add another API method that takes the new seventh parameter.
Michael Nygard (00:31:01):
Well, if I just take a map and I don't enforce the parameters sort of at the boundary, but at just one step beyond the boundaries, sort of validating that I've got a payload I can operate with, well, then I can add a seventh parameter quite easily and I can start looking for it. And if no one is sending it to me, okay, I just behave in the old way. If I start to receive it, great, I can use it and do the new thing. So there's this idea that expansion is safe when you're using open sets. This is one way to get around the problem of proliferation of things that are almost the same, but not quite the same.
Michael Nygard (00:31:44):
I also have a few sort of rules of thumb that I apply. In one of my talks, Architecture Without an End State, I talk about this rule that says augment upstream and contextualized downstream. What that really is referring to is upstream and downstream in terms of data flowing through your system. So data in this case may be requests from users, it may be feeds from outside, but you receive data in in kind of a basic form. And what some systems try to do is immediately reject some of that data. So filter out entities that we think don't fit our schema. They try to decompose it into a relational format where we're fixing the cardinality of relationships. So, part-whole relationships are one-to-one or one-to-many and changing from a one-to-one relationship to a one-to-many is hugely disruptive, right?
Michael Nygard (00:32:47):
So almost the first thing you do is you take this data coming in and you say, "Whatever fits into my schema is real and anything that doesn't fit my schema doesn't exist." That's already contextualizing. You're throwing away information. What I prefer to do is take in all the data and say, "This data is real and somebody somewhere downstream might be able to work with it." This is part of my war on required attributes for example. Maybe I don't have all the attributes needed to put an item on the online storefront and sell it and ship it and deliver it, but maybe I have enough to show it to the marketing people who are going to slot it into a category and start making it useful, right? Then as the additional attributes are available, then we can use them.
Michael Nygard (00:33:41):
So the context about what I can do with those entities really is determined by downstream systems, not the upstream. What the upstream can do is mix in additional information by joining to other sources, by applying inferences and adding fields. You go through this expansion phase upstream and then as you propagate downstream, different systems get to apply their policy about what they can do with it or what they should do based on the attributes they see.
Gene Kim (00:34:14):
I was wiping tears from my eyes. I had a sort of visceral reaction when you were talking about this, when you were coursing count data. When I take data in, and I am often guilty of changing data to fit my parochial needs in that function, and destroy data and make it not available to someone else, I mean, that's-
Michael Nygard (00:34:37):
Then don't you regret it later on when you need it-
Gene Kim (00:34:39):
Michael Nygard (00:34:39):
... or you want to do something else?
Gene Kim (00:34:42):
Right. The notion is that really what should be happening is you can add to it, but you really should not remove things so that other people can use it later. Yeah, I love that and that Rich Hickey notion of you can [inaudible 00:34:58], but you can't-
Michael Nygard (00:35:00):
Take away. Yeah.
Gene Kim (00:35:00):
... destroy. Take away here, right? That suddenly seemed even more important. So what other parts are there to the sensibility of too many notes versus too few? For example, I'm intrigued by your reframe experience as well, right? I think we both have a tremendous amount of admiration for it, but that feeling of all you want to do is put something from the database on the screen and there's a fourth component pieces that need to be understood to even write your first event handler. What distinguishes that from the too many notes problem?
Michael Nygard (00:35:36):
I'm hesitant say anything that would be critical of reframe because I think they've done a great job. The documentation is some of the best I've seen. It's very explicit about everything. I want to separate the getting started experience from the day two experience. So, we've all had situations where the initial on-ramp is pretty tough, but then the rewards are high, right?
Gene Kim (00:36:03):
At Clojure in my case.
Michael Nygard (00:36:05):
We can build better on-ramps, right? You can create templating tools. You can create project generators that give you stuff. You could imagine adding some macros in your Dev workspace that would create for you the pieces that reframe needs. Then you only need to sort of unpack them and care about them once you have to make variations. In terms of the too many notes, one of the other kind of recurring patterns I see is this difference between the archetype and the instantiations. I'm trying to be careful about terminology because I see this pattern happening in a few places.
Michael Nygard (00:36:47):
I've been in, say, Java code bases, where there are a high number of classes which only ever have one instance. And they may have interfaces where the interface is a one-to-one match with the implementation and it's only instantiated one time. In those systems, you get a proliferation of classes. If you look really hard at the behavior, you'll start to see that there's a lot of behavior being repeated across the classes. A lot of the interfaces will look like near duplicates of each other, but not quite.
Michael Nygard (00:37:22):
I've worked in other Java code bases, but more commonly Smalltalk code bases where classes are instantiated many, many, many times that it would be extraordinarily rare to find a class that only has one instance. So because a thing is reused, it becomes reusable and the cognitive overhead is way less. I only have to understand the class one time, whereas in the former type of code base, I have to understand each of these sort of megalithic God classes independently. The same exact thing happens with services in a microservices environment. Most microservices environments have only one instance of any given service. So one code base, you can think of that as the class or the archetype, one instantiation and everyone uses that one instantiation.
Michael Nygard (00:38:20):
Well, that means I have to understand how to interact with that service and the other service and every other service independently just like those mega-classes. Whereas if I can find ways to generalize the components, I can reuse the components. One of my favorite examples of these is with Kafka. There are these Kafka connectors. If I need to take a topic, receive all the messages, flattened by a key and make it persistent so I have a materialized view of the latest of that key for the whole topic, I don't need to write a new component. I instantiate an off-the-shelf component with some parameters, some configuration that says what topic, what's the key field, what database, what table does it go into.
Michael Nygard (00:39:11):
If I have a lot of instances of those little Kafka connectors, it doesn't really add that much cognitive overhead to try to understand each connector. What I need to understand then is how is data flowing through the system? So I'm operating at a higher-level because of the simplicity of the underlying components. That notion of simplicity goes along with generality. This is another one of my ongoing arguments that I contend that making something more general almost always means making it simpler, not making it more complex. You don't achieve generality by adding every special case possible. You achieve generality by removing all the special cases.
Gene Kim (00:40:01):
We are so much looking forward to the DevOps Enterprise Summit, Vegas Virtual, which will now be held on October 13th to the 15th. As always, the goal of the programming committee is to bring you the best experience reports and to outprogram all our previous events. This year we expect to deliver on that promise again. I am so excited about the speaker lineup we have for you partly because they are among the most senior technology and business leaders that have spoken at this conference showing you how important the work of this community is.
Gene Kim (00:40:31):
Maya Leibman, the CIO of American Airlines who presented at our annual forum in April and we were fascinated by the perspectives that she shared with us. I'm so excited that she will be co-presenting with our long-time friend [Bras Clinton 00:40:44] about the American Airlines journey. Since 2014, we've all been dazzled by the CSG journey as told by Scott Prugh and Erica Morrison. I am so thrilled that this year Scott Prugh will be co-presenting with his boss, Ken Kennedy, executive vice president and president of CSG, the largest provider of customer care billing and order management in the US.
Gene Kim (00:41:06):
Ken and Scott will be sharing their story on the interplay between business and technology leadership and how it resulted in their amazing accomplishments over the years. This is just the beginning. Stay tuned for more exciting announcements about our amazing speaker lineup. This will undoubtedly be the best DevOps Enterprise Summit Program we've ever put together. You can find more information at events.itrevolution.com/virtual. Keep going because that's a heck of a claim to make.
Michael Nygard (00:41:37):
Okay. So this is another Clojure example. Suppose I want to find the length of a list, and imagine that we didn't already have length built in as a function, I would reduce over the list applying a plus operator, plus one to my accumulator for each item in the list. So you're already writing the code in your head I can tell. You know exactly what that would look like. It's a one-liner. Now imagine I say, I want a function that can only find the length of lists of prime integers. You have to add code to make that work, right? The more specific thing requires more code.
Michael Nygard (00:42:23):
Now if I want something that finds the length of a list of names, I have to add code to make sure that my list is only full of strings. If we take the same idea into the strongly typed world, the more specific your type signature is, the less general your functionality is. So you have to add more cases to cover more territory. If I have a function that goes from list of ints to ints, there's basically just a handful of ways to write that. If I have something that goes from list of A to int, I can feed it many more things and the code is going to be simpler. Because the implementer is able to make fewer assumptions about the parameters it receives.
Michael Nygard (00:43:08):
If I have list of int to int, I might be multiplying the ints together, I might be summing them, right? There's no guarantee that I'm actually counting them. If I have lists of A to int, the receiver doesn't know what they can do with A and so they're constrained to basically, "What can you do with it?" You can count it and then you can do something crazy like divide by two or negate the count or maybe just all those returns zero. But it is more general and it's going to be simpler because there are fewer operations being done on the parameters coming in.
Gene Kim (00:43:43):
Holy cow! This is not where I expected Mike to go, but he just gave us a pretty precise and also a very startling definition of how to know whether code is simpler or more complex. I not only had to listen to this portion of the interview several times to make sure I understood what he was saying, but I also have to read, listen to a LambdaCast Podcast that I heard last summer. Which I was dazzled by, but didn't actually fully understand until today. But thanks to Mike, I think I understand now and it's pretty amazing what Mike is claiming.
Gene Kim (00:44:19):
Let's rewind and listen to what Mike just said. The greater the number of special cases and logic I allow into my function, the less general it is. And the fewer number of specific cases and logic I allow into my function, the more general it is. Okay? I guess both of those make sense. In other words, if you want to write general code, avoid logic and special cases. I think that's helpful. He then went on to say the more general the type signature of my function is, the fewer operations that can be performed on them. Conversely, the more specific the type signatures are, there are a greater number of operations that can be performed upon them and the less general the functionality is.
Gene Kim (00:45:07):
Okay. This will take a little bit of explaining. I'm going to put a link in the show notes to that entire episode of LambdaCast, which is on this very topic, which is hosted by the very brilliant David Koontz. David Koontz says with every increase that you know about the types, you have less certainty about what the function can do. If you know nothing about the types, you actually know everything about what it does. So just following only applies to pure statically-typed functional programming language like F sharp, ML and Haskell. But it's still an astonishing proof point.
Gene Kim (00:45:43):
I apologize if this is getting too abstract, but this is what category theory, the mathematics that all of functional programming is based upon says about this topic. If you have a function that accepts type T and returns type T, you already know exactly what the function does. The only thing that a function can do is return exactly what you gave it because if you don't know what type T is, you can't make a new one. Therefore the only valid value it can return is what you gave it. In other words, in the scenario where you know nothing about the type, you know already everything about what it does.
Gene Kim (00:46:21):
Let's now consider a situation where you know everything about the type, you now know nothing about what the function does. Here's the proof. Suppose you have a function that accepts type int and returns type int, you now have an infinite number of values that the functioning can return. It can be a constant: one, two, three, and so forth. It could be negative. You could add one to the input, add two to the input. Basically you have an infinite number of values that it could return and so you really have no idea of what it actually does from looking at the inputs.
Gene Kim (00:46:52):
Again, to repeat the astonishing claim that Mike makes, if you make something more general, it has to be simpler. When something is more general, it will have fewer lines of code and it will even eliminate the possibility of having specific cases in your code because you don't even know what you're operating on. So it's the right things that are simpler and more general. We eliminate as many specifics from our code as possible. I got to tell you, wow, that is a pretty big idea. Okay, let's keep going.
Michael Nygard (00:47:24):
Let me use another concrete example. I don't have a mathematical proof on this, but I have a lot of examples. But this one is actually a debate that I've had inside my company. I was being provocative and it triggered a lively exchange of ideas. We often need to find the location of things on earth. So we were in the travel industry. It's useful to know in which city the airport called ORD exists because sometimes people care about going to the city rather than the airport and so we need to know that.
Michael Nygard (00:48:06):
Well, we can write a service that will take an airport code and return you the LAT launch of the airport code, right? Now in order to write that service, somebody has to feed me with the data about airport codes, what they are. And either the same source feeds me the coordinates of those airport codes, or maybe I get them all as one delivery. Well, when I receive a request, what's the first thing I'm going to do in such a service? I'm probably going to look to see if you've given me a real airport code or not.
Michael Nygard (00:48:47):
So I'm adding code to validate that the parameters are legit for the type. Yeah? Then I'm going to go make a query to find out where it is and maybe I'm going to do a radius query with LAT launch to find nearby points of interest. And I'm going to return you a place or a set of places. Let us now suppose that I also need to locate hotels, should I write another service to locate hotels?
Gene Kim (00:49:22):
My gut feeling is probably not. That seems like a concretization that is not necessarily since you're already doing location points of interest.
Michael Nygard (00:49:33):
Except that I only accept airport codes in my specific API. So now maybe I need to add a special case or another API function that accepts a hotel identifier. Now in addition to hotels, maybe we want to add theme parks or cruise ship terminals, various other points of interest. My service is growing new APIs, but fundamentally, all it's trying to do is map a name to a location or a set of locations. So what I should really do is take away all the special cases. Imagine the Google search page, if you had to tell Google, if you were searching for a phone number or a zip code or the name of a restaurant or the name of a book, or the name of an author of a book, or the name of a movie adapted from a book.
Gene Kim (00:50:34):
Imagine the drop-down box, right? That you would have to, right? Before you hit enter.
Michael Nygard (00:50:39):
Imagine I had a service that could translate a name into a location or a set of locations. Now, I have the choice where I can make an instance that only deals with airports and I can make an instance that only deals with hotels, but that choice is in the dataset that I loaded up with, not in the implementation of that service. The service is more general. I can choose to run one global one that handles all named locations for everything or I can choose to have many deployments that are composed into different workflows and have operationally independent availability. But I have more options because I've got a more general thing at the core.
Gene Kim (00:51:24):
Awesome. And so what was the strongest argument for the other case? What was the opposing argument?
Michael Nygard (00:51:33):
The opposing argument was that any given caller certainly only cared about their type of data. In other words, if you're looking for a flight, it's of no use for me to give you back hotels in Chicago, which is true. What that tells me is we need to augment the data that we're passing in with some context.
Gene Kim (00:51:56):
Right. So this is a feeling you have, is you call an API and you get back a whole bunch of stuff you don't care about and you're mystified by why it's being given to you?
Michael Nygard (00:52:05):
Right. So imagine that my parameter is Chicago and I get back restaurants-
Gene Kim (00:52:12):
Gas stations. That's [crosstalk 00:52:14]-
Michael Nygard (00:52:13):
.. gas stations, hotels and the O'Hare higher rental center and so on. But what that really means is I have some implicit assumptions about what I'm interested in that I didn't tell you about. So, one of two things can happen. Either, I contextualize those results by saying, "Oh, I'm going to filter for airports," which means your data needs to contain some kind of classifier or identification. Or I need to tell you to only give me airports. But we're making that implicit assumption, explicit in the data, which allows us to simplify and generalize the service on the other end.
Gene Kim (00:52:59):
That's super interesting fact. I mean, so I think maybe one of the conclusions is that feeling you have when you make an API call and you get this huge [inaudible 00:53:10] thing of like stuff you don't care about is don't overreact. And maybe that's okay, right? It didn't hurt you, right? That's actually a signal that that's actually maybe putting into something that's very generalizable, not just for you, but for every other potential caller.
Michael Nygard (00:53:25):
Yeah. And if you have no way to-
Gene Kim (00:53:28):
Not to be offended by it.
Michael Nygard (00:53:29):
Don't be offended, yeah. I previously used the example about Stripe accepting payments where you simply identify the item that is being purchased rather than having to supply them the entire catalog. This is another example of making something both simpler and more general at the same time because they no longer have to do catalog look-ups and deal with item not found or item is in the wrong seller or any of that stuff. That would be huge complexity on Stripe's end that not only do I not care about it as a consumer of their services, but it would actually be harmful and frustrating if I had to deal with that hidden coupling that there's an implicit item catalog behind the scenes.
Gene Kim (00:54:20):
Wow. I thought that was so cool. So, just in case if you didn't get that the first time around, let me repeat what Mike just said. Imagine that you have a service that takes as an input an airport code and generates as an output a list of items of interest around it such as other airports, hotels, restaurants, and so forth. He presented two options of implementing this. Option A, you create a separate service for each type of area of interest: one for gas stations, one for hotels, one for cruise lines, et cetera.
Gene Kim (00:54:56):
Option B, you create one service that handles every type of area of interest. Using his reasoning, you should choose option B because it is the more general solution as measured by it handling fewer numbers of specific cases. I think I'm definitely starting to understand far better how Mike views the world. So let's go back to that payment processing example that he gave in the previous episode. Again, we have option A, you put all the logic into a central group who defines not only the payment methods accepted, but they would also be responsible for ensuring that each payment method is actually accepted in every country.
Gene Kim (00:55:37):
Option B, you create a separate service that would allow every country manager to define which payments are accepted in each country. Then the bill ground of option C, you create a third component, which would find the intersection of the two. Option C seems so unlikely because it adds a third component. I now finally asked Mike to explain why option C is the preferred solution.
Michael Nygard (00:56:05):
I'd really like to talk about case three the most because I've used this word implicit a couple of times and implicit information is kind of the worst kind of coupling. It's the part that's hardest to change because if there's something that's an implicit assumption on the receiving side of a call, they probably assume there's only one instance of a thing, right? Only one item catalog, only one list of payment providers. It's very rare to see that I can tell you which list to use.
Michael Nygard (00:56:43):
This has come up a couple of times in different contexts, but it's also one of the things that I learned from working with Rich Hickey, is take whatever is implicit and sort of ambient or floating in the environment and make it explicit. Make it an argument that you pass along. And oftentimes you'll find the receiving side might not need anything more than the arguments you're giving it. So you can get rid of entire databases, you can get rid of data feeds to populate those databases, reconciliation jobs, because the receiving service just doesn't need it.
Gene Kim (00:57:20):
Keep going, right? I mean, it's funny you mentioned that, right? My reaction when you say that is, "Oh gosh, more fields." But then I think about what you said about the example of that 12 argument API does an 11 field version all the way down to four, right?
Michael Nygard (00:57:34):
I need to distinguish between two different types of parameters. I'm going to start with the microscale and I'll illustrate this by contrast. In something like a Ruby on Rails app, you've got this fabulous framework called Active Record, which allows you to get an entity back from the database, manipulate it, save it to the database. And you don't need to know the SQL behind it, you can just work with the object. And in most cases you don't even have to worry about the database because there's just a configuration at startup time that says what database am I connected to.
Michael Nygard (00:58:14):
This works great until you need to use two different databases. Because the database is just kind of a global parameter, there's one. All the Active Record methods assume the database. By contrast, if you were working in say a Clojure system, whether you're working with a SQL database or Datomic, the much more common practice is to have functions that receive the database connection or the database value as an argument. And this way, those functions work with whatever database you choose to pass in.
Michael Nygard (00:58:55):
So it's now up to your application at a higher level to say, "Do I have one? Do I have five? Do I have 10 databases?" The lower level functions no longer coupled to that implicit or ambient notion of the database. Now that's not an optional parameter, right? Those functions require a database connection to work. So we can't really ally that parameter and you do need to pass it along. The example I gave about the Smalltalk methods with a large number of arguments, the optional ones were modifiers that would give you special behavior or added control, but they weren't the... They were optional parameters, they weren't the required ones.
Michael Nygard (00:59:41):
At the macroscale, we have a similar thing with services. If we're making something explicit that... I'm calling you, if we're making something explicit in the call that you have to have then I must provide it. What that's doing in a way though is making clear in our API specifications and in our contract, exactly what you need to operate. Whereas before you have some hidden requirements, which may or may not be fulfilled and may or may not be applicable to the use case I'm trying to invoke. I have to know more about how you work in order to invoke you to know if my call is likely to succeed. If it's all explicit in the arguments, then I only need to look at the contract. I don't need to know anything beyond that API specification.
Gene Kim (01:00:34):
What about that scenario that option C in which payments do I accept, what did you exactly react to that led you to say, "No, we actually do need this third piece?"
Michael Nygard (01:00:47):
It's all about change. Ultimately, almost everything about architecture is how do we enable change at a system scale? If we have the centralized case where there's a master that understands what every payment provider is in every geography or country we're going to have a lot of churn on that, right? We're going to constantly need people to update that. And unless you've provided it with a super good API for allowing changes to be added from lots of different places, you may be dealing with code changes almost on a daily basis. Now we can deploy it. That's no problem. The problem is the attention and the backlog and the queuing time to get that change into that shared master component.
Gene Kim (01:01:36):
So this is like the VP of manufacturing from the Big Three auto manufacturer, right? It's a centralized control, one person needs to know all the information, right? Then everything is reliant upon changes there. Okay, got it.
Michael Nygard (01:01:48):
What we'd really like is for the business unit in each country to make their own deals with payment providers that operate in that country. Or if we've got TransNational Payment providers, maybe we can make the deal globally for efficiency, but we want that flexibility. We want local adaptation for culture, for example. Not everyone views PayPal the same way around the world. Not everyone uses WeChat to pay for things around the world, right? So we want the people with the local context to be able to contextualize to make those deals, to set up the, capabilities and then sort of inform the global system rather than having the need for coordinated change on both sides of this interface.
Gene Kim (01:02:39):
And then just to kind of argue against that one... By the way, my reaction was like, "Oh, TransNational Payments? Oh, no." [inaudible 01:02:47] a little bit to that, right? [inaudible 01:02:48].
Michael Nygard (01:02:49):
Maybe a bunch of listeners broke out in hives just now.
Gene Kim (01:02:52):
Right. [inaudible 01:02:54] finally kind of startling to hear additional complexity that certainly wouldn't have shown up in my first version.
Michael Nygard (01:03:03):
Well, so the challenge with option two is that it's not only the payment providers that are in question because we particularly operate as a marketplace with two sides. We have to think about both the seller and the receiver. And both of them may have something to say about what payments will be acceptable. We have to be able to process it in the currency in the region and it has to be something that's acceptable to the supplier of services who will be receiving that payment.
Michael Nygard (01:03:40):
So when you're trying to do that kind of matching, somebody somewhere has to take two sets and find the intersection of those two sets. And the essence of my third option is let's do that intersection late by having one side provide it's set in the request data rather than having it all preconfigured and predefined. Just provide it in the request data and then when it finally reaches the end point, that's when you do the intersection.
Gene Kim (01:04:12):
And there's something so gloriously right about that third option, but I'll be honest. The red flags didn't go off as you were describing that, what could go wrong if you have all that logic happening in the option number two? What's so hard about having that matching happen in that service?
Michael Nygard (01:04:30):
I'm going to make an assertion that the best granularity for data is request level or transaction level business, an instance of a business process. And so if we could, we would pass all of our data within the business process. I mean, all of it. Because then it can change from one request to the next. So all of our rules, all of our policies could change from one request to the next without requiring code changes. All of our, I don't know, catalog and item data, all of our approval levels, what have you, everything.
Michael Nygard (01:05:11):
Now, of course I'm sort of postulating an impossible universe, right? Because we know that we can't carry all that data with every request. The size of the request payload would be ridiculous, which means every piece of data we're storing in advance to make decisions is a performance optimization. My assertion about that then is because that's a performance optimization, it generates complexity as with any kind of a cache. So you can think of a lot of our databases as caches where we are providing a key like an item ID or a carrier code or something along those lines. And we've got cached business rules or policy data or something along those lines.
Michael Nygard (01:05:56):
Well, every cache needs refresh mechanisms, update mechanisms. You need to monitor your success rate and so on. It adds complexity in the name of performance because we can't carry all that payload data around. I've really come to regard a lot of our store databases as cached or materialized views on top of events that we use to accelerate decisions during business processes. If we could make everything fully explicit in the payloads, our systems would be enormously simpler. You would only look at data, make decisions about data and admit more data. All of our services would be pure functions.
Gene Kim (01:06:44):
Tell us what else. That was my gasp of shock, but [inaudible 01:06:50]. I was wondering if that's what you're suggesting, ideally that what makes that solution better is you are carrying around basically every factor you need in order to make a decision in a pure way with nothing implicit, nothing hidden?
Michael Nygard (01:07:04):
Right. We approximate that in some of our systems by this notion of imperative shell, functional core, right? So when you receive a request, you go, you look up everything you need to know., you attach that all to your context, pass it down into the functional core and you get back a value that says, "All right, here's the HTTP response to deliver. Here's some messages to admit. Here are some changes to apply to the database on the way out." But what you've got inside of there is a pure function.
Michael Nygard (01:07:36):
Well, a lot of what I'm trying to do in macroscale architecture is extend that idea and say, how can we further apply functional concepts like pass values, not references, be explicit, not implicit? How can we apply those concepts at the level of services in an enterprise scale? One of the amazing things that happens is you automatically get the ability to adapt to certain kinds of changes with plurality. If I no longer have an implicit item catalog, I can pass you a bunch of items, right? That you've never seen before and you can operate on them. That makes both sides of the interaction simpler and more general.
Gene Kim (01:08:19):
Okay. You could actually hear me gasp a couple of times as Mike was talking because I started to wonder if he was actually going to make the claim he did. The critical part of what makes option C better is that it makes both components in options A and B more general and simpler. And that option C could be done as a pure function. That term pure function comes from the functional programming domain. Pure functions are the notion that functions must be referentially transparent. In other words, for any given set of inputs, you will always get the same outputs. This can only be true if there's nothing implicit, no global variables, no back-end data stores as queering.
Gene Kim (01:09:01):
In fact, what often makes a function impure is that it uses the current system time, which of course will be different every time the function is called. Instead, time must be passed in as an input. So when you do this, do you end up with systems that are dramatically simpler to not only implement because you can test them without any of the other system components being present? For those of you who saw Scott Havens present at DevOps Enterprise Summit in 2019 on the work that he did at Walmart and Jet.com, this is exactly what he built to handle the entire supply chain systems for Walmart. I'll put a link to that talk in the show notes, but I'm happy to say that I've already interviewed him for a future episode of The Idealcast. He will talk at length about this exact topic. I think this is all so amazing. I am so happy that I finally understand why Mike's third option is obviously the best option.
Gene Kim (01:10:00):
Okay, back to the interview. By the way, when you hear me say something that sounds potentially disparaging about monorepos, I just absolutely not meant that way. I love monorepos and I am in awe of how Google has used them for almost all their internet facing properties. So it's funny that you brought up Rich Hickey because the question that I was dying to ask you is what is it about Rich Hickey and Clojure? One of the things that... I got a chance to talk with him at the last Clojure/conj, one of the last conference I went to before the lockdown. And something that just clicked for me was that he seems to be viscerally aware of coupling a couple observations.
Gene Kim (01:10:47):
One is he seems to detest unnecessary coupling and he seems to be aware of it at a level that most of us, me included, cannot see. To the point where his sensibilities almost seem alien. I remember him reacting to the notion of a monorepo and being disgusted by it. I think it's because that it's tied to a CI system that you're not able to work on two separate components without deploying. So you can't really work on two things that have two things in progress and have them interact with each other, which I think is an amazing observation that I certainly never objected to.
Gene Kim (01:11:27):
But now that you mentioned it I do recognize how many workarounds I've had where I was like, "I just want to work on two pieces without committing both of them and deploying both of them." His notion of classes being coupled to each other was actually one of my big aha moments in his job of one presentation. I mean, so what is it... Could you validate that sensibility and why are those couplings bad?
Michael Nygard (01:11:54):
I think you've described those two characteristics of Rich pretty accurately. One of the things I learned from him was how to spot coupling that I had previously not seen. The things that appeared to be atomic to me, he regarded as compounds that could be decomposed. I'll give you an example. When we talk about OO programming, Clojure gives you the characteristics of OO, but they're all a cart. Whereas a class couples together a protocol and an implementation and some state. And in Clojure you can separate all three of those and handle them however you like. You have the option to compose them together however you like.
Michael Nygard (01:12:45):
I had this discussion with Rich many times about actors and whether it made sense to include actors into Clojure. And maybe after the third time I finally got what he was saying. An actor is a compound. It is behavior plus state plus an inbox that somebody is managing. You have exactly one inbox, you don't have the choice of multiple inboxes. You have exactly one outbox, you don't have multiple ones. So an actor has already made some decisions about bringing together constructs. Everywhere that I just said and, Rich would take those apart and supply each of them independently. So you have channels, you have ways of managing state, you have ways of managing behavior.
Michael Nygard (01:13:30):
Then if he provides those atomic components, you have the option to compose them together, but you're not obliged to compose together. And so splitting things and splitting and splitting and splitting is totally appropriate for a language designer. Rich has incredible sensibilities about that. I'm constantly impressed. He has a strong aesthetic sense that goes along with it. And there is such a thing as taste and one language designers tastes may be more in line with yours. I think his taste for splitting things down into tiny pieces, tiny orthogonal, composable pieces gives consumers of his language tons and tons of options.
Michael Nygard (01:14:25):
It's funny that you mentioned the monorepo idea though, because we're actually moving towards a monorepo inside my company precisely because we want to couple some things together that have been independent in the past. So yes, those higher-level constructs pre-make decisions for you or predecide things for you. And sometimes we choose that deliberately. It's when you get it by accident or without reflection that the coupling is really a problem.
Gene Kim (01:15:02):
Gene here. Okay. If you're getting a little bit lost because you don't know the Clojure programming language, I'd recommend you watch an amazing Rich Hickey talk called Simple Not Easy that he gave at the Strange Loop conference in 2011, where he talks about coupling. This is where I learned about the term complected. It shows up so prominently in The Unicorn Project. Those concepts Rich Hickey are at the heart of the first ideal, the whole notion of locality and simplicity, the desire to keep components of the system from being complected together. In Rich Hickey's talk, he talks about splitting apart the notion of identity and state and interfaces and time and namespaces and functions, data structures, and all the benefits afforded by doing so. And when Mike Nygard mentions agents that comes from the agent construct popularized by the airline programming language used for concurrent programming.
Gene Kim (01:15:58):
The second thing that I wanted to mention is that coupling is neither good nor bad. As Mike was saying, it's only when it is accidental or when there's too many implicit assumptions that when it can hurt you. So when you go to a restaurant, typically you want a meal, not all of the ingredients put into a sack and left for you to assemble. And often that's the right thing to do for our customer. But we've all been in a situation where we don't want the entire meal or we don't want the entire piece of furniture, we just want one bolt or one screw. And we shouldn't have to order a whole new bookshelf just to get that one screw. All right. Back to the interview. So what do you think those sensibilities of breaking things down into these small, orthogonal pieces? What [inaudible 01:16:46] is that generalizable or reinforced kind of your sensibilities for thinking about macrosystems?
Michael Nygard (01:16:52):
I would say I'm continuing to explore how those ideas work at the macroscale. We have this challenge of metaphors when we talk about service-based infrastructures or service-oriented architectures or whatever acronym you like to apply. We try to say it's a collection of objects that are distributed, except you don't want to make too many calls because there's a lot of overhead, right? And sometimes the object is just not there when you try to talk to it. Oh, well, so it's not that much the options, right? We don't enforce type signatures on calls, you can make any kind of call you like and it could respond with a catalog of Weird Al music, if it feels like it. You don't have a byte-level syntactic enforcement like you do with objects. Actually, the more you look at it, it's not really very much like objects.
Michael Nygard (01:17:54):
Okay. So we'll let... Maybe it's like actors. You pursue that path and you're... No, it's not really very much like actors either. Eventually you start to realize, now these service-based architectures are really their own thing. They have their own properties, their own characteristics. We need to think of design techniques that work for these. And it's still... I mean, even though we're 20 years into SOA or more, maybe 25 years into SOA, and we're at least 15 years into the Guerilla SOA or REST style, I think it's still relatively early days to see what evolves and survives change the best.
Michael Nygard (01:18:42):
We had stories from Uber a few years ago about how they had more services than engineers, which probably meant they had some orphaned services that no longer had anyone who knew what they were or what they did or how to deploy them. Well, now we see stories from Uber about now that they've stopped their hypergrowth scaling and sort of flattened off on their employment curve. Now they're kind of pulling back from that and saying, "Well, we're going to take collections of services and put them behind a facade that represents a higher-level aggregated behavior." I totally get that. That's a very sensible pattern.
Michael Nygard (01:19:20):
At one point in one context it seemed like rapid proliferation of services was the right way to survive change and evolve. Now we're thinking actually that may allow coupling of types that we don't like that inhibit other kinds of change and evolution. So we're still trying to figure out what it is that's going to allow us to survive and persist with these architectures. My explorations on applying the principles of functional design is part of that. There some pieces I'm very certain about. There are some pieces I'm pretty sure will work and there are some that are hypothesis.
Michael Nygard (01:20:04):
I'll give you one example that I'm very sure about. I designed a service at one point that I called a perpetual string service. I was with a company that had a problem about a T's and C's. They needed to make sure that when a user came to the site, they had agreed to the latest terms and conditions. This was a SaaS eCommerce company. They had a table with all of the shop IDs and sorry, the date that the T's and C's have been agreed to, but they weren't keeping the old versions of T's and C's. They were being overwritten. And so actually go back and say, "What's the difference between what I agreed to before and what I agreed to now?" Or if you got into some kind of an arbitration situation, you had the date they agreed to it, but you had to go to paper to figure out what the text was.
Michael Nygard (01:21:07):
So to me, coming from the functional world, I said, "Well, the problem is you're treating something that should be immutable like it's mutable." What you should store is a reference to a perpetual record of the T's and C's that they agreed to. And when you modify your T's and C's what you're actually doing is making a new contract, not modifying the old one, right? So keep the old one around, make a new one and when they agree to the new one, update the reference to point to that. Well, it turns out there are a lot of cases in a company where the ability to store an arbitrary string of text that is immutable, content-addressable, and that I can rely on fetching forever. There are a lot of use cases for that. So by making the service as simple as saying, "I'm going to put a bunch of texts to you and you're going to give me back a URL. The contract is-"
Gene Kim (01:22:02):
Oh, my gosh.
Michael Nygard (01:22:02):
"... I can always use that URL to get back the original text." Very, very simple, right? You could write that in an afternoon, enormously useful in a lot of different situations. But by the way, now it would just use Google Cloud storage or Amazon [inaudible 01:22:22]. Those effectively are the content-addressable immutable storage I was looking for.
Gene Kim (01:22:27):
That's astonishing. That's really freaking awesome. I love that story because Mike is highlighting how important the concept of immutability is another core concept from functional programming. It's the notion that when you create a variable, you can never change it. You must create a new one. Rich Hickey described this as the notion of place-oriented programming. The notion that in the olden days, we had to care about memory. So that's why we had to reuse memory. That's why we had to use pointers and memory addresses for variables and Clojure and most functional programming languages. Immutable data structures are the norm. I have found that so much complexity in the applications that I've written disappear when you use them. Entire categories of errors no longer happen.
Gene Kim (01:23:15):
I think that concept is familiar to many of us, but then he made the same claim for databases that in the olden days we had to preserve space in databases so we would routinely overwrite values in our databases. I mean, after all, what else would you do? Rich Hickey created the Datomic database, where you can't overwrite values, you have to merely supersede them. It means that the database only grows and never shrinks. I love Mike's answer to the one thing that he does know about macrosystems at scale, which is that it will take advantage of immutability.
Gene Kim (01:23:51):
So I've been dazzled, every interaction that we have. I learned so much and my eyes are all teary from this laughing so hard, but there's something that is bothering me. You talked about the architects elevator about going from the boardroom to the boiler room and you just gave me a new example of the boiler room of like these immutable strings servers. So it seems like it could be almost trivialized as like all about the bits and bytes, why does Mike Nygard care about strings? And yet it's definitely a board level issue, right? About how do you know what terms and conditions someone actually signed up for four years ago? That's in that class action lawsuit.
Gene Kim (01:24:29):
So on a scale of one to 10, to what extent do you think the most senior leaders are armed with the knowledge of structures that is required to win in the marketplace in terms of the supporting architecture, how to organize teams, how to create these enabling services that could be easily laughed? On a scale of one to 10, one is no concern, every leader... The top leadership knows is properly supported. 10 is grave existential concern about in most organizations, leadership is not armed with that level of knowledge or sensibilities.
Michael Nygard (01:25:11):
I'm probably at a nine. I think that there are exceedingly rare companies where executive leadership and board level leadership has an understanding of these issues. I've been fortunate to work in some companies where it was true and I've certainly seen the effects in companies where it was not true. The idea that managing a large enterprise is all about looking at the balance sheet and optimizing your labor cost and outsourcing non-core, et cetera, I think it's a harmful idea. Because you don't have that profound understanding of the system that the Deming's ed was necessary to make changes. If you view your company as a system, you need to profoundly understand it. That understanding is hard to achieve, it's time-consuming, it rarely comes from outside the company. I think it's a combination of perhaps luck when it occurs or it's a combination of... Or it's exceptionally good recruiting and team building by the very top executives.
Gene Kim (01:26:26):
Mike, this is so fun. I feel like really important. I mean, I think these are... He said, I think are not well understood. And when I say not well understood, certainly not well understood by me. I think are really important that everyone needs to understand better. I mean, with just a tremendous amount of gratitude, thanks for your time.
Michael Nygard (01:26:45):
I enjoy talking to you enormously. It's fun every time.
Gene Kim (01:26:54):
Wow. That was such a cool interview and one of the most challenging to fully process, comprehend and explain, but I think the topics that Mike covered in this interview and the last one are so important for any organization aspiring to win in the age of software and data. I'm so grateful that I got to learn so many of these sensibilities we talked about today by watching as many of Rich Hickey's talks that I could find. And by programming in his language Clojure, which forces you to program according to his sensibilities. If any of these things interest you and you love programming, I recommend you try Clojure out. In the show notes, I'll include a link to my blog post, My Love Letter To Clojure, which has a list of my favorite aha moments and how it's reintroduced the joy of coding back into my life.
Gene Kim (01:27:44):
In the next episode of the Idealcast, I'll be interviewing David Silverman, CEO and founder of CrossLead and co-author of the amazing book Team of Teams, which has been a topic of conversation in every episode we've done. I'll also have on Jessica Reif, who is director of research and development for CrossLead, where she leads their education efforts, which have been delivered to over 20,000 leaders. I'm so delighted like so many of us, Jessica Reif comes from a software background. This is an amazing interview where we learn about the story behind Team of Teams and the lessons that leaders must learn from it. See you then.
ode of the Idealcast, I'm so delighted to have on David Silverman and Jessica Reif.
Gene Kim (00:00:26):
David Silverman is a coauthor of Team of Teams and is founder and CEO of CrossLead. Dave spent 13 years as a US Navy seal leaving as a Lieutenant Commander, having received three bronze stars and other commendations. Later, Dave worked with his colleagues to codify and write down some of their amazing lessons learned to future leaders as part of that journey. Dave co-founded McChrystal group, where he served as their CEO for five years. I am such a huge fan of this book, which described how in 2004, the Joint Special Forces Task Force in Iraq was failing to achieve their mission to dismantle Al-Qaeda in Iraq and what they did about it. Their work led to not only them achieving the strategic objectives, but it also led to a deep and critical rethinking of almost everything across all US military services and in commercial industry as well.
Gene Kim (00:01:20):
Also with me is Jessica Reif, the director of research and development for CrossLead, where she continues her work researching and codifying practices into the CrossLead management framework. She currently leads their education efforts, which have been delivered to over 20,000 leaders. I am so delighted that like so many of us, she comes from a software background. She was previously product delivery manager for applied machine learning and engineering teams at Oracle Data Cloud, where she had to solve the team of team challenges in a software development and delivery context.
Gene Kim (00:01:54):
In this episode, I was so honored to be able to learn more about the philosophies and thinking that went into Team of Teams, one of my favorite books I've read in the last decade. I learned about the truly breathtaking scale of the organization and management required to support hundreds of thousands of personnel involved in these operations and how it impeded the achievement of their portions of the mission. I learned about just how dramatic the changes were and the transformation described in the book. It is utterly amazing to hear about how and why it worked and being able to piece together the structure and dynamics both before and after the transformation.
Gene Kim (00:02:35):
It talks about the leadership characteristics that are needed in this new way of working. I learned more about the famous ops intelligence update call, the famous 90-minute call that happened daily involving 3,000 people around the globe 365 days a year, and what was required to increase a temp of operations by over two orders of magnitude enabled by breaking some pretty amazing constraints. And as a side note, it is amazing to hear Dave describe the events that I read about in the book often with a sense of wonder even 10 years later.
Gene Kim (00:03:08):
This is part one of a two-part interview. You may notice that I asked Dave a lot of questions in the beginning to help set the stage and educate us on what the context of the book was. You will hear a lot more from Jessica at the end and in part two of this interview.
Gene Kim (00:03:27):
Dave and Jessica, I'm so happy that you're both here. You both know how much I admire the Team of Teams book. It's been a topic of discussion in almost every one of these podcasts, but also within the broader DevOps enterprise community as well. So could you both describe yourselves in your own words and tell us about your involvement in the development of this amazing book?
David Silverman (00:03:48):
I'm super excited to be talking to your listeners today about this journey. A little bit of background on me. So I grew up a military brat really moving around because my father was in the United States Navy as an aviator. He had served in Vietnam and we moved around a lot as kids. And eventually I found myself at the Naval Academy trying to continue that legacy of service. And I really wanted to be a part of a high-performing team because growing up, that's always where my passion was, was in sports, specifically, water sports. And so, I was fortunate enough to get selected and picked to become a Navy SEAL. And I graduated at this sort of interesting time in recent history of the United States, which is, we've been in a prolonged period of peace and prosperity relative to the military services in the late '90s and early 2000s.
David Silverman (00:04:43):
And then after my first deployment, we came back and everything changed. 9/11 had happened and for the next, well still going on today, really, but for me the next 10 years, it was effectively just a non-stop series of operational deployments around the world, trying to combat this threat of terrorism that had been manifesting for a while beforehand. And so during that journey, me and a lot of peers, you kind of went through this crucible of being very junior officers in the military, and then sort of being baptized and brought up in this new conflict, had to go through rapid change and iteration. And we felt like that experience was pretty transformational, but two, our assumptions were that it wasn't unique to the military, that some of the lessons that we were learning on the battlefield were going to be applicable to broader society and industry at large. And so a group of us got together, started a company to try to translate those experiences into services. And then ultimately try to codify that experience in the book Team of Teams.
Gene Kim (00:05:50):
Awesome. And Jessica, how did you get involved in this amazing effort.
Jessica Reif (00:05:56):
Gene, thanks much for having us. I'm a huge fan of the Idealcast and really excited to be here today as well. So I've been working with CrossLead for the better part of seven years in one capacity or another. Like you said, I'm part of the research involvement org now. I wasn't directly involved with the writing of the book Team of Teams, and it was actually well underway by the time I met Dave, but after the book was published, there has been a lot of interest from enterprises that really understood that they could benefit from applying our firm's methodology CrossLead, which is a set of practices that reinforces the concepts of common purpose, shared consciousness, empowerment, and trust within organizations. So this is really the area that I'm really passionate about and my primary role has been to develop research-based training programs and practices that help teams work together more effectively in complex environments, particularly when they have to continuously adapt to change.
Gene Kim (00:06:54):
This is so great. I've been waiting so long to interview both of you. So Dave, you talked about your desire to go back and start writing things down. Could you talk about what the motivation for that was? If I understand correctly, you were trying to teach other junior officers in the US Navy. What were the goals of that? Who were you trying to teach? What did you teach them and why do you think those teachings were missing?
David Silverman (00:07:20):
Yeah. So, let me go back a little bit. So to frame what's going on here, it's 2001, I get back from deployment in August from a tour of the Pacific where predominantly our mission was to engage in training and mutual bilateral relationships with allies to help them as they sort of advance or professionalize their respective militaries. We come home, I go out to the East Coast with my at the time girlfriend to meet her family and friends, and then fly back on September 10th of 2001 on the flight that the next day crashed into the South Tower of the World Trade Center. And so I get a frantic phone call at 6:00 AM in the morning from her father who believes we're on that flight, panics saying, "Are you guys okay? Where are you guys?" And that was the start of a pretty crazy time.
David Silverman (00:08:14):
So fast forward, the invasion of Iraq has occurred, I've come back from being a part of that experience. And it's now about six or seven months later, I'm sitting in a friend's living room, my best friend, my roommate from college, and one of his mentors and Naval Academy graduate as well, who is a Marine, and this guy's name was Doug Zenbiec. And he asked this profound question in this room and to understand that scenario, when you train to go to war, you have this preconceived notion of what it's going to be like. And so really what this conversation was trying to sort of unpack and unwind what the actual experience was, and we'd had three very different experiences. In the case of Doug, he had just come out of the first battle of Fallujah, where his unit had taken 70% casualties.
David Silverman (00:09:03):
And it was obviously a very intense and life-changing experience for him. And he made this statement, he said, "Look, every generation in the United States history has served during times of need. And they go overseas, they do things, they learn, and they bring those experiences back to society to try to improve it. I believe fundamentally that this war is going to be different than other wars. I think it's going to be a prolonged fight. I think the percentage of the population that's going to be engaged in it actively is going to be very small because I don't anticipate there being a draft. So it's going to be incumbent upon people like us to capture those experiences and translate them back into society to make it better." I was 25 at the time. I was like that's heavy, heavy stuff in the conversation, but that was the initial seed for what would later become CrossLead and the book Team of Teams.
David Silverman (00:09:55):
And so fast forward, it's now four or five years later, we've now done multiple deployments to various combat zones around the world, all of us. And I get a phone call as I'm walking through the United States' Senate offices, where I'm told that Doug's Zenbiec was just killed in Iraq. And here was a good friend of mine, a guy that was in my wedding. Obviously he was a larger than life figure that had a profound impact on a lot of people, but we came out of that experience. I would go back to Afghanistan and when I came home-
Gene Kim (00:10:26):
What year is that at this point?
David Silverman (00:10:28):
This is now 2009, 2010 timeframe when we got back from Afghanistan. Doug would die in 2007. And we said, "Hey, look, let's take this thesis that Doug had and see if we can live up to this legacy of service and try to give back in some meaningful way." And the experience that we had overseas during that ten-year period, and to be fair, it's still going on today, was pretty transformational. We came from this legacy training and development program in the SEAL teams, which was I would say, tied to industrial warfare best practices, and what we had transformed into was something much more akin to an agile leadership and management best practices. We didn't call it those terms because to be honest you, I didn't even know what agile meant, but what it meant was this idea that you are part of a larger task force that was tightly coupled with a lot of interdependencies.
David Silverman (00:11:26):
And we had to rapidly disseminate and learn effectively in real time so that then we can apply those lessons locally to make faster, better decisions to combat our threat. And the enemy at the time was leveraging significant advancements in technology for them to disseminate learnings, while we were still initially passing up key lessons through a very hierarchal, bureaucratic information flow. And so we were at a significant disadvantage and that disadvantage is where on an aggregate level, we found ourselves losing. And so when we made this pivot to operating much more analogous to what I would say, agile principles and values, all of a sudden we rapidly increased our rate of learning, which then allowed us to bring to bear all of our competitive advantages associated with technology, talent, and training to basically combat and sort of suppress this immediate threat of a violent, radical extremism that was manifesting in the areas that we were operating in.
David Silverman (00:12:22):
So our thesis was, we're not unique to the military. The fact that larger change that was happening in the environment was this advancement of how people fundamentally communicate and collaborate and learn that was being enabled by mobile and social technology and media. And that if you could get organizations to think fundamentally differently about that, they could also have decisive effects on how fast they could adapt their operating practices to be competitive in an environment that was increasingly more complex or changing very rapidly.
Speaker 1 (00:12:57):
This is amazing. And I have goosebumps for so many reasons. If I understand correctly, one of your immediate goals was to teach fellow junior officers. Can you talk about what you taught and who you were teaching and maybe how receptive they were to being taught?
David Silverman (00:13:14):
Well, I would say while we were part of this task force the immediate goal was to disseminate learnings across the mid-level ranks of the organization as quickly as possible, so that they could apply those lessons locally. So in that sense, yes, we were trying to teach the senior NCOs, non-commissioned officers, and the frontline managers and leaders what was happening and how they can apply those locally at the same pace that they were changing. When we came and got out, our initial thesis was we were going to focus this predominantly on, one, validating that this hypothesis was correct.
David Silverman (00:13:46):
And our goal was to try to do that in industry because our thesis was industry had quantifiable metrics that would judge operational performance that we could attach this framework to and we could demonstrate results. And if we could demonstrate results, then we, theoretically, we could take it back and try to address larger social opportunities and issues like COVID-19. I mean, which I think is a perfect example of how the globe has been mobilized. And should be thinking of itself as the team of teams and how they're trying to rapidly learn from each other on best therapeutics or best treatments or best vaccines that could be then rapidly disseminated, hopefully save lives and restore the economy back to safety again.
Speaker 1 (00:14:27):
Oh, right on. And by the way, I believe so wholeheartedly in your thesis. So that initial teaching was really for the team of teams in the context of Iraq later Afghanistan, the people who are embedded in these missions that were mutually dependent to achieve the goals. Is that correct?
David Silverman (00:14:44):
Yeah. So, to answer that more broadly, the original task force had a global mandate, but it was concentrated predominantly in those two areas that you've talked about, Iraq and Afghanistan. And, but very quickly, what we realized was that those two theaters of operation were highly connected, both from us, our allies, and also our adversary with the rest of the geography. So in this case, the Maghreb and the Levant and other parts of the Arabian peninsula, were feeding, funding talent resources into those AOs. And so, if you're trying to go upstream from the problem and try to address it on multiple different areas, you needed to have a broader coalition that was mobilized around the problem set. And so what initially was what we thought was decoupled like, well, what happens in Algeria is really irrelevant to what's happening on a daily basis. And Baghdad became actually highly connected. And when you started to realize that maybe there was a [madras 00:15:45] that was radicalizing and then sending through these what we call it, rat lines, talent and money all the way that was then ending up on the battlefield in Baghdad.
David Silverman (00:15:54):
So, how do I get the people that are trying to solve that problem in Algeria on board and understanding the problem that we're seeing here, so that they can take that in context and figure out how to apply it locally. So, that was the intent of the original process changes that we made was to say, the solution here is part of a larger network of organizations, countries, and that need to be mobilized around this problem set if we're going to solve it. No one person has the piece of the puzzle, but if we can put all the pieces on the thing, we could probably start figuring out what this thing actually looks like. And then once we're aligned on what it looks like, then we can start to operate much faster, more effectively, locally to solve the problem.
Gene Kim (00:16:36):
This is so terrific. So, one of the things that I've been so looking forward to is just really trying to understand the Team of Teams story through the lens of structure and dynamics. And so, you may or may not know I've been spending this whole year, really trying to view the world through how Steve Spear views the world through three simple things, the notion of dominant architecture, in other words, there's a way of doing things that is very dependable, but very resistant to change.
Gene Kim (00:17:03):
Then any system of work is really made up of two things, structure and dynamics. So structure is a way that we organize teams and the ways that the teams interface with each other, so it's most probably, obviously manifested through the org chart, but also things that maybe aren't visible, which is the way that teams are allowed or incentivized to talk to each other or not.
Gene Kim (00:17:25):
And then there's dynamics, which is almost wholly a function of structure. So the dynamics are about feedback, how quickly we get feedback. It's about the ways that signals are either amplified or extinguished. So when we have a culture that people are afraid to tell bad news, important signals might get lost entirely. Whereas if we are trying to elevate weak signals, signals can be spread throughout the system far faster than it could through any sort of centralized command and control system.
Gene Kim (00:17:53):
Can you tell us about how was your Iraq experience relative to General Stanley McChrystal? In the book Team of Teams, McChrystal said he wanted a different approach. What were the visible things that he did and enacted? What were the resulting challenges that he tasked his leadership? And how does that cascade down to you? And just, I'm dying to hear also your reflections as an outside observer in terms of how you interpret this.
David Silverman (00:18:17):
Yeah, it's a big question. And there's probably not going to be an answer that satisfies it complete, but maybe to give some context here, if you think of bureaucratic structures or charts, I think that US military sort of mastered how to make that as painful and as rigid as possible. And some of that's out of necessity because of the complexity of the operations and the pure scope of how they're trying to perform. So if you go back to something like an invasion of Iraq, you have a geographic combatant commander who's responsible for a geography, that is a four-star either general or admiral. And they have a mandate to basically operationally manage control, fight forces that are deployed to it from in this case, the joint staff, which is the higher headquarters at the Pentagon, which takes assets from the different services and says, these assets are yours to use within these constraints for this period of time.
David Silverman (00:19:11):
And now when a war happens, there's a further sub-setting that takes place, means there's now, in this case, a four-star commander put in charge of the operation in Iraq, he has or she has unique authorities and permissions associated with that theater. And then inside of it, you'll have a combination of conventional forces, coalition forces, special operations forces, and then a whole host of other government organizations. Could be intelligence apparatus, it could be non-government organizations, you name it. So, the hardest thing when you first get on the ground, especially if you're a relatively junior officer, like myself, was just figuring out where you fit in. You're like, "I don't even know. Can somebody show me an org chart on where I fit in? So I understand what are my operating sort of boundaries and permissions and who do I need to keep informed?"
David Silverman (00:19:56):
And it's highly matrixed and very complex. And I would say, so we spent a lot of time just trying to sort through all that. And if you're relatively junior, you almost need like an experienced PhD to figure out how to map the whole thing out in complex. So when we got there, this wasn't unique, and traditionally what happens is if it's predominately a military operation, there's a military leader who has a civilian counterpart and they two work in parallel to try to prosecute the goals and objectives of the coalition and in this case, United States. And special operations would be one of the tools that they're given. Now in the special operations community, we had different types of special operations units. We had sort of high-end counter-terrorism strategic assets, which in this case was being led and managed by the Joint Special Operations Command of which General McChrystal was the leader for more than five years.
David Silverman (00:20:48):
And then you had conventional special operations forces, which are usually designed to train the eyes and equip and help fight alongside coalition assets and everything in between to be honest with you. I mean, even to try to explain that the special operations community, it's a pretty diverse group. So what we realized pretty quickly was you have all these individual tribes per se, operating in this common battle space and somewhere somehow it's loosely connected, but for the most part, their cultures, their processes, their decision-making rights were all pretty siloed off, pretty isolated from each other. And what we realized was that this wasn't going to work, we couldn't piece mail a solution to this level of complexity. It needed to be something that could be much more organic in sort of how it operated and grew.
David Silverman (00:21:35):
And so very quickly, when you get too close to a problem, you sort of naturally figure this out. So if you put a junior person from four different organizations in the same physical location, and you give them a similar mandate to say, "Provide security to the area," they start to collab, hopefully, collaborate, learn, trust each other through relationships. And informally, start to operate more like a team of teams. The challenge is, as you go up the bureaucracy, those relationships are much more, I would say that muscle memory for how they operate is much more rigid, and so it becomes much harder to do that. And so part of the goal of the task force was to break down that mid-level and even senior level friction that was preventing us from learning and operating effectively as a group. And so that was the big change.
David Silverman (00:22:27):
And even for someone like myself, who by 2005/6, I was managing a task force in Baghdad of both Iraqis and SEALS and other special operations components from the military. We had it in attached into other existing mechanisms, so in this case, the special mission units that McChrystal was in charge of as well as the conventional mission units that the other special operations commander had, as well as the battle space of the respective conventional force that owned it, and we were operating, we had a mandate to operate all around the country. So we would find yourself going into different two-star commands, So I could be working for an army general one day and I could be working for a British general the next day. You could be working for somebody else the next day. And so being able to seamlessly move in and out and keep things deconflicted and operating effectively so that we're all trying to achieve the same end state was sort of the goal, the objective of the larger apparatus was just basically remove those traditional roadblocks or friction points that existed that was inhibiting productivity against the common mission set.
Gene Kim (00:23:30):
Gene here. I want to pause here for two reasons. One, I want to just marvel at the vast complexity of the organization that Dave just described. Hundreds of thousands of people under a four-star combatant commander overseeing forces from all branches of the US military services plus coalition forces, plus civilian leadership. I'm reminded of a CTO summit that I had the privilege of attending that was held shortly after 9/11, by DISA, the Defense Information Systems Agency and specifically, the DISA CTO at the time, Dawn Meyerriecks. It was such an impressive group, but one of the things I remember most was an org chart that Ms. Meyerriecks showed. She said something like "You're from the commercial sector, and you think you understand complex org structures." And then she showed a pretty typical org chart that we're all familiar with. And then she showed an org chart that was easily an order of magnitude more complex describing so much of what Dave just described, but also included the intelligence agencies, not just from the US, but coalition forces, coalition partners, and so much more. It was really remarkable to me at the time.
Gene Kim (00:24:38):
And hearing Dave tell the stories about how he would be reporting to so many different superiors in such a compressed time period shows just how fluidly people like Dave had to move around the organization. And secondly, I think all this is so great because it describes the conditions in which the dominant architecture may not be able to solve the mission at hand. When the problem space is so vast that no one group can see all the pieces of the puzzle. When information must be shared faster than what the official communication channels allow because there are no official channels between the components of the systems that actually need to talk to each other. When local learnings need to spread much more quickly and broadly than what current programs allow. And all of this is made urgent and important because the adversary is learning and acting far quicker than you. All right back to the interview.
David Silverman (00:25:30):
So that was the structure. Now we couldn't formally do a lot of this stuff. So if you want to say, "Hey, when I combine this unit and I want to create a new unit," we didn't really have that ability to do that. So really this was much more on the second thing you were talking about, the informal way that we operated. So it was more about establish a series of values and principles that would govern how we collaborate and communicate. And in this case, we try to demonstrate trust by giving more information, quality information into a system that otherwise we wouldn't have done, which then enabled other people to kind of get some comfort. And sharing, connecting actually started driving, strategically and operationally, where we were going much faster.
David Silverman (00:26:13):
And so, the big aha for me was growing up in a pretty tribal culture used to be knowledge was power. If I had knew something before my competition did, I don't mean my adversary, I mean literally the army guys, I had an advantage maybe getting the mission or getting the operation wherever else. Well, we quickly realized, and you can extrapolate that across like the CIA or state department or whoever else. In this world, we said, "Hey, well look. That's really ineffective if we're going to be successful as enterprise, because we don't have ... everybody has different pieces of the puzzle." Everybody has different authorities that allows us to operate. Every has different resources and tools and toys to play with. If we can figure out how to put our [inaudible 00:26:52] aside and just focus on the outcome, more a mission-based team, then we could potentially be much more effective as an enterprise. Get out of our own way for [inaudible 00:27:03]. So, we really focused on those processes.
David Silverman (00:27:03):
[inaudible 00:27:00] So we really focused on those processes, those leadership principles and values that are pretty analogous to agile. It allows teams to rapidly emerge and work outside of what I would say the triple organizational chart's framework. And that's when you start to get to me massive speed, productivity increases. Anytime you found yourself trying to justify yourself inside of a bureaucratic framework, it was just friction that was going to slow you down, right? You said, well this boss doesn't want you doing it because he's in charge of that person.
David Silverman (00:27:29):
You're [inaudible 00:00:29]. Ultimately, I don't really care who's in charge. I'll give you guys all the credit. I just want to be able to go do this. And as a mid-level manager at that time, that has, I would say highly competent, qualified operators, that, I work with, they just want to go do the job every night. And so your goal as the leader was really just to make sure there was enough stuff to do to keep them focused, because if they weren't then, there's nothing worse than a bored Navy SEAL. They tend find trouble, so it's sort of inherent to their nature. So there was self-preservation aspect to myself as well in this [inaudible 00:01:13].
Speaker 2 (00:28:08):
Awesome. And by the way, just to maybe expose some of my thinking, I genuinely believe that it's no surprise that the people whom your story ... the story resonates most with are software people. Because I think there's something uniquely, there is something very similar in the experience. So one question before I ask, just to reflect on this is, if you were to compare the before and after McChrystal era and what was that top differences that actually made a meaningful impact into what you were trying to achieve?
David Silverman (00:28:42):
To me, the biggest difference was how it felt. And so if you step back and you just went in and looked at it, what it felt like, is it, everybody was part of this larger team and that, regardless of where you came from before. You now self identified with this; being part of this task force. You could go to an embassy in London and you could hear an analyst talking about what they're doing, how they're contributing and how they're part of this larger task force, because the pace and the tempo and the people that ... the ability that people would now were connected in that on a real-time basis, allowed everybody to feel like they are part of something bigger than themselves. And some of those parochial legacy, as the identification markets, they were still there for sure, but you almost identified to this new transformational group of people in it.
David Silverman (00:29:28):
And it was really magical to be a part of it. And what was also interesting is it was so diverse. It wasn't just a traditional Special Mission Unit Tier One people.
David Silverman (00:29:39):
It was also all of the layer of other apparatus that comes around that and potentially it started to expand, right? So what may start off as primarily just a handful of what you would call highly specialized units. All of a sudden started to grow, to conclude conventional units and they control [inaudible 00:29:56] because we, the basic thesis was, there is no person who has the full answer to the problem. And, and we need to be humble enough to look for capability or perspective that we lack in order to basically solve this problem. And the problem was so dynamic and changing that it was critical to that. Let me give you an example, so like financing was coming through banking systems. So treasury, who in my upbringing, when I would never have anything to do with is now heavily engaged in [inaudible 00:30:26] and negotiating with foreign governments and trying to slow down the illicit movement of funds between groups that were supplying our enemy with materials that were then having devastating effects on us. So, just shows you how wide this thing became.
Speaker 2 (00:30:41):
And what were the visible, if you were to study kind of the behaviors of upper leadership in that transformation, what were the things that you found most helpful to help create that condition that you called magic?
David Silverman (00:30:52):
To me, it was sort of a first a recognition that they don't have the answer right. All right, there is no silver bullet. There is no strategic move. It was less about that. It was sort of having the vulnerability to say, I don't know, and I need your guys' help to figure it out. And if everybody starts with that premise, then all of a sudden you [inaudible 00:31:12] you worry a lot less about where you fit into the system and you just start trying to figure out, how do we solve the problem most effectively? And that became sort of transformational. So, the leaders that were most effective in this environment that we found after reverse engineering and studying this. Were people that took approach of trying to manage, we call them Gardeners, people that manage an ecosystem where the main purpose was less about making decisions. It was more about removing obstacles and barriers for productivity and effectiveness.
David Silverman (00:31:39):
And those leaders were almost...they were naturally more like coaches and they were bosses because they were sitting there saying, well, I don't know, but have you thought about this? Have you thought about that? Or, I'm going to connect hey! I heard something similar from over here. Why don't I connect you two and you guys start talking about it. So they were really just trying to make sure the ecosystem, the network was actually working. You know, we used to have this saying that "endure to defeat and networked organization". Like Al-Qaeda, we had to become a network ourselves. And within [inaudible 00:32:07] was that, put, check your ego and your past history aside and focus on the task at hand and try to find insight wherever it comes from.
Speaker 2 (00:32:16):
And if I had to push you for two other sort of helpful behaviors in the before versus after, can you give me maybe two more?
David Silverman (00:32:24):
Yeah, sure. So I think being self-aware is probably the most important skill set that we see in leaders today. It's the idea that if you're trying to move or affect a system of any type, understanding the energy, both catabolic and constructive, that you're putting into that scenario and how that's affecting others is really important. And then the other one was, there's so many, but the other, probably two where we were connecting, which to me was like, you have to inspire people towards a common goal and objective and give them kind of a consistent path to kind of say, all right, they chart themselves to. And then the other part of that connecting is actually demonstrating empathy. The ability to walk in somebody else's shoes, understand their perspective, because when you're trying to solve a challenge and people are coming at it from their own biases, or, our backgrounds, the ability to understand that would help speed up the process by which you could actually get to constructive solutioning, vice, fixated on whatever pain or preconceived bias that you had coming in the situation.
David Silverman (00:33:25):
And then the last one was just discipline, right? You have to have consistent habits and patterns so that the organization can not spend a bunch of its cognitive load, trying to figure out how to show up or what to do. Because if you have those three things working in parallel, then you start to de-risk the environment for constructive criticism and feedback that allows for rapid learning that's necessary.
Speaker 2 (00:33:47):
I want to pause for a moment to compare and contrast some of the leadership characteristics that Dave and Jessica just talked about. Both so far and later in the presentation, as well as in Dave's DevOps enterprise presentation, that we'll be playing for you in the next episode, they talked about as key leadership skills, functional excellence, ability to connect, self-awareness, discipline, decision-making, effective communications and continually learning. I'm amazed at the overlap between this and the transformational leadership characteristics that we found in the 2017 state of DevOps research. Specifically, in that year, we asked every respondent 15 questions, among five domains around transformational leadership. Vision, to what extent does a leader understand the grandest goals of the organization? And to what extent can they get in front of it, not just to be relevant, but to help with the achievement of the most important goals. Intellectual stimulation, to what extent can the leader challenge basic assumptions of how we do work?
Speaker 2 (00:34:50):
In other words, just because it was great 20 years ago, doesn't mean that we need to be doing it today. Inspirational communication, to what extent can the leader overcome fears, generate excitement, create coalitions required to overthrow powerful ancient orders, supportive leadership and personal recognition. What we found in the 2017 research is that the bottom third of organizations with the least amount of these characteristics were only one half as likely to be high performers. Dave also just mentioned, you need to create an environment where constructive criticism and feedback can enable the rapid learning that's necessary.
Speaker 2 (00:35:31):
Just reminds me of the Westrum organizational typology model from Dr. Ron Westrum, which also shows up in the state of DevOps research. Specifically, Dr. Ron Westrum studied healthcare organizations. And what did he found in 2004 was that those organizations with the worst patient outcomes had these characteristics. Information was hidden, messengers of bad news were shot, bridging between teams was discouraged, failures were covered up and new ideas were crushed. Whereas in the highest performing organizations, those organizations with the best patient outcomes, information was actively sought, messengers were trained to tell bad news, responsibilities were shared, bridging between teams were rewarded, failure causes a genuine sense of inquiry and new ideas are welcomed.
Speaker 2 (00:36:19):
In a conversation with Jeffrey Fredrick, co-author of the book, "Agile Conversations" noted, the Westrum model is really about how information flows in an organization. Is information suppressed or extinguished, or is information encouraged to flow? And of course, this brings up the notion of psychological safety. At Google and Project Aristotle and Project Austin, there was a multi-year study trying to understand what made great teams great. And the top factor was always psychological safety, as measured by to what extent do people on the team feel safe to take risks, to say what they really think without the risk of feeling insecure, embarrassed, ridiculed, or even being punished. And that factor was higher than dependability, structure and clarity, meaning of work or impact of work.
Speaker 2 (00:37:08):
I loved revisiting this work when researching the Unicorn Project. One of my favorite treaties of this was written by Charles Duhigg in the New York times magazine article called " What Google learned from its quest to build the perfect team". I'll put a link to it in the show notes. All right, back to the interview. This is so great. Jess, I've just made the claim that, I think the software community PIES more than most that I'm familiar with, are especially receptive to observations and lessons like this. Can you just talk about that in terms of your own background and maybe your reflections on the changes required at various levels of leadership?
Jessica Reif (00:37:47):
Absolutely. I would say that, I guess just stepping back for a moment, so team of teams was a really powerful story, but it was largely an inductive analysis of one single massive culture transformation. And sort of what we realized in the process of developing CrossLead is that there's so much applicability and similarity to the way that the special operations forces were operating and the way that agile developers operate. Some of the similarities are especially eerie. So for example, the concept of AARs in the Navy SEALS very closely mirrors, the concept of retrospectives on agile software development teams. So, in general, there's sort of this overlap and that we have a collection of cohesive team units that are operating together, but have dependencies on one another that are required to succeed.
Jessica Reif (00:38:42):
So I think that part of the reason why team of teams resonates so much with this community is that there's an acknowledgement of the pain and the suffering that's really associated with all of those dependencies. So, and Gina was listening to your recent episode of " ideal cast", where you interviewed Elizabeth Hendrickson. And she was referring to a situation where the architecture had grown so complicated that no single member of the team could hold it in their mind. And I think that is certainly an example that a team of teams really draws out the entire nature of the conflict was so complicated, that complex, that no single individual or team is capable of understanding it. And as soon as you acknowledge that fact you've done something right for the teams.
Jessica Reif (00:39:31):
You've created a system where they really need to ... where they acknowledged the need to continuously coordinate with one another, learn from one another and understand how an action here is going to have impacts there, and understand how dependencies across teams are going to manifest and better understand the consequences of their action. So I really think it's that self-organizing nature that resonates so much.
Speaker 2 (00:39:58):
So if I were to concretize some of the things I've heard, here's what I'm hearing. I'm going to try to use the language of structure and dynamic. It sounds like one thing that is definitely clear is that the structure of the organizations didn't change drastically, Navy SEAL still report it through the secretary of the Navy Army range, still reported through the secretary army. There might have been some changes at the top leadership level, but it in general, the configuration of the forces at a macro-level did not change. And yet, what I've heard was that there are certain behaviors at the very top that to help enable that sense of magic, where that began with a sense of, I don't have all the answers. There is no strategic move that can take us there. And that invited a different dynamic of working and helped accelerate this very fluid dynamic, informal network that you assembled.
David Silverman (00:40:50):
What I would say is that there were definitely some structural changes that were taking place behind the scenes. Right, and I would say those were operating on three or five year cycles, between planning to execution to finishing much like you would think about a reorg in a business. I don't think they really mattered though, fundamentally to the operations that were happening inside of a deployment cycle. And, and those cycles could be three to four months, or they could be longer depending on what team you're with and what cycle you're operating on. So, there was kind of both going on. We personally spent ... I spent time actually looking at both. When I was overseas, I really cared about, was getting the job done and being effective. And then when I came back and they said, Hey, let's take a broader look at the organization and how we can think about officer career progression or enlisted career progression and Instructure, that's going to enable us to fight in the future.
David Silverman (00:41:39):
But those were part of five-year study procurement decision making cycles, associate with the quadrennial defense reviews that the military apparatus operates under. But that was going to have zero relevance, if all of sudden the enemy was doing this tonight, and all of a sudden they're doing this tomorrow. And how do I reform and adapt to be able to solve this following here? We needed [inaudible 00:41:59] ... it was going to be woefully inadequate. And so when I think about increasingly complex environments that are changing almost as soon as the big structure change was finished, it's already much less relevant than the initial assumptions that went into making it. And so what I see working with companies over the last 10 years or so is, they go through these organizational structure changes because they're trying to find some optimal model for how they're going to be line block chart organized.
David Silverman (00:42:25):
And I'm not saying there's not value in that. I'm just saying increasingly there's less value in that. Meaning the organizations that I see are naturally unhealthy are ones that are constantly reorging because it creates such fatigue in the enterprise and distracts you from doing what you need to do, which is actually focusing on producing quantifiable outcomes rapidly based on changing conditions. And so for me, we spend a lot more time crossly thinking about how you work first versus how you're structured. And our assumption is how you work, you'll emerge to the appropriate structure. And if that structure is relevant for a week, great, as well for six months, better, I guess. But, but my experience is it's going to change. That's the only thing I can say definitively is, it will change.
Jessica Reif (00:43:08):
Yeah. And to add onto that too, it really goes back to the analogy that you referenced earlier between the Chess-Master and the Gardener, the Chess-Master says, I can look at the pieces on the board, I can optimize them in a perfect way for the situation that we're in right now. And that to me is the leader or executive team who says we can solve this problem with a reorg. We can move the pieces around in the board and they will be arranged in the optimal fighting pose for the situation we face today. Whereas, the Gardener's approach is I don't know exactly what I'm going to need six months from now. However, I know how all these parts are going to have to work together to achieve any particular goal that they set their minds on. And instead of focusing on the optimal placement of pieces, they focus on the optimal interaction between groups, which sets the conditions that they can accomplish anything that they set their minds to.
Speaker 2 (00:44:01):
This is so great. And to even concretize this even further. So what I'm hearing is that the successful panels relied less on that kind of reconfiguration of the chess board, but another part of structure is what are the allowed interactions between the pieces. I just want to confirm, it sounds like before people were not motivated to let Army to speak to Navy Union, speak to the Intelligence Agencies, the changes that did occur in that era certainly made those interfaces between those teams, not only allowed, but encouraged them. You did a ton of things to actually reward people who shared information. Span those boundaries, Is my [inaudible 00:44:40] correct?
David Silverman (00:44:41):
It's amazing. I forget who gave that quote. Maybe it was Mark Twain, [inaudible 00:44:45] or somebody else. But it's amazing how much you get accomplished if you're willing to not take any credit for it. And I think that was kind of the underlying principle for us that were overseas a lot, because I will say that what wasn't, one of the other unique aspects of this fight was there was, we call it the away team. There was a group of people that were fighting very different Wars over and over again. Right, so they were gone for the better part of those 10 years. And so, and then there was obviously these systems back home that were maybe operating on different cycles or they're in and out. And eventually over time, you started to care a lot less about who got credit for stuff.
David Silverman (00:45:22):
What you really cared about was could you accomplish the mission and could you bring your people home safely? And so then you started saying, all right, well, look, I'm happy to let you guys have X, Y, Z, as long as we can get this done. And all of a sudden then people's defenses break down and they start actually thinking about the outcome. It's this idea of ... there's naturally division in the system. And there was [inaudible 00:45:44] friction and the most effective leaders were more humble and took a more empathetic approach and just said, look, it's not about me or even this org. It's really about this outcome and whatever it's going to take to get the best team on the field, with the best resources to accomplish the mission is all that really matters.
David Silverman (00:46:06):
And that ... We didn't start that way. Right, we didn't start that way. And in order to have their credibility, you had to have a lot of success, right? You had a lot of continuous success rate. He will say, you know what? We're going to give these guys the benefit of the doubt. One of the interesting is when I was taking congressional delegations over to see the task force. The commander at the time was ... it was so important to him that this organization was not seen as wasting taxpayer dollars or looking like they had too many creature comforts. So everything was very spartan. And you would walk in and you'd see plywood tables. And you'd have nice technology in there, but you're eating meals ready to eat instead of fancy stuff. And, and you're not in some palace, everything was what I would say transactional or was temporary, because that is, we're going to move somewhere else, [inaudible 00:20:11].
David Silverman (00:46:56):
And so you come over and be, wow, that's the room you guys sleep in. That's what you're using. What do you guys need? We need more ISR collection platforms. Okay, whatever, give these guys what they want. They're clearly not wasting money. If vice, if you went down to the headquarters, you might be having a meeting in a palace, sitting at a golden linen table that used to be Saddam Hussein's. And even at that, either they're operating in the same mindset, it just looked different. It looked like, wow, okay, these guys look kind of comfortable here. If you went over there, well, these guys are clearly just ... they don't care about the frills. And the impact that had on our credibility, it was pretty profound, it was pretty ingenious by the boss. And you got a lot of benefit of the doubt from delegations, regardless of what party they came from, or their disposition on the war or not. They said, okay, well, these guys are credible, right? Credibility equals freedom of action, freedom of maneuver. And that was critically important to us being successful.
Gene Kim (00:47:51):
Gene here, two things. I love this metaphor of the Chess-Master versus the Gardener. The Chess-Master sets up their position on the board as the all-knowing strategist in the center, who knows all and optimizes the entire system. Whereas, the Gardener focuses more on the interaction between their own pieces and enables the desired dynamics, which allows for far more decentralized decision-making, sense-making and actions, which has required to defeat Al-Qaeda in Iraq, a far smaller, but far nimbler adversary. Just reminds me of something that Steve Spear said to me recently. He said often when we're hired, we're hired into a specific role to fulfill a certain set of responsibilities within the static system. In other words, that will be my job, not just for now, but into the future, as well, contrast this to how people were onboarded into a new role at Toyota. There was no real expectation that they were in static system.
Gene Kim (00:48:50):
In fact, the higher you rose within Toyota, the more you were explicitly expected to change the system. I think this wonderfully supports the Chess-Master versus Gardener metaphor that Jessica just talked about. And secondly, what Dave just said about credibility, equaling freedom of action. This seems like a very important principle. So I'll just underscore it here. And this will come up in my second interview with Dave and Jessica in another episode. Okay, back to the interview. One of the things that I loved about your presentation, Dave, is how you sort of divided up kind of the world into what I think it was through like three levels, right? Kind of executive leadership, middle managers, and frontline leaders.
Gene Kim (00:49:28):
And something that is starting to, maybe, hypothesis that's starting to form for me, is that in these type of transformations, really the change that is most challenging is this middle management. Which is super interesting because it alludes to the fact that we need executives to understand that there's a better way of working, but in the technology committee, we have this phrase that we use a lot is the "frozen middle". Can you scrutinize that claim that the challenge of team of teams ultimately is really a story about how do you kind of change the middle management?
David Silverman (00:50:04):
I really see four layers, right? I see like strategic level C-suite executives and decision makers. You see people at the bottom that you're calling your "doers", the people that have to get ... they're operationally connected to the outcome on a daily basis. They're just doing the job. Those are your frontline, your frontline operators and workers. And then you've got these two levels of management that sit between them, right? You've got a mid-level manager that are usually directly controlling and leading frontline doers. And then you've got senior management that are probably managing multiple teams of mid-level managers. And my experience is that your friction point is really with the middle to ... the senior and middle level management. So, and the reason why it's sort of intuitive, if you think about seeing the senior C-suite, they're what we call the good idea factory. It's usually their vision, their idea, and to be fair, it's not probably going to affect their lives too much.
David Silverman (00:50:55):
They're not going really have to do much. I say that tongue in cheek, but it's just that the reality is they've got other things to worry about, but vice driving, some internal change. And then at the lower level, you're along for the ride, right? You're doing kind of what you are told, you may not like it, you may make a plane, but ultimately you're, you're going to do, what's needed or you're going to opt out. You're going to go somewhere else. So then you've got these two level managements that really control the actions and mid-level management, in my opinion, are very practical, rational actors for the most part. And what I mean by that is they're dealing with daily problems that they're seeing manifesting and how their teams are operating, and they're trying to solve those problems directly.
David Silverman (00:51:36):
And if they see something that can help them solve a problem more effectively, or maximize an opportunity more rapidly, they're more or less going to come around because it's in their own best interest to do that. So they tend to be a little more pliable, sorry, flexible, in their mindset on how they sort of adapt to learnings and changes. Right, and usually the goal between those is just to get enough of them connecting and talking to each other so that the learning is not happening in silos. It's just happening across. The challenge like we talked about structurally before is with the senior management, in my opinion, because they don't have the same incentive. They're not necessarily dealing with frontline problems. They're seeing problems that are probably more systemic, but potentially the problems that exist, they'll see maybe like a structural or a decision-making right, as something that is, is personal or validating to their position or authority.
David Silverman (00:52:26):
And so they're reluctant to change because it equals risk as much greater, right? So either they're trying to get to the next level and they make a big change. It doesn't work. They're going to get penalized for that. At least that's the perception, and, or they're an expert in that domain. And if you change the way of the system, they become less relevant and all of a sudden, that gives them anxiety. So that, to me, that's the toughest layer because it doesn't necessarily operate with the same level of solution mindset as the mid level. It doesn't mean it can't do that. I'm just saying, historically, when you see problems and bureaucracies, and you say, well, why are they doing that? When you put yourselves in their shoes, you're saying, Oh, it's because this is scary for you.
David Silverman (00:53:07):
This risk is going to undermine your personal identity. And so I got to figure out how to appease that if we're going to get past this issue. Right, and so when we put processes in place, which was basically agile principles scale to an enterprise level, we were really trying to create communication and learning mechanisms that could hack those two layers. Right, you could get frontline insights and feedback on what was happening, rapidly disseminated across mid-level managers and not inhibited by some bureaucratic or decision-making bureaucracy that was being managed by the senior management. And then ultimately to the C-suite, it was start to affect strategy. One of my other big "aha" moments was being in Afghanistan. I was part of the ISAF staff. And we had a team that was solely focused on writing the strategy and the division. And one of these brilliant minds, kind of came in and was looking for a break.
David Silverman (00:54:03):
And one of these brilliant minds, kind of came in and was looking for a break. So he came into my, to my little operations center and was like, "Hey, I have an idea. I was just bouncing around." And what he was trying to do at the time was try to take speeches or guidance that he had gotten from, in this case, the president and the national security council, or the UN secretary general and their counsel, and figure out how to locally apply those to our strategy in Afghanistan and make sure that we were consistent with authority. And what was so obvious in the conversation was that it was just so out of sync. And what I mean by that was not that we were like doing something different. It was the fact that people in Washington, D.C. could not possibly be expected, because they were so detached from the reality of what was happening on the battlefield, to understand the dynamics of say, how do we integrate with the Taliban?
David Silverman (00:54:49):
What is the best process to do that? They were just so far removed. We almost needed operations on the ground that were dictating what the strategy would be because of the rapid feedback and interaction points we were having with, in this case, the local constituencies. And that needed to, then, go back and make sure it wasn't outside of the values or principles or guard rails that are put in place by higher headquarters. But they were not going to be able to write a plan for say, recidivism locally. It was not possible, because things were changing so fast. And so, figuring out how to break that down- I don't know that we ever did it by the way- I think it was a point of frustration in context. But having spent some time in a D.C. think tank and seeing some really smart people, that have not a lot of operational experience, trying to come up with policy back here for things where you have a large data set, you can do that. I think if you say, well, let's look at Cold War dynamics with rational state actors that we have deep understanding with. You can come up with plans, but for something that was as volatile and rapidly changing as a counter-insurgency, it was- to me- the operations on the ground, we're going to start driving what the options were from a strategic level.
Gene Kim (00:55:58):
And by the way, I just reminds me of a story. In London in the 1600's, they were essentially planning the entire Georgian economy, right? (laughing) Thousands of miles away with no knowledge of soil conditions, water levels. (laughing) It sounds like a very similar problem.
David Silverman (00:56:15):
It was not quite that bad, but yes, it was this idea that what we found was that operations started driving intelligence. Not the other way around.
Gene Kim (00:56:24):
Oh, that is very interesting.
Gene Kim (00:56:29):
We are so much looking forward to the DevOps Enterprise Summit: Vegas Virtual, which will now be held on October 13th to the 15th. As always, the goal of the programming committee is to bring you the best experience reports and to out-program all our previous events. And this year we expect to deliver on that promise again. I am so excited about the speaker lineup we have for you, partly because they are among the most senior technology and business leaders that have spoken at this conference, showing you how important the work of this community is. Maya Liebman, the CIO of American Airlines, presented at our annual forum in April, and we were fascinated by the perspectives that she shared with us. I'm so excited that she will be co-presenting with our longtime friend, Ross Clanton about the American Airlines journey. And since 2014, we've all been dazzled by the CSG journey, as told by Scott Prugh and Erica Morrison.
Gene Kim (00:57:20):
I am so thrilled that this year Scott Prugh will be co-presenting with his boss, Ken Kennedy, executive vice president, and president of CSG, the largest provider of customer care billing and order management in the U.S. Ken and Scott will be sharing their story on the interplay between business and technology leadership and how it resulted in their amazing accomplishments over the years. This is just the beginning. Stay tuned for more exciting announcements about our amazing speaker lineup. This will undoubtedly be the best DevOps Enterprise Summit program we have ever put together. You can find more information at events.itrevolution.com/virtual. Can you react to that notion about senior management? As Dave was talking about this, (laughing) this is laughing because I think that we see so much of that. Maybe to use kind of language. They really do represent the dominant architecture, and often they're put in a position where I think they're often asking, "What is my role in it?" And so how do you overcome that? Can you validate them? How do you overcome that here?
Jessica Reif (00:58:30):
Yeah, absolutely. So, I think Dave's point and your point on the frozen middle is very much a valuable one, because the frozen middle is the group that has to say, if there's going to be a major change effort, they are the one that has to come to terms with: what got me here will not get me in my team there. So they are the ones that really have to implement a change. It's really easy for a leader to stand in front of a podium and say, "We are going to deploy 10 times a day," because they have to deploy zero times per day. So there is really no change for what they are going to have to do by making that demand or strategic change. Whereas, at the more middle manager or senior manager layers of the organization, they are going to have to change.
Jessica Reif (00:59:11):
They are going to have to change what they are holding their teams accountable for. I think it is a quote by Dr. Henry Cloud, "You get what you tolerate", and those are the layers that are going to be setting the standards. So if they are tolerating deploys twice a quarter, that is what they are going to continue to get. So it is really that group that has to change their mindset. And we see the same thing a lot with agile transformation. We will hear of an executive that has heard, "Oh, digital revolution! This is the way forward. We are going to pivot our business to be much more digitally focused and customer centric." And while all of those things sound really good, they do not necessarily have an immediate ramification for the executive. They have immediate ramifications for those middle managers, which is who really needs to buy in to drive the change.
Gene Kim (01:00:05):
And what concrete advice would you give to someone who is asking that question? A senior leader, who is asking me, "What is my place in this new system?" If teams get to define the work, define how it's done, (laughing) just specifying their own work. And then left with the question, "What else is there left for me to do?" What do you tell them?
Jessica Reif (01:00:28):
So, two pieces of advice that I would give would be, one: focus on the vision and making the vision as crystal clear as possible. And there is some really good research from Professor Drew Carton, at the Wharton School, who published a paper on the techniques used by NASA, and specifically President Kennedy in the 1960s, before the moon landing, to really establish a visual image. It is not just, we are going to pursue the new frontier of space, but we are going to land a man on the moon. And you can close your eyes, and you can picture what that experience is going to be and what it is going to look like. And I think that the role of those senior leaders is to help create that mental image, for the network of teams that they are responsible for, of what success looks like, and make it something so crystal that they can close their eyes and that they can picture it.
Jessica Reif (01:01:21):
The second, I would say, is setting the conditions for the teams to interact and operate as fluidly as possible. So in a lot of cases, that is addition through subtraction, the senior leaders are often the ones that can remove rules. They can remove rules and barriers. They can improve funding for tools that are going to help teams work better together. So really recognizing what those opportunities are to reduce barriers that make it hard to interact across teams and taking advantage of the leader's authority to remove those roadblocks.
David Silverman (01:01:54):
And just to build on what Jess is saying from a mindset standpoint, because she's a hundred percent, right, I go back to servant leadership fundamentally as the mindset that you need to have in today's environment. Which is not about you, it is really about how do you position your people to be successful. And anything you can do to remove those obstacle and barriers that Jess just talked about, whether it is meetings or bureaucracy or pain or friction. That is really what is about. Your job is to be a steward that allows your people to be successful, and if you are doing your job well, it is really about them. If you are struggling with why I am giving up, you are already in the wrong frame of mind, in my opinion. You are already thinking about yourself and what is right for you, vice what is right for your team.
David Silverman (01:02:43):
And in my experience is, if, as a leader, your job is to take care of your people. And if you do that consistently, what typically happens is your people shock you and surprise you with how awesome they perform, which then reflects very well on the culture that you have established with a team that it makes. I think that ultimately, you are the glue, culturally, for the organization that enables it to be successful. And that is the magic, because if you take that away and you put something else in, the whole thing starts to stop optimizing.
David Silverman (01:03:12):
And so it is just sort of redefining this assess metric and how you evaluate performance for senior management, and you change the incentive model. I think where organizations struggle, and I do not think it is specific to software engineering, but financial services is a great example where traditionally you get promoted because of your ability to basically manage a P and L and make tactical decisions and assume, and take on risk. But increasingly as you get bigger, if you are trying to create a sustainable organization, that should matter a lot less. It is other people that have to be able to do that, and do that effectively. And you are basically sitting there trying to make work. That is how you create a legacy. That is how you create an enduring organization that is going to be successful and resilient.
Gene Kim (01:03:53):
And looking back in your military days, was there a senior leader that you think would epitomize the change that you would wish upon other people? Here is a person who felt like that and had an aha moment and they are acting and behaving in different ways that led to a bunch of incredible successes?
David Silverman (01:04:15):
There are tons of examples. I was very fortunate to work with just some incredible talent over the years for me personally, but the ones that always stuck out in my mind were actually those non-commissioned officers. It was these chiefs and these leading petty officers over the units that were tactical experts at what they did. They were probably better suited than most to say, no, this is the right way to do something or not. But if we were going to get any type of scale or effectiveness, they had to turn into a mentor and a coach figure. And they had that incredible depth in experience. But when they modeled that behavior, that is when we started to see tremendous effects at a localized level on productivity and increases in productivity. You would have relatively junior or new people who are now being empowered, because some of those senior managers, in this case, those chief petty officers or senior chiefs, those NCOs are setting conditions for them to basically operate more effectively as individuals.
David Silverman (01:05:11):
And that is when the whole thing unlocked. And my job really was to stay out of their way. If you are doing as well as the officer, you are really just trying to say, hey, does the chief- chief's really calling the shots on the objective anyways, they are the ones that you got to listen to, especially when things are chaotic. And you are just trying to manage the whole operation and make sure that it is inside the boundaries of success. That was my experience. I can think of a couple of individual names. I won't give them just in the interest of their own security stability, but those leading petty officers, those chiefs that I was fortunate enough to serve with overseas, almost without question, were my heroes.
Gene Kim (01:05:50):
And so these are people with decades in the service, if I understand that correctly.
David Silverman (01:05:54):
Yeah (affirmative). I would say the average tenure, for a chief petty officer, is somewhere between... the earliest you can make chief, if you're screaming up the ranks is probably eight or nine years. And usually these guys have 10, 15, 20 years of operational experience at this point. They have forgotten more than as a junior, mid-level officer that you will ever learn.
Gene Kim (01:06:10):
This is fantastic. So let's leave the domain of highfalutin theory and go into actual practices. One of the things that you have talked about a lot, Dave was one specific thing that you had. If I remember correctly, it was this global daily call, thousands of people on the call around the world, in this very informal network. Can you talk about what that call was? What were the specific objectives and what might distinguish a great call from a mediocre call, a productive call versus an unproductive one?
David Silverman (01:06:43):
The name of it overseas was the OPS intelligence update. We chronicalized it pretty specifically in the book team of teams. What it was, fundamentally, was it started off as a staff meeting between the senior commander and his immediate staff. It would take place between the Ford headquarters and the rear headquarters. And it was basically then they stay synced and just de conflict on what was going on. It was a daily meeting. It was probably 15 to 30 minutes, and there was probably 10 or 15 people in it.
David Silverman (01:07:06):
But as we started to evolve as an enterprise, what we realized pretty quickly was that at the senior level, we had these insights of what was happening. Cause we could see across domains and start to put together a picture. And what we realized was that we were losing. We were fighting individual wars that were winning locally, but on the aggregate we were losing. And at the local level, you are running your own operating mechanisms and your own critical learnings, but you feel pretty detached from say a similar unit in a different geography. Even in the same battle space like Iraq, or forget about if it is something like the Philippines or Northern Africa or something.
David Silverman (01:07:44):
You are like, "Well, our missions are just totally different. So there's really nothing to learn right now." And so what we saw was this gap of information change. So what we did was we said, "We need to start opening up this meeting to start to break down those natural, bureaucratic layers in the organization and start to connect dots between them." Because the way we are disseminating information now is a local unit discovers something, writes a report, it goes to a higher headquarters. It goes to another higher headquarters. It goes another higher headquarters. It goes back to a training command, gets institutionalized into training, maybe, trains a unit and they come back over. So it was like the learning cycle was...
Gene Kim (01:08:18):
David Silverman (01:08:19):
18 months, it was slow. Now locally, you were learning quick. But the cycle between units was relatively slow. Obviously that would get accelerated if it was something catastrophic cause you would speed up that chain.
David Silverman (01:08:32):
But I would say on an aggregate, it was pretty slow. And even the special mission units that were on shorter rotation cycles, theirs were still probably nine months long. It was still woefully inadequate. We put this meeting in place with a main idea saying- hey, it starts from a position of humility. We do not know what we do not know. And we know that this problem is bigger than, like what Jess was talking about earlier, we can like wrap our mind around. So we just need to find a way, in a disciplined, structured format, to share critical updates on what is happening. And then look for patterns, look for themes or concepts that are applicable, not just in one geography, but in others. And that became kind of the art to it. And so we had this mechanism every day to rapidly cross level key insights from the night before.
David Silverman (01:09:17):
That way you could then disseminate those learnings by centralizing them effectively across a larger task force. Then locally, I could then say, "I am going to take this thing I just heard. And maybe either go do subsequent conversations offline with that group, or I think I have the context I need to make localized decisions." The other benefit was if the organization needed to pivot. Now it is not just about a learning, but it is... we have got to go from this to that for some reason. Well, in the old days, that would like moving an aircraft carrier. Talking about 20,000 people. You have got to shift their mindset around something, and you play the telephone game of information dissemination.
David Silverman (01:09:58):
By the time the message gets down to the lower unit, in our case, a pretty big bureaucracy spread across a lot of different time zones and geographies, all they hear is- yeah, we got to do the same thing we did last night. Okay, got it. So now all of a sudden, you had a vehicle in place because this meeting became, every day for 90 minutes, and it had thousands of people on a daily, across every time zone. Now it became a mechanism where you could rapidly disseminate, not just learnings, but also intent and understanding or a shift in principles and values. And so it could move the organization much faster that way as well.
Gene Kim (01:10:39):
So I have heard one more sort of aspect of structure there, which is that it almost became that your participation in that meeting, gave you an interface to everyone else. So that if we need to create a sub team, you could quickly marshal up a group, the common interests and actually act upon it as opposed to waiting for it to percolate through the...
David Silverman (01:10:58):
That's right, it gave you the closest thing you could get, as a mid-level leader in the organization, to visibility of the entire network. You can sit there and be like, "Oh, I did not know those guys were working on that, and I have that same need down here." And especially when it came to interacting with the inner agency. I had my own informal networks with my peers that were in other parts of the battle space so I could call if I really wanted to [inaudible 01:11:28]. But the idea that I would go be dealing with a CIA agent who had been looking at this target set, that I happened to be just shifting onto this night, that they have been looking at for five years. Before my access and availability to them was...
David Silverman (01:11:40):
First of all, I did not think I had the authority to do that. And two, I would not know where to start to look. Now, I can see that. And one of the things that the leadership did is they gave us permission to collaborate informally. And they have kind of left it up to us to say, "If you are already a part of this task force, we are going to give you the benefit of doubt. We will start sharing because that behavior is being modeled above us." And so it gave us the freedom of action to do it down low. And so that sped things up.
David Silverman (01:12:05):
The big constraint in this whole thing was information control, the idea that this is not appropriate to be disseminated to the larger [inaudible 01:12:14]. I see this in companies all the time. They go, "This is proprietary, non-public information that, if it got out, would be a violation of SEC compliance rules or laws, or potentially give our competition advantage." It is almost like the cure is worse than the disease then. Cause you are like, "We are not going to tell anybody anything, but then how do they expect you to then do anything? So, we had a bias towards speed and transparency.
Jessica Reif (01:12:40):
To pile onto that, as far as practices that we have seen, that were outlined to [inaudible 01:12:44], that we have seen wide adoption with, with our clients and those that we have connected with over the years... Dozens of companies at this point are doing something really similar to the operations at Intel update. Admittedly, we have not sold anybody on a 90 minute meeting that takes place 365 days a year. But what we do see a lot of is companies that are doing a 30 minute meeting, whether it is every day, every other day, two days a week, where they are sharing those critical updates. And Gene, you had asked about what specific characteristics are of a good meeting and one that is bad.
Jessica Reif (01:13:23):
And some that we have observed specifically are, one: as a good characteristic, the meeting creates dependency awareness. So the stakeholders go to the meeting and they learn something about how what they are working on relates to something that somebody else is working on. The meeting forges connections between group members. Perhaps, if the three of us were on a meeting, and I learned that I have a dependency on Dave, who has a dependency on Gene, then the three of us can have a short sync after the more formal meeting to coordinate with one another. As an outcome, we see that these meetings are actually very effective in reducing other meetings, because otherwise, perhaps I would have a standing meeting with Dave and a standing meeting with Gene. And what the group sync allows us to do is reduce all of those other standing meetings to one that is collaborative and shared across teams and where we are able to do a quick meet after on the high level things that are relevant for that particular week.
Jessica Reif (01:14:27):
As far as when we have seen these meetings go awry, I would say that the two areas are when the meeting becomes just strictly updates, that could be communicated better via some other forum, whether it is email or Slack. So if people are sitting in the meeting and they feel this could have been an email, then that obviously a very bad sign. And then the second is when it becomes a point to point conversation between two of the attendees about a specific challenge or dependency that they are facing. So the role of the leader in those meetings is to make sure that those bad things are not happening and to maximize the good things. And I would say that the outcome metric that we see as, wow, this was really working well, is when you see people that are voluntarily opting into the meeting that are not required to attend. Because it becomes such a valuable source of information, that by missing it, they feel like they are missing out on something that is really important and valuable.
Gene Kim (01:15:29):
Gene here. I love what Jessica just mentioned. She is saying that a critical job, especially for middle managers, is to be able to create concrete manifestations of the vision. In my last episode with Mike Nygaard, I talked about having just read Gene Kranz's amazing book, Failure is Not an Option. I learned that the Apollo 9 mission was actually a bit of a Hail Mary. The goal was to pull in the timeline in order to achieve President Kennedy's goal of landing a man on the moon and returning him safely back to earth. By the end of the decade, it was breathtaking to read about all the risks involved, but their philosophy was high risk, high gain. Under enormous pressure, in 1969, they eventually came up with a plan that they had sufficient confidence in to achieve the mission as set forth by President Kennedy, back when he was still alive, in 1962.
Gene Kim (01:16:21):
Jessica mentioned a paper by Dr. Andrew Carton and Dr. Brian Lucas. The paper is called How Can Leaders Overcome the Blurry Vision Bias, Identifying an Antidote to the Paradox of Vision Communication. I will put a link to this paper in the show notes.
Gene Kim (01:16:39):
That is super interesting. And so, I am trying to create a word cloud of what is actually going on here. So, I hear a lot of what you just talked about, Jess, marshaling, de conflicting, just awareness of dependencies. But something that I also heard was an initiation of new actions. I just heard something, I am going to go find some people with a common interest, and I am going to Marshall a group together, that did not exist before, with potentially the creation of a new short or medium term objective. I am actually forming a new coalition or group. I just want to confirm my understanding there.
Jessica Reif (01:17:12):
Yeah, absolutely (affirmative). A new group forms, and whether it is just for 15 minutes to talk about the specific thing that came up during the meeting, or it could be longer term that perhaps two teams realize that they are working on something that is really similar. And maybe there are components they could share, or maybe there is something that they can do to mutually make each other's lives easier.
Gene Kim (01:17:33):
Awesome. One of the things that really caught my attention was a dynamic that actually led to people wanting to opt in. David, it sounds like the old behavior was: knowledge is power, I am going to hoard it, and every piece of knowledge that I have, and that you do not have, I can use to my advantage and your disadvantage. Somehow this inverted it, this call created a mechanism that rewarded knowledge sharing. Can you talk a little bit about that, or even validate that that was a dynamic?
David Silverman (01:18:00):
Yeah, that was a dynamic. The reason, to Jess' point, that when you know it is going well is when you have that mid-level management layer opting in voluntarily. Because you can't really, at that same scale, force them or hold them accountable to showing up when they are opting in, on their own volition. The reason why they would do that is because there are insights that you can gather that you could not get anywhere else. So for us, the currency for operations was heavily correlated to certain critical assets that were in high demand, like helicopters or collection platforms, or close air support. Inevitably would have to be prioritization decisions made at the operational level where they would say, "Hey, these are where these assets are going." You would see that in that meeting.
David Silverman (01:18:46):
And then you have a sense of, okay, I now know where the cards are being dealt for the night. And I can now start horse trading with local battle space commanders that have those assets to try to unlock potential latent productivity that might exist. We would then establish those relationships locally and start doing horse trading between them, almost creating like a marketplace where we could basically say, "Hey, you have this, can I borrow for this? I will give it back to you, and you get some of the credit." And they were like, "Okay, sure, I will do that." And that is what really unlocked the productivity across the enterprise. Where we see them not go well, is when we do not spend the time upfront to understand what the interdependencies are. So when you do not have that semblance of currency and independencies, then it just becomes another meeting, to Jess' point, where it was a series of updates that people do not find it to be necessarily relevant or not.
David Silverman (01:19:34):
I do think there is a lot of value sometimes in some exploration that takes place between two senior leaders. As a junior leader, listening to how people think out loud around a decision that they are wrestling with was super helpful context since I was trying to figure how to get something approved. Cause I would be like, "Oh, wow, I did not realize that was what was top of mind for the boss. This thing over here, and this other country is weighing on him heavily and potentially has implications on assets or resources. That is context that I lacked, that if I could have walked in and said, "Hey, here is what I want to do." And the guy goes, "Well, no." And then I go, "Why?" And they go, "Because there is this larger thing at play that you did not realize," and you go, "I could have shortened that cycle if I had had some appreciation for that."
David Silverman (01:20:11):
Plus it had a way to help the leaders at the mid-level scale and develop. So traditionally, what the military does a good job of is they send people throughout their careers to and from education and training venues. You make a rank, you go to school for nine months to a year, then you come back and then in order to advance to the next rank you do it again and again and again. If you are a high demand asset, like a special operations operator, your ability to go take time off to go to school was curtailed dramatically during this fight, because there was more requirements than there were bodies. And the learning that would take place just by hearing leaders talk and think out loud was super helpful to a junior. I would go just for that, because I would be like, "Wow, that is context I otherwise would never have gotten access to unless I was an aide-de-camp or something." And so it really did a lot to professionalize the force indirectly, so you can use the same mechanism to do...
David Silverman (01:21:03):
It's a force indirectly, so you can use the same mechanism to do, one, drive your culture, two, professionalize and develop your talent, then three, increase its overall productivity by improving the quality and the merit of the message because what we find in complexity organizations, the hardest thing to do is to stay aligned because if things are changing, the priors are changing, you don't know if what you're working on is productive and there's nothing worse than non-productive time. I just spent a bunch of time building something or coding something or doing something that all of a sudden isn't valued anymore, and you're like, "Well, gosh. That was wasteful. I don't like wasting my time." So if you're hearing how those priorities are shifting, and there's a mechanism to do that much more efficiently than that telephone game, that to me, is super helpful.
Gene Kim (01:21:43):
That's interesting. So what I just heard was it's not a rebellion/revolution of middle managers, there's actually a vehicle where senior leadership adds a voice where they can model the desired expected behaviors and amplify that across, it must've been a breathtaking scope, maybe even far beyond their official area of authority. Am I hearing that right?
David Silverman (01:22:08):
Yeah, and I think there's room for potentially two different things, but the Keystone forum was really a way for you to connect strategy of execution at the operational level for I would say that the team. Really what you're doing is you're hacking those layers like I described earlier, but there are also informal networks that were created, almost these liaison groups that acted almost like APIs that could connect information flow across the system. Oftentimes, they would be energized or accelerated based on something they heard. In this Keystone forum, they'd be like, "Oh, okay, well, here's a pervasive problem that this group can go spend time on this informal network of thought leaders." Some are dedicating some portion of their mind share to solving larger problems for the organization. So that also was taking place.
David Silverman (01:22:49):
So we had both of those established, so you had this basically change agent networks that existed that would then be tapping into this process that was systemic and as far how we operate. Those two things together allows you to basically pivot large organizations effectively. John Smart is writing about this in his book- [crosstalk 00:02:11].
Gene Kim (01:23:10):
... happier. [crosstalk 01:23:13]
David Silverman (01:23:12):
Because he did something very similar when he was driving a digital transformation at Barclays where he created these change agents for lack of a better term. Then when they had a specific thing they need to go focus on, he can mobilize this informal army to basically you attack friction, wherever it exists in the organization to unlock productivity.
Gene Kim (01:23:30):
A couple of things, it's amazing to see them referencing John Smart's upcoming book, "Sooner, Safer, Happier," describe all the pioneering work he did when he led the ways of working team at Barclays and organization founded in the year 1634, which actually predates the invention of paper cash. The other thing I wanted to mention here is the dynamics of having this incredibly vibrant ops intelligence call. Jessica mentioned that this was a meeting that people opted into just because it was a source of so much information that you couldn't get anywhere else. It was a way to connect with peers who are solving similar problems that you could collaborate with to better achieve your own objectives. This reminds me of the themes that show up in John Allspaw presentations for many years. Most recently in the DevOps enterprise London conference, John Allspaw talks about the need to learn from incidents.
Gene Kim (01:24:28):
He asked questions like, " Are your post-incident reviews being read by people outside of the team? Are they being referenced in code faces? Do people want to attend these meetings?" This reminds me of a comment that Bethany Makary said when she was also at Etsy about how the post-incident reviews, these blameless postmortems were widely attended because it was one of the best ways to learn about areas outside of your team. In the DevOps handbook, we quoted Randy Shoup, who over the years has been a chief architect at eBay an engineering director at Google, and is now again a chief architect at eBay and a VP of engineering. The DevOps handbook, we quote him about his experiences when he was the engineering director for the Google app engine team describing how the documentation of post-mortem meetings had tremendous value to others in the organization.
Gene Kim (01:25:20):
"As you can imagine at Google, everything is searchable. All the post-mortem documents are in a place where other Googlers can see them. And trust me, when any group has an incident that sounds something similar to it that has happened before, these post-mortem documents are among the first documents being read and studied." I remember having a conversation where you said, "Whenever there's a customer impacting at Google, everyone would be looking forward to the post-mortem documents being published because everyone loves war stories." My point here is that there's a dynamic or sources of incredible insight like the global ops intelligence call, like the post-incident reviews, are a source of incredible learning and as a way to spread knowledge across the organization. All right. Back to the interview.
Gene Kim (01:26:04):
That is fascinating. So actually, you mentioned one thing about the sort of internal marketplace, so in the book, "Team of Teams." one of the stories that really caught my attention was how so many missions were scrubbed at the last minute due to lack of availability of certain scarce resources and was it like helicopter transport, intelligence gathering platforms? So my interpretation of this was that that was really kind of the inability of any centralized planning system to know everything and forecast who needed what, and how do you get a certain scarce thing to who needed it most? So I think you just talked about exactly that, which this became almost a, maybe it wasn't the marketplace, but it facilitated this horse trading so that the people who needed something the most could get it and horse trade their way there. Can you talk a little bit more about that, that the notion that it came validated that it was really, this became a way to augment the planning processes and it was really mill managers who were needed to get those things to where they need to go most?
David Silverman (01:27:05):
Yeah. To me, this was probably the biggest single productivity driver for the enterprise, which was there was only so much that could be managed centrally when it came to the efficiency of decisions on critical assets and infrastructure. So in order for you to get the most productivity out of a certain system, there needed to be something much lower, much closer to the problem set, much lower level leader managers that are basically trying to figure out how to, I would say load balance for lack of better term, that asset, that resource to be effective locally. Because if you try to do it centrally at the scale that we were operating on, it just wouldn't work. There was just too many competing interests. So to me, that was typically, if you think of traditional prioritization of any drill, you got the higher headquarters that says, "Okay. These assets are going to be overseas," and then there's a combat commander says, "These assets go to these two countries and these assets go to these three regions in the country."
David Silverman (01:27:59):
Then inside of there, it starts to like... Okay, well now, that's about as low as we can possibly manage it. So what this allowed us to do was I had relationships with the same, my Delta force counterparts over in another operating base. Because they were Delta and Delta's the best, they had all the best toys, right? So I knew that I had to somehow access or leverage their toys if I was going to be able to get the most productivity out of my force. So I put my best operator physically in that person's headquarters and said, "Hey, you do at him what you want. He's here to basically help provide a cognitive information flow between our two organizations where should you need any of the resources or tools that we have, he's your guy. And if you want to use them for any other stuff, unless it's something morally ethically wrong, go, go nuts."
David Silverman (01:28:52):
Going back to those leadership skills, this high performer was all those things we talked about. He's a humble professional. He's highly skilled, competent. He was very self-aware like. He understood how to like walk in somebody else's shoes. He was extremely disciplined. So very soon, he started to build a relationship sort of established credibility. They started using it for more and more stuff. Then when I'd say, "Hey, they have this collection platform tonight., Can we potentially use that resource to put our things that we're looking for on this device, so we could try to find those too?" He'd say, "Yeah, we got some extra capacity here. You can do that." I said, "All right. Well, I know I'm not going to have to keep it. Can you please give me a location? And then from there, you can go back to doing whatever it was before, then I'll assume the risk after that."
David Silverman (01:29:33):
So all of a sudden, our tempo dramatically unlocked and it's funny. I actually remember going up to the higher headquarters to visit one of my other liaisons and the commander of all the forces in Iraq for the special operations unit, he was like, "Dave, do you guys have stuff on my assets?" And I started laughing. I was like, "Well, yes, sir." And he was like, "I don't understand." And he goes, "Well, I guess it's okay." It was funny because it was like, "Well, that increase in productivity that you're seeing from your centralized force, some of that is you're counting our stuff. We're just giving you guys a credit for targets that we're prosecuting, that they don't have the time or they don't want to, but they're still like important." Let's use a software, analogy, bugs. Maybe they're a lower priority. We were cleaning up some of that backlog of stuff, and the net productivity increase.
David Silverman (01:30:25):
Then eventually, as we got better and they got better, the quality started to go up and all of a sudden the capacity is expanding and then the whole system working more efficiently, more effectively. So that was the magic that sort of goes outside of the traditional, I would say prioritization. It was sort of inside of the sprint cycle, in this case, the day. You had two effective engineers trading lessons and practices to help each other out to basically deliver on time. That was the magic.
Gene Kim (01:30:55):
Gene here. This is amazing. So Dave just described in phenomenal detail, one of my favorite parts of, "Team of Teams." This is around page 178 where they talk about liaison officers that they would send to key partners like intelligence agencies. They seem to allude that in the old days, they would send people who weren't fitting into their unit, people on their last rotation before retirement. I think the implication is that they were often sending not the highest performers into these assignments. I quote from the book, "However, as these interfaces became increasingly important, we realized the potential for bolstering our relationships with our partner agencies by sending a strong linchpin liaison officer. As it turned out, some of our best liaison officers were also some of our best leaders on the battlefield. We started taking world-class commandos and placed them, attired in civilian suits, in embassies thousands of miles from the fight because we knew we needed a great relationship with the ambassador and the other inter-agency leaderships posted there. Everyone hated removing some of our best operators from the battlefield, but we reaped enormous benefits."
Gene Kim (01:32:04):
"Our goal was twofold. First, we wanted to get a better sense of how the war looked from our partners' perspectives to enhance our understanding of the fight. We saw one piece of AQI up close and daily, but we knew that they were part of a larger global system of finance, weapons and ideology about which other people knew much more than we did. Second, we hope that if the liaisons we sent contributed real value to our partners operations, it would lay a foundation for the trusting relationships we needed to develop between the nodes of our network."
Gene Kim (01:32:36):
So we just heard an amazing example of this, in this case, Dave sending one of his people and embedding them in one of the Delta force units. Right after that passage in, "Team of Teams," is one of my favorite stories in the book. Describe how a US embassy in a troubled nation finally accepted a posting of a liaison. Apparently, this officer got a very lukewarm reception despite being, "A walking mass of extroverted energy, habitually upbeat and helpful." They write, "At his new post, he was initially granted no access to intelligence and given nothing to do. So Conway volunteered to take out the trash. Each afternoon, he went office to office gathering refuse and carrying it to the dumpster. When he found out that one embassy colleague loved Chick-fil-A sandwiches, Conway arranged for the next taskforce delivery to include several in his contents. A man, the US government had spent hundreds of thousands of dollars to train as a Navy seal was for three months, a glorified garbage man and a fast food delivery boy."
Gene Kim (01:33:37):
"So the story goes, when the situation heated up in the country's capital and the ambassador asked whether he knew anything about forest protection and dealing with growing Al-Qaeda threat, that person was exactly where he needed to be. 'I do.' He said, 'That's what I'm trained in, and I can do you one better. Let me make a call.' Soon, the entire weight of the task force enterprise was at the disposal of the inter-agency team at the embassy. Our liaison officer was there to serve the collective mission from trash to terrorism. The taskforce relationship with that country grew tighter, nearly instantaneously. A new node in our network became online and began to thrive."
Gene Kim (01:34:14):
I just love that story because it indicates the investment that they were willing to make in these relationships in the hopes that they would eventually pay off. I see so many examples of this in the DevOps community, infrastructure teams embedding their best people into dev teams to help them figure out how to securely, quickly, reliably promote code into production to collectively help their organizations win in the marketplace. Okay. Back to the interview.
Jessica Reif (01:34:43):
Yeah. We see this happen a lot within product lines of manufacturing companies, within value streams at technology companies where resources are allocated at a particular level, and then everything that's done below that is based on shared consciousness within that group of who needs what, when and how they can collectively leverage the resources that have been assigned to them, also collectively to achieve whatever their mission is because whoever's allocating the budget to that group doesn't know for example, who's going to need a designer on their project and how much of a designer's time is going to need to be used or who's going to need a GPU. They don't necessarily know those things. So getting those resources aside centrally, and then having those forums to create your consciousness of who needs, what, when is a really valuable way to manage those resources effectively.
Gene Kim (01:35:34):
Wow. It's astonishing. I'm sort of connecting some dots. So I have to imagine this happening at scale must be breathtaking to see, that's creating these relationships across the scope and breadth of an entire organization encompassing hundreds of thousands of people.
David Silverman (01:35:52):
Yeah. It was pretty amazing. If you think about it on a task for level just in Iraq's and not the whole task, was just Iraq. Early days, we were doing two ops a month basically. Then it got up to two ops a week. Then by the end, we were doing 10 ops a night. So same resources, same assets. Quality and intelligence and some of the technology was certainly better over the years. But at the same, node, the physical constraint was a number of operators they could go do something. Those numbers didn't dramatically change over that time period. Then more importantly, the actual quality of the operation went up as well. So casualties went down, success rates went up. All of those things started dramatically improved because the culture sort of enabled and encourage this level of synergy and collaboration.
David Silverman (01:36:35):
Then the last thing I'll say is it had dramatic effects on innovation and creativity. So what it did was localized problems that were happening. The fact that you were crossing those in real time and it was tied to understanding strategic, direction and intent, you can start to better, faster create products, services, offerings that were aligned with where the organization was trying to go to. Oftentimes, we'll see companies invest a lot of money in third parties to come up with strategic plans based on their deep expertise in a certain domain and there's value there, certainly. But increasingly as things are changing, we believe that you got to emerge a solution quickly and get real-time feedback. Back to that analogy earlier, the operations will start driving the strategy, the intelligence.
David Silverman (01:37:17):
So if you have a mechanism to connect that, all of a sudden the whole org starts to going faster. So what it feels like is just a lot of winning, right? You're winning locally and you're winning at scale. That is when people really start to get energized. It's funny, they're looking at type performing teams and culture. It's amazing how much winning will do to solve other problems. If you start with success, a lot of the other stuff that bothers you tends to go away. So this was sort of creating some of these micro successes and as they add it up, it started having pretty significant effects. So if you look at [inaudible 01:37:50] you're like, "Oh, it's just marginally more productive." But then when you step back and you go, "Wow, the effect is like a 10 X improvement at this scale on how things are going..." And that can have decisive effects for organizations operating in competitive.
Gene Kim (01:38:03):
Is that what you mean when you say operations starts dictating strategy? In other words, people sense that there is a pattern of winning and that really becomes... that raises the question of how do we win more at the strategic levels? Is that what you mean?
David Silverman (01:38:17):
Yeah. Think of it like launching an MVP. So when you launch an initial product, for us, that was us going on an operation that night. We go on operation night, that was our minimum viable product for the night. We would collect information from that experience and that would then go back into the feedback loop that would inform the next one. So that was happening organically inside our unit. The magic of the, "Team of Teams," was connecting that learning across all the other learnings that were taking place and then trying to disseminate which of those were applicable so that you could rapidly move them across. That's what all of a sudden, it started to make us go much faster effectively.
David Silverman (01:38:49):
So that's why I'm saying, operations started driving intelligence or strategies, because what we were finding on the battlefield was detained the next target. We'd say, "Okay. Learn this and based on this, here's the next thing that we needed to go do differently to solve it." So it's no different than launching MVP, getting feedback and saying, "What the consumers value the most out of this experience was this. This space can still be optimized better. Go focus on that. You're going to get a higher return than you would on somewhere at something else."
Gene Kim (01:39:15):
And if I hear you correctly, that pattern of winning can inform strategy or even become the strategy in the extreme.
David Silverman (01:39:23):
Yeah, that was my sense is that all of a sudden you get that intrinsic motivation of feeling like you did something successful. Then seeing that success applied and scaled to a larger enterprise's outcomes and desires, all of a sudden it makes you feel connected in ways that before, you may have felt isolated, which just improves overall morale, in effect, which then reinforces you want to put more time and energy and motivation to something. So it has this flywheel effect.
Gene Kim (01:39:52):
Gene here. I just want to take a moment to concretize what Dave and Jessica just said; when success at the execution level starts to influence or even drive strategy. I can totally see that happening. If you are striving to create successful outcome by creating a dynamic of learning, what is initially an island of success will keep getting larger. If those successes can be connected with the largest and most important goals and objectives of the entire organization, I think one can quickly see how this effort would have larger and larger influence on the rest of the organization, especially if the other parts of the organizations are trapped in a culture of compliance, a culture of just following the plan. A more dynamic culture of rapid experimentation and learning will, or at least should keep having an ever-growing impact and level of mind share from senior leaders.
Gene Kim (01:40:42):
As Mike Nygaard said in his first ideal cast interview, "This does require that we doubled down on the winners as opposed to force feeding the losers." In other words, all too often, there is this unfortunate dynamic where the projects that can actually get funding are the ones that are late and losing as opposed to the teams who are actually winning. Those are the teams we should be investing in because they have identified a potential breakthrough.
Gene Kim (01:41:08):
Okay. I can't overstate just how grateful and amazing it has been to talk with Dave Silverman and Jessica Reif about the philosophies that went into one of my favorite books, "Team of Teams," as well as hearing so many of these stories that further demonstrate the lessons in the book. So believe it or not, I only got through half of the questions I had for them. So we will be continuing this interview in another Ideal Cast episode. But before that, you will be hearing Dave Silverman's amazing presentation that he gave at the 2020 DevOps Enterprise Summit London virtual conference. He talks about many more of these lessons that he learned in the, "Team of Teams," experience, which I know will resonate with anyone attempting to transform their own organizations. In the meantime, Dave, Jess, can you please tell everyone how they can reach you and prompt that you'd love to work on it?
Jessica Reif (01:41:55):
You can reach us at crosslead.com. I'm Jessica. [email protected] and Dave is [email protected] The types of problems that we enjoy working with the most are really the ones that we talked about in this episode; how do you operate more effectively as a network of teams and how do you address some of the challenges that come with work in the context of a complex system?
David Silverman (01:42:25):
Yeah, I think justice nailed it. I'm passionate, we're passionate about multi-team systems, specifically how organizations communicate and make decisions in environments that necessitate flexibility and adaptability. That's always been sort of my passion. I have a bias towards high-performing organizations that are committed to continuous improvement because I think without that it, it's sort of tough. So helping instilling that culture and then driving the mechanisms that reinforce those behaviors is what I think we spend most of our time with customers and clients spent talking about. So crosslead.com and [email protected]
Gene Kim (01:43:05):
Thank you both. If you enjoyed this episode, I know you'll enjoy those two upcoming episodes as well. See you then. Thank you so much for listening to today's episode. Up next, will be Dr. Steven Spears, DevOps Enterprise Summit presentations, both from 2019 and 2020, where he talks about the need to create a rapid burning dynamic, as well as how to create them. The 2019 presentation talks about many of the case studies we talked about today, but in more detail. And in 2020, he talks about one of the most remarkable and historic examples of creating a dynamic learning organization at scale, which was in the US Navy at the end of the 19th century at the confluence of two unprecedented changes. One was in the underlying technologies, which you found in ships and in the strategic mission that they were in service of. As usual, I'll add my reflections and reactions to those presentations. If you enjoyed today's interview of Steve, I know you'll enjoy both of those presentations as well.