Skip to content

Personal DevOps Aha Moments, the Rise of Infrastructure, and the DevOps Enterprise Scenius

Episode 24
Patrick Debois and John Willis
coauthors of The DevOps Handbook
2h 19m

- Intro

Interviews with The DevOps Handbook Coauthors (Part 1 of 2: Patrick Debois and John Willis)

In part one of this two-part episode on The DevOps Handbook, Second Edition, Gene Kim speaks with coauthors Patrick Debois and John Willis about the past, present, and future of DevOps. By sharing their personal stories and experiences, Kim, Debois, and Willis discuss the scenius that inspired the book, and why and how the DevOps movement took hold around the world.

They also examine the updated content in the book, including new case studies, updated metrics, and practices. Finally, they each share the new lessons they have learned since writing the handbook and the future challenges they think DevOps professionals need to solve for the future.

Kim will conclude the series in Part 2 (coming early 2022), where he interviews the remaining two coauthors, Jez Humble and Nicole Forsgren, PhD.

- About The Guests
Patrick Debois

Patrick Debois

coauthors of The DevOps Handbook, Second Edition

Patrick Debois is considered to be the godfather of the DevOps movement after he coined the term DevOps accidentally in 2008. Through his work, he creates synergies projects and operations by using Agile techniques in development, project management, and system administration. He has worked at several companies such as Atlassian, Zender, and VRT Media Lab. Currently, he is a Labs Researcher at Snyk and an independent IT consultant.

John Willis

John Willis

coauthors of The DevOps Handbook, Second Edition

John Willis an author and Senior Director of the Global Transformation Office at Red Hat. He has been an active force in the IT management industry for over 35 years. Willis’ experience includes being the Director of Ecosystem Development at Docker, the VP of Solutions for SocketPlane, the VP of Training and Services at Opscode. He also founded Gulf Breeze Software, an award-winning IBM business partner, which specializes in deploying Tivoli technology for the enterprise. He is also the coauthor of Beyond The Phoenix Project and Investments Unlimited.

- You'll learn about
  • The DevOps origin story from coining the term, why it took off, to launching the DevOpsDays conference as an offshoot of the Velocity conference.
  • How people thought of DevOps when it was first presented (their reactions, their mentalities, and their willingness to adopt it).
  • What has changed in the DevOps world since the first edition of The DevOps Handbook was published.
  • How the rise of SaaS companies is altering the DevOps world and participating in its evolution, and how building solid relationships with SaaS vendors and communicating comprehensive feedback to them is integral to DevOps.
  • The significance of speed in changing team dynamics.
  • Why resilient companies like Google and Amazon engineer chaos, and why companies like Toyota are happy when production stoppages happen.
  • Why you can’t afford to provide a high variety of products if you also offer high product variation.

- Resources

- Transcript

Gene Kim (00:00:00): Welcome back to The Idealcast. I'm your host, Gene Kim. Today, we have on Patrick DeBois and John Willis, who are two of my four co-authors of The DevOps Handbook, Second Edition, which was just released last week. The original idea was to interview each one of my four co-authors, so that's Jez Humble, Patrick DeBois, John Willis, and our new co-author, Dr. Nicole Forsgren. I thought I'd quickly ask each one of them four questions and put it together in one episode. So, those four questions were: tell me the story behind your original DevOps aha moment, which each of them wrote about in The DevOps Handbook? What is the most interesting thing that you've learned since the book came out in 2016? What is your favorite DevOps pattern or practice? And what is your favorite DevOps case study that is documented in the book? So, that's what I asked each one of my co-authors. And, holy cow, these questions took each one of the interviews to some incredibly unexpected and super interesting places. Each one of the interviews was so fun and so full of interesting insights that I'd never heard, despite having literally spent hundreds of hours with each one of these people. There was so much great stuff that we decided to break these interviews up into two episodes. Gene Kim (00:01:17): So, just a few words on The DevOps Handbook. When I read through the first edition again earlier this year that we released in 2016, my first reaction was, "Wow, I really love this book." And I know others have valued it as well. Since it came out, it sold over a quarter of a million copies. I think this book stands up really well, even after six years, in a way that so many books about technology don't. So, the book is made up of principles and patterns. So, of course, the principles still apply because the underlying principles should never change. But even the patterns still seem right on target, maybe with the exception of a sentence here or there, usually because we mention a tool that no one uses anymore. Gene Kim (00:01:57): But without a doubt, the second edition is so much better. There's 15 new case studies, mostly from the DevOps Enterprise community, including from Fannie Mae, Adidas, American Airlines, the US Air Force. There's over 100 pages of new or updated content, including so many of the solidified learnings from the state of DevOps research and the Accelerate book. There's a new forward and material from Dr. Nicole Forsgren. There's an updated afterward, including sections from each one of the five co-authors. And there's a new section with new resources at the end of each part of the book. So, I really want to thank Dr. Nicole Forsgren for leading this effort. Gene Kim (00:02:37): As an author, I find second editions of books to be a very challenging endeavor. And I know it's not just me. So, one of my favorite interviews of an author is from Nobel Laureate, Dr. Richard Thaler, who wrote about his pioneering work in behavioral economics. He was recently on NPR Planet Money to talk about his latest book called Nudge: The Final Edition. And he talks about how important it was to him that the words, final edition, be in the title because of the huge effort involved. So, when the interviewer, Greg Rosalsky, says that he looks forward to talking with Dr. Thaler when there's another edition of the book coming out in 13 years, Dr. Thaler responds, "Go to hell, Greg." So, I know that it's not to as me. So, seriously, thank you again to Dr. Nicole Forsgren for helping make the second edition possible. And as much as I love the first edition of The DevOps Handbook, I love the second edition even more. It is, for sure, a materially better book on so many different dimensions, and I'm so happy with how it came out. Gene Kim (00:03:37): So, let's jump to the first interview, where I talk to Patrick DeBois, the godfather of the DevOps movement. And after that, will be my conversation with John Willis, who literally invited me into the DevOps community in 2010. My interviews with Jez Humble and Dr. Nicole Forsgren will come early next year in season three. Gene Kim (00:03:56): So, up next, you will hear Patrick talk about how DevOps enabled him to be able to do infrastructure work with a development group and for the first time actually feel like part of the team; how presenting those discoveries at a conference in 2008 helped him find some fellow travelers, but how the problem statement didn't quite resonate with everyone in the room; his recollection of the famous 2009 Velocity talk from John Allspaw and Paul Hammond about how they were doing 10 deploys a day every day and how that led him to holding the first DevOpsDays conference, accidentally coining the term, DevOps; his DevOps lessons learned during his four years at a startup, which involved the entire company, not just dev and ops; and his views on how important relationships and empathy are, even in the world where so much of what we rely upon are in SAS services, often behind in API. Gene Kim (00:04:48): And after that, you'll hear John Willis talk about his DevOps aha moment being in 2007 when he learned about Puppet and configuration management from Luke Kanies; and his adventures with so many of the early pioneers of next-generation infrastructure and cloud; his side of the story of the early days of the Velocity Conference and DevOpsDays from which so many of the early DevOps principles and practices drew from; how those communities eventually led to creating the DevOps Enterprise community in 2013 and 2014 to now; and how both of our experiences and appreciation of conferences helped frame some of the DevOps Enterprise community goals and ideals; some incredible examples of how these early connections were made and led to new connections; and why so many people confuse variety and variation in knowledge work, and why he thinks that is such an important concept. Gene Kim (00:05:39): Okay. Let's go to the interviews. And I hope you enjoy these conversations as much as I do. You're listening to The Idealcast with Gene Kim, brought to you by IT Revolution. Gene Kim (00:05:55): Okay. The first interview is with Patrick DeBois. So, within the DevOps community, we call Patrick the godfather of the movement. This is because Patrick coined the term back in 2009. He organized the first DevOpsDays in 2009 in Ghent. And thanks to John Willis, who you'll hear from next, I was able to meet Patrick in 2010 at the first DevOpsDays in Mountain View, California on the LinkedIn campus. It has been such a pleasure to work with Patrick over the years, not just on The DevOps Handbook, but on so many other projects and interactions. I got to work with him for nearly a year in 2020 during the middle of the global pandemic. It started when I was personally trying to figure out, how do we run an online virtual event. After attending scores of them and talking with tens of online event organizers, the one that I really loved the most was the [All the Talks 00:06:46] Conference, which Patrick ran. So, I asked him for his help to help us deliver the DevOps Enterprise Summit virtually, which I thought were astoundingly successful. One person even said that they enjoyed it even more than the live conference. So, Patrick helped us run two virtual events in 2020, and we've now done four of them using a formula very similar to what Patrick had originally created. So, during those two conferences that we got to deliver together, I was able to experience firsthand just what an amazing technologist he is and his incredibly paranoid sensibilities around operations. My favorite moment was when we were doing a live dress rehearsal event. This was before the second event that we were doing in October, 2020. And while we were watching the video stream, we noticed a strange video glitch that would show up maybe once every 30 minutes or so. The video would gray out for just a frame or two. It could have been a local network issue, but we all saw it. We were running out of time with the real event just one week away, and there's still so much work to do. But to my surprise, Patrick became fixated on this issue. I must have made a comment, maybe making fun of him for being spooked by something that seemed so harmless. But then he said, "These are the issues that tend to blow up on you in production." Gene Kim (00:08:02): So, here's the surprising twist to the story. He ended up opening up a ticket with easylive. This is the video streaming platform that we use. And it turns out that there was a known issue that videos hosted on Dropbox would have video glitches because they couldn't be downloaded fast enough to be streamed. So, we ended up uploading over 100 gigabytes of videos into easylive. My idea of uploading them into Google storage buckets or S3 was rejected by Patrick because he felt that it introduced an unnecessary failure point. My argument that it has over four nines of availability was not compelling to him. Gene Kim (00:08:40): I tweeted about this on October 10th of last year. I said, "I'll admit that had you told me a month ago that Patrick DeBois would recommend uploading all video files to easylive versus leaving them in S3 or Google, I would've laughed at you. Now, I think his paranoia that the entire world is actually out to get you is probably warranted." I wrote, "For me, this was a phenomenal example of needing to pay attention to those weak failure signals because dismissing these events as unlikely to repeat can lead to disasters. In other words, normalization of deviance. Something happened and nothing terrible resulted, so therefore it must be okay." I quote Patrick DeBois. "It helps knowing how unreliable these services can be at times." Gene Kim (00:09:25): Okay. Enough stories about Patrick DeBois. In The DevOps Handbook, he wrote about his DevOps aha moment. He said, "For me, it was a collection of moments. In 2007, I was working on a data center migration project with some Agile teams. I was jealous that they had such high productivity, able to get so much done in so little time. For my next assignment, I started experimenting with Kanban in operations and saw how the dynamic of the team changed. Later, at the Agile Toronto 2008 Conference, I presented my IEEE paper on this, but I felt it didn't resonate widely in the Agile community. We started an Agile system and administrations group, but I overlooked the human side of things. After seeing the 2009 Velocity Conference presentation, 10 Deploys a Day, by John Allspaw and Paul Hammond, I was convinced others were thinking in a similar way. So, I decided to organize the first DevOpsDays event, accidentally coining the term DevOps. The energy at the event was unique and contagious. When people started to thank me because it changed their life for the better, I understood the impact. And I haven't stopped promoting DevOps since. Gene Kim (00:10:35): Can you describe how much admired how productive they could be, and contrasting that to what you saw in operations at that time? Patrick DeBois (00:10:42): Yeah. So, at the time, I was working in the government where we rolled in the Sun systems, like Sun Microsystems, on a carriage, and then we had to cable this hand by hand. But luckily, at that time, there's something like Solaris Zones came out, like, a little bit of virtualization. And I remember the teams doing their TDD dance and kind of coding and getting this feedback loop. And while I was kind of automating the zones, all of a sudden, I got a speed feeling, like, I wasn't installing a machine, waiting for the boot loader, then kind of for the ISO to go on. All of a sudden, you say like, "Okay, start a new VM," it was like, boom, and it was running. That was insane. So, I felt the same energy of feedback levels that they were telling me about, "Oh, you code. You save. You kind of run your tests. And it's running." And that environment wasn't there before. So, I felt whenever they needed something, I was getting faster at the systems level. You need an environment for testing? Boom, it's one script and I got it. And even though you could automate the hardware as well, it was always slower. It was never that instantaneous feedback loop. Patrick DeBois (00:12:03): But I think what I like the most from them is they actually showed more of a human part of collaboration, which is... [inaudible 00:12:13] job, right? Maybe some people remember the [inaudible 00:12:17] operator from hell. It wasn't that disperceived that's the social function in kind of the IT. You're some guy in the back or some girl and you're making sure the systems work, but there wasn't this gratitude of, "Hey, you're doing a good job and you did something," because on the coding level, the features give value, but the fact that it was running, people assumed it's wrong so you never got this ingratitude. Patrick DeBois (00:12:45): And I think when I got closer to that same team, I kind of felt the same bond. I started seeing what they needed as a feature. And in that case, it was identity and access management system and they needed multiple environments for testing. And I could deliver that to them in a nice way. And they would ask me in the same way. So, I felt way more integrated in their project due to the fact that I could deliver faster than I was before, where they would ask me and I was like, "Okay, hold on. I'm off for two weeks, cabling and racking and stacking the new server that comes in." So, that's kind of a different vibe. If you see what you're building immediately used, immediately you learn what's not working, what should be changed. And then when you want to change this, this gets faster. So, I think that technology change of getting that feedback allowed me also to belong better in the feedback of the business and the development team. Patrick DeBois (00:13:50): And then I got, in general, Agile, the collaboration, it needs to function and so on. Those were values I've been advocating for years because I was always receiving broken software. Not always, of course. But you kind of get the tarball. You get the thing. And then you have no say about it. And then you kind of have to reverse-engineer this. So, that feeling about, again, you're at the source, changing people, what they need to do, what they can help you, and you helping them. I guess, that was the Agile feeling due to the technology change that occurred at that time in my life. That kind of changed my view on the collaboration between the two teams. It's not that I hadn't been working with devs before, but definitely the speed of feedback changed at that time. Gene Kim (00:14:42): Yeah. In both directions. Patrick DeBois (00:14:44): Definitely. Gene Kim (00:14:45): Interesting. And by the way, Solaris zones, when you had to spin up a new VM, are we talking about milliseconds? Seconds? Minutes? Just how fast was it? Patrick DeBois (00:14:55): It was a couple of seconds. If you have your template, then it was just almost like a ZFS clone and hub. You had the new system running. It's very similar to what we would now do with the docker, but it's different technology, but the kind of similarities of speed were definitely there [crosstalk 00:15:14]. Gene Kim (00:15:14): That's interesting. Patrick DeBois (00:15:17): Yeah. Gene Kim (00:15:17): Just a little side note. I mean, they say that if build times are more than, say, two to four minutes, productivity goes way down because after you cross four minutes, it's hard not to go work on something else and you lose that sense of flow. When you're talking about spin-up times in seconds, I mean, that must have been incredible. Patrick DeBois (00:15:34): Yeah. It's a life-changer of the behavior because all of a sudden, those long, dreaded, I write a script and I have to wait and I have to see whether it completely completes by the end of the day or something and it takes me, like, 500 iterations to write that one script and I'm never going to rewrite this again, all of a sudden you say like, "Okay, I'm getting confident that when I run this and I run this again, I run this again, it's giving me the same results." So, that repeatability, again, gives you the confidence, like you're saying, that you can deliver what somebody asks in a good way and you're not always drawing the short straw and saying, "Hey, you know what? I don't really know what's going to happen, but I'm on call. And I'll take care of it." Patrick DeBois (00:16:22): So, it gives that same level of control, I think, which is similar to when you haven't done TDD and you're changing in an existing app, and you don't have the tests, you don't have the fail-safe environment, right? And that recreation, like, nobody wants to touch it because it's the one [inaudible 00:16:42] system nobody wants to use. And obviously, that same thing is for machines as it is for code, like, that one tarball or the one piece in the code nobody wants touch. And it's very similar. But if you got the testing and the speed to reproduce, then you feel safer to change it, change it back, see what happens. And I think, again, that feedback loop on the technology side helps the feedback loop on the human side, on the collaboration. Patrick DeBois (00:17:12): And I think if you would now take this into the extreme with kind of mob programming, they're not waiting for the handovers even within the teams. Like, even if you say, "Okay, you're on the next desk and I'll send you a PR in a couple of hours." No, we're even skipping that part. We're just looking at the code together, not just for the preview. And it's interesting how there's a kind of a emotion against it, "Oh, this cannot be effective. Five people looking at the same code at the same time. What a loss of time." But if you think about the learnings, the things you've prevented, the kind of speed, all the things that you're doing at the same time, that totally makes sense." So, I think we're still at the beginnings of the collaboration in many organization. And I think it has massively improved over that period of time in 2008 to now. Patrick DeBois (00:18:10): But almost on the flip side, like when a new company starts, they're all about team autonomy and collaboration. It's all good. And they have all the tools in place. And I've seen actually the pattern being reversed in companies where they say, "Oh, you're all autonomous. Everybody can do what they kind of think needs to be done. Not what they want, but what needs to be done." And then after you have 15 or 20 teams, all of a sudden, we're thinking like, "Oh, maybe we need to streamline this a little bit more so we're not reinventing the wheel 15 times." And this becomes the opposite: there's a lot of enthusiasm, I feel I can control my own destiny, and all of a sudden, it changes the dynamic, like, "Oh, maybe we need to collaborate and ask the others." Patrick DeBois (00:18:58): So, one of the nice things there is that what I found is that on these collaboration techniques, and especially in autonomous teams, people mistake that they kind of have the ownership of everything. I think just as DevOps, it's not whether the dev should do it or the ops should do it or who actually does it; it's about the affinity and understanding what the other person does in a good way. And that collaboration is actually making the difference useful of the collaboration. And it's a pattern. On the collaboration side, when you're doing something, the question you should be asking yourself is, how is it impacting the other person? And if you think there's an impact, you should ask that person for advice. You still have to ask whether you're impacting something else. So, you can't do it in your own silo and think like, "I'm a microservice. Here's my API. Go off. Read it." You kind of always have to think like, "If I'm doing something, what's the impact on the system there?" Gene Kim (00:20:04): Back to 2007, 2008. You discovered this totally new dynamic and a way of working. So, you go through the trouble of writing this IEEE paper. And you go present it to the Agile Toronto group in 2008. To what extent were you able to share that enthusiasm? And to what extent was that enthusiasm understood by the community at the time? Patrick DeBois (00:20:23): Well, when you give a talk, and the whole room is looking at you and you're like, " They're looking at it." "What is he talking about?" And half of the room is leaving your talk and like, "Okay, what's happening?" And it's happened to me. Not at that time. But then there's always, like, the two people somewhere in the audience that say, "Yes!" Patrick DeBois (00:20:50): And I think at that time, even though I presented, there were people, I got a remark, and then everybody goes away, as it is very common in these conferences, but then all of a sudden, that one person reaches out, is like, "Yes! I think that's a good idea." And obviously, you can't generalize whether that's a good pattern or bad pattern because if it's a bad idea and only one agrees, then maybe it is a bad idea. But if you're enthusiastic about it and you believe in it, you start seeing other people doing similar things. And that's, I think, a good thing. Patrick DeBois (00:21:30): There was a tweet yesterday about, is the customer always right? Do people always know what they want? And I think Dan North's reply to, look at their behavior. So, if you have an idea and you see other people doing similar things, there might be something there. You can't really judge whether it's a good idea or bad. But I got to learn Jez Humble later and Chris Read and the people at ThoughtWorks, who were kind of doing similar things. And I think that's what the power was of the first DevOpsDays the year after is that you kind of get these people in a room and they're all about, "Yeah. I think we should do something about this." And that creates the vibe. Patrick DeBois (00:22:16): Coming back to the 2007, 2008. Of course. Yeah. You feel lonely, but you kind of somehow read hints in people, even if they don't mean it, like Michael Nygard or [Elizabeth Henriksson 00:22:28], who were also in the book, they were like, "Oh, what happens in production?" And you're thinking, "Yeah." And I remember sitting with David Anderson on Kanban. And he [inaudible 00:22:39] the whole book. And I remember sitting in a talk in Antwerp. And he explains the whole thing. He was like, "Oh, and who should be in the Kanban pipeline?" And it was like I raised my hand and I tell him, "I think the [inaudible 00:22:53] and the ops." And the answer he gave me was, "Good luck." And I'll never forget that. But he says like, "Yeah, I think it's a good idea, but the mentality is still not there." Patrick DeBois (00:23:07): And I think, in that way, of course, it could have been totally different and I could have dismissed it, but I think even after the first DevOpsDays, I thought like, "Oh, this was fun, a get-together." But I think because others kind of started being enthusiastic as well, that that drove me to understanding, "Hey, maybe there's something there. Why would somebody fly from the other side of the country or the world and come here and then want to run a conference in Australia, in Mountain View, or wherever about the same thing?" And then it dawned to me, still, okay, it might be the one conference and you don't see a series in it. I think a lot of people immediately, when they have an idea, they think about the franchise and how it could grow and become the big world. I'm not like that. But I'm glad others saw the opportunity and took it from there. Gene Kim (00:24:03): You had written how you were dazzled by the 2009 Velocity Conference. Can you talk about what it is you saw there and maybe connect the dots from there to DevOpsDays 2009? Patrick DeBois (00:24:15): Oh. So, when I saw the presentation from John Allspaw and Paul Hammond, I think I remembered mostly the heart. I think a lot of people at the time when they looked at that presentation, it was more about, "Oh, we got the metrics and the monitoring. And we're measuring everything." And they came at it from the kind of technical side. And, for me, this was more proof about the team's collaborating and working as one team. So, that's kind of where it struck more in the narrative. I don't know how it was for other people. I guess you were in the room, right? Gene Kim (00:24:57): No, I wasn't. It's funny though. It's funny you mention the slide that you remember most. For me, it was the Scotty versus Spock. I was like, "Holy... " right? The fact that they were two very different archetypes was the most amazing depiction I had seen. That's interesting that you mention that. Patrick DeBois (00:25:16): Yeah, yeah. It was different roles at the time, but I guess maybe I didn't see it that much because maybe I was already infected in the way that I lived in both worlds, maybe also doing some coding, doing some admins. And it's always been from the day Java zero something came out, I was doing that coding, but I was also running it in a Netscape LiveWire server or something like that. But it was always kind of a collaboration. Patrick DeBois (00:25:44): And I think you get that quite often from people who were there in the beginning. They had to run their own stuff. And then later, that got more separated when it became more complex. And then, for some reason, we needed to reunite again. And jokingly, I would say now, where everything runs in the cloud so we're back at the mainframe level. But everything's circular in that way that we don't see who's running it and it's somewhere behind that. Patrick DeBois (00:26:10): So, again, I feel, in a way, we're heading back to silos. We're building abstraction layers. We're trying to find ways that we don't need to talk again to each other. We think, "Here's an API. Here's the docker file. Everything solved." Guess what? If things fail, again, we need to work together. So, yeah. That's kind of a circular observation that probably, if you have gray hair, that you start spinning the wheel a couple of times in different directions over the years. Gene Kim (00:26:43): Awesome. By the way, this is great. By the way, just a quick story for you. Did I tell you about... I was hanging out in a Meetup here in Portland. This is actually where I met Corey Quinn in person. And someone was asking, "Gene, what are you working on these days?" And I was like I just had to gush about Clojure. And I was just talking about how much fun I'm having with all these apps I'm building for the first time by myself. Gene Kim (00:27:03): And I see this look of horror around me. I'm like, "What did I say?" Yeah. It's like I'm like, "What did I say?" Okay. I was like, "Said Clojure, functional programming language, runs on the JVM, blah, blah, blah." And I'm like, "What did I say?" They're like, "Oh, no, you're fine, Gene." I'm like, "No, really." Then I realized, "Is it that it runs on the JVM?" And everyone's like, "Yes!" So, these are all operations people and, to them, all they heard was, "JAR file. Good luck." Patrick DeBois (00:27:31): Memory leaks. Gene Kim (00:27:34): [inaudible 00:27:34]. Patrick DeBois (00:27:35): Yeah. [crosstalk 00:27:35]. Garbage collection. Gene Kim (00:27:38): And it just blew me away that that scarred an entire generation, where all they heard was, "Out of memory error. Here's a JAR file. Good luck." So, to me, that was like the reawakening that I had forgotten. I had forgotten about those days. Gene Kim (00:27:59): Okay. Gene here. A couple of clarifications and expansions. Number one, I had the story about Clojure running on the JVM horrifying my ops friends at the Sensu Summit in 2019 here in Portland, Oregon. This was such an amazing experience to me that I wrote this up in a [SysAdvent 00:28:19] article. I'll put a link in the show notes. Gene Kim (00:28:22): It's funny to me that Patrick, in the earlier part of the interview, talked about getting a tarball and then having to deploy and run it. I think so many ops professionals have a similar experience about Java JAR files. They all have their horrific war stories about being thrown over the wall incomprehensible and completely opaque JAR files, which then invariably detonate in production, resulting in endless firefighting at night, on weekends, during birthday parties. Gene Kim (00:28:50): Just to explain what was so surprising to me. If you had asked me in that moment, what I thought about the JVM, I would've said something like this: "The JVM is amazing. Clojure runs on the JVM and takes advantage of the billions of dollars of R&D spent over 20 years that has made it one of the most battle-tested and [inaudible 00:29:05] compute platforms around. And it can use any of the Java components in the Maven ecosystem." So, Maven is to Java as npm is to Node.js; gems is to Ruby; pip is to Python and so forth. There is just so much innovation happening right now. Thanks to Red Hat's Quarkus, Oracle's GraalVM, Amazon Corretto, Azul, and so much... the JVM enabled me to be so productive. There's never been a better time to be using the JVM than now." Gene Kim (00:29:31): But after that astonishing evening at the Sensu Summit, for weeks, I kept on thinking, "Am I having so much fun programming in Clojure, being a dev, that I've forgotten completely what it's like to do ops? Is it possible that dev and ops really do have two very different views of the JVM?" And so, as an experiment, I put out a following tweet, "I'm performing an experiment and will report out on the results." I asked people to reply to this tweet with two pieces of information, one, which do you identify as, dev or ops; and then, number two, type any words or emotions or emojis that come to mind when I say JVM or Java Virtual machine. Begin. Gene Kim (00:30:05): And amazingly, I got over 300 replies, which included some of these gems that evoked bad memories from the past. Very annoying memory hog. Write once, run anywhere. And it's running slow, let's give it more memory. Pain, anguish, suffering, screams of why. Bane of my early ops existence. Another 3:00 AM call-out. Out of memory. And so forth. I got another group of responses that sounded like this: fast, reliable, ubiquitous, easy packaging, brilliant piece of engineering, solid, battle-tested, stable, rich. So, pretty amazing how two different communities react to one simple word. I'll put a link to the 2000 Word blog post in the show notes. Gene Kim (00:30:49): Number two. Mob programming and the notion that we're only at the beginning of learning what collaboration might look like. I'm going to read from the Wikipedia article. Mob programming is a software development approach where the whole team works on the same thing, at the same time, in the same space, at the same computer. This is very similar to pair programming, where two people sit at the same computer and collaborate on the same code at the same time. With mob programming, the collaboration is extended to everyone on the team, while still using a single computer for writing the code and inputting it into the codebase. In addition to software coding, a mob programming team can work together to do almost all the work a typical software development team tackles, such as defining user stories or requirements, designing, testing, deploying software, and working with the customer and business experts. Almost all work is handled in working meetings or workshops. And all the people involved in creating a software are considered to be team members. Gene Kim (00:31:44): What I find so interesting about this is that when I first heard of pair programming back in, I guess it must have been around 2007, 2008, this is when Kent Beck wrote his Extreme Programming book, the notion of pair programming seemed preposterous. It seems like it would just halve your team's productivity. It almost seemed immoral. And yet, I can personally attest to how valuable pairing on a problem can be. I think it's one of the best ways to do knowledge transfer. And there are so many types of problems where having two people working the problem, you just end up with far better outcomes and the solution comes much faster. I don't have any experience doing mob programming, but it wouldn't surprise me that there will be a whole category of problems where this is likely the right way to collaborate: not in tickets, not in email, not in Slack, but in one shared work environment. Quite frankly, I'm excited to get more experience doing things like mob programming. Gene Kim (00:32:40): Number three. Patrick mentioned the notion of autonomy versus standardization. This topic that Patrick brought up has come up so many times in the DevOps Enterprise community. I want to talk about three presentations that talked about how large complex organizations have dealt with this problem. Gene Kim (00:32:57): The first presentation came from 2015. Ralph Loura, CIO of the HP Enterprise Group, so this is before they split up the company, speaking with Rafael Garcia and Olivier Jacques. Rafael talked about the need to create buoys, not boundaries, though the metaphor that he presented was that of a river channel. And so, if you use the shared platforms that are officially supported, you are going to be guaranteed to be using the parts of the river channel that are safe, that are dredged, that they can make certain guarantees about, that they have vendor relationships for. But if you need to use tooling to solve a business problem that you understand better than anyone else, you can stray beyond the buoys. So, you have to follow certain principles. You still have security and compliance requirements. He went on to say that maybe those will be the areas of innovation that will create the next generational platforms that everyone within the organization can use. So, I thought that was just a beautiful way to describe a different type of governance system that's very different from the way we used to mandate certain tools to be used. Gene Kim (00:34:03): The second presentation I love on this topic came from Target, from DevOps Enterprise 2018. This is from Levi Geinert, then director of engineering at target, speaking with Luke Rettig, principal product owner, and Dan Kundiff, principal engineer. The story they told was that back in the early days, say, 2014, 2015, they had given as much authority and freedom to the teams, maybe compensating for decades of depriving teams from being able to choose their own tools and techniques. Levi talked about how it seemed at times that every team had chosen a different tool, creating all sorts of problems. One of them was team portability, that if a developer ever wanted to join a different team, they would have to use an entirely new toolset. They talked about their new approach to standardize. And it was essentially one list that lived in a GitHub repo. And the list would divide up tools into three categories. One is recommended. In other words, we love this tool and here are all the groups that are using it. In the second... PART 1 OF 4 ENDS [00:35:04] Gene Kim (00:35:03): ... Tool, and here are all the groups that are using it. The second category is we haven't decided yet and here are all the teams who are using it. And the third category is do not use, actively deprecating. So often these might be technologies from vendors whose business models are diametrically opposed to targets. These are technologies that are being actively removed from the organization, and so you definitely don't want to be using them. These might be sometimes database vendors or middleware vendors, and they're now using open source technologies for them. And that's it. I love it just because, again, this is so different than what a centralized architecture group would often do, mandating the use of certain tools and technologies so far removed from where problems are being solved. Gene Kim (00:35:49): The third presentation comes from Comcast in 2020. So this presentation was given by John Moore, Chief Software Architect, and Senior Fellow at Comcast Cable and Michael Winslow, Senior Director of Software Development, and now distinguished engineer. And they took a different approach. They say, "We want all teams to be innovating, but not in certain areas. For example, continuous integration pipelines. Here, we don't want innovation because it comes at the expense of more important things." So in the presentation, they use the metaphor of the railroads in the late 1800s. There were over 10 different railway gauges being used. It meant that trains couldn't transit across the entire rail network. It meant that passengers in cargo would often have to switch trains to get between different line segments. Gene Kim (00:36:39): So they performed an inventory of CICD pipelines being used, and they found that there were over 14 tools being used, and the goal was to create one. What's interesting was the deliberate process they went through. The goal was not to make everyone happy, instead they went through a very deliberate process to try to satisfy the needs of as many groups as possible. So they had a scale from one to five. Five, best solution ever. Next one was best option from what's available. Three is not my first choice, but I get it. Second is I could support it if... And the number one would be, it would be a terrible mistake. The goal was to pick a solution that maximized the number who answered three or above, and they did this by understanding who had concerns around number two. In other words, I could support it if... And focused on how they could get those people who answered number two to three or above. Gene Kim (00:37:29): It is an amazing story. And incidentally, by going through this process, they eventually chose Concourse CI as their solution, open source, free and something that already had some internal expertise around. And what I loved about their presentation was how Jonathan Moore said that they accomplished this even though there was no short term benefit at all to the teams, but everyone recognized that there was long term benefit to the entire organization. Fantastic presentation. I will put a link to all three presentations in the show notes. Number four, Patrick mentioned Jez Humble, Chris Reed, Dan North, you're going to hear more about those stories in the Jez Humble interview. That will be in the next episode. Number five, the famous 10 Deploys A Day presentation from John Allspaw and Paul Hammond. The full title of the presentation is 10 Deploys Per Day Via Dev and Ops Cooperation at Flickr. I will put a link to the slides and the Velocity Conference video in the show notes. It's so fun to look at these slides again after nearly a decade. Gene Kim (00:38:40): There are two slides I just want to describe. One is a slide I had mentioned, the Spock versus Scotty slide. So a depiction of development is Spock, described as a little bit weird, sits close to the boss, thinks too hard, whereas the embodiment of Ops is Scotty, pulling levers and turning knobs, easily excited and yelling a lot in emergency. I love this slide just because I think it does a wonderful job characterizing the two archetypes of devs versus ops. But the slide that Patrick mentioned was the heart slide, which is basically a big pink heart with dev and ops in the middle, and it's interesting to reflect back in the early days of the DevOps movement. One of those two slides would show up in almost every presentation about DevOps. Oh, such fun stuff. Okay, back to the interview. The DevOps Handbook came out in 2016. What has been the most surprising thing for you since that book came out six years ago? Patrick DeBois (00:39:40): Yeah. So obviously, it's a little bit on a personal level in a way that it's my own experience [inaudible 00:39:50] as a whole. I worked for four years in a startup and I thought, "You know what? I know DevOps, right?" I'm not like that, but some people would say how hard can it be? You do some automation, da, da, da, and you're done. You have an insanely good product and company's going well. And I think it took me a year to convince the team to do check-ins of code and testing and da, da, da. And instead of doing Git pulls on production machines and those parts, so we learned as a team, even though you might have the knowledge, it takes time for people to absorb the knowledge. Patrick DeBois (00:40:35): But I think by learning that once imagine the pipeline was at a level of quality which is good enough, we had an insane need for speed deploys and we could release during live production with tens and hundreds of thousands of people live. So we got the pipeline covered. My aha moment was that even if we got everything of that solved, I started looking at where's our next bottleneck? So I stopped looking at the technology bottleneck. Was it that we didn't get any money? Was it that we couldn't hire people? Was it that our marketing was bad? So I tried to extrapolate the same concepts of where's the bottleneck in the system, but I stopped looking at the technical bottlenecks only. So I extrapolated that to the whole company, and I think we tried really hard to assess that. But at the same time, I actually learned... So for me, that was an aha moment is that each of these parts, they have their own pipeline. Patrick DeBois (00:41:43): So if you're doing marketing, they will tell you they have a pipeline of qualified leads that they need to go through and they need to validate and they need to test, and then hopefully they can convert and then they can check in and then they do checks periodically on the same contact. And the same was true for on the sales side. If you are giving fast feedback when you're doing sales and you're saying, "Well, this thing didn't go so well, we had a problem," but the salesperson was able to communicate this really well and do this in a transparent way that the customers could come on our Slack and chat with us what they're happy with or what they're not happy with. So it shows me at least that the same patterns are applicable to other parts of the companies. Patrick DeBois (00:42:36): And okay, in the end, maybe our bottleneck was that I tend to believe that there was not a market with what we wanted to achieve. It could be wrong and we could have done a horrible job, but I tell myself that there was no market because even the competitors started leaving the market for the same money. But the fact that there's so many pieces in a company that need to align, and it isn't just the technology and everybody's actually connecting their pipelines to each other. If I do sales and you're not doing a good job on technology, there's an impact on the sales, and it goes in different ways. So for me, that extrapolation of the DevOps model to a more generic business as a whole was a learning. And then the second part which I learned is that they always tell you keep to your core of your business. Patrick DeBois (00:43:31): So don't do everything yourself. So even though I could build my own monitoring system, I could build my own cloud, I refrain myself from doing and say, "Okay, I'm going to stick to the pattern." You do your core and you do that well. But on that part, I learned that at first I was very curious. Oh, we're using Amazon, but I can't talk to Amazon. Oh, we're using a video service? I can't talk to these people. And I thought, "Is this the end of DevOps where everything's abstracted behind a service and an API and there's no collaboration that actually happens?" And I think I started seeing this as the suppliers, like a SaaS or a third party dependency now with DevSecOps you would say open source is a supplier in a way, and they're all kind of people or things that you need to collaborate with. Patrick DeBois (00:44:26): So why do you get the support contract? Well exactly for this, because you want to talk to those people, and in case things fail, you get insight. That's why you talk to these people at conferences and learn why you should change things to the dev or to others. So there are other ways of communicating, but in a nutshell where the first aha was about all these pipelines are connected. The second one is there's also a bunch of pipelines lining from outside your company to inside your company. And that is another point of collaboration and connection of where you need to make sure you are not throwing things over the wall. "Hey, I want to do a campaign. Run it. I won't tell you anything." No, no, you cannot be actively engaged. Patrick DeBois (00:45:16): You want them to understand the business and similarly. It might be and then it's probably more [inaudible 00:45:23] on a supplier level. But I think it's valid also from the internal thing. I learned that while you have continuous integration and continuous delivery, what the external services taught me, especially in the mobile space, is that you're going to be continuously rearchitectering. And in a way that if something doesn't align with your interest or the way it should be done or you need it, you want it to be able to swap this in and out of your environment. And it's very similar where you would say it's the swarming of the ecosystems of the team inside a company. We're independent, but if you want to collaborate, we will collaborate. But it was also how the pressure will build on the internal IT systems. "Hey, these guys can do it. They're outside the company. Why can't you do this?" Patrick DeBois (00:46:13): And it's very similar pressure on if one service can do it and then the other services do a better job, you're just going to swap. And there's no hard feelings, but it's a better collaboration. But if you want to take this into a learned term relationship, you got to give the feedback back to their system. So one of the things I always do when I try a new service is I send them pages of notes to give them feedback on what I discovered at first try. And I hope I've got good responses on this is that from engineers that say, "Finally, I can take this internally, this feedback, and tell my boss somebody's suffering from this." Otherwise, other times there's no feedback or no response. But I think if you're building this long term relationship, this is part of how you do it. And it's very similar to what you would do internally. I don't know if this resonates somehow, Gene, with you. Gene Kim (00:47:12): Oh my gosh, no, it's so good. Absolutely. This is great. I do the same thing. It's always good to have friends, especially in people you depend on. What is your favorite pattern? That's in the DevOps Handbook. So favorite technical practice, favorite architectural practice, what is- Patrick DeBois (00:47:28): In the DevSecOps world, I actually like the threat modeling, and I think it's a similar exercise to a CodeKata, or it doesn't really matter whether it's value stream mapping in that way. And I think there's value obviously in creating the view, but I think the actual value is about the collaboration to create the view. That's why it's one of my favorites and what I would always start with in a way that, "Oh, we want to understand what's happening. Let's align our world views on how this is happening." And then the more people you get into the mix, the more broader your view becomes. But also, the more people start adjusting their views. So I would say that's probably my favorite thing. And of course, it doesn't matter whether it's threat modeling or in a way you could say a PR review is very similar. Patrick DeBois (00:48:23): I'm looking at what you think is best, but let me give you some feedback. And it's those practices that I like the most. And I like them because they're disguised as a technical thing. But in essence, they're actually the human side of what I think should be in DevOps of the calibration, the sharing of knowledge. That's why I'm sometimes a little bit sad when people talk about GitOps. And no, I like the pattern that people are saying, "We're doing this through the CI and through the check in." I would also say it's a missed opportunity to actually work around the collaboration in the same aspect. So why you're not focusing on capturing more knowledge at the time or on the feedback or turn that into a lesson that the whole team can share. That's why I'm like, "Okay, I like the technical practice," but I missed that same collaboration aspect in that part as well. Gene Kim (00:49:27): When you bring up threat modeling, it so much reminds me of the exercise that you went through in terms of what are all the things that can go wrong delivering an online conference? What are all the points of failure? Is that what you're talking about as well? Patrick DeBois (00:49:40): It is in a way. Yeah, yeah. Yeah. But I think when we did an exercise for the conference, I would think when I explain these things, it showed you the things you couldn't see about the system. And that's what I think the value was is that, okay, I'm not aware this could happen. Then you ask me what's the likelihood this could happen and what is the thing you can make sure it doesn't happen and what could be the strategy to mitigate it? And I could have done this on my own, but the fact that you were aware of this when it happened, and I remember that one time when we had to switch to Zoom because the whole streaming thing stopped because we set the wrong hour. I don't know what it was. But the fact that you were aware and that was kind of thought through, again, it's the sharing of knowledge and then making sure you are prepared. Patrick DeBois (00:50:38): And we went through the scenario beforehand, you knew it was coming, and I think that was the value of it. You could think, "Oh, it was a technical exercise and I showed off." But that was definitely not the goal itself. The goal was that, like a threat modeling, everybody understands what is the possibility. We discuss whether you accept the risk or not accept the risk ahead of time, and then you also understand if you have a finite amount of resources, what do you do first? And there's some things you can't do, but imagine I hadn't explained this to you and it would happen. You're like, "Why did this happen?" It's like, "No, I told you." But we both decided together that this was not a likelihood, so we didn't go for that scenario in that way. Gene Kim (00:51:28): There was this amazing presentation given by a team from Vanguard. They invented the index fund, so I think it's $8 trillion in management. But they both came from a background of chaos engineering and they're young leaders and they went through something very similar with their business partners in terms of, "Hey, let's just explore all the things that can go wrong, and we just want to understand what is the best business response?" So some were technical failures, some were business operations failures. But for me, it was exactly what you're talking about. It opened up a line of dialogue and my reaction was if the team in the Phoenix Project had done this, that would've created a tremendously different outcome. What do you mean we could actually shut down all the eCommerce operations and everything else for three days? I think that shared understanding would've potentially have deflected the outcomes. Patrick DeBois (00:52:22): Yeah. But I might also think this comes from the fact that in the past there was a lot of fear of not delivering on time because we didn't understand, "Well what goes into a sprint and not, and then the debt marches and somebody was going to get fired at the end because it wasn't delivered." So that safety wasn't there, so why would I ever bring up all the scenarios that it could fail. We wouldn't talk about this. Let's hope it goes away. Gene Kim (00:52:58): Accelerated suicide. Patrick DeBois (00:53:00): Yeah, yeah. Yeah, yeah, yeah. And it ties into what you said before about the noise and the signal. Because we want to know and we want to be prepared, we're actually looking into that. Even if it's a weak signal, we want to look into it because we don't want to be surprised too much at that time. Gene Kim (00:53:20): Gene here, just a couple of quick break ins. The presentation I mentioned was a presentation from Vanguard Financial. This is Christina [inaudible 00:53:30] she's a site reliability coach, and Robbie Datesman, IT Delivery Manager. This is one of the best presentations on operations and infrastructure I've ever seen. Christina does an amazing job walking through the modernization of the vast technology stack, some parts which go back over 40, 50 years. I will put a link to it in the show notes. But the part of the presentation I referenced was Robbie Datesman and the work they are doing to support Vanguard's financial advising technologies. This is a comparatively new and important business area for them, and he specifically talked about an exercise seated with all of their business partners, something called failure modes and effects analysis, essentially going through something very similar to what Patrick talked about. Gene Kim (00:54:16): Not only what could go wrong, but what would the business response ideally be, and talk about potential countermeasures. I marveled at the dynamics that this type of dialogue creates in terms of joint ownership of the risks, the mitigations, and properly prioritizing how to prevent these things from actually happening. Again, this is one of the best operations and next generation infrastructure talks I've ever seen, highly recommend the stock. I will put a link to Patrick's tweet on all the failure modes of an online conference, everything from streaming, video mixing, speakers, interaction, host coordination, audience communication, and site, and the potential fallback measures. Number two, Patrick talked a lot about how do you do integrated problem solving between not just dev and ops, but between engineering and sales, marketing, finance, and so forth? One might notice that this is very much the theme of so many of the ideal casts over the last year and a half. Gene Kim (00:55:13): How do you integrate a problem solving in a way where you can keep the communications at the edges as opposed to up and down the organization? I find it super interesting and exciting that Patrick spent so much time studying this as well. Okay, back to the interview. This is so good. So let's talk about case studies, whether in the DevOps Handbook or not. Over the last 15 plus years, you've seen a lot of case studies, success stories, people telling stories about how they went from maybe a not so good place to a good place. What's your favorite one? What have you found tantalizing, interesting, inspirational? Patrick DeBois (00:55:51): I remember when I was really working on mobile continuous delivery and we held a conference. And Microsoft came in and they said, "Oh, we have this OneNote application." And people didn't give it good ratings and there was a lot of issues, and I know the team worked really hard to improve this to make this better. But I remember them telling me, "At a certain point, we decided in this period of getting feedback from the people to enable a feedback button in the app." And what they told me is that the next thing they had to do is scale the systems to collect all this feedback. So for me, that was a nice story about, "Oh, you think you got everything figured out, you got everything under control," and I'm sure they had operational systems running and good observability in a way. Patrick DeBois (00:56:47): But the fact that I opened up that to the feedback, and then all of a sudden it wasn't only business feedback. It was like, "Oh, this is too slow. And why, if I'm pressing this button, it goes red twice?" I don't know what, but I think it's a good story about when devs are building stuff, why they should not just look at their test results, but they should actually look at the production feedback to learn from that. And I'm not saying I'm on a mission, but in DevSecOps, you have this narrative about dev first and let's do it as early as possible in the value stream, and I think it's definitely good given the cost of when things happen in production, that you do things early. But what I don't agree with is that it might give you a false sense of security. Patrick DeBois (00:57:42): "Oh, we tested everything we knew. If we didn't see the bug, there is no bug." It doesn't work like that. So I started saying if you shift left, you should keep shifting right as well, just to understand that not everything can be fixed and you cannot foresee everything in the pipeline from the beginning. You do have to be prepared when things happen in production and you do have to listen to the feedback in production. And they might, again, not be strong signals, but I think part of you owning the service is actually looking at those signals as well. And I know people dismiss it as saying, "Well, this is not my job. Once it goes into monitoring and metrics, we're just going to use it to debug stuff." But I think it's proactively looking for those signals, why you do this. And that's, for example, why I like it. Again, it's coincidental that it was Microsoft. Patrick DeBois (00:58:41): Another example is when you go into Visual Studio Code and a part of a code that gets executed. They have a way of integrating, for example, with their serverless code, is that you actually get the feedback of how many exceptions we're running through this function, how many people executed this call, and then you can even set break points to parts of the code and it goes directly what is running in production. So I think that kind of feedback loop is still undervalued and I'm sometimes scared with platform teams and people are separating the layers again and it becomes a silo on its own. It shouldn't be, but there's a danger of abstracting things and that somebody who isn't responsible for things on the hood, they just say, "That's not my part." But it should be visible. And in my opinion, a good dev should still seek out that feedback and not just look at their test results or their backlog alone. Gene Kim (00:59:44): So what are you working on these days? Patrick DeBois (00:59:47): Yeah. So when people ask me to introduce myself, I tell them that over 20 plus decades, I've done pretty much all roles in IT that I could think of. The only one I haven't done is front-end designer because I'm color blind. But I think the last two years now that I worked in the DevSecOp space in a way was a surprise to me that the industry that spun up was becoming so big because I always saw security as a quality feature. And as an admin and an ops person, it's always been in their own my mind. So I didn't see it was becoming such a big thing, but I understand the community we needed to bring in on security was needed. Patrick DeBois (01:00:37): I think the dev centric approach, and I would rephrase that to have people decide and do the work where they have the best position to make a decision, so that's how I rephrase things in the DevSecOps. And I think when I started the journey, I also spent some time on supply chain. And in that way, I'm still interested in how this will evolve and the more breaches we have there. And again, there's a lot of will, still a lot of tooling to be built, and we're not even sure whether it's going to be a competitive advantage on the business. So that's a similarity we had in the past. When the blog post was there, Ops is a strategic weapon. Then all of a sudden it got proved we got more revenue if the systems keep running and so on. But we're still at the point that we need to prove that somehow with the security efforts. Patrick DeBois (01:01:35): So a lot of them are still ramping up and working on getting the low hanging fruit and doing the scanning. But the tough question is you have a finite amount of people, do you do features, do you ops or do you do a security part? And that's something that intrigues me how we're going to ever balance that in the backlog in a good way. So that's what interests me. And then more recently, actually I got pinged by promoter from my thesis way back in the day. And all of a sudden said, "Do you remember me?" I said, "Yes, I'm working on digital twins and I think DevOps is the key to unlock digital twins." So his life work was around modeling things, modeling factories and modeling now with sensors, we do predictions on things that are going to happen, but are too expensive to actually run in the real world. Patrick DeBois (01:02:33): And I started seeing similarities in the large data center. We're trying to build a cost model, for example. And all of a sudden I started seeing models and models. Why is it that when we do a model for the UI, it's not connected with the final thing? Why is it a throw away model and then we do this model again, where we're doing the testing, and when we are doing the testing, we're throwing that model again away when we're doing our reasoning when things fail in production? So I think there's something there about not losing metadata, not reinventing the model. Or maybe in a better way, saying keep the models in sync. And coincidentally, that's also the hard part of the supply chain security is that we've thrown all the metadata away and now we have to reverse the metadata. Patrick DeBois (01:03:29): And on the no metadata, we have to reason whether it's secure or not. So all of a sudden we're thinking, "Huh, what if we don't throw away who built it? What if we don't throw away how this was built?" And so I think there's a similar dynamic that I'm reading into it. But I think over the years, there's been an allergy towards models where when I started, everything needed to be strongly modeled, and all the specifications needed to be done. And we kind of threw this overboard because we said, "Well, even if the model isn't perfect, that's fine and we we'll change the model whenever we need and whenever it works." But I think reality made then that we're ending up with eight or nine different models in our company. So should we start syncing them somehow, whether it's on a compliance level, a financial level, or even in the future with your machine learning? Where does what run and how does it run? So anyway, so it's not my daily job, but it's my passion, I guess. Gene Kim (01:04:35): It was so great talking to Patrick just now. So after a short break, I will be speaking with another one of my co-authors, this time, John Willis. So here at IT Revolution, we've been hard at work bringing you in 2021, 2 DevOps enterprise virtual summits, this podcast, two issues of the DevOps enterprise journal, and a new immersion course from Dominica DeGrandis, renowned author on flow efficiency who dives into the fundamentals needed to help you better understand your organizational workflows and make them more efficient. And we just published a second edition of the DevOps Handbook, which this episode and the next episode are all about. I mentioned that as much as I love the first edition of the book, I love the second edition even more. The 15 new case studies, mostly from the DevOps enterprise community including Fannie Mae, Adidas, American Airlines, and the Us Air Force. There's over 100 pages of new or updated content, including so many of the solidified learnings from the state of DevOps research and the Accelerate book. There's a new forward and materials from Dr. Nicole Forsgren, there's an updated afterward from all five co-authors, and there's a new section with new resources at the end of each part of the book. I want to thank Dr. Nicole Forsgren who joined the authors team. On top of everything I mentioned above, this expanded edition includes new material from her. She's a good friend and a renowned researcher. She is currently Partner and VP of Research and Strategy at Microsoft and was Lead Researcher on the Study of DevOps Research and Lead Author of the Shingle award-winning book, Accelerate. In this world we're living in where we need to adapt more quickly than ever and create resilient organizations that can respond to turbulent times in order to help our organizations survive and win in the marketplace, the topics covered in the DevOps Handbook second edition are more important than ever. Gene Kim (01:06:28): Okay, let's get back to the interview. To introduce my next guest, I'd like to share a fact with you that you might not know. Many people ask me, "How did you get into this DevOps thing anyway?" And the answer I would usually give is, "Well, it was just a natural continuation of my 22 year journey studying high performing technology organizations which drew me to the center of the DevOps movement, which I've always thought of as so urgent and important." But if you kept on asking me, "Well yes, but how exactly did you stumble into the DevOps movement?" And eventually, you would hear me say well in 2010, I was at Tripwire at the time and I got this email out of the blue from someone named Damon Edwards and John Willis, who I'd never heard of, invited me to be on a panel at a conference that I've never heard of. But I went to it anyway. I was intrigued and I went to it and I was blown away by what I saw. Gene Kim (01:07:19): And it was there that I met the amazing group of Mavericks, and they were at the epicenter of the DevOps movement.That included John spa, John Willis, Patrick Deb, Andrew Schaffer, Dominica DeGrandis, and so many more. That event, it was DevOps Days 2010, the first DevOps Days event in the US. John Willis is likely one of the first people who succeeded in bringing these alien DevOps concepts into large enterprises. He cares deeply about improving the lives of technology workers in large complex organizations because we all know it needs improving. We were co-authors of the DevOps Handbook and John was our probably most enthusiastic expert on next generation operations and infrastructure. Gene Kim (01:08:01): He was at the frontier of that movement. He was a part of the Velocity community at the time of the famous Allspaw, Hammond talk in 2009, he was VP of Services at Chef in 2010, and he has certainly been on the frontier ever since. He and I also worked together on the Beyond the Phoenix Project audiobook, as well as the amazing panel that we did with Dr. [inaudible 01:08:24] Dr. Richard Cook, and Dr. Steven Spear on resilience engineering, safety culture, and lean. And I'm so excited that I'm getting to talk to him today. So John, why don't you introduce yourself in your own words and tell us what you're working on these days? John Willis (01:08:40): Yeah, thanks. So today, my day job these days is working for Red Hat. Andrew Clay Shafer, another pioneer of this DevOps thing we do, at the end of probably summer of 2019, called me up and said, "Hey, what would you think about coming over to Red Hat with me?" And I quite frankly said, "You're out of your Mind. Did you butt dial me, dude?" And he said, "No, I got Kevin Bear and Jay Bloom," and I think the world of those guys, and I think Andrew's one of the smartest guys I know. And I'm like, "Well, okay. Never say never." I listen, obviously, and then the next thing you know, I'm on a phone call with Jim Whitehurst and Whitehurst is- Gene Kim (01:09:15): The CEO of Red Hat. John Willis (01:09:16): Yeah, CEO of Red Hat, and he's asking me, "We really want you to come in." And all right, I even joked. I said, "I didn't think I'd ever say this to a CEO of $32 billion company, but why do you want me?" And he gave this really great answer, these great leaders give great answers when you put them on the spot. He said, "You and Kevin and Jay and Andrew who have been part of this 10 year discussion about DevOps, we point our customers at some of your writings, your presentations." He said, "We, frankly, want you here for the next 10 years to help us evolve that." And I was like, "All right, done. You win. Good to go." Some people hate the term thought leadership, and I'm not crazy about it either, but we've been given an opportunity to really think about the next 10 years and bring that into large customers. And it's just been- PART 2 OF 4 ENDS [01:10:04] John Willis (01:10:03): ... 10 years and bring that into large customers and it's just been a blast. They say you want be the smartest person in the room, be the dumbest person in the room. Right? So, I am. I just work with these three brilliant men and just get to sort of absorb all these great ideas and I love sort of boundary spanning as and being able to work with people who have all these additional spheres of education. Jay, for example, is getting a PhD in design transition at Carnegie Mellon, focusing on IT, I just get to absorb insane amount of knowledge that I would not otherwise get. So, yeah, having a fun time. Gene Kim (01:10:40): Awesome. One of the many things I love about you is that you are a person who has been near, or maybe in the middle of each one of the major epicenters of DevOps over the last 10 plus years from at least 2008 until now. So can you tell us about the early days of the movement and I'm thinking about, you had mentioned Andrew Shafer, Patrick Debois, fellow co-author Jesse Robbins. Tell us about what those early days were like, why did it take off? John Willis (01:11:07): 2008, 2009, there were these sort of things swirling around. We originally called the Cambrian explosion. It was just, you had cloud popping in, you had all this stuff and so my sort of try to not make this too long of a story, but I had been doing work with an IBM portfolio product called Tivoli. Proprietary and big, you go into large corporations and you install ridiculous amount of software that typically never worked back then. And it was very frustrating actually. I had one of the most successful consulting companies doing this work and we'd get our work because they'd spend two or three years with a big consulting company and it just wasn't going anywhere. And they had heard about us about 30 of us that actually knew the product better than anybody else, but it was still very frustrating. John Willis (01:11:56): And then I saw Luke Kanies give a Puppet presentation. And when I first saw him, it was an open source, an O'Reilly open source conference. And I thought like, what is this young kid going to tell me about configuration management? And five minutes later, my life has changed. I'm literally taking notes. He starts cutting me off from the one of the guys that asked way too many questions. And we went after dinner and I literally begged Luke for a job. And I think that's the only time in my career where I actually had to beg for a job. There was a window about a year where every time I'm like, "dude, you have to hire me. I know how to take this to the enterprise." John Willis (01:12:37): And then I ran into Adam Jacob who was building his own sort of version of a Puppet, that's a whole story into itself and- Gene Kim (01:12:44): And this is Chef. John Willis (01:12:45): This is Chef. Sorry. Yeah. And he was like, yeah. In fact he used some curse words like, yeah. When I first asked him, I'm like, okay. So only the work, the Chef is the old guy at like 50. That was just a tremendous time and what was happening with Patrick Debois, was base, we had the O'Reilly conference. And so I remember sitting in the back of the room, watching John Allspaw, give his presentation with Paul Hammond. And I kind of...there was the 10 deploys a day at flicker and I sort of half hardly joked that people were throwing up in the back of the room. Like, you can't do this, this will destroy humankind. This is just...the idea of putting 10 products in deploys and in that same day, what people don't realize is Andrew Clay Shafer gave a presentation that wasn't recorded called agile infrastructure. And he basically was showing that picture that we've seen thousands of times, which is the Dev, the Wall and the Ops, he called it the Wall of Confusion. John Willis (01:13:45): And like I said, it wasn't really...that wasn't videotapes that didn't have the lasting effect. But sometime after that, not too long, I heard about Patrick running this DevOps day thing. And so I sort- Gene Kim (01:14:00): That was in 2009. John Willis (01:14:01): Yeah. And one of the things that Chef at the time, which was what the beauty of Chef was, they just sort of give me a credit card and said, go figure this thing out. I mean, it was...not knows exact words, but literally I went to...I think even today people say I've been more DevOps days than anybody it's probably spoken it more and I just was able to go. And that was the first one that was this incredible...I told you, I was sort of getting to the point where I wasn't sure like the work what we were doing or I was doing was really meaningful. And then I saw all these young people, Steven Nelson Smith, Lindsay Homewood, obviously Patrick and Chris Byatt, I can go on and on. And they're given these presentations about stuff that Luke was talking about that was like, this is good stuff. And I literally got completely jacked out of that. And then, oh, and then that's where we met you. John Willis (01:14:55): So I don't...about six months later, it was only about 40 people that first event in Belgium. And it was a significant event, but Damon myself and a couple other people decided to organize one in Silicon Valley right after Velocity, O'Reilly Velocity. And there were 300 people in that one. And so about a week later, me and Damon did a podcast and we were like, we just couldn't even, what did, what happened? What was that? How was just the amount of energy that was in that. And you knew, I mean, I knew personally in Ghent at the first DevOps stage that this was real for me. I didn't know how real it was for everybody else, but then, and when we had it at the LinkedIn at first one you were at like at the end of that one, I'm like, "this is real." And then there's kind of a third ending to not ending part of the story that we got involved. We can talk about like working together and doing lots of cool things, but there was still a lagging question about, was it real for the enterprise? Gene Kim (01:16:01): You had mentioned to me years ago about the story of the birth of the Velocity conference. And I think Adam Jacob was a part of it. Jesse Robbins, the Master of Disaster at Amazon, you were telling me about how that community needed a home, that forward thinking since admins with wild ideas of what operations and infrastructure should look like. John Willis (01:16:19): The first Velocity I went to, I know I remember 2009 really well, and probably 2008, all bets are off when you get over 60 buddy. So I think I was at 2007. I think that one was at the San Francisco Marriot. Oh, that's right. Gene Kim (01:16:39): That's right. John Willis (01:16:39): That was the first one- Gene Kim (01:16:41): Right by the airport. John Willis (01:16:42): Yeah. By the airport. And, and there was a bunch of interesting things there. I hadn't been involved with Chef yet. Let's start with Jesse Robbins. Jesse Robbins was basically a fireman. He still to this day goes to burning man. And basically runs, creates a whole village of EMTs. I mean, that's what he does every year. So he was my CEO at Chef, right. So I started Chef late 2009. It was like the seventh or eighth person in there. I was the grownup basically. And so, his sort of history was that he was very early in Amazon. So I was mentioned a fireman like becoming a raccoon stack. Early [inaudible 01:17:20] admin, he moves up through the ranks because they grow so fast. At some point he becomes, self-titled, the Master of Disaster. Gene Kim (01:17:30): Gene here. Sometimes it's hard to keep up with John. So I may do more frequent break-ins than normal. Real quickly, Jesse Robbins was at Amazon from 2001 to 2006, his official title was Availability Program Manager and Master of Disaster. John Willis (01:17:44): And the reason he came to that, he took a fireman's or EMT approach to how he did infrastructure. So he was on one of the original papers, Lisa papers, or ACMQueue papers with Tom Limoncelli. And Jesse, Tom Limoncelli and then I'm going to apologize for the other two, it was a woman at Google and it was the original sort of break it in production or there was the paper about GameDay. They built this game day and Jesse, I don't know who was the biggest part, but Jesse, he would tell these incredible stories of GameDays at Amazon. I mean, you'd just be sitting around at lunch or dinner and you'd just rattle off like these incredible Bezos. And I don't know how much time you want to spend there. John Willis (01:18:33): One of my favorite all time is he would do GameDay and basically shut up the core routers between the different data centers. Everybody knew it was coming. It was sort of but you didn't know exactly when, and he'd say he'd do it and they'd get on sort of a large call with everybody. Like, I don't know, 400 people, I don't know. And it'd be like, "okay, Jesse, you've made your point, turn it back on." And he'd be like, "I can't turn it on. [inaudible 01:19:03] blown up"- Gene Kim (01:19:03): And just pause for a second. So the purpose of these GameDays is to rehearse large scale production disasters to see if they are as resilient as they thought they were. John Willis (01:19:14): Exactly right. Gene Kim (01:19:14): So he would pick a day. He would tell everybody when they were going to shut a data center down and everyone was responsible for creating services that were resilient enough to survive it. John Willis (01:19:23): Everybody's on a bridge call and they're like, "okay, Jess, you made your point. We got to fix this, we need to turn it back on." And Jesse would be like, "I can't turn it on. It's blown up." It's like veteran fireman who are doing a fake fire and they have the sort of the guy, the person who's playing the fake dead person. Like he doesn't wake up. You don't shake him, say, Hey, come on. So there be guys- Gene Kim (01:19:46): Please turn on the router again, please. John Willis (01:19:49): That's all...totally. And then Jesse like, "I can't and I just knocked it off." And then, I mean, they would threaten, "hey, I'm going to Bezos." He goes, and this is my version of it. But because he was always very theatrical in some of these great stories he would tell and he would be like, "well you can call Jeff. But all I'm going to tell him is the same thing I'm telling you. The data set is blown up." Gene Kim (01:20:11): So to be specific, mission critical revenue generating services are down. John Willis (01:20:16): Yes. But more importantly sort of like data center to data center. Gene Kim (01:20:20): Yeah. Right. In fact, I'm reading from the DevOps handbook. Robbins describes at Amazon, they would literally power off a facility without notice and then let the system fail naturally and allow the people to follow the processes wherever they led. Gene here again. So as John was alluding to these GameDays were a precursor to what has become known as chaos engineering. And John mentioned this famous ACMQueue paper called Resilience Engineering: Learning to Embrace Failure. A discussion with the aforementioned Jesse Robbins, Kripa Krishnan from Google, John Allspaw, and good friend, Tom Limoncelli who was then a site reliability engineer at Google. We wrote about this in the DevOps handbook. Robbins observes that when you set out to engineer a system at scale, the best you can hope for is to build a reliable software platform on top of components that are completely unreliable. That puts you in the environment where complex failures are both inevitable and unpredictable. Gene Kim (01:21:17): He quipped a service is not really tested until we break it in production. So in the GameDays they would define, execute, drill such as conducting database failovers or turning off an important network connection to expose problems in the defined process. Any problems or difficulties that are encountered or identified, addressed and tested again, Robbins said, you might discover that certain monitoring or management systems crucial to the recovery process end up being turned off as part of the failure you've orchestrated. You would find some single points of failure you didn't know about that way. These exercises are then conducted in an increasingly intense and complex way with the goal of making them feel like just another part of an average day. By executing GameDays, we progressively create a more resilient service and a higher degree of assurance that we can resume operations when inopportune events occur, as well as create more learnings and a more resilient organization. Gene Kim (01:22:05): So also in that paper, there's other great stories of a similar program led by Kripa Krishnan, who was a technical program director at Google leading the program for seven years. During that time, they simulated an earthquake in Silicon valley, which resulted in the entire mountain view campus being disconnected from Google major data centers, having lost complete power and even aliens attacking cities where engineers resided as Krishnan wrote, some of the learnings in their disasters included when conductivity was lost, the failover to the engineering work stations didn't work, engineers didn't know how to access a conference called Bridge. The Bridge had only capacity for 50 people and they needed a new conference call provider who would allow them to kick engineers off who had subjected the entire conference to hold music. Gene Kim (01:22:49): And when the data centers ran out of diesel fuel for the backup generators no one knew the procedures for making emergency purchases through the supplier, resulting in someone using a personal credit card to purchase $50,000 worth of diesel fuel. So whether in the Amazon context or in the Google context, the result is by creating failures in a controlled situation, we can practice and create the playbooks we need. We allow people develop relationships with people in other departments so they can work together during an incident turning conscious actions into unconscious actions that are able to become routine. Okay. Back to the interview. John Willis (01:23:25): Yeah. It was the original Netflix, Chaos Kong right from the get go. Well anyway, so the long story short, Jesse was becoming really famous. He wrote that other really great article about if you used sort of infrastructure's code a tale of two startups, there was this great article tale two and it was on O'Reilly radar and so he was early tapped to be one of the chairs of O'Reilly and in fact- Gene Kim (01:23:53): Of the O'Reilly Velocity conference? John Willis (01:23:56): Yeah. I'm sorry. Thank you. When this sort of, we called it the Cambrian explosion. When that Cambrian explosion started happening in IT, Puppet, Chef, cloud, infrastructure's code at scale. By the time you get to 2009, that's when you see Allspaw, 10 deploys a day, him and Hammond. We both talk about that being a moment. Gene Kim (01:24:19): Yeah. John Willis (01:24:19): So I'd gone again. I was at the first DevOps stage with Patrick. Patrick doesn't remember this way, but I remember Patrick because he went through O'Reilly and said, Hey, I really want to do this sort of DevOps thing. And they said, yeah, we're probably not going to do that. You should do it. So he did it and that's how I met him. And then Patrick said to me, I know he said this at that conference. And he knew I worked for Chef at that point. John Willis (01:24:41): He said, it would be great if you could sort of go back and try to do this in the US. So I ping Damon Edwards, our good friend, Damon Edwards. And I said, we should create this conference I was at in Guent was really insane. And it was mind blowing. And we created the first DevOps stage that we ran and at LinkedIn. Gene Kim (01:25:02): Yep. That's right. John Willis (01:25:03): In 2010 and you were there and that's how we loosely met. I've told this story. I mean, I've taken a lot of time, but this is my favorite story of all time. I'm on a panel with you and Patrick is the moderator and he sort of lovingly makes fun of my age. It was many years ago. So I was old even back then. And you said something in my memory like, oh, he's not that old or something like that. John Willis (01:25:25): And I said, "well, thank you, panel guy number four or whatever." And after I got off as Damon always does, he says, "you know who that was?" I'm like, "I don't know, panel guy number four." And he said, "no, that was Gene Kim." And I had read your idol book [crosstalk 01:25:40] I had that. It's the same thing you and I when [inaudible 01:25:46] same thing. Like I met him and I'm like, "oh my God, you're that guy how I learned Solaris". Gene Kim (01:25:50): Like The Porsche book. John Willis (01:25:51): Yeah. So, and then every year we would do the first three or four years, maybe five years, we did the two day conference piggyback and one other really cool thing- Gene Kim (01:26:01): And by the way, my feeling of being there, I remember seeing Patrick Debois play the Charlie Chaplin movie. And then, I love Lucy excerpt of like- John Willis (01:26:09): Yeah. [crosstalk 01:26:12] Gene Kim (01:26:12): And my feeling was like, "holy cow, this is my crowd that I always been looking for." And what was amazing was that because it was self selected, the people who stuck around after an insane Velocity conference, there were that crowd too. I mean, it was a die hards. John Willis (01:26:27): Though that was the next story I was going to tell, which was when we were doing the planning, the first thing I had to do is go to Jesse and say, "you think O'Reilly would be upset if we run this conference right after your conference?" And, Jesse asked and they said, "I don't care." At that point, nobody even knows what DevOps is. So they're like, "sure, go ahead, you can't use our name or anything like that." And then the other debate was, should we do it before? And everybody said, "no, that's not going to work, you got to have Velocity." Because at that time, Velocity was the go-to place. I mean, like if you were doing what we do now, you needed to be a Velocity. Everybody that worldwide, it was the one place where Ben Rockwood, Patrick, and you just list the people. John Willis (01:27:13): So we then said, "okay, it's going to be the two days after." And then the debate was, wait a minute. We all know that Velocity is this insane, open up your head and just stump brain fuel in, would people really be up for another day of this kind of activity? And we said, flip the coin and said, you know what, we just got to do it. And I remember being on the second day, I think it was two days, second day at like three o'clock. So here's a thing, DevOps Ghent was great. Eight, nine months later, there's 300 people. It did. This is a whole different ballgame. John Willis (01:27:50): And two days after Velocity at like three o'clock on one of the second day sessions, there's still about 280 people in that room. And I'm like, "you know what, now I really know this is real". Because all our fears about people not wanting to stick around for the next two days after they had just been...and again, for people listening that weren't in the early days, Velocity, I mean, it was, you were just...I was watching one of your old pre I watched one of your Puppet conference presentation all day. So it's weird. We watched Ben Rockwood's original and it rolled into your Puppet con. And I remember you had put up...I I think it was John Jenkins, Amazon numbers, the thousand- Gene Kim (01:28:32): That's right. 11,000 deployments per day. John Willis (01:28:34): And you know what, I came from Velocity. It was a [inaudible 01:28:37] and he did his Velocity. Gene Kim (01:28:38): Yeah. John Willis (01:28:39): Anyway, I think, that's when I knew, this is for real. Gene Kim (01:28:44): By the way, let's talk about who was there at Velocity. I'm just kind of conjuring up the memories. I mean, so it was Facebook, Amazon and Netflix was there- John Willis (01:28:51): Theo [Sasamer 01:28:52], Gene Kim (01:28:52): Theo Sasamer. John Willis (01:28:56): [crosstalk 01:28:56] and then it was Archer with his- Gene Kim (01:28:59): Fastly. John Willis (01:29:00): Fastly was there. And they were big in those days. They're big. Now, they went IPOed and all, but like they had the team of people. Gene Kim (01:29:09): Yeah. What a perfect stage of people already primed thinking in a DevOps-y like way. John Willis (01:29:17): Yeah. But what happened there is that Damon and I, and you, we, I think that we eventually got you on board I remember watching Allspaw give a presentation maybe 2010 or something about how they did change management. My background is, we work with Chase Manhattan Bank. With Chase management is circuits and it's a idle service management. It's serious stuff. Gene Kim (01:29:42): High out cost environment. John Willis (01:29:44): Absolutely. These are very changes for circuits. So I'm watching John and I love John Allspaw, but to him, it seemed like a novel idea. And I'm thinking, man, if we could create a conference where I could bring some of the people I worked with at B of A and Chase and JP Morgan Chase, and they could come in the same conference, and this is way too far ahead of its time. And give presentations in the same conference where John, it'd be that sort of John would sit on their sessions and say, "oh wow, that's interesting. That's a much more complex scale problem." And they would sit in on John's presentation of people like John and go, "oh my goodness, we're doing it all wrong." So that was like, I wanted to make that happen. We couldn't make it happen with Velocity. It was finally IT revolution and [inaudible 01:30:32] enterprise summit where we put out the shingle. And boy, we soar. That first RFP was like, oh my God. Gene Kim (01:30:41): Do you know who I met at Velocity conference? And at Chef conf 2012, 2013, Jason Cox, then Director and seasoned engineering at Disney. John Willis (01:30:53): Yeah. Gene Kim (01:30:53): In fact, so among all those unicorns, web startup companies or Disney people. John Willis (01:31:02): Well, again, Chef con is where I met Courtney. Chef con, was Jason, Courtney. I mean, I'd have to shake my memory [inaudible 01:31:12], but there were so many influential people in. Because the thing more sort of historian, sort of view or observation from my perspective is that Puppet had done a great job pre-cloud. They were in all the universities, they were running at Facebook scale. They were running in certainly at Google, they were running. It was the really only way to do real scale infrastructures code. Or in fact, I take that back. Facebook goes running CF engines. Gene Kim (01:31:41): This is hundred thousand plus server deployments. John Willis (01:31:45): Yeah. It's crazy. Yeah. In fact, the way I met Adam Jacob I met Luke first at OSCON. I wrote about this in the first DevOps handbook. But as my sort of overview, which is a friend of mine in an OSCON in 2007 or '06, I don't know. It's probably '07. I go to OSCON. I don't know anything about open source. I'm getting in fights with Tim O'Reilly and Mark Shuttleworth. But somebody says, "oh, you got to go see this session on this thing called Puppet." I'm like, "what is it?" They're like a monitoring tool. Yeah. I'm all in monitoring. And it, wasn't a monitoring tool and I'm sitting in the back room and I'm watching this young kid talk about free words, man. And I've said this, and I think in the original DevOps handbook where I said, he changed my life, I'm in the front row. John Willis (01:32:27): I'm like, oh my God, everything I've been doing for the last 15 years is wrong. And then I ran and then Adam Jacob talking about scale, Adam Jacob, he was killing it with Puppet in the Seattle area. He had a consultant company who were the best Puppet consultants. Gene Kim (01:32:43): That's right. John Willis (01:32:46): And it's really two funny stories here. One is, I had been doing some podcasts with Luke and having a lot of dialogues with Luke and working with Puppet. And I saw this presentation or Luke said, "Hey, you ought check this guy out. He just did a Facebook application where they implemented 5,000 bare metal servers in like, I don't know, three days or two something crazy." So I immediately called Patrick when I found out he was going to called Adam up and I tracked him down and I did a podcast with him. John Willis (01:33:18): And then it was at that O'Reilly in 2009 where I'd asked Luke for a job one more time when he said no. I literally was looking away, I hate that, I'm so pissed. Then Adam Jacob says to me, he sees me. He says, "John, why are you so down in the dumps?" I'm like, "you know what, maybe I'll always count for John." And I'm feeling like maybe I Peter Pinch principle, like these young kids. I've gone a Bridge too far on this, on what I think. And I go to Adam, like, " [inaudible 01:33:48] you think any kind of heck why you'd want to hire me your new startup itself?" And he throws out his fists at me and he does a curse word, like, yeah. And they hired me. So again, there's so many memories of Velocity. I literally the sort of the fist bump higher from Adam Jacob was literally sitting on a couch. This one was actually when they moved to San Jose- Gene Kim (01:34:18): The High Convention Center in San Jose John Willis (01:34:20): Sitting on one of the sort of corridor couches and said, "John, why don't you sitting down?" I'm like, yeah. One other point, Gene too, is you sort of alluded to this. The thing about DevOps days and Velocity was we knew every year that everybody that you wanted to meet was going to be at Velocity. And they knew that we were going to run DevOps days. So it was basically like Ben Rockwood and John Allspaw. And we just go down the list, Theo, Jesse and Adam and Luke James Turnbull. I mean, just you go down the list. You knew they were all going to be there. What, if you're a geek, what other place on the planet would you want to be? You usually, they were in early June sometime then wherever that sort of combo was. And that happened for three or four years in a row. Gene Kim (01:35:12): Yeah. I remember those times this has been around 2013 and we wanted to see if we can get an enterprise track created at the Velocity conference. John Willis (01:35:21): Yeah. Gene Kim (01:35:22): And I felt like we were winning. We had just humble there lobbying for this, as well as Adam Jacob, and you and Damon. And we had one session on this and it warranted a second session on it. And then I remember when sort of all the air left the room, it was two people. One of them said, the ExxonMobils and the Chase Banks and so forth. They're not like us. We don't want them here. John Willis (01:35:50): Yeah. Gene Kim (01:35:51): And I remember Adam Jacob saying, "I've got news for you. They're already here at the bar hanging out with us." John Willis (01:35:59): That was just the thing in early on. If I could have got the person ran configuration and change at JP Morgan Chase in Ohio, Columbus, and John to see each other's presentations, I think the world would've changed earlier, but the problem was there was already very tight, real estate for sessions. And then the idea of sort of bringing in the enterprise, which probably the web people would've been like no way. And even if this admins, you remember in the early days going to Lisa groups and some of these old timers CIS admins, I remember being in Boston one time at a local meetup. And again, I'm adding my literary license, but I felt like I was getting chased down the streets with pitchforks, because I was talking about Chef. Like, what if it runs away with the systems? What if it just starts recruiting? What if it.... They thought it was that movie, the war games. It's immoral. John Willis (01:36:59): In the early days, there were a lot of CIS admins were like, "that's cowboy stuff, we don't want that stuff in our sort CIS admin pure." So anyway it is, but you're right. I think it was great that you sort of brought this up about. Even myself, I have such fond memories of Velocity O'Reilly in general, because like I said, OSCON, I don't think if I would've sat in on Luke's presentation, I wouldn't had the journey I had. Gene Kim (01:37:29): But just shows how conferences are amazing when you can attract like-minded thinkers and who are truly visionary. And by that I remember the way I described DevOps enterprise 2014, that we finally got going. Because we couldn't get started within the Velocity conference. It was incredible. It was a sense that there was something genuinely momentous happening. That everyone felt it was a universality to their problems. That was felt regardless of what industry you were in, how old the company, what industry. I mean, it was magical. John Willis (01:38:02): And was a couple years ago, you said it, but like it's always in my mind been this, you said there's no velvet rope here. And I thought there was so...that was the best way to describe this place. I have a great story. You're going to love this. So I had met Dwayne Holmes over at Marriot. This guy is the deal. We tried to get him to speak, but Marriot. Gene Kim (01:38:21): No, he spoke. John Willis (01:38:22): I know, but it's took a few years, right? Gene Kim (01:38:24): Oh yeah. [crosstalk 01:38:26] he took him leaving in that company. John Willis (01:38:28): And I had gotten John [inaudible 01:38:29] in too. From time he was a KeyBank, now he's a PNC [inaudible 01:38:35] And I realized these two people, especially me as an infrastructure freak and like, oh my God, I've got to get these two guys together. So Dwayne came to Marriot, but he wasn't able to speak. And so we go in the speakers' room and I'm like flying a wall. I'm like, I need you two to meet each other. And they were sort of tigers, what do you do for this? You can see, they have such level of expertise. They're not going to waste their time. Gene Kim (01:39:01): Just how good are you? John Willis (01:39:03): Exactly. And it wasn't an ego way because everybody in the knows those guys like no ego at all. But it didn't take very long to say, start questions about, well, how do you deal with a vendor was like, oh my God, I'm sitting there. John Willis (01:39:15): This is got Marriot and a major bank having discussions from anywhere from vendor relations to how do you run engine Xs? I mean, I'm sitting there. Guess you sit the left of me, Steven Spear. I turn to him because you know me when I'm around him, I can't stop asking him questions. I'm like, "what do you think about this?" And I turned to him. I said, "you know what, this is the only time I think ever where you're sitting next to me and I won't be talking to you. I listen to this conversation of Dwayne Holmes and John Rzeszotarski. I don't think he got it, but I say to myself, in what universe would I not bechewing Dr. Steven Spear's ear off. If I'm sitting next to him in the speaker room, except for, I got John Rzeszotarski talking about large scale infrastructure, DevOps at a large bank in Dwayne homes, Dwayne homes at the time ran 60 to 70 billion dollars of Marriot generating revenue through Docker and Kubernetes. And that was five years of ago. Gene Kim (01:40:22): Right. Earning him the title of a Google Uber title. John Willis (01:40:27): Yeah. Got those crazy certifications. In fact what I do at DevOps mostly is bring people to other people. I watch most of the sessions, except for Jason Cox, which can't get recorded. I know all the other ones are recorded. So generally I spent a lot of time going with John. Is there any way you could introduce me to this person? I'm like [let's go confirm 01:40:47]. And then, here you go. So I spent a lot...That's one of the things I love. Because I get to sit in and listen on the conversation. Gene Kim (01:40:56): Reminds me that saying you're only as good as a top five people you hang out with even just to listen to them interact. How much you learned just by osmosis. John Willis (01:41:04): Well, Dwayne works with John now, right? Gene Kim (01:41:07): Yeah. John Willis (01:41:09): So I mean, I put those two together and when John found out he was basically leaving Marriot, he literally hung up on me and said, "I'll get back to you." Gene Kim (01:41:17): Oh, that's great. Hey John, this is really good stuff. John Willis (01:41:25): Yeah. You know me, I'm the historian here, right? Gene Kim (01:41:29): Gene here. I hope you're having even a fraction as much fun as I'm having hearing about the early scene around DevOps days and the O'Reilly conference over 10 years ago. Part of me thinks there might be a little self-indulgent, but I thought it was so interesting to hear the stories about how DevOps days came to be, his relationship with the Velocity conference and how it even led us to creating the DevOps enterprise conference in 2014. So here are a couple of clarifications. One, Dwayne Holmes was senior director of DevSecOps and enterprise platforms at Marriot. He gave a presentation in 2020 about the platforms he developed that as John mentioned, was supporting over 30 billion dollars of annual revenue. I was so delayed that he was finally able to share his story, even though it was anonymized in the DevOps enterprise video, which among other things earned him, the title of Google cloud certified fellow having built and managed one of the world, largest Kubernetes installations, looking at the Google cloud page on this, there are fewer than 50 people with this highest level of certification. Gene Kim (01:42:34): I'm reading, the Google cloud certified fellow program is for elite cloud architects and technical leaders who are expert in designing enterprise solutions. The program recognizes individuals deep technical expertise who can translate business requirements into technical solutions using anthos and Google cloud. Number two, John mentioned John Rzeszotarski. He is phenomenal. He attended DevOps enterprise 2016. Back when he was director of DevOps at KeyBank, I didn't meet him then. John actually introduced me to him telling me that I had to hear about what he did after he got back from the conference, because apparently he went back to KeyBank with a sense of mission and took advantage of a crisis that was the entire consumer banking property going down and used that to spark a revolution of his own and ended up presenting at DevOps enterprise 2017. In 2019, John became SVP of technology infrastructure at PNC bank. Gene Kim (01:43:29): He's another amazing person I met at a conference whose work I admire. And I love that Dwayne Holmes is now working for John Rzesz. Another great example of how world works. Number three, John mentioned a paper that Jesse Robbins wrote called Operations as a Competitive Advantage. And by the way, Patrick also mentioned this paper in his interview. So Jesse Robbins published this article in 2007 as a part of O'Reilly radar. The same year that he became one of the co-chairs of the Velocity conference. This is a famous paper in the ops community because it describes how ops shouldn't be an afterthought as we typically viewed it. And as John described it, it's really the tale of two startups, both having to deal with a user base that is doubling every week, swamping all server capacity. In the first startup, the ops team is spending more than half the time racking and stacking new servers, trying to get new capacity online with their workload growing linearly. Gene Kim (01:44:27): The number of hours as graft is growing almost at a 45 degree angle. In the second startup, the ops team is using automation, in their case, Puppet the company that was founded by Luke Kanies, that John referred to, to automate server provisioning. And they are spending less than 5% of their time scaling capacity and managing operations. So this is what Jesse Robbins called operations as the secret sauce, the competitive advantage that was a make or break capability for web 2.0 companies trying to keep up with user growth. Number four, I had mentioned that I met Jason Cox from Disney, who is so prominently featured in- PART 3 OF 4 ENDS [01:45:04] Gene Kim (01:45:03): And I met Jason Cox from Disney, who is so prominently featured in a DevOps handbook at a [Chef Cov 01:45:07]. You might have heard me react when John said that that's when he met Courtney Kisser too, another person who was prominently featured in the book. I did not know that that's where John had met her. I met her at Velocity 2013 back when she was a senior director and later VP of technology at Nordstrom and later at Starbucks and Nike. And she is now CTO at Zulily. And you will hear John talk about why Courtney's Nordstrom case study is his favorite case study in the DevOps handbook later. Number five, John mentioned the notion of no velvet ropes at DevOps enterprise. That's something that's always been very important to me. As you can hopefully tell, I really love conferences. I feel like I owe so much of my entire career to conferences. Gene Kim (01:45:50): I met every one of my co-authors at a conference. I'm pretty sure I met almost every one of the IT revolution authors at a conference. So much of what I learned that went into the Phoenix project and a DevOp handbook I learned at a conference. In fact, I learned about closure at a conference. That was from Mike Nicar, Velocity, 2013. So many people who I cite in the Ideal Cast, I met at a conference. I had a real reason to think about this in 2019 as the world was locking down during the global pandemic and every conference had to move to a virtual format. In support of this, I wrote an 8,000 word blog post called My Love Letter to Conferences to better understand what made great conferences so great. How are they structured to create that magical dynamic that John and I were talking about in the Velocity conference context. John and I are talking about some of those magical dynamics where you learn from incredible talks. Gene Kim (01:46:39): You're exhilarated by being surrounded by the best in the game. You find fellow travelers who not only share similar goals, but you also experience similar struggles that you hope to conquer together and so much more. And I wrote the connections you make at conference often lead to lifelong friendships, and maybe can even change your career. No doubt they can change your career. I'll put a link to that blog article in the show notes, which also includes something that touched me very deeply. I went through all the pictures I had taken at conferences over the last 15 years, conferences about DevOps, information security, audit operations, the Itel community and I picked 800 of them. I put together in a YouTube movie and in one insane picture that I think is like 19,000 pixels tall. And I think I even tweeted out that picture at mentioning every person whose picture I found and yet as great as all those experiences are, there are some feelings I've had at conferences that I wouldn't want anyone to feel at a conference ever. Gene Kim (01:47:41): Sometimes I felt like I was on the wrong side of the velvet rope. In other words, all the people that I wanted to talk to were on the other side of that velvet rope and there was just no way to get on the other side of it. Sometimes I would find myself in a sea of people and I wish I could just find someone to talk to about any number of topics that I wanted to learn more about. So there's a ton of things we deliberately do to make sure that these velvet ropes don't exist, whether it was accidental or intentional. For example, I asked every speaker to end with a slide that describes a help they're looking for. That creates opportunities for people to help each other. And so my personal goal is that we help foster community that is actively helping each other. Number six. I love how John is very deliberate about how he wants to spend his time at conferences. He mentioned that DevOps enterprise, he loves helping connect people with each other and being apart of those interactions. I just want to point out that the best conference experiences I have found tend to involve planning and being very intentional. Networking is more than just being friendly. It's about finding the right people to help you achieve your own goals, whether it's finding people with certain expertise, finding people with connections, whether you're looking for helpers or fellow travelers. Gene Kim (01:48:53): By the way, I should talk about some of my own goals. One of the things I love about the DevOps handbook is that there are 65 case studies within them. And almost all of them came from conferences that I attended and at least half of them came from DevOps enterprise summit. So how did that happen? Whenever I was at a conference, I was always looking for people who were sharing great experience reports. I always tried to meet the people behind the transformation. These are people I've learned so much from over the years and admire. Many of these people end up presenting at DevOps enterprise and some of those ended up being featured in the DevOps handbook. Okay. Back to the interview. In those early years, what was the most fun moment for you as DevOps is just starting to take off? John Willis (01:49:39): Well, this one's an easy one because I've been, unfortunately or unfortunately obsessed by this, but there were two. They usually wound up in open spaces and the two greatest open spaces, I'll give you the first one quick, which was... It was John Osbar was running it. And the question was, when is it okay to fire somebody? And it starts off like when is it to fire somebody for making a mistake and everybody in the room, like 50 or 60 people were like, 'never'. Okay, we're done. I'm like, no, no, wait a minute. It's not that easy. Gene Kim (01:50:10): Quick clarification. The topic of the open space was when is it okay to fire someone because they might have contributed to an outage. John Willis (01:50:18): What if the same person makes the same mistake twice? Now the room is split in half. Half the people are like, it's still not okay to fire them and then the other half's like, well, you know, whatever. And then I said, oh, we're not done yet. What if the same person makes the same problem say three times in a row? So now it's just John Osbar and it's one other woman from some company in Brazil or something. And everybody like, nah, John, sorry. You know, like... And what was funny is we sort of ended the meeting with this agree to disagree. And then later we had this amazing conversation where John really started making me understand that it's really never okay. And that's a longer story, but the other one was in open spaces. I think it was like this, the third DevOps days. John Willis (01:51:06): And it was basically on theory constraints. And if you remember how we met, I met you very early, maybe five years into your 10 years of building Phoenix project. And I always say to people that you gave me this gift. I said, can I get an early copy? And you said, well, I think you should read this book by Ellie [inaudible 01:51:25] first. Right. Well, sure. If that's what you want to do. So I read it and I was like, this is great. And then I read Critical Chain and I read a bunch of his books. And I remember calling you and say, Gene, can you introduce me? And you're like, yeah, I can't, he's deceased. But anyway, I was full with [inaudible 01:51:41]. That was the guy, the theory constraints, everything. John Willis (01:51:45): And we get into this open space and the beautiful Ben Rockwood in sort of a... He would never do this. So I'm just trying to describe almost like a tapping me on the head. Like John, John, it all goes back to Dr. Deming. And I'm like, no, no, it's go red. Stop. No, I don't want to hear that. And he's like, I'm sorry, John. And I spent the next year really trying to answer that question. And I started with my sort of Deming to DevOps. And so that probably, where I am today, where I'm sort of freakishly obsessed with Dr. Deming more than any other one sort of session or a moment really. And when Ben Rockwood says something, it's like the old EF Hunting, when Ben Rockwood says something, you listen. So. Gene Kim (01:52:31): No, that's great. Those open spaces in the demos days. It's such a great format. John Willis (01:52:37): They're amazing. Yeah. Gene Kim (01:52:38): So what has been the most surprising thing for you since the DevOps handbook has come out? You mentioned going through the index of the DevOps handbook and you said something just really beautiful about that. John Willis (01:52:51): I knew we were doing this and I started looking at some of the stories again, refresh my memory, and then all these names popped up and I thought, oh, you know what? I want to look at all the names. And I started going through the index of the names and I realized there's so much about the book. One is today, it still stands. And I know the updated edition has some additional stories, but the story and the narrative of the DevOps handbook is rock solid today. And I think there's a lot of reasons to that. But as I think of it personally, for me, when I go through, I think the thing we did, we included a lot of stories from our community and we gave attribution everywhere, right? And so I look through the index and I look at the names there and that's this collective group that I just... I think about every, almost every one of the people that I listed, I get a smile on my face, how much they've contributed to my career, how I've contributed there. John Willis (01:53:47): And if I name a couple, I'm going to leave out beautiful ones who I apologize for front, but you know, obviously Damon, Scott Prue, Courtney Kissler, Randy Shu, Tom Lemoncelli, Dominica if I haven't said it, John Osbar, Josh Corman. You introduced me to Josh Corman. Josh Corman changed my life. I love Josh Corman. And all these people that I'm looking at this book and it's almost like it's like a painting of my career over the last 15 years. And again, I apologize for the names I've either screwed up or sort of missed. Gene Kim (01:54:28): No, it's great. A good call. Qualifying that by all the people we didn't mention. But I do love that because it is really kind of an expression of kind of this collective genius of this very productive scene, right? The scene of people who really helped codify kind of the better known way of doing things. John Willis (01:54:44): Well and again, Gene I've said this to you, and I've said it to other people. I mean, you deserve an incredible amount of credit here because you put this sort of stake or this thing where you created this collective. We all contributed, there's no doubt, right? You didn't single handedly do this yourself. But I look at what you've created. I don't think the DevOps movement would be where it is today, certainly the enterprise without sort of Phoenix project, without your sort of getting us all together in the way that created the most optimum output. So again, I've always sort of, I feel like what you've done and how you brought us all. I mean, all those people I just talked about are dear friends now. And most of them, not all of them, most of them, I would not have known if it wasn't for you. Gene Kim (01:55:37): And by the way, it's been equally rewarding for me, buddy. So what is the most important thing that you've learned since the DevOps handbook has come out? You were talking about revisiting the three ways, cybernetics, variety and variation and holy cow, it is so true. You have become one of the best scholars of Dr. W. Edwards Demings I've ever seen. So I can't wait to hear how this has come together in your mind. John Willis (01:56:02): You know, the conversation always for a lot of the lean people and people I really respect like Mary [inaudible 01:56:09] or David J. Anderson, Don Reinertsen. Gene Kim (01:56:12): Don Reinertsen, yes. John Willis (01:56:15): They'll always put this line is saying that like you can't really map industrial economy work to knowledge work. Gene Kim (01:56:26): Right. So one domain you work with your hands and the other domain you work with your head. John Willis (01:56:31): And their primary argument is that knowledge work is novel and it can't be sort of cauterized or sort of put in like you need the freedom. I think we started the line in the sand that starts decoupling that argument in the DevOps handbook, because we described deployment lead time. Gene Kim (01:56:54): To be specific, we said, design development is everything to the left of code committed, and then build, test and deploy is to the right. John Willis (01:57:02): That's right. And we called deployment lead time. And I always said everything to the left of that was basically ideation. But the David J. Andersons, the Don Reinertsen, the Mary J. [inaudible 01:57:11], I don't think they read that part or really got it. So even one of the podcasts I recently did with Mary and I love Mary [inaudible 01:57:17]. I mean, I adore those two people. And what they mean is that you need to be creative. You can't be rote processed. I would argue that to a certain extent, the deployment lead time is sort of a first order answer to that question. The primary argument is that knowledge economy work needs variation. And one of the things I've studied post the DevOps handbook is the Toyota supply chain. And they talk about the four VLs of learning. And if you read that, they're very specific about the difference between variety and variation. People think Toyota was just a production line, right? No, no. They were designing cars. There was heavy ideation in Toyota. Gene Kim (01:58:00): And designing the production system, designing the subject lines. John Willis (01:58:04): [crosstalk 01:58:04] Thank you. Thank you. And designing the production system as well. And they made a clear distinction between, I think the argument that we conflate. We conflate variety with variation a lot. And when we use that argument, we say you can't have control variation for innovative or knowledge work, right? You need to sort of create new ideas. And I think that, and I know this is very meta what I'm saying, but if you really look at what the four VL and variety, variety was exactly that. It was basically an economic tool to understand how much you could stretch out your sort of variety based on the economics of doing that, which is all about what we do in design, development, ideation, software development, and variation is about understanding causal relationships of the things you do. The four VLS of learning. The four first V is velocity, right? Okay. That's speed. Right. We get that. That's pretty easy. Second V is visibility. Gene Kim (01:59:06): Variety, variability, velocity, and visibility. And by the way, which book did you cite to cite? John Willis (01:59:11): It comes from the Toyota supply chain. It's really well described. I suspect that's the first place it was all described, but it's certainly the best description of them. So again, velocity, visibility, right? Okay. Get it. Important. Now you get into two other topics, which is variety and variation. And here's the thing I think, and I keep wanting to write this in defense of variation blog article, which is I think when David J. Anderson and a lot of these people talk about you can't have variation in knowledge work. I mean, that's the core of their argument is one, is they see variation. They say, constraining variance in knowledge work is a bad thing, right? In other words, they say you can't do it and you shouldn't do it and it's doesn't fit. I say, first off you don't understand variation. There's a difference between variation and variety. So variety is more about... J Bloomberg work which has really helped me understand Taguchi loss function. Taguchi loss function is this interesting idea where instead of tightening everything to the lowest tolerance level, you're actually finding for the edge. So a lot about tolerance is how far your upper and lower control limits can be. John Willis (02:00:28): And then that takes in an economical concept. So again, I'll do a little sort of timely shameless plug on the J. Bloom article. We talked about epistemology there, right? And one of the things that J did really well are pragmatism. He talked about this guy Pierce, who was one who found there's a [inaudible 02:00:46]. And the reason why he came up with pragmatism, he was trying to make the perfect pendulum. And one of the things he found out was there was a point of diminishing returns, which like the parable of the rabbit get in the road, he goes halfway there and then he goes another half and like, theoretically never gets there, right? But he realized, you know what? There's a point of which I should stop trying. And that stop trying meets really the economic and then third, which plays into [Chut 02:01:21] was a pragmatist, which is you use statistical probability to figure out where that is. Gene Kim (02:01:28): And if I heard you correctly, and you control it to the point where it makes sense. It is not the end to itself. So what is variety? John Willis (02:01:36): So I think that's where variety is. So the layman's version of variation is you tighten it, you just tighten it forever. A more expanded or short Deming version or operations research version is statistical process control, where you sort of figure out what's special cause variation versus common cause, and then you can work on a process to use statistics. All that came from pragmatism that came from Pierce's pendulum. Gene Kim (02:02:08): I read one thing that... Kind of variety is the things that you want. You actually do want very... You want to offer wide variety of things to customers. You can't do that if you're dominated by variation, right? If you have internal variations, you can't afford a high variety. John Willis (02:02:23): So when I go back and I say, here's the mismatch between sort of these giants who are giants. David J. Anderson, Mary [Popavich 02:02:31], and Dave [inaudible 02:02:32], the mismatch is you're right that you need to measure. You can't just isolate the variation of knowledge work. But what you can't do is just say knowledge work has to be just some hippie forever expanding with no controls, right? Gene Kim (02:02:53): Totally random, dominated by entropy. John Willis (02:02:57): Everything and even cost. Do you allow a developer to sort of do a thousand experience? 10 experience? Five? And so variety is a constraint that overlaps variation. This is my view of... I'm really going to sit down and write an article and really do it real justice. And that's why I haven't written the article yet, because I there's a few more things I want to dive deeply before I start taking on David J. Anderson. But the point is variety is that you also have to have controls. You don't get knowledge work for free, right? And I think they're in lies when they say that the sort of lean or Toyota proxy system can only be useful to a point and there are things in knowledge work where it won't work, I say bull crap. Go to Toyota supply chain and understand the difference between real variation, [inaudible 02:03:54] cross control and how they describe variety. Because by the way, there was a lot of ideation going on at Toyota. They weren't limiting the idea factories and the dojos and the things that they were doing to create incredibly innovative ideas for new type of cars. Gene Kim (02:04:12): Yeah. This has been a longstanding topic between me and Steven Spear is everyone sees the Toyota production system or the manufacturing plant. The two incredible creativity efforts are the design of the car and the design of the manufacturing system. John Willis (02:04:27): That's right. [crosstalk 02:04:28]. Gene Kim (02:04:29): Incredible variety. John Willis (02:04:31): That was knowledge work, man. So. Gene Kim (02:04:33): Absolutely. And by the way, it's so cool to hear Elon Musk, right? CEO of Tesla say production is so much more difficult than design, the car design. John Willis (02:04:43): Oh yeah, totally. Again, two summers ago I visited the Toyota factory in Toyota City. It is incredible to see kanban boards, first off. You can know it on paper. We can talk about it in presentations. When you see how you can see a kanban from every place you are standing in the factory, in the [inaudible 02:05:08] and actually get a sense of what the little color codes mean. Like you literally start seeing where they are in production, but the other thing there, as you go through the museum, you get to see sort of the history of it. I wrote something about this, I called it the factory-less factory or something like that. They now... Back in the day, the way we think about is all these sort of kanban and just in time things all sort of clamped onto as the car comes through. Well, first off that's all Bluetooth, right? And it's all sort of robotic things that are filling things up, but what they do now, and this is something they talk about in the museum is it really is just a roadway. It's a concrete roadway. Everything is dynamic. So all the things that were sort of the big bar thing above and all that things that we saw classically, there's no infrastructure. Gene Kim (02:06:06): Are you saying there's no physical production line? It's actually all on conveyor- John Willis (02:06:11): It's all commercial, it's all robotic. It's sort of Bluetooth it's pathway. And here's the thing, the reason they did that is they needed to... One of the reasons is to shrink or shorten the size of the line, depending on the sort of kanban-ish output design. So if they need to basically make it, I don't know, 1000 feet, five, I don't know the logistics of it, shorter because it's all dynamic, they can just do it. And if [crosstalk 02:06:37]. Gene Kim (02:06:36): Floor, having to move all the [crosstalk 02:06:41]. John Willis (02:06:41): All the dynamics of all these things that are not coupled now. It just sort of fit in and then I think you've that further, that you can basically build a factory like that. You know what I mean? So to your point, the innovation of how you create a production infrastructure to [inaudible 02:07:04] more important than the innovation of the [inaudible 02:07:08]. Because that's where the numbers come from, right? Gene Kim (02:07:12): I love it. So the DevOps handbook is principles and patterns. What is your favorite pattern in the DevOps handbook? John Willis (02:07:19): Yeah. I think the thing that resonated the strongest right off the bat, even before I really started deeply studying systems thinking and feedback loops and complexity, which is the second way, right? And which is the feedback loop concept and probably and Andon cord is... It's sort of mythical in so many ways. But that idea, and I guess here's the real key point, right? And we do point out a fair amount of authors work in the DevOps handbook, but, Mike Rother wrote Toyota Kata. John Willis (02:07:52): One of the things that he said that really helped me and you recommended that book to me and it was great because he said that when he talked about the Andon cord, the way he expressed it was not only could anybody pull the Andon cord, and just to be clear, it didn't mean the line stopped. There was a point at which the line would stop, but it was a significant event that when you're producing 2,000 cars a day on the line, somebody pulling the line and potentially slowing them down or even stopping it. And the thing I love most about that from a feedback perspective, which was Rother said when he went over there and worked in Toyota for a little while, and that's why he came up with the Toyota Kata concept, he said that the floor manager would, the first thing they would come up to you and say before they knew anything of why you stopped the line, or you pulled the Andon cord was thank you and he wanted to thank you for creating a learning opportunity. And that is so foreign to Western thinking, right? John Willis (02:08:52): The idea that even if you stopped the line because there was some shadow and there was nothing wrong, that was a learning opportunity. Right? And I think it was Rother's story where they said... The two stories, the ones in Rother's is one in a book about the Kentucky plant, but one in Rother's book was, I think it was this plant was, the Andon cord would get pulled 10,000 times a day. And all of a sudden they went down to like 8,000 and sort of in Western world, that'd be a celebration. Yay. 2001 [inaudible 02:09:24] and the plant manager pulled everybody in and said, we got a big problem here. We're learning 20% less than we were less. Right? And then there was this other great story of the Kentucky plant, where they were building like 2,200 cars a day and the reporter or the automobile analyst said, how do you build 2200 cars a day? He goes, oh, it's quite easy. We pull the Andon cord 5,000 times a day. And that is the sort of core of why we think about feedback loops, why we think they're important, why creating a psychologically safe environment for people to sort of metaphorically pull an Andon cord anytime, anywhere, any person, any gender, because it is a learning opportunity. So, yeah, I think that's the pattern. Gene Kim (02:10:15): Love it. Last question. Maybe. 50 case studies in the DevOps handbook now heading to 75. which one is your favorite case study? John Willis (02:10:25): Well, I never answer questions simply. If I had to be pinned, I would say it's Courtney's... I've said this before. I just loved, coming from a mainframe background, the story of... And I always sort of ruin the story. I probably should have reread it [inaudible 02:10:39]. You always fix it up for me, but it was a mainframe application and it was always sort of blamed for latency problems. And every year they'd have these discussions about, oh, we got to get rid of that and all that. And one of the things that is great about Courtney, I know you are, we're just huge fans of Courtney Kissler in so many ways is she took such a pragmatic approach. She went and said, you know, there's these things called [inaudible 02:11:05] mapping, and why can't we mix the two? Sort of the Gartner of bimodal, like, no, no. Don't even... She just applied [inaudible 02:11:15] and what they found, it was basically a Java. It was a manual process that basically could be fixed with sort of, I don't know, 40 or 50 lines of code. John Willis (02:11:26): Right? And it was gone. I mean, that whole latency issue was off the table. And it's that kind of thinking... Gene Kim (02:11:34): I remember reading this just a little bit ago. So the issue was that there was a form on the 3270 screen presumably. They were asking a floor manager for information they didn't have like employee ID number. So they makes a little note to say, I'll do it when I'm in the front of my PC, in the background. John Willis (02:11:50): You had to go up to another floor to actually get the data. Right? And of course- Gene Kim (02:11:55): [crosstalk 02:11:55] They wrote a simple web app so that they stopped asking for that information, made it easier for the store floor manager to input the information and boom, no one's complaining about the mainframe application anymore. So to your point- John Willis (02:12:07): That's systems thinking. That system's thinking, right? That's not looking at like, well, mainframes are not DevOps. We can't really do value stream mapping, right? Today I think people work really well, but when she did that, like there was sort of an advanced thinking to actually do that kind of exercise with a mainframe application. And then the other one that you asked me for one, but I can't leave out Scott Pru's journey at CSG. I tell people today when they're beginning their journey, go watch... I don't know what the first one was, 2014 or- Gene Kim (02:12:40): That's right. John Willis (02:12:41): I said you get this glorious view of where you are now, watching him every year and you get to see he comes back and it's different and it's better. And you know Scott. It's methodical, it's truth. So I always think that I love his journey and for us to all see that year after year. Many times we talk about should we have repeat speakers or in the balance of that. But the argument always for what Scott does is just every year you get to see his journey and it's always better than the last year. And it's just this beautiful, continuous improvement story. Gene Kim (02:13:22): Love it. And I love that story too just because I think one can make the claim if you can do it with the technology stack that he had, it really says you can do it for anything. Gene Kim (02:13:33): Awesome. Is there anything else? You tell us about the podcast? Tell us about what you've been doing. It's blowing me away just to what extent you are studying Deming. What's been the funnest thing that you've learned? John Willis (02:13:46): Well, you've told me for many years now you ought to write a book, right? [inaudible 02:13:50] In my mind, I've always sort of created an outline over the last 10 years and I've grabbed great stories and when the pandemic came, I realized I was getting this gift of about 50 or 60 hours a month of non travel. And I figured, like use that time. So I've gotten much more serious about really understanding the narrative of his impact everywhere. So I've got a podcast. It's profound, it's called Profound based off of the idea of system profound knowledge, which is what's his last sort of theory. And then I've been writing a blog sort of supplemental to that, just these interesting stories. So I've just, if you follow those, you'll just get some really good, beautiful... John Willis (02:14:33): I interview different people, not just people in our industry, but I interviewed the guy who wrote a book about Hawthorne, right? He has nothing to do with IT. He was a librarian science, he's a professor of library science. And he wrote a book about Hawthorne. And he just told me all the glorious stories about Hawthorne and Cicero, which was the plant where Dr. [inaudible 02:14:52] invented [inaudible 02:14:54] where Deming did his intern. So Gene, I was telling you the other day, I find all these glorious stories and of I've got them all compiled. Hopefully near the end of the year, I'll have something in a form where people can read it. But I just read one the other day, again, these glorious stories. So you got to remember when Dr. Deming went to Japan and influenced part of what they call the miracle in Japan, he wasn't the only person, but he had impact, right? John Willis (02:15:22): He was 50 years old. That's the thing. I don't think people get the deal. That was when he was 50. He comes back to America and he's basically obscure and nobody really knows who he is in America. And then it's this documentary on NBC that just explodes. And it's called if Japan Can, Why Can't We? Gene Kim (02:15:39): That's 1980? John Willis (02:15:40): In 1980. Yeah. 1980. Right. And what's interesting is at this point, I had a car. I was like... 1980. I'm like 19 years old, 18 years. And you knew this whole... Everything was Japan, not just cars. I mean, it was memes. You basically figured that if you were going to go work, you were going to work for a Japanese company. It was just, they had won. They had won this war. Gene Kim (02:16:05): TVs, VCRs, cameras. John Willis (02:16:09): Manufacturing, even these sort of cultural references of die hot with the big building. It's all... Gene Kim (02:16:17): The tall plaza. John Willis (02:16:19): Yeah. Right. That was it. That's how people thought. That was the state of the art was if you were going to talk about the most interesting company, it was a Japanese company. So this thing comes on. In like the last seven minutes or so is Dr. Deming and everybody in country is listening is like, oh my goodness. It was an American that taught them how to do this. So the Donald Peterson, the president of Ford invites him to Ford. Then this thing, starting about 83... I'm calling it Deming mania, right? From 83 to 90. So he's 83 years old. And I think upon the number how many people he trained in a 10 year period. It was ridiculous, like a million people. And so here's an 83 year old guy flying all over the world teaching his four days with Deming, right? At 93. So I was just reading this story. I wrote a small blog article about it. 93 years old, about a couple of months before he died, he's given his four days with Deming. On the last day... I mean, he's got an oxygen tank, he's sick, he's 100 pounds. And one of his students come up to him and say, we don't mind if you sort of leave for the rest of the day. Nobody will matter. Get some rest. You're coughing. He's 93, he's about two months from dying. John Willis (02:17:36): His body is shutting down and his response was, "I have a responsibility to teach people this stuff." At 93. I don't know. And then apparently two weeks before he died, he was teaching his four days with Deming in a Los Angeles seminar. And he didn't do it for the money. He had really nothing. He lived in a small house outside of... He felt a response.... Again, not to sort of say where Deming or anywhere near the spectrum of Deming, but I get that sort of kinship of like a lot of what you do, what I do, what Jez Humble does, what Nicole does, what Damon does. We do it really because we feel... You said this, and I know you may cut out all the cool stuff that I say about you, but you've said this many times in your early days. John, I want to improve the lives of millions of people. And I remember that, one of the first times we met and I think I feel such a kinship with him in that regard. That to me was one of the most beautiful stories is even at that point, he's like, no, no. I mean, he wasn't saying it, but I know I'm going to die, but I have a responsibility to take every last breath I have to try to help. Gene Kim (02:18:51): Love it. Keep up the great work, John, thank you. And that is our show. Thank you for listening. For updates on new episodes and the lineup for next year's season, please go to and sign up for our newsletter. Up next will be my interview of the two other DevOps handbook co-authors Jez Humble and Dr. Nicole [Forsman 02:19:19]. The ideal cast is produced by IT Revolution where our goal is to help technology leaders succeed and their organizations win through books, events, podcasts, and research. PART 4 OF 4 ENDS [02:19:36]

gene (2) (1)

Gene Kim

Gene Kim is a Wall Street Journal bestselling author, researcher, and multiple award-winning CTO. He has been studying high-performing technology organizations since 1999 and was the founder and CTO of Tripwire for 13 years. He is the author of six books, The Unicorn Project (2019), and co-author of the Shingo Publication Award winning Accelerate (2018), The DevOps Handbook (2016), and The Phoenix Project (2013). Since 2014, he has been the founder and organizer of DevOps Enterprise Summit, studying the technology transformations of large, complex organizations.

Want to be the First to Hear About New Books, Research, and Events?