Skip to content

Open Source Software as a Triumph of Information Hiding, Modularity, and Creating Optionality

Episode 21
Dr. Gail Murphy
Professor, University of British Columbia
2h 11m

- Intro

Open Source Software as a Triumph of Information Hiding, Modularity, and Creating Optionality

In this newest episode of The Idealcast, Gene Kim speaks with Dr. Gail Murphy, Professor of Computer Science and Vice President of Research and Innovation at the University of British Columbia. Dr. Murphy’s research focuses on improving the productivity of software developers and knowledge workers by providing the necessary tools to identify, manage, and coordinate the information that matters most for their work.

During the episode, Kim and Dr. Murphy explore the properties of modularity and information hiding, and how one designs architectures that create them. They also discuss how open source libraries create the incredible software supply chains that developers benefit from everyday, and the surprising new risks they can create.

They discuss the ramifications of system design considerations and decisions made by software developers and why defining software developers’ productivity remains elusive. They further consider open source software as a triumph of information hiding and how it has created a massively interdependent set of libraries while also enabling incredible co-evolution, which is only made possible by modularity. Listen as Kim and Dr. Murphy discuss how technologists have both succeeded and fallen short on the dream of software being like building blocks, how software development is a subset of knowledge work, and the implications of that insight.

- About The Guests
Dr. Gail Murphy

Dr. Gail Murphy

Professor of Computer Science & Vice-President Research and Innovation, University of British Columbia

Gail C. Murphy is a Professor of Computer Science and Vice-President Research and Innovation at the University of British Columbia. She is a Fellow of the Royal Society of Canada and a Fellow of the Association for Computing Machinery (ACM), as well as co-founder of Tasktop Technologies Incorporated. After completing her B.Sc. at the University of Alberta in 1987, she worked for five years as a software engineer in the Lower Mainland. She later pursued graduate studies in computer science at the University of Washington, earning first a M.Sc. (1994) and then a Ph.D. (1996) before joining UBC. Dr. Murphy’s research focuses on improving the productivity of software developers and knowledge workers by providing the necessary tools to identify, manage and coordinate the information that matters most for their work. She also maintains an active research group with post-doctoral and graduate students.

- You'll learn about
  • Why defining software developers’ productivity remains elusive and how developers talk about what factors make them feel productive.
  • The value of modularity and how one can achieve it.
  • Ways to decompose software that can have surprising outcomes for even small systems.
  • How open source software is a triumph of information hiding, creating a massively interdependent set of libraries that also enable incredible co-evolution, which is only made possible by modularity.
  • How we have exceeded and fallen short of the 1980s dream of software being like building blocks, where we can quickly create software by assembling modules, and what we have learned from the infamous leftpad and mime-magic incidents in the last two years.
  • Why and how, in very specific areas, the entire software industry has standardized on a set of modules versus in other areas, where we continue to seemingly go in the opposite direction.
  • A summary of some of the relevant work of Dr. Carliss Baldwin, the William L. White Professor of Business Administration at the Harvard Business School. Dr. Baldwin studies the process of design and its impact of design architecture on firm strategy, platforms, and business ecosystems.
  • How software development is a subset of knowledge work and the implications of that insight.

- Resources

- Transcript

Gene (00:00:00): Welcome back to The Idealcast. I'm your host, Gene Kim. It has been an amazing experience recording these podcast episodes. I get to have long in-depth conversations with some of the people I admire most, and also to work with my friend and fellow researcher, Dr. Steven Spear, to better understand how organizations work, both in the ideal and not ideal, and to test theories on how high-performing organizations work, which is in support of the book that we're offering together that should be coming out sometime in 2023. If you want to be the first to hear updates about our progress, go to You're listening to The Idealcast with Gene Kim, brought to you by IT Revolution. Gene (00:00:49): Welcome to another episode of The Idealcast. In June and July, the headlines were dominated by the widespread shortages of what seemed at times almost everything. There are currently shortages of semiconductor chips of seemingly all kinds, which have in some cases halted production of some automobiles and household appliances. Personal protective equipment such as masks and gloves and sanitizing products are still in short supply, and disturbingly, plastic lab supplies, such as Petri dishes and pipettes, which are so critical to scientists, remain back-ordered for months. A friend of mine at a large retailer told me last year how almost every retailer in 2019 struggled to deal with shortages that changed almost monthly. In March 2019, it was paper goods and essentials, resulting in the famous pictures of empty shells of toilet paper and paper towels. The next month, it was laptops and consumer electronics as almost every school switched to online learning, requiring purchases of laptops and cameras, and almost every adult switching to working from home. Gene (00:01:55): I remember during that time, it was virtually impossible to find any USB cameras or microphones, regardless of price. The global pandemic has created a bunch of first-ever events in supply chains. Purchases of clothing ground to a halt, dropping nearly in half in some parts of 2020. The result is that in many retailers, there are tons of excess clothing piled up in warehouses or in the back of stores. On April 22nd, 2020, oil prices went negative for the first time to -$37 per barrel. To vastly oversimplify, it was because of the huge drop in demand in oil, and no place to store it. This wreaked havoc in the sometimes abstract financial options market, which then started wreaking havoc in the real physical markets. And this year in March 2021, the Suez Canal was blocked for six days. 12% of global trade was delayed because the 50 ships that normally would pass through couldn't. Gene (00:02:52): On June 1st of this year, the New York Times published an article from Peter S. Goodman and Niraj Chokshi with the title How the World Ran Out of Everything. The article reads, "In the story of how the modern world was constructed, Toyota stands out as the mastermind of a monumental advance in industrial efficiency. The Japanese automaker pioneered so-called just-in-time manufacturing, in which parts are delivered to factories right as they are required, minimizing the need to stockpile them. Over the last century, this approach has captivated global business in industries far beyond autos. From fashion to food processing to pharmaceuticals, companies have embraced just-in-time to stay nimble, allowing them to adapt to changing market demands, while cutting costs. But the tumultuous events of the past year have challenged the merits of paring inventories, while reinvigorating concerns that some industries have gone too far, leaving them vulnerable to disruption. Gene (00:03:45): To say that this article angered Dr. Steven Spear would be putting it mildly. For those of you who have watched or heard him speak, it was a little surprising. He's a mild-mannered bow-tie-wearing professorial type, but bring up this article, and he'll immediately recite a long four-point refutation of the claims made in the article. In the previous month, the Wall Street Journal ran an article by Sean McLain with this title, Auto Makers Retreat From 50 Years of ‘Just in Time’ Manufacturing. The lead reads, "Pressured by pandemic, the hyperefficient supply-chain model pioneered by Toyota is under assault." Steve will point out many problematic claims in this article as well. I thought these discussions were so rich with insights and so relevant to how organizations work, we decided to dedicate an entire episode just to this topic. What's amazing to me, just as the New York Times article describes, is that these philosophies and techniques have gone far beyond manufacturing into supply chains in general, most famously by Walmart and its ecosystem of suppliers. Gene (00:04:46): In the 1980s, Walmart spent over $4 billion on its retail link system, which would transmit their retail data to their suppliers and enable them to replenish goods on Walmart shelves directly at the stores, eliminating the need for so many warehouses. I so much enjoyed re-watching a 2016 DevOps Enterprise Summit talk from Rich Jackson, principal systems engineer at Walmart Technology and Rosalind Radcliffe, distinguished engineer and chief architect for DevOps at IBM. Rich Jackson describes brilliantly of just how important supply chains are to Walmart. He says, "Inventory is a huge deal for any retailer. Retailers want to have enough inventory on hand so that customers can buy it off the shelves. On the other hand, retailers don't want too much, because it ties up cash and space." He describes how their CFO described how out-of-stock issues in 2015 cost him $3 billion in lost sales, but he also describes the balance sheet impact. He described how Walmart had $45 billion of inventory in a specific quarter. And he said, "If we can reduce that by 1%, that's $450 million of cash that could be reinvested into something that can deliver more value." Gene (00:06:01): Rich Jackson quotes Sam Walton in saying, "People think that we got big by putting big stores in small towns, but that's all wrong. We got big by replacing inventory with information. It is not an overstatement to say that Walmart revolutionized supply chains, and there are now entire MBA curriculums based upon their work." In previous episodes, we've talked extensively about the MIT Beer Game, which is literally about the dynamics in the supply chain. Gene (00:06:27): So in this episode, you will learn what supply chains are and why they're so vast and complex, and why they're so important to society as we know it, how just-in-time revolutionized manufacturing, and then later the entire manufacturing supply chain, and then the supply chain for basically everything, and the argument that just-in-time increased the resilience as a supply chain, not decreased as the aforementioned articles claim, why Toyota is one of the auto manufacturers least impacted by the semiconductor shortages, partially as a result of what they learned during the Fukushima earthquake and tsunami in 2011, how the virtuous structure and dynamics of the Toyota supply chain are almost exactly the same as the structure and dynamics of great systems discussed in previous episodes, such as the COVID mass-vaccination clinic with Trent Green, and team of teams. Specifically, they have the dynamic ability to reconfigure themselves with a low cost of change. Gene (00:07:27): We'll talk about how these characteristics enabled Toyota to resume production so stunningly quickly in 1997, when the factory of the sole supplier of a P-valve part burned to the ground, halting all production. This resulted in grave prognostications that it showed the vulnerability of just-in-time projecting weeks of downtime and economic ruin, but in actuality, they were able to resume production in days. And how similar stories unfolded at even larger scales in 2002, when all 29 ports on the US West Coast were closed due to a labor management conflict, and what this has to do with modularity and architecture. We'll discuss how these principles are very similar to the famous Netflix Chaos Monkey and the entire field of what we're now calling chaos engineering. Gene (00:08:16): And we'll explore what the opposite of just-in-time is, and how it is very similar to the dynamics found in the Soviet centrally-planned economy, which turns out to be an extremely useful example as we talk about its structure and resulting dynamics, and how inventory is a substitute for knowledge. And we tie all of this up into the three basic tools of finance: net present value, option theory, and portfolio diversification. This is such a fun and educational interview. It was so much work to even boil down the points we covered to even that long list, and it concretizes so many of the concepts that we've been exploring through this entire podcast series. I hope you enjoy this interview as much as I did. Over the past couple of days, ever since that New York Times article came out, you and I have been having a bunch of energetic discussions, and I've been learning a lot about supply chain. So can you talk about that New York Times article, describe what you found problematic, and maybe describe what people might not have seen in terms of how just-in- time actually increased the resilience of the supply chain, not decreased it? Dr. Spear (00:09:34): Gene, that's a fantastic question, so let's think about what that article tried to do. They said, during this last year-and-a-half nightmare that's been COVID that we've run out of stuff. Originally, it was toilet paper and beans in a meaningfully-sized can, PPE for healthcare workers, which was actually a serious thing, and now we're seeing shortages with electronics, micro-processors, et cetera for the auto industry. And they said, the reason for that is that over the last 20, 30 years, largely inspired by Toyota, industry has adopted this just-in-time approach towards managing their supply chains. They've reduced to a dangerous point, to a fragile point, the amount of inventory in their system. So that was their premise, that somehow the shortage that we have today are due to the last 30 years of inventory reduction. Dr. Spear (00:10:24): I would offer that their understanding of how these systems work is flawed. Their attribution of cause and effect, just-in-time to missing stuff, is really flawed logic in terms of cause and effect. And I think we should sort of step back and sort of unpack what we've been discussing. Why did we have shortages a year ago? Why do we have shortages today? And I think it's also worth talking about... How does just-in-time actually generate huge amounts of value where it's practiced well? And here's my concern, Gene. When you run an article on the front page of the New York Times, people will draw conclusions from it, and it might influence their thinking and it might influence their behavior. If you run an article on the front page of the New York Times and it's fundamentally wrong in so many important ways, but it carries the authority of the newspaper of record, then it might steer some people in really counterproductive directions. And that, for me, is a huge concern that people will read that, and whether it's policy-makers or decision-makers, public sector, private sector, that it leads them to the very, very wrong conclusions. Gene (00:11:38): You had described what happened to supply chains due to the global pandemic as smashing a Swiss watch. Can you walk me through that? Dr. Spear (00:11:45): So you thinking about a Swiss watch, it's famous for the precision with which it keeps time. And behind that is this idea that particularly for mechanical watches, there's exquisite engineering in there in terms of the precision of the parts and the precision by how the parts come together in an integrated harmonious fashion that they keep such exquisite time. Now, you start thinking about how you end up with something in front of you, product or service, that you can put to good use. That thing, that experience took shape because the contributions, the creative contributions of hundreds, thousands, tens of thousands of people came together in harmony. And the way in which they come together in harmony is the design, the development, the execution of these very elaborate business processes by which you know what to do, when to do it, how to do it, why to do it, for whom to do it, who you're dependent on. I mean, it's a very complex thing. COVID happens. Gene (00:12:45): Actually, before you go there, how complex are supply chains. In fact, maybe we even talk about... What is a supply chain, and what makes it so complex? Dr. Spear (00:12:53): A popular product like a smartphone, you would know better than me how many pieces of hardware are inside a smartphone, but if it runs into the thousands of discrete parts, I would not be surprised. Certainly the many, many hundreds, right? Gene (00:13:07): So a car is probably like 25,000 parts. An airplane is like 10 million parts. There's a lot of parts. Dr. Spear (00:13:13): These are crazy flipping numbers, and an airplane is really an interesting example, because an airplane is often built, assembled whether it's by Boeing in Washington or Lockheed Martin in Texas, or whomever else, or wherever Airbus does its magic. But oftentimes, a lot of the major systems, entire engines, wings, fuselage, is to a subcontractor, a partner in the enterprise who's far removed. So it's a lot of parts, and they come from very far away. And the reason they come from very far away is it takes a whole lot of skill, man, to design and build those things. And there's no one in the world... You could probably almost make the argument there's no one in the world today, no organization that singularly could design and build an aircraft, and that's why they tap into partners all over the world. Dr. Spear (00:14:02): So that's part of the complexity, which is these things come from organizations which specialize in the discrete pieces because it requires so much scientific and technological knowledge to make the piece. And Gene, you used as your reference point... It's sort of ironic. You used as your reference point the number of parts in a watch, a car, or a plane, but how about the lines of code? I mean, how many millions and millions of lines of code go into a modern car, and certainly a modern aircraft? It's extraordinary. So anyway, I think your point's a good one, is that whether it's the literal parts or the figurative parts, there are so many of them coming from so many specialists, individuals, and enterprises around the world that this web of supply is wildly complex. Gene (00:14:52): In fact, maybe we can just dive in into one aspect of that. So it's one thing to identify all the things that you need to integrate into a whole, but the choreography required to make sure that they appear in the right place at the right time in the right quantity... Dr. Spear (00:15:05): Right. Oh my gosh, yeah. I mean, Gene, it's fantastic. First of all, just to know all the pieces you need, it's staggering that anyone knows all the parts that go onto an airplane or in a car. When they're all done [inaudible 00:15:17], "Oh, we forgot one." It's staggering. It's staggering that they know where they come from. And that's just sort of like a structural thing, that there's a thing in a place, but then there's this whole dynamic element of... Steve is sitting on one side of the country, Gene is on another side of the country, and Steve gets a ping, and it tells him who needs what from him by when, and the qualities and the characteristics of that thing. Dr. Spear (00:15:39): And Gene is getting a different ping on his side of the country, and that thing has an entirely different meaning to Gene in terms of who needs what, when, in what form, how it's created, et cetera, et cetera. I mean, we're not aware of it. We're aware of the little pings on our phone giving us some latest update on what the Kardashians are doing, but in reality, the world is pinging all the time, giving people insight into what they need to do to be useful and valuable to somebody else. Gene (00:16:07): I remember in the Admiral Richardson interview, when Admiral Richardson sort of maybe oversimplified the complexities of manufacturing, and you brought up an example where the contention on the loading docks, having a minute granularity was too big. You need to shrink the window down to 30 seconds, given weather, traffic, and so forth. So that really opened up my eyes in terms of just how intricate the stance is to make these supply chains work. Is that typical? Dr. Spear (00:16:33): It's unbelievable, gene. Any time you pick up something, you should look at it and say a blessing and say, "I'm observing a miracle that this thing took form." Literally, everything. I live in a Boston area. I eat beautiful navel oranges in February. They did not come from anywhere in New England. The fact that I can sit there and go into the store, and not even a fancy store, and get a beautiful navel orange... You start thinking about the complexity of the systems that someone knew to grow that orange, God knows where they even grew it, and that it ended up in a cart, on a truck, on a plane or whatever, and ended up on my plate. It's staggering. Gene (00:17:18): This reminds me of a conversation I had with [John Lauterbach 00:17:20]. So he came from the supply chain distribution, and he remembers the sense of marvel and awe as he sat in a parking lot of a Walmart. And he saw all the people with goods coming out, and realizing that sometimes, you have to have this... He looked out at the velocity of goods coming out and wondered, "What does it look like on the other side? There has to be an equivalent velocity of stuff coming in," and his mind marveled at that. Dr. Spear (00:17:43): Oh, Gene, one of the best research days I've ever had, this guy indulged me. And we started I think in Maine at a Walmart warehouse, and worked our way down from a warehouse, to a cross-dock facility, to a Sam's Club, and then one of these super Walmarts. And so I did what you're describing, is I saw the back end of it all flowing in to allow people to leave out the front door. Oh my God, it's amazing. I just want to say that for people who haven't seen that kind of physicality of work, the concern people doing that work have for your wellbeing is just extraordinary. We're going through these facilities, and they're doing produce and that kind of thing. There was an apple, and it wasn't... And the guy I was working with turned to one of the associates and said, " This apple, would you serve this to your mother?" Dr. Spear (00:18:34): And the guy said, "Nah. I don't know." He said, "We got to figure out how this apple ended up this far into the system," and this was way up in the system. "We got to figure out how this apple ended up in the system, because if you wouldn't serve it to your mother, why would we sell it to anyone to serve to their mother?" And then later in the day, I discovered that tomatoes and strawberries are really, really hard. And so we're at one step along this enormous complicated, fast-moving, complex supply network, and someone picks up a tomato and says, "That's a beautiful tomato." And you had these several adults standing around admiring this one tomato. So anyway, I just want to say for those of us who maybe we have jobs that we do from desks and offices in front of computers, next time you pick up a tomato, have some appreciation of not just the hard work and the complexity, but the care and concern that hundreds of people had that when you ate that tomato or served it to your mom, that it would be a good experience for you. Gene (00:19:35): Oh, that's awesome. And so let's talk about the economic angle. So this reminds me of a conversation I had with some friends at Walmart, and they talked about how by leaning out the supply chain, change the replenishment strategies... I think this came from their CFO, that every time Walmart stocks out, the collective impact of stock-outs was $3 billion in sales. That's when you leaned out too much, and if you carry too much inventory, that $45 billion inventory, that shouldn't be there. So if you could reduce it by 1%, that's $450 million of capital that could be freed up and deployed to more productive things. So I have to imagine that's just a sliver of the total value. Dr. Spear (00:20:15): Gene, you make such a good point, because the stock-out, which is what sort of triggered this article... All right, so there's the opportunity cost or the delayed gratification, which is, "Oh, I didn't get my beans now. Maybe I'll get them in the afternoon or tomorrow. I have to come back." But your point about the overstock, I mean, what really is an overstock? It's the presence of something that no one wants. And then you start taking that one step further. In order to have that overstock, what does it mean to have that overstock? It means that someone invested some hours of their life which they'll never get back. We have to keep coming back to this, which is it's people doing work on behalf of other people. That if you have an overstock, not an understock, you've asked someone to do something which no one cares about. Dr. Spear (00:21:04): Anyway, so let's keep going before I lose it here. Anyway, an overstock is a pathology, and we should understand it as a pathology, because there's the carrying costs of the overstock, and there's the storage of the overstock, and there's the complexity of keeping track and moving it, and this thing and that thing, and actually the physical hazard of managing overstock situations than properly stocked situations. The other thing is anytime you walk into a situation and say, "Oh, there's an overstock," realize someone has been asked to do something and no one was in a position to say, "Thank you," for that effort. Gene (00:21:37): So help me understand. Before we dive into the details of the supply chain, what do you think is the... Over the last 30 years as the supply chain has been leaned out, how much value has been recaptured so far? So global GDP is about $80 trillion a year. Just order of magnitude. Dr. Spear (00:21:54): Look, I'm not an economist. I can't give you a precise number, but I would say... First of all, I'll say the introduction of just-in-time strategies has been a source of enormous value. And as a non-economist, the way I can sort of prove that, air quotes around prove, is that when we look at Toyota which has been the inspiration and the exemplar of just-in-time, this highly-synchronized, self-stabilizing, retuning, repurposing way of managing these very complex networks, their ability to create and deliver value into their market has been dominant, and it's been dominant for 30, 40 years. We can let listeners go out and sort of figure out how does that extrapolate to the entire economy, but it's very big. I'll give you another example. I was recently doing work looking at a company, and I have reason to believe that they've been wildly high fidelity in terms of taking lessons from Toyota and creating their own just-in-time systems across multiple business types. Dr. Spear (00:23:04): You look at their rate of return over the last X years, and the last X years have been pretty good in terms of the S&P 500. They're about 5x each year. So if the S&P gives you back about 3% a year, these guys have given about 15% a year. And I looked at one of the numbers, it was crazy. If you look over a 10-year period, they have outperformed the S&P by 20000%, or something crazy like that, but I'll stop pretending expertise in economics. But the impact of creating these well-tuned, well-synchronized, agile, resilient, complex, dynamic systems, the value of that to society has been simply off the charts. Gene (00:23:45): [inaudible 00:23:45]. Okay, I have learned so much from this interview, and learning so much from listening to this interview again. I have a bunch of things I'd like to elaborate on, especially after I was able to do some research. So let's do the last one first, about how much value that was likely created by mass adoption of just-in-time practices. I scanned a bunch of academic papers, and I found one called An Econometric Analysis of Inventory Turnover Performance in Retail Services by Dr. Gaur, Fisher, and Raman, from respectively the Stern School of Business, Wharton School, and Harvard Business School. The figure that caught my attention was that inventory is typically 36% of total assets, 53% of current assets in 2003. That totals $449 billion. A metric that is commonly cited is called inventory-to-sales ratio. According to, the inventory-to-sales ratio is used to determine the rate at which companies are liquidating their inventories. Put simply, the inventory-to-sales ratio measures the amount of inventory that a company is carrying compared to the number of sales that are being made. Gene (00:24:59): They say that efficient inventory is a crucial aspect when running a business, because when you keep a large stock, then you will risk not selling them, hence reducing the efficiency of the business operations. So they described the two extremes, a low inventory-to-sales ratio means that sales are high and inventory is low, which indicates excellent performance for the business. In other words, a low inventory-to-sales ratio means that the business can quickly clear its inventories by way of sales. This shows efficiency in the operation of the company, hence leading to high chances of making a profit. On the other hand, a high inventory-to-sales ratio means that the company is witnessing a high level of inventory compared to the speed of sales. This can be interpreted that the stock were not aligned with customer taste and preference, leading to dwindling sales for the firm. Gene (00:25:49): When inventory levels are high, the firm might be forced to incur storage and maintenance costs, which reduce the profit margin of the organization. Okay, so that's the definition. I started looking for what data is available, and then ran into the amazing FRED site, run by the Federal Reserve Bank of St. Louis, which has amazing data of the inventory-to-sales ratio, but it only goes back to 1992. But then Twitter saves the day. I put out a request for help on Twitter, and James Cham at Bloomberg Beta linked me up with Dr. Daniel Rock, an economics professor at the Wharton School of Business. And he showed me that the FRED site actually lets you get access to data that goes all the way back to 1946, and actually has some tools to allow you to calculate the ratios. I wrote up a Twitter thread that summarized my results, and I'll put a link to that in my show notes. Gene (00:26:42): I wrote, "Thanks to Dr. Daniel Rock, using FRED, I was able to calculate the ratio of inventory to sales from 1946 to now. If you divide up the data into two portions, the first is after the US exited the 1982 recession, the inventory sales ratios are about one-third lower than pre-1982." So now that we have that, let's take a look at the last data point, which is Q1 of 2021. At that time, total inventory carried by US corporations were $2.6 trillion, and total corporate revenue is $11 trillion. So here's the math. If the percentage of capital as a function of revenue tied up in inventory were expanded to pre-1982 levels, that would be $850 billion more of capital tied up in inventory. So that additional amount is about 7.7% of corporate revenue that has been freed up to more productive activities. To put that in scale, there's a widely-cited average that says corporate profit margins average around 7.9%. Gene (00:27:52): In other words, compared to pre-1982 levels, the amount of inventory not being carried on the balance sheet is on par with the amount of profits being generated. That is a whole lot of value. That's nearly $1 trillion of inventory that would have been carried around had it not been for just-in-time. This reminds me of that Sam Walton quote from Walmart, "People think we got big by putting big stores into small towns, which is wrong. We got big by replacing inventory with information." In a reply, Dr. Rock responded, "It's a very interesting hypothesis and question. I'm not sure what's going on here, but there's a plausible story here," which I thought was super encouraging. Number two, let's go back to the beginning, when Steve talked about the massive number of suppliers you'll find in the automotive and aerospace ecosystem. I've always been amazed, and have never fully understood the stories about how the US automotive supply chain involves so many small companies. Gene (00:28:53): I would read news articles that would talk about how many of them are small family-run businesses, which was apparently one of the reasons that the US government had to bail out General Motors in 2009, because of how many small companies were now at risk because they were suppliers for GM. I'm reading from a March 19th, 2009 Reuters article with a title of Half of U.S. auto suppliers face bankruptcy. The article reads, "More than half of the top US auto parts suppliers could file for bankruptcy protection in 2009, with at least one million job losses, according to a study by global consultants A.T. Kearney. An article from IndustryWeek, March 28th, 2019, headline reads Suppliers Hit Hardest in GM Closures. Paul Erickson wrote, "... back in 2008 when General Motors was on the verge of bankruptcy, the U.S. Department of Commerce estimated that for every job GM would lose—and they would experience significant job losses going through bankruptcy—three supplier jobs would be lost." Gene (00:29:57): So apparently, these dynamics also occur in the aerospace ecosystem. There are some fantastic glimpses into this in the 21st Century Jet documentary about the Boeing 777 that was discussed in the episode with Dr. Ron Westrum. So in episode three, a Pratt & Whitney executive talks about how there are 50,000 parts in the engine that goes onto a 777, which is about one-quarter of the total price of the airplane. And he describes how during testing, they found that the turbine case could be ruptured, and had to be strengthened. And one of the most critical suppliers then became the Flanagan Brothers. This is a small firm. According to this documentary, it is quote, "A sleepy town in Glastonbury, Connecticut." They've been around for over 50 years. Their first business was selling copper fruit bowls, and their expertise is metallurgy. Pratt & Whitney needed Jim Flanagan's help to create a new jet engine housing that would be made of nickel alloy that could weigh 4,000- Gene (00:31:03): Thing that would be made of nickel alloy that could weigh 4,000 pounds because it's six feet in diameter, and Jim Flanagan figured out how to reduce that weight by 90%. the expertise that was needed is creating these large diameter parts made of strong materials that can withstand high degrees of heat without distortion, requiring drilling, milling, exotic grinding. So these are not conventional skills and they had those capabilities. There's a fantastic quote by an executive at Boeing who said, "despite all the engineering might at Boeing and Pratt & Whitney, we are depending on the Flanagan brothers to design, in this case, the ability for the firstborn 777 to fly hinges on this last minute need." Gene (00:31:46): So you can find that story in episode three of 21st Century Jet. In episode two, another small supplier shows up. They talk about how Asda is manufacturing the rudder that goes on the 777 that needed to be made stiffer. There's a fantastic interview of Al Tyler, rudder program manager for Asda, who's having to deal with these ever-changing requirements from the Boeing 777 team. Here again, the delivery date of the Boeing 777 hinged upon a very small supplier. I looked up Asda and around this time, they had only 95 employees. It's easy to imagine that design changes and pressure like this could have potentially driven them out of business. Gene (00:32:33): Interestingly, they were acquired by Rockwell and are now part of Boeing. I'm speculating here, but I wonder if those acquisitions were driven by the fact that such a small supplier could actually jeopardize the entire Boeing 777 program. I am just fascinated by how many technical niches there are for all of these suppliers because of what Steve calls the multitude of all the unique and local idiosyncrasies of all the problems being solved and how critical all of the suppliers are to make all of these amazing products. Okay. That takes us to number three. We just talked about the shape of the supply chain for the automotive and aerospace industry, where there are so many suppliers. Gene (00:33:17): Steve had asked me if I know how many parts go into an iPhone and I didn't, and I was actually very surprised at what I found. Apparently, the number of parts in an iPhone is around 350 parts according to CNBC. That seems like an astonishingly small number of components, but consider that the iPhone uses 70 of the 82 usable elements in the periodic table. So I'm speculating that supply chain is actually much deeper. In other words, those suppliers have suppliers that have suppliers. So there are 43 suppliers listed in the official iPhone supplier document. Again, not my area of expertise, but super interesting that supply chains differ so much by industry, which gets us to number four, which is how much code goes into these products. Gene (00:34:11): There is a phenomenal information graphic by visual capitalist. The title is how many millions of lines of code does it take? In 1990, Photoshop took about 100,000 lines of code. Quake III took about 300,000 lines of code. The space shuttle, 400,000, the F22 fighter jet, about 2 million, the Linux 2.2 Colonel, about 2 million, Windows 3.1, 3 million, Windows NT 3.1, 5 million, the Boeing 787, 8 million, the F35 jet, 20 million, Windows 2000, 30 million, Microsoft Office 2013, 45 million, the large Hadron Collider, Windows Vista and Microsoft Visual Studio 2012, 50 million, Facebook, 60 million, MacOS 10 Tiger, 85 million, car software for the average modern high-end car, 100 million, and Google for all their internet services, 2 billion. Gene (00:35:23): And it's interesting to note that these days for most applications, 95% of code that shows up in an application is not code that we wrote. Instead, we pull them in through software dependencies. For several years, I got to work with Dr. Stephen McGill analyzing the software supply chain within the Java Maven ecosystem. So Maven is to Java what NPM is to JavaScript, gems are to ruby and so forth, and we got to see how long it took for vulnerable components to be addressed to propagate through the transitive dependency graph. I'll put a link to that in the show notes as well, which gets us to number five, the dominance of Toyota. There's a fantastic video from random stats that shows an animated infographic showing the top 15 biggest car manufacturers in the world from 1999 to 2017. Gene (00:36:17): It shows not only the rankings, but how many cars they sold from 1999 to 2017. You may have seen a similar one about Moore's law and transistor counts from data grapha showing Moore's law and transistor counts from 1965 to 2019. I'll put a link to both of those in the show notes, but in 1999, the top manufacturers were GM, Ford, and Toyota selling 8 million, 7 million, and 6 million cars respectively. In 2004, the top manufacturers are still GM, Toyota, and Ford. By 2012. Toyota takes the top spot at 10 million cars with 9 million cars being sold by GM and Volkswagen. Gene (00:37:01): And by 2017, Toyota and Volkswagen are tied for first at 10 million cars with Hyundai selling 7 million cars. It's amazing to see Volkswagen surge in volume, which Dr. Amy Edmondson talks about in her fantastic book, The Fearless Organization, about how their goals to become the number one automotive manufacturer is likely what led to the diesel emissions scandal. Okay, let's go back to the interview. You had mentioned in that Swiss watch, the metaphor, the analog is the amount of lubrication you need. The amount of lubrication... Walk us through that. Dr. Spear (00:37:35): Oh, Gene. So this is a good thing, and I want to tie it back to this phrase lean, and just in time because the phrase lean, I think is in many regards misleading. It's sometimes used synonymously with just-in-time, which is used synonymously with what Toyota created, but it's misleading because lean sounds like you deliberately stripped away the inventory, or in the case of a watch, if a lean watch, you deliberately stripped away the lubrication. Now let's build back up from the watch example. The reason a watch has lubrication is because of people's inability to create perfect gears. If we created perfect gears, then they would operate without friction, without binding, without mismatch and so on and so on. Dr. Spear (00:38:24): In the world, there are people who specialize in the creation of perfect or near perfect gears, and those gears are unlubricated. They're very expensive because it requires a huge amount of talent to design and create them, but they're not lubricated and the reason you have lubrication in gears is because those gears are imperfect. Now, and in fact, you could think that the less perfect that gear, the more lubrication you need to compensate for its imperfections and all its mismatches, all right. Now let's carry that from the watch into the world of supply chain. Dr. Spear (00:38:57): Why typically do you have inventory is because there's a mismatch between the demands that people have on the system in terms of wanting or needing goods and services and the ability of the system to match what's wanted in terms of volume, quality, quantity, time, location, place, et cetera, et cetera. If you thought about perfect matching would mean I want something, boom. There it goes. You want something, boom. There it goes. Mismatching means that someone's got to keep a little bit extra just in case, that the system is not aligned with what you want. Dr. Spear (00:39:34): Now, the reason I just want to spend a little bit time talking about just-in-time versus lean is that one of the attributes or characteristics, but it's totally a dependent variable of these systems that are really good just in time is they require less lubrication in the form of inventory. The mistake a lot of people think is that the removal of the inventory was the decision to cut costs, and that's not the case. It's like, again, back to the watch, you will never buy an expensive watch and then remove oil from it to cut the cost of the oil. You would have a better and better watch, which would require less and less oil. Gene (00:40:13): Let's go to the proof point, which you sent me this morning. That blew my mind. So it is well-known that the automotive industry, in fact appliance manufacturers, I mean, they're all, in some cases, we're looking at lead times measured in six, seven months because of the lack of availability of semiconductor chips. I did not know until I read this article that apparently, Toyota is actually weathering the supply charge better than other auto manufacturers in a very surprising way. Steve, can you walk us through this article? Dr. Spear (00:40:44): Yes. The first line is, if you just read the first line and there was an article about how Toyota learned from the Fukushima disaster in 2011. I think it was in Reuters. We should give credit to where it is, but that they learned from the disaster, the Fukushima disaster in 2011, and if you read first glance, see, it proves the point. Just in time is very fragile because Toyota is stocking these electronic components for cars. All right. You got to read more than the first line. What happened in 2011 is that Toyota, like anyone else in Japan, really got whacked by the Fukushima disaster. I mean much like we got whacked by COVID because it basically dismantled these complex systems we depend on. Well, Fukushima de-powered these systems because the electricity got cut off, and in addition to all the social, all the human suffering and whatnot- Gene (00:41:39): Actually, just to interrupt you for a second. So this is the nuclear accident that resulted in hundreds of thousands of people being evacuated, power disruptions. Dr. Spear (00:41:47): Tsunami, badly placed generators, flooding, this thing, that thing, loss of the plant. Fukushima caused everyone in Japan to lose productive time. Toyota, first of all, I think if we go back and we look, Toyota was probably the fastest Japanese automaker to recover because they're actually very, very good at reconfiguring systems which were meant for something else to fill the gap and we can explore examples of that, and so I think if we go back, and I feel very confident in saying this, though I don't have the article in front of me, that in 2011, Toyota was fastest to recover of all the majors in Japan. Dr. Spear (00:42:22): Anyway, park that idea. What did Toyota learn out of that experience? In the course of recovering, they realized that across their entire product line, which must be tens and tens and tens and tens of thousands of parts, that there were 500, 500, which they were very vulnerable to disruption in the supply of those 500, again, out of the tens of thousands of parts that go into their entire product line. So what they did is they got with the suppliers of those particular parts and said, look, everything else we're running on these same day... It's crazy. In Japan, I've seen this, Gene. Dr. Spear (00:43:00): It's not same day delivery of parts. It's same hour, same 30 minute hour. It was like, people delivering things with such precision that they're delivering 15, 30 minutes apart at a time and just showing up every 15, 30 minutes with the placements, but anyway, they said, look, most of our system, we're going to run with that level of precision integration, that kind of thing, but your part, if there's another one of these Fukushima things, we're really screwed and so we're going to ask you as one of the 500 out of the tens of thousands, for that, we want to put a little bit aside just in case. Now, before we say, well, that still somehow weakly proves the point that just-in-time is fragile and just in case id necessary, think about how Toyota was able to do that. Dr. Spear (00:43:44): The only reason they were able to come to that degree of precision, those 500 parts out of the tens of thousands is that in order to set up a well-functioning just in time system in the first place, you have to have such clarity, Gene, such clarity about what is needed by whom, where and when, why, and from where it comes and how it gets there and how it's made and all the inputs and outputs. You have to have such a granular, detailed, precise visualization of your system as a whole that you can see this whole thing and say, aha, that's the one. Those are the 500 out of the tens of thousands. This conversation has been premised a lot about the production that the delivery of the availability of physical things and things like lines of code. Dr. Spear (00:44:32): But I'm starting to think about how much collective intelligence Toyota had to apply in order to figure out what the 500 critical ones were. In my book several times in The High Velocity Edge, I talk about the role of leaders in developing capability in chapter seven, chapter nine, et cetera, and it just dawns on me that to have all the products Toyota makes around the world and that they could find the 500, it wasn't like someone at Toyota headquarters sat down two guys, Gene and Steve sitting there and saying, oh, Gene's on which part? Steve's on. No, it was thousands and thousands of people who had the intellectual acuity and acumen to contribute to that discovery. It's staggering. It's what collective genius. Gene (00:45:16): It's interesting because it occurs to me that's not inwardly focused. It's really outwardly focused. You have to, all your suppliers, all your vendors, you have to understand what are their vulnerabilities, that it's kind of triggering an awe inspiring line of thinking. Is that kind of what you're thinking as well? Dr. Spear (00:45:31): Oh my God. Yeah. Gene, I'm just sitting here thinking about the collective intelligence required just to identify the 500 parts because you start thinking about it because tens of thousands of parts. There's no one in the world who knows what those parts are and even when you narrow it down to the 500, it means there are 500 separate conversations with suppliers about which part and where it's going to be stored in what quantity. I mean, oh my gosh. Gene (00:45:53): There's three things I'm pulling out of this article right now. One, it was driven by their business continuity planning function. So that led the two to six months of buffer for these things. Two is Toyota has been thus far largely unscathed by the global shortage semi-conductors falling a surge in demand. It is the only automotive maker properly equipped to deal with chip shortages, person familiar with Harman International, the audio systems. Toyota surprised rivals and investors last month when it says its output would not be disrupted significantly by chip shortages, unlike Volkswagen, General Motors, Ford, Honda, Stellantis, and others, as they have been forced to slow down or suspend some productions and earning forecast has been increased by 54%. So maybe that's, they're the only people with selling cars and so they're now benefiting from the ability to maintain production. Dr. Spear (00:46:42): Gene, it's inspiring. Gene (00:46:44): Should we talk about maybe based on what we've just learned here about the simplification stabilization aspects of this, or should we go to the Toyota brake fire in 1997, one of the many examples showing how resilient the supply chains and manufacturing capabilities of Toyota are? Dr. Spear (00:47:00): Yeah, well, let's go with the factory fire. So for those who haven't read that case, which they can find in chapter nine, there was Toyota had a supplier of a P valve. It was a valve that went on a line feeding into to brakes, and the supplier suffered a catastrophic factory fire, let's say, on a Saturday and it was like Wile E. Coyote losing his shack in a road runner cartoon. It was gone. The only thing they had left was some sample parts and some drawings of parts, and it's interesting because if you look at the press coverage that followed on Sunday, it was predicting, actually Gene, it was one of these same things. Oh, this proves the fragility and vulnerability of a just-in-time systems, and- Gene (00:47:50): February 4th, 1997. Yes. Dr. Spear (00:47:53): There you go. Right. It just proves it, and of course, as you spin through that case that you find the Wall Street Journal article from a few days later saying, that's odd. Toyota's up and running with no apparent disruption in production at all. Gene (00:48:06): Actually, just to pause there for a moment. Yeah. So it's that $10 P valve part that hobbled the 14,000 car per day output. They were projecting Japanese GDP would be reduced by 0.1% per day. Dr. Spear (00:48:20): Per day. Gene (00:48:21): Per day. All right. So very grave prognostications about what would happen. Dr. Spear (00:48:26): It's horrible. That just in time thing really is an awful thing, but anyway, how does the story end is that within, I think, what, about three days, Toyota's back in production and within a week, it's back in full production and a few days after that, it's running a little bit of overtime just to catch up with the things they missed over Monday and Tuesday, and so, in the case we explored, well, how is that possible? If you're running a system which is, and now I'm going to deliberately use the term lean in the way it's the wrong use of it to me, low inventory, how can you run a system with just such low inventory suffer the catastrophic loss of the only supplier in the entire world for that part and recover within a few days? Dr. Spear (00:49:08): And the answer is, and Gene, it gets back to our theme here of a distributed and collective intelligence is that Toyota was able to go into its supplier network and say, look, we need to take advantage of your really, really deep understanding of the complexity of your own systems and we want to ask you to reconfigure them, and here's an example of a part, here's the drawings for such a part, and here are the technical experts who understand material properties, et cetera, et cetera, but go do... Again, this is Richard Sony and this is that extreme delegation. In your own plan, regardless of what it does, whether it makes sewing machines, whether it makes a prototype parts, whatever it does, get with your people and figure out how to make this precision safety critical P valve. Dr. Spear (00:50:00): And wouldn't you know, even makers of non-automotive parts, if they're part of this Toyota ecosystem, they start showing up with production ready parts. So anyway Gene, look, in 1997, everyone said, this proves that their fragility of just-in-time is the misinterpretation. They saw capital burning down. I mean, the bricks and mortar of a building, and they said, you lost the bricks and mortar, you lost the metal equipment inside. You no longer, but what they missed was capability, Gene, and capability is resident in people and people, when we're really capable, we have a tremendous ability, but also tremendous agility and the capacity for repurposing and redirecting our efforts. Gene (00:50:45): Amazing. By the way, there was something that you said that I was not actually aware of to the extent of which there was a catastrophic loss. If I heard you correctly, the ability to actually tell people how to make these parts, it sounds like that was also impaired. To what extent is that true, or did I just misinterpret that? Dr. Spear (00:51:02): I don't think he misinterpreted it, Gene. Look, here's what they had. They had parts. They said, this is what a P value of this type or that type looks like. There you go. They had drawings, engineering drawings of those parts. They had the expertise of the engineers who had designed those parts, and both the design engineers and the manufacturing engineers who knew the design and the making of those parts, but they didn't have the equipment. See, that factory because it was the sole supply, and that's what they did. They had very, very specialized equipment for making those parts. Dr. Spear (00:51:34): No one else in the world, or at least within the Toyota ecosystem had similar equipment. So when Steve's factory burns down like Wile E. Coyote's shack all the way to the ground and Gene is asked to make those parts, Gene's got to take equipment which was never meant to make P valves and repurpose it. Now, how does Gene redo that because the resident capability of the people within Gene's organization are such that they can look at the part, they can look at their own equipment, and they can reconfigure their own processes to make that P valve, even though they've never made one before. Gene (00:52:10): And I'm quoting from your book, one of the things that really delighted me was some of this equipment, they stripped exhibition demo models from showrooms. They took equipment that was already sold to other people and purpose that in service of this mission to recreate this capability, to build this never made before outside of that plant, a P valve part. Dr. Spear (00:52:28): A hundred percent. Gene (00:52:30): Okay, so because my other question, which is, so my interpretation of this is that basically, the entire Toyota supplier network was mobilized in service of creating this capability. Can you talk about what's the motivation for someone to actually go along with this? I mean, they all have their own profit motives, and I know there's a very holistic, what's good for Toyota is good for everybody, but this has got to be, how do you concretely take these organizations that have their own missions and other lines of business all divert such effort to this emergency for Toyota? Dr. Spear (00:53:08): Yeah. So Gene, I guess it gets to the question, do you want to play on a winning team or not? What I mean by that is I think ESPN, during COVID, had a thing on Michael Jordan and the dude is out there in terms of his... I mean, first of all, the physical skill's incredible, but that was necessary, but not sufficient. The amount he was driven, his competitiveness, his absolute determination to extract the most he could out of his own innate potential was off the charts. It meant that if you were on the Chicago Bulls during the Jordan era, you had to push yourself to the same level. Even if you didn't have the same physical gifts, you had to have the same character gifts. Now, if you did, you got to play with Michael Jordan and the other people on that team, Scottie Pippen, et cetera. Dr. Spear (00:53:59): And you got to win a succession of championships. Look, I'm going to throw this in as a little bit provincial here. In New England, Bill Belichick runs a different kind of football team and not everybody can go along for the ride with Bill Belichick and Tom Brady and all that because they exist way outside of what any of us can imagine in terms of commitment to the craft. However, if you're willing to commit to your craft in a similar fashion, you get to be on the Patriots and just have to sit around all day polishing your trophies and rings where others don't. So this ties back to why were the folks part of the Toyota network willing to do this? I don't know. It's like you're asking why did the folks on the Chicago Bulls train so hard? I mean, that's who they were. They wanted to be winners, they were committed to being winners and they were winners. Gene (00:54:54): What are the simplification and stabilization artifacts that we can see that are exhibited in the 1997 recovery from the Toyota brake fire? Dr. Spear (00:55:03): Yeah. So I had chance to speak directly with people who are involved in that. This whole, and again, I want to say very, very high performing just in time approach depends on building systems with very fast, very frequent, very fine-grained feedback, and this is why you can operate with so little inventory because when something is slightly out of whack, it generates a signal somewhere to pick up, speed up or slow down, do more or do less, do this or do that. It's these systems with these incredible properties of feedback and self-correction that they always seem to be in tune even with a world which is unpredictable. Dr. Spear (00:55:52): And so anyway, let's take that feedback story forward. So what happens is that you have people who are inside this ecosystem and that their means of being depends on feedback triggering self-correction, self-improvement. So now you hand them this P valve and say, all right, can you make it, and truthfully, the first answer is no, I've never made a P valve. The first question, what is even a P valve, but these are people who've been socialized to this ethos of have an idea, build it at small scale, get quick feedback, modify your understanding, build another version, build another version, build another version. If you're really uncertain, build multiple versions and have them test it in parallel and get very, very fast feedback, both local feedback and systemic feedback. Dr. Spear (00:56:48): I mean, Gene, just as sort of a parenthetical, I mean, Toyota, I think pioneered a lot of the practices that the IT community has adapted as agile, which is when faced with a new situation, assume ignorance. You just assume you just don't know, and if you assume ignorance, but you know you need to have great wisdom and understanding, the only way to acquire that is learning loops and the way to get to wisdom quicker, better, faster is have richer learning loops that occur more and more frequently. I don't think we should be surprised by that because that's in the IT system, the lean startup stuff that Eric Reese and Steve Blank have written about. It's feedback based, learning based. Gene (00:57:30): So you had ended on this, you're alluding that the tech space has learned a lot through this kind of fail safe experimentation, and what that immediately brought to mind and what I had shared with you, I think earlier was the famous Netflix chaos monkey, where Netflix is one of the few organizations to actually survive the Amazon web services failure that took out almost every one of their customers in 2011, that was running in kind of one particular availability zone, AWS-East- 1. Gene (00:58:00): And so for days, people were wondering what was Netflix doing differently that caused such a different outcome for them and they shocked the world by describing how they had been running something called chaos monkey that randomly kills virtual machines in the cloud so that in the middle of the day, so that developers got used to single servers just disappearing and enforcing, encouraging developers to design things in such a way that it could survive such a failure, and so it's no surprise that they were very good at creating things that were resilient enough so that when one availability zone went down, everything could keep running. Dr. Spear (00:58:41): Right. Gene (00:58:42): Steve, this reminded me of an amazing thing that you wrote about, about what happened at the Aisin mattress factory plant. From my notes, they had two production lines, each capable of producing say 100 units per day and on slow days, they would send all production onto one line experimenting with ways to increase capacity, identify vulnerabilities in the process, knowing that if something went really wrong, they could just switch over to the second line. To me, those stories sound very, very familiar. Can you react to that? Dr. Spear (00:59:12): First of all, I think that the stories are very similar, if not identical. So what you have in both cases is organizations, in one case, Netflix, and the other case, a Toyota supplier. In both cases, there's a sense of vulnerability carried by management that if somehow, they needed to overload their system by design or their system got overloaded by act of God, that the system might crash. So what do they do? In both cases, I think the way the narrative goes is they take advantage of moderate times. You said the Netflix was midday, right? Gene (00:59:47): In midday. I used to think it was in the middle of the night, but no, they wanted to take compute servers down in the middle of the day because the developers are already at work. They don't have to, if things go wrong in the middle of the night. Dr. Spear (01:00:01): So I'm going to offer another spin on that. Let me come back. I think the way the case was set up in the book, each line had a rated capacity of a hundred units. So on days when they had, let's say instead of a hundred units, I'm sorry, a total of 200 units demanded so 100 each. Rather than taking 180 units and going 90 and 90 or 180, they instead went 120 and 60, and why did they do that? Because they knew how to do a hundred and they certainly knew how to do 90 and 80, but what they didn't know is how to operate at 120. So here, on a day where they could underload one line, they could overload the other and find out all the problems that would present themselves. They could see those problems at 120 and then solve them and gradually raise their capacity. Dr. Spear (01:00:54): Now here's where I think the story runs similar is if you're telling me that Netflix did this midday, here's what's popping into my head. People are not watching Netflix midday. They're watching it in the evening and the night time. So if you're saying midday, Netflix is doing it because it's when they're developers are "on the clock". So what you're saying is that's an under demand, over capacity moment. So why not do a stress test because what's going to go wrong? You've got all your developers and you don't have the eight o'clock heavy viewing crowd, which is going to riot because they can't turn on to their latest installment in the Marvel universe. So anyway, I think the stories, at least the way you told me your story, it sounds very, very similar to the story I observed from the folks at Toyota or Toyota's supplier, Aisin. Gene (01:01:49): And what's the underlying principle at work that was demonstrated at that Aisin plant? Part of it was this desire to figure out ways to increase capacity, but certainly, there's more to it than that. Partly, it's experimentation? Gene (01:02:02): Certainly there's more to it than that. Partly is experimentation, doing it safely. Dr. Spear (01:02:05): I think in both cases, there's a sense that more can be known about getting value out of the system in a way that's safer and safer. So, in the case of Netflix, at least the way you gave the accounting, it wasn't so much that they wanted to increase their capacity to serve yours. Now that might've been the reason, but they were concerned about an unplanned and unpredicted outage that would cause them disruption. And so what they were trying to do is simulate that outage so they wouldn't suffer from it. Now, in the case of the Toyota supplier, the issue was less worrying about an unplanned outage, but to take advantage of marketplace demand. Dr. Spear (01:02:54): But I think in both cases the underlying belief was that whatever we're doing today, doesn't reflect all that can be done, because what we're doing today reflects what we know today and does not reflect what necessarily could be known. So, in the case of both Netflix and this Toyota Supplier Aisin, what they did was they created a stress situation, which forced people to solve problems, which they ordinarily wouldn't even be seeing. And by being in this stress situation that not only were they solving the problems, but they were building useful, practical knowledge. They get added capacity, added agility, added resilience, added all those very desirable properties. Gene (01:03:39): I want to thank all the listeners who have sent their encouraging feedback. Here are some of the comments that we've seen on Twitter. [Nico 01:03:47] writes, "One of the best podcasts out there is with Gene Kim and ITRevBooks, thoughtful, insightful, and entertaining. The latest episode with Dr. Ron Westrum was brilliant and hits home on why so many organizations struggle to innovate, add your playlist." [Deca Guppter 01:04:02] writes, "A gem that's worthy of listening to again and again. And I learned so much. All of us read so much, but listening to him was something else. Thanks, Gene and ITRevBooks." [Adam Zimman 01:04:14] writes, "The season is on fire. Its been like the ultimate candy store of leadership principles. If you spend time thinking about leadership and organizational design, the conversations have been remarkable. Keep them coming." Gene (01:04:24): And longtime friend and collaborator. Dr. Nicole Forsman writes, "I cannot recommend this enough. I'll be listening to it again. It's just so good." [Kate Lanyon 01:04:32] writes, "I had to say a big thank you to you and Trent Green for the ideal cast episode on the mass vaccination site, it's a big inspiration to us here at Eugene and helped us set some important goals going forward." By the way, she also sent me this amazing diagram of the notes you took from that episode. We will put a link to her diagram in the show notes. Okay. Thank you again, for all the kind words. You can reach me anytime on Twitter. I'm @RealGeneKim. Let's go back to the show. Before we leave this topic, we've talked so much about how people who participate within the Toyota production system, supplier network. They're famous for being able to do say 60 line side store changes per day versus more tightly coupled systems that disastrous things happen when you do six. What's the link between this kind of experimentation behavior happening in the Aisin plant and the low cost of change? Dr. Spear (01:05:31): So, the experimentation and the Aisin plant comes back to this idea that we simply don't know all that's knowable about the work we do. So, what we need to do is take advantage of opportunities to add to what we know so we can put that wisdom to good practical use. Now, something else we've discussed is this idea that given the system you're running is always going to have absorbed uncertainty. And given the system you're running, you always want to have the opportunity to creatively solve problems and build your understanding. The previous conversations we had were around this idea of design your systems to allow you the latitude to experiment, reduce the cost of experimentation. All right, now, let's spin that back to the Aisin folks, is I bet, and unfortunately I wrote that case 10 years ago before really top of mind was design systems to reduce the cost of experimentation. Dr. Spear (01:06:38): But just by recollection, that system was designed where experimentation was wicked, wicked cheap. And I'm just going by the memory I have of that system being ... Oh, well, Gene actually. So, we've talked about what are the attributes of a system that is designed to reduce the cost of experimentation? So, one it's linear versus non- linear and the linearity allows partitioning and nesting, that reduces the cost. The system has standards. So, built into you're doing a worker refutable hypotheses, the systems are inherently stabilized so that a local disruption doesn't become a systemic disruption and they're synchronized so that they don't get out of whack. Now, one thing I can say with great confidence, the production line at this Toyota Supplier Aisin had all of those qualities. It was a simple, non-linear. It was standards everywhere. It had a stabilization and synchronization. So, that system was designed to make the cost of experimentation very, very low. And what didn't you know, that when given the opportunity to experiment, they took full advantage of it. Gene (01:07:53): Gene here. Okay, a couple of clarifications and amplifications. One, the Fukushima nuclear disaster, according to the Wikipedia page, the Fukushima Daiichi nuclear disaster in 2011 was caused by the Tōhoku earthquake and tsunami. It was classified as a level seven international nuclear event, joining Chernobyl as the only accident to receive such a classification. It resulted in 154,000 people being evacuated. The earthquake was on March 11th, 2011, generating a tsunami 14 meters or 46 feet high that arrived shortly afterwards and swept over the plants seawall and flooded the lower parts of reactors, one through four. Incidentally, the Fukushima disaster also shows up in Dr. Amy Edmondson's book, The Fearless Organization, serving as a case study about both how warnings were ignored as well as how leadership can be done in a psychologically safe way. A fantastic book. Number two, there was a great Forbes article that describes the steps that Toyota took after the Fukushima disaster. Gene (01:08:57): The article says that most of Toyota's Japanese based production plants were closed because of the supply chain suffering from a widespread shortage of parts that persisted for weeks. To reassure investors, leadership quickly issued a litany of plans, policies, and declarations to ensure that such a disruption could never yield the same degree of havoc upon the most ambitious what is aim of building an earthquake proof supply chain. The claim appeared bold, even foolhardy given the company was based in one of the world's most geologically active regions. But it spoke more to the mitigation efforts of containing the effects of sudden shocks rather than preventing them. Since the 2011 incident, Toyota has worked closer with local suppliers to share supply chain information to protect Japanese manufacturing. This is a database of supplier information that identified vulnerabilities and parts information of over 650,000 supplier sites. The reason why I bring this up is that it's not just a vendor manager in Toyota understanding their supplier. Gene (01:09:56): Instead, that vendor manager must understand that supplier and all of the supplier's suppliers, all of it's dependencies, or as we talk about in software, all their transitive dependencies to fully understand the extent of the supply chain risk and extremely ambitious undertaking. Number three, Steve talked about the Aisin Corporation, according to its Wikipedia entry, it is ranked 359 in the Fortune 500. It has $30 billion in annual revenue. Number four, the famous Netflix Chaos Monkey. So, many of us will remember April 22nd, 2011, because it was one of the first widespread AWS or Amazon Web Services outages. Every customer that was reliant upon AWS east one availability zone went down. And there was one very curious exception, which was Netflix. For days people were wondering, what is it that Netflix is doing differently that enabled them to have such a radically different outcome than so many other people using AWS. Gene (01:10:54): The answer was revealed in a now famous blog post called Five Lessons We've Learned Using AWS. They write, "We've sometimes referred to the Netflix software architecture in AWS as our Rambo architecture. Each system has to be able to succeed no matter what, even all on its own. We designed each part of that distributed system to expect and tolerate failure from other systems on which it depends. One of the first systems our engineers built in AWS was called the Chaos Monkey. The Chaos Monkey's job is to randomly kill services within our architecture. If we aren't constantly testing our ability to survive failure, then it isn't likely to work when it matters the most." It's such a great article. So, later that year in July, Netflix wrote another article about how Chaos Monkey was just one of what they called the Simian Army. The other members of the Simian Army include Latency Monkey, which introduces delays into all their restful client server interactions to simulate service degradation. Gene (01:11:57): Another one is Conformity Monkey, where they delete instances that don't adhere to best practices and just shut them down. Dr. Monkey takes unhealthy servers and takes them out of rotation, and they proactively investigate them, trying to figure out how they got into that weird state. And Chaos Monkey gained a larger sibling called Chaos Gorilla. So, Chaos Gorilla simulates an outage of an entire availability zone, not just a single server. And later came Chaos Kong that actually simulates the downing of an entire region. And now there is a whole field called Chaos Engineering. So, this is a subset of resilience engineering focusing on how to operationalize these type of practices that were pioneered at Netflix. Okay. And this gets us to number five. Steve talked about the four characteristics of simplification, standardization, stabilization, and synchronization. I want to take a moment to describe what these are. Gene (01:12:51): So, in the realm of structure and dynamics, these are the four properties that we are defining as necessary to get these amazing characteristics of being adaptive, resilient, and having a low cost of change. We've talked about some of these concepts in previous episodes, but over the last couple of months, we've started to concretize these concepts. So, briefly simplification are what are those linear and explicit flows of work. And these are what allow and enable modularity, meaning partitioning and nesting. Standardization means that everything has a strong declaration of how they will work and what will happen when those conditions fail. Stabilization is that local effects stay local as opposed to causing global chaos and disruption. And synchronization is that all coordination can be done locally, now what we're calling fast mode, not requiring the coordination of a centralized mothership that knows all such as a centralized production control system. Gene (01:13:49): So, for the sake of intellectual completeness, let's see if the Netflix Rambo Architecture satisfies those four characteristics of simplification, standardization, stabilization, and synchronization. So, the first property is simplification, which is ideally achieved through nesting and modularity. And so this is certainly the case at Netflix, which is considered one of the exemplars of this practice. Microservices and service oriented architectures are all being used at Netflix to make sure that things stay modular, to make sure that services don't entangle with each other. And what I read from that blog post that those services were independent of each other, being able to operate on their own, even if dependent services fail. So, yes on simplification. Standardization speaks to the very strong declaration that things will behave in a certain way and to find what the explicit fail-over behaviors are. So, much of the Simian Army is to simulate these types of things going wrong, specifically to test things continue to operate as designed. Gene (01:14:48): So, yes on standardization. The third property is stabilization where effects stayed local without causing global chaos. And for sure, yes on this. The whole point of Chaos Monkey, Chaos Gorilla and Chaos Kong is to simulate even large scale disasters happening to make sure that the service can continue to run as designed. This is certainly a great example of any system's ability to stabilize itself. And the last property is synchronization where coordination can be done locally, as opposed to being controlled by some all-knowing centralized mother ship. For sure, yes, on this. The best demonstration is that so much of the Simian Army actually moves non-conformance services, such as killing improperly configured VMs. They rely on things like AWS auto-scaling and load balancing to make sure that things can be taken out of rotation without disrupting the service. These ensure that when inopportune things happen, the service can continue to operate. Gene (01:15:47): So, to summarize, it certainly does seem to indicate that the Netflix Chaos Monkey has the same architectural properties as the Toyota supply chain and the Aisin [inaudible 01:15:58] factory. Okay. Let's go back to the interview. That's awesome. Okay. So, we first talked about the 1997 Toyota brake fire and how the supplier network mobilized to bring back significant amount of the pre fire capacity. So, something else happened then in 2002, that was even bigger. You asked in the book, what if the entire supplier network went off the grid? And this is exactly what happened when 29 ports on the US west coast were shut down from September, 1929 to October 8th, due to a labor management conflict. And the three things I wrote down here is that you couldn't divert ships to Canada, because the unions were sympathetic. You couldn't go to Mexico because presumably insufficient royal and road infrastructure. And you couldn't go through the Panama Canal because the ships were too large. So, I have to imagine people were prognosticating doom then, and that was a remarkable sort of resilience in the system. Do you want to walk us through that? Dr. Spear (01:16:55): Yeah. So, I think the storyline is that ... All right, look, Gene, we got into this conversation because of my rant against folks like the New York Times who declared the shortages we experienced during COVID as a consequence of just-in-time policies. As we discussed, they reflected, I don't know, a grotesque misunderstanding of what just-in-time policies are. They thought just-in-time means rip out all the inventory you have there as a buffer in protection and operate, I don't know, like on a tight rope without a net. And of course, that's not what we're saying. We said the reason the amount of inventory and systems has been able to go down is because people, at least some, not all, but some have learned how to create systems which are much better at the self synchronization attribute. And the more and more something is synchronized, the less and less it needs arbitrary inventory for buffer. Or I think it's even more accurate, the less and less it's going to generate needless inventory. I mean, I worked in a factory for some short period where the mechanisms of pacing production were terrible. So, not surprisingly the things that we actually had to ship on any given day, we never had them in stock. It was always a rush order to expedite them through the system. But we have plenty of stuff that no one wanted. We had plenty of stuff that no one wanted. All right. So, anyway, let's tie this all together. The ability to operate with lower inventories and the ability to operate without generating needless inventory is a function of having a much, much better understanding of the elements of the system and how they come together. Okay. That's where we are. So, what happens when the ports close? Well, those who have poor understanding of how their systems are constructed, all they see is we can't deliver. Because they see the system as an all or nothing logistics. Dr. Spear (01:18:54): When the port closes, to them it seems like they've lost all their logistics capacity and they can't deliver. Now in the case of Toyota, which is the exemplar, perhaps even the modern inventor of just-in-time. They look at that and said, well, unloading a ship and putting the stuff from the ship onto a truck is only a small piece of the much, much larger whole of how we get things from their origin to their destination. So, we just have to replace that one piece. Everything prior to that we can do and everything after that we can do. All we have to do is fill in the missing gap. And so they go ahead and they figure out how to unload in this port or that port, I think in Mexico, the lineup rail and truck and this thing, and that thing, how to get across the border, which is they've been moving things across the border. Dr. Spear (01:19:45): So, now it's an issue of scale. And clearly there had to be some administrative things that they also had to disaggregate and solve those problems. But then they were able to put the system back together. And the beauty of it, Gene, is that for the person who, or the many, many parties who got the parts made and onto a ship from their perspective, nothing changed. And for the people who took stuff off the trucks for them, nothing changed. The only thing that changed was this little piece in the middle and the reason it was changeable ... Again, it gets back to the phrase that we've been using with such a liberty and joy is this idea of designing systems so the cost of change in the cost of experimentation is low. Gene (01:20:27): That's interesting that you mentioned that you could be participating in that system and not even know that there was this massive disruption, sort of like the Kanban card example that if they weren't your input or the output, you may never know. [crosstalk 01:20:43]. Dr. Spear (01:20:44): Oh, Gene, let me just think. We've been talking also through several of our conversations around this 60 line changes. This goes back to the example of visiting a Toyota plant with the head of manufacturing from a big three. And they said on any typical day they're changing the location of line side stores up to 60 times a day. And his response was one of profanity. And he said it was impossible that they could do that, because they once tried to move six things on a given day, and they shut the plant down. Now that's six versus 60 is like this port thing on a small scale, which is if you design a system without concern and attention to design it, to minimize the cost of change, then it won't have simple linear flows, which naturally partition, which naturally nest and change is very disruptive. Dr. Spear (01:21:39): On the other hand, if you have, as a conscious objective function, the idea of designing these complex systems to minimize the cost of change and minimize the cost of experimentation, then whether you have to change them internally, where you put parts, or you have to change them externally, where you drop parts off, at what port, and reconstruct that disrupted element. Your system is two things. One, your system is more on conducive for that kind of change. The other thing is the people who work in the system, you're tapping into the distributed and collective genius. The people in the system have the capability to do that experimentation and do that change. Gene (01:22:27): So, the thing that blew me away was how they bought up all the 747 freight capacity and presumably, they would have bought more, had there been more. And there wasn't more so they had to go find Antonov transport planes. Dr. Spear (01:22:40): Oh, yes. [crosstalk 01:22:42]. Worlds biggest cargo planes [inaudible 01:22:44]. Gene (01:22:45): And had to land in Mexico due permitting problems. And then they also had to handle all the cargo that was dumped by the cargo ships in Mexico, because presumably they don't care that the ship was making and they had to dump off the cargo somewhere, not my problem. And then the mobilization of a whole bunch of leaders to try to figure out how to get the cargo north. I love this phrase, because we solve real problems all the time in our daily work when real crises hit. It's just a matter of degree. Can you talk more about any of that? So, one is if ship's not available, we'll send it by air. And the fact that there was all sorts of cargo showing up at ports that presumably maybe never have been an entry point before. Can you fill out that story? Dr. Spear (01:23:32): Yeah. I think the issue here again is a system designed and populated by people where change is the ordinary. And so when faced with having to change it was just a matter of maybe a slight redirect or an acceleration. But it wasn't from a dead start and having to figure out where to go. Gene (01:24:00): Believe it or not. I've read this case study numerous times. And for the first time during this interview, it just occurred to me that Toyota was not the only automotive manufacturer hit by this. Anyone who had inbound parts from Japan or from Asia would have been impacted from this. So, I guess this is another natural experiment. What happened to the other auto manufacturers? Dr. Spear (01:24:25): Well, I have to believe that they were less quick to understand their existing system and understand where the port closure created a gap that had to be reconstructed, unless adept at reconstructing things. So, when Toyota quickly started realizing, wait, we got to fly stuff that's really, really critical to keep our north American operations opening. There was no air cargo capacity for those other folks. You as much said so, when you said that they bought up the commercial air cargo capacity, which is why they started leasing these Russian military cargo jets. I mean, this is a gigantic. People have seen maybe pictures of the C5, which is the giant us military transport and these Russia ones are even bigger than that. They have overhead cranes inside. They're phenomenally huge. Gene (01:25:16): I'm sending you a bunch of links right now. So, already an auto manufacturing plant in Fremont, California, shut down, idling 5,000 workers. The so-called NUMMI plant was shut down for unspecified amount of time. And there's actually another article, just-in-time inventory system proves vulnerable to labor strife. Dr. Spear (01:25:36): Yeah. This is one of these silly things, Gene, because what is not shown here is how quickly NUMMI was able to reopen because of just-in-time and the human capability within that system, that they were able to redirect around the ports. And it also begs the counterfactual. Most people aren't intelligent about the inventory they have on hand. And we can park for a moment the story of microprocessors at Toyota. Most people are not intelligent about the inventory they keep on hand. There's a couple of types. One is we've got no flipping clue about where this system will fail. So, we have this stuff just in case. There's that. And then there's this stuff. The reason we have it on hand is because we made a mistake in getting it in the first place, because no one wants it. I mean, look, Gene, I don't know if you see this in your local supermarket. Let's say it's periodic. Dr. Spear (01:26:30): I don't know if it's once a week. But periodically you'll see someone going around with a shopping cart and filling it up, pulling stuff off the shelves and throwing it into the shopping cart. Why is that? Because they had overstocked on those things. And so the milk is beyond this recommended sell date. The eggs are beyond their recommended sells date. All these things that are "perishable," they're beyond the recommended sell date. Now why is that? Because they bought stuff that no one wanted. All right. And that's the real nature of these non just-in-time systems is that it's not like they have a lot of emergency stuff on hand for the just-in-case, oh, good thing we strategically and thoughtfully stockpiled that one thing. They have a lot of stuff on hand simply because they didn't know they didn't need it on hand. Dr. Spear (01:27:17): And that stuff is not there as a cushion or buffer. That stuff typically is they're clogging up the works because you got to track it, store it, move it, et cetera, et cetera, costs carry it. There is an exception to the general rule, that inventory is the unintended consequence of an ill coordinated system. And that's what we were talking about earlier, which is after Fukushima Toyota said, holy cow, we can really get clobbered by angry mother nature. And they went down the list of the tens of thousands of parts that go into their entire portfolio product. And they can put the 500 where they felt most vulnerable. And so those they stockpiled. And it turns out Toyota's running through this latest shortage of microprocessors and other chips because they knew to stockpile among their 500 things where those particular chips. But that's not how inventory normally operates. Most people don't have in hand the stuff they actually need. It's all this stuff they made by mistake. Gene (01:28:20): Just as a, I can't share all the details right now, but we have an experience support coming from a large merchandiser who described how they were able to increase profitability of a certain segment within a store by 40% just by reducing amount of stockouts and reducing the orders of inventory that never sells, which is just awesome. That's great. So, I will look for more articles about how Toyota fared versus other manufacturers. I'm assuming that probably won't be hard to ... We should be able to find those just because certain parts do come from Japan and only from Japan. Dr. Spear (01:28:57): Right. We were talking, this whole conversation got triggered by that nonsensical article on the front page of the New York Times last week. The Wall Street Journal beat them to the punch because I'm seeing here an article, a similar theme, automakers retreat from 50 years of just-in-time manufacturing. [crosstalk 01:29:21]. It's like, no, no, they're not. The hyper efficient supply chain model [inaudible 01:29:27] by Toyota is under assault. It's not, in fact it's demonstrating its resilience, it's demonstrating its agility. And if anything else, they're going to double down on making sure that they're even more agile, more synchronized, more in tune with what's going on. Gene (01:29:47): And so they talk about Tesla and how they're going to buy nickel directly too. And it just seems like the common thread here is that smart manufacturers are buffering themselves from strategic supply chain risk, whether it's Toyota and semi-conductors and "stockpiling those." Having a larger safety buffer or nickel, Tesla, [crosstalk 01:30:11] point. Because it's not free. This is a cost they will pay willingly and happily to buffer themselves from some sort of exogenous risk. Dr. Spear (01:30:24): Yeah. So, trying to remember, we'll have to go back and look if it's in my dissertation or my thesis, but I have a chapter devoted to inventory. For sure it's in my, I'm sorry, my dissertation or my book. And I'm pretty certain it's in the, I'm looking for it. Pretty sure it's in the dissertation, where I talk about how Toyota uses inventory. And they use it in a very, very targeted fashion in the dissertation that even if it's exactly the same material, the same finish, go to the same work in process that they will have rather than one huge pile of it. They will have small stores located in different parts of a plant, because the inventory is there as a buffer, as a protection against a particular problem. So, if it's unreliability on a machine making something, there's a little bit there, if it's to give the appearance of immediate availability where there has something has a significant cycle time, like making a part or transporting apart, they'll have a little bit in each of these places. Dr. Spear (01:31:25): And so when you say that Elon Musk has decided to get nickel on hand. That's not anti just-in-time. That's recognizing that you're at risk and you're going to create a little bit of insurance against that risk, like Toyota with these microprocessors. Nevertheless, you're still going to have these elaborate, elaborate sprawling supply networks with this attempt to have an extraordinarily fabulous, adaptive, agile, resilient synchronization across of it so the right thing show up in the right places at the right time. And if it just so happens that nickel is in short supply through the normal market. Great, we can go open a closet and we've got a little bit of extra nickel to tide us over. But that's not a rejection of just-in-time. And to say so as a rejection of common sense. Gene (01:32:15): The purpose of this particular interview was to capture some of the thoughts of why you got so angry when you read the New York Times article, and then later the wall Street Journal article about how COVID shortages were made worse by just-in-time. So, we talked about what are supply chains, why they're important, how Toyota is actually able to beat the earnings forecasts and avoid shutdowns. We walked through [crosstalk 01:32:41] Dr. Spear (01:32:41): Over and over again. I mean, it's not once, Gene. It's not once. It's over and over decades. I mean, you think about it, it's not just competitors which have screwed with just-in-time, the fabric of creation. If you have a Tsunami that wipes out a power plant in Fukushima and destroys it temporarily ... Dr. Spear (01:33:03): ... and Fukushima and destroys it temporarily, hampers the economy of an entire country yet the exemplar of just-in-time time does best through that? All right. That's given you a hint. Whether it's, again, mother nature screwing with humanity, with COVID and puts a pandemic infectious agent throughout every society in the world. And yet the exemplars of just-in-time ride this out better than anybody else. It's telling you something that there's something about these systems that, yes, under regular conditions they operate with a much higher fidelity to what the market actually wants than any other system. But the fact that they're the ones that are outperforming everybody else, should give you a suggestion that you shouldn't devote a lot of ink on your front page of your paper to say how they stink, when the data and the evidence is all to the contrary. Anyway, sorry. Gene (01:34:03): One of things that's coming out in the work that we've been able to do together is, opposing poles of these simple systems that are pull based versus centrally planned, kind of the mothership knows all and broadcasts out instructions to every one of the nodes. Dr. Spear (01:34:19): Right. Gene (01:34:21): If you overlay that onto supply chains, clear the first is like just-in-time pull based, and what is the opposite of just-in-time? Dr. Spear (01:34:31): Well, the opposite of just-in-time. We went through about how do you design the system in such a way that it lends itself to low cost of change, low cost of experimentation. We said it has these characteristics of simple/linear standardized, so you can see disruption quicker, stabilized when you actually have a disruption and synchronized. All feeding the possibility of change and experimentation. So the non just-in-time have the lack of those attributes. That they're highly integrated, they have ambiguous standards if at all. They don't naturally self stabilize locally so local problems become systemic problems and they're not synchronized, naturally synchronized, so they have to be, "Synchronized through these elaborate mechanisms of data gathering, data processing, instruction generating." But the problem with those is that they lack the fidelity and the cycle time necessary to keep pace and accuracy with the system they hope to manage or control. They're always falling behind. In fact, these centralized systems, because they simply don't have the computational speed, depth, fidelity, granularity, there is as much a source of disruption as stability. Gene (01:35:58): Right. I'm not sure if I can in my head explore all four axis at the same time. I'll give it a simple case and then an exaggerated case. Dr. Spear (01:36:07): Yeah [inaudible 01:36:08]. Gene (01:36:08): One thing that comes to mind is build-to-forecast. Maybe an annual plan where you don't know what gets ordered so you create a forecast and you build with that forecast. That's how you get in a situation where you have too little of what's really needed by the market and too much of what no one wants too. Dr. Spear (01:36:22): Right. Gene (01:36:22): And so- Dr. Spear (01:36:25): You know what that is? What build forecast really means? Cover your eyes and throw darts at a wall. What is a forecast, but a guess? Again, it gets back to these just-in-time systems, which I think the Wall Street Journal and the New York Times missed, is that these just-in-time systems typically are much faster through the system, and they're capable of much greater variety than the non just-in-time systems. What does that mean vis-a-vis forecasting? It means that when they finally make a commitment, they're making a commitment later when the information is actually more perfect. Let's say, I have to decide today what I'm going to do a year from now, what are the odds I'm going to be right? They're pretty slim. Dr. Spear (01:37:10): On the other hand, what if Gene Kim can make a decision a month out? He can procrastinate for 11 months before making a commitment, why can he delay 11 months? Because one, his time through the pipeline is so fast he only needs a month to get his work done. Steve, the other guy, it takes him 12 months. All right. So I got to start today to be ready. The other thing is, if Gene not only is faster through the pipeline by a lot, but he can repurpose his pipeline at the last minute, then you just sit around and you wait till like, "Oh, what is the information I have to what I need to do a month from now?" You don't have to guess nearly as much. Gene (01:37:50): Right. In fact, I think that gets to, then you can build-to-order. I only build what is ordered and then I'll never have the risk of- Dr. Spear (01:37:56): Yeah. Gene (01:37:56): Building something [crosstalk 01:37:57] didn't want. Dr. Spear (01:37:58): I guess that's right. You can do one of two things. You can either build-to-order and tell people to wait for a month, an example, we set up. Or you can build-to-forecast, but your fork is going to be way better 11 months from now for what's needed 12 months from and minus today. Either way Gene wins, Steve loses, end of story. Gene (01:38:18): By the way, I learned from you that there's three things that finance space has theories for. Time value of money. So the sooner I get paid, the better. Option theory. The longer I can procrastinate decisions, the better, because I'll have more information. Dr. Spear (01:38:31): That's right. Gene (01:38:32): Third is portfolio diversification. If I can't get more information, I should at least diversify my risk and not put all my eggs in one basket. Dr. Spear (01:38:38): Yes. Gene (01:38:39): I think two of those can tie. If I have less cash tied up in the carrying cost of all that inventory, that's better. Dr. Spear (01:38:45): Boom. Gene (01:38:47): Less inventory. Two is, I will build by deferring the build to when I have more information or when it's ordered I can defer those decisions. Yes? Dr. Spear (01:38:57): Yep. I think Jean, you actually get the third one, which is by having greater flexibility as to what you actually build a month from now, you've diversified your risk. Gene (01:39:09): Say a little bit more about that. I didn't quite get all of that. Dr. Spear (01:39:11): Yeah. You've created capacity to build A, B, or C. Not only can you delay for 11 months to build, but at the 11 month point, you can build A, B or C because you've diversify your risk against that change. I'm screwed, because I got to start today. I've got, to your time value of money, I'm carrying costs for a year, not a month, 11 months out. That's one. Two, I have to decide today what I'm going to do, so I'm definitely not building to order because who the hell knows what they want to buy a year from now? Nobody. I've lost the option value. And the other thing typically is the folks who are really slow, they're not just-in-time. Not only are they slow, but they're not agile or flexible. I've got no risk diversification either. I'm deciding today and the only decision I can make is an A, because I don't have the capacity for B or C anyway. I mean, Jean, I'm taking my life savings and just doing nothing but lotto tickets. Gene (01:40:16): Gene here, just a brief break in here to explain that last section. Over the last many months I've been learning from Steve, that finance basically has only three known theories. One, the time value of money. In other words, the sooner I get paid, the better. Two, option theory. The longer I can procrastinate decisions, the better, because I'll have more information. Three, portfolio diversification. If I can't get more information, I should diversify my risks so all my eggs aren't in one basket. I spent months with Steve exploring this because I think it's so powerful and so parsimonious and has such high explanatory power of phenomena we see every day. Okay. Back to the podcast. So it gets, I think to my exaggerated version. I remember reading in, I think it was Basic Economics by Dr. Thomas Sowell at Stanford University. He described the forecasting processes of the centrally planned Soviet economy, which was uninspiring in terms of its inability to deliver basic goods. I'm sorry. Parts to only hundreds of thousands of factories. I mean, it was an epic description of the failures of the centrally planned economy. It was just really awesome. Can you maybe speculate or react to why it's obvious or why that problem was so prone to [crosstalk 01:41:40]. Dr. Spear (01:41:41): This is fascinating. If you think about it, I also took a course in college called Soviet economics. It's interesting the Soviets had some reputation at least of good Russian engineers. Engineers typically think in terms of feedback. The systems are dynamic. And yet here you had engineers who are trying to operate within the system, which allowed no feedback, certainly not political feedback, Stalinism and all that sort of thing. But even market feedback. Because the reason our economies function well, is that there's a lot of signals around demand. There's a lot of signals around supply. There are signals around establishing combinations of prices and quantities and this and that thing. But the Soviets, I think they tried to operate a system that was signal free. It was signal free in terms of not matching supply and demand. It was signal free in terms of not matching intermediate supply and demand within supply chains. Everything was scheduled. It was a disaster. Gene (01:42:53): [inaudible 01:42:53] here. Okay. Here are some clarification and expansions. One, let's start with describing the opposite of just-in-time and some incredible passages from that book, Basic Economics by Dr. Thomas Sowell. Reading that book, there are some amazing passages. One is this, "Inventory is a substitute for knowledge." I think that's such a profound observation. The full context of that quote, "Inherent risks must be dealt with by the economy, not just through economic speculation, but also by maintaining inventories. No food ever be thrown out if they knew exactly how much food to cook." Here are some other passages that I highlighted on pages 67 to 70. He writes, "Demand of all types of gas in the United States to filling stations are too vast and dynamic for any centralized allocation and distribution body to plan. Pricing signals in the market serve this purpose. Oil companies just service demand." Gene (01:43:53): There is a pretty amazing counter example that he gives in the Soviet economy. He says that the government raised prices of moleskins leading to hunters to hunt more of them, leading to distribution centers filling up with pelts because there's actually no demand for them. There were requests to lower the prices that were never made, because the decision makers were too busy with 24 million other prices which needed to be set. The more urgent business was adjusting prices to address shortages of other things that could be made instead. He writes, "The result was that too many things were being made that nobody wanted, too little of things that people actually needed. The Soviet union had painfully low standards of living for millions of citizens despite living in one of the most resource rich areas. Lower than not just the US, but Japan and Switzerland who had much less resources." Gene (01:44:44): He writes, "The pricing signals allow individuals and entities to watch very few prices." So I think this is the whole point of what an efficient supply chain does. In the ideal fast lead times enable building to order deferring decisions to allow better accuracy. Dr. Sowell writes, "Nobody in any kind of economic or political system can possibly know the specifics of all things, the advantage of a price coordinated economy," and I'm assuming an effective supply chain, "Is that nobody has to. The efficiency comes from the fact that vast amounts of knowledge do not ever have to be brought together, but are coordinated automatically by prices that convey in summary and compelling form, what innumerable people want." He writes, "In these systems, people receive instructions from others on what people want. In contrast, the central planner is giving instructions to other compelling them to obey." I think that's so lovely because it beautifully, and maybe cartoonishly describes the poll versus the century plan system. Gene (01:45:45): Again, I'll just repeat that one quote, "Inventory is a substitute for knowledge." I just want to expound a little bit on how the Soviet centrally planned economy compared with the four characteristics. The first is simplification, where we wants simple linear flows. In the Soviet economy, supply and demand couldn't be signaled directly between the producer and the consumer, whether it was moleskin pelts, tractor parts. Everything instead had to go through planning committees and these price setters, which couldn't keep with the dynamic changes in demand in the marketplace. Gene (01:46:18): Standardized. Where there's a strong declaration of how things should go and what should happen if things go wrong if we need parts that aren't there. He writes, "The significance of prices in the allocation of resources can be seen most clearly by looking at situations where prices are not allowed to function. Two Soviet economist described a situation in which the government raised the prices it would pay for moleskins, leading hunters to sell more of them." He writes, "State purchases increased and now all the distribution centers are filled with these pelts. Industries unable to use them at all. And they often rot in warehouses before they can be processed. The ministry of light industry has already requested [inaudible 01:46:58]," I guess the manufacturer. Gene (01:47:01): " Has already requested twice to lower the prices, but, 'The question has not been decided yet.' This is not surprising. Its members are too busy to decide. However, overwhelming it might be for a government agency to try to keep track of millions of prices, a country with hundreds of millions of people can far more easily keep track of those prices individually because no individual or business enterprise has to keep track of more than the relatively few prices relevant to their own decision-making." Gene (01:47:27): That strikes me of compartmentalization, information hiding, abstraction barriers and so forth. You've only care about things you need to care about. Dr. Sowell continues, "The net result was that many Soviet enterprises kept producing things in quantities beyond what anybody wanted, unless, and until the problem became so huge and so blatant as to attract the attention of central planners in Moscow, who would then change the orders they send out to manufacturers. But this could be years later and enormous amounts of resources would be wasted in the meantime." Not sure if that fully addresses standardization, but I think we're in that territory. Gene (01:48:04): The third is stabilization. In other words, affects decision and affects stay local as opposed to global. I think that's pretty easy to show that the centrally planned Soviet economy was done centrally not locally. I think that last example of the slow feedback loops around reacted to over-producing is a great example of that stabilization. Gene (01:48:26): The third one is stabilization. Affects stayed local as opposed to causing global chaos and disruption. I think we failed that test because there's little ability for nodes in the system to fix things locally. They had to lobby the central planners and price setters. The fourth is synchronization that local coordination versus central coordination. I think by definition, it was essentially planned economy. Gene (01:48:54): Number two, the 2002 shipyard strike, the work stoppage and over 300, 000 containers were backlogged. I'll put a link to some articles on this, including some great quotes from the LA Business Journal, October 7th, 2002. They write, "For auto parts, the time that goods sit in warehouses has dropped to three to four days from one month for apparel, a week from four to six weeks. For non-perishable foods, three to four days instead of a month. And for electronics one week from a month." By the way, there's actually very little I could find on how Toyota fared with other auto manufacturers. Apparently it was primarily Toyota and Honda that were affected by the port shutdown. If I read it correctly, most of the impact were fully assembled cars, not automotive parts, because so much of the assembly was still being done in Japan at that time. Gene (01:49:50): Okay. Number three, we talked about how in response to the 2002 shipyard strike that Toyota created this air bridge from Japan to the US, buying up all commercial air capacity and then renting these An-22 transport planes. I learned a couple of years ago from someone who works in supply chain, something pretty amazing. She said apparently these days three companies dominate the use of all air freight capacity between China and the US during the holiday retail season, Apple, Amazon, and Nike. Apparently this is a very finite resource that is fully utilized. Gene (01:50:27): Okay. Number four. I want to mention one more thing that I found very surprising in this interview. When I previously studied the Toyota response to the 2002 port shutdown, I read it as a story of heroics, of fast thinking and expediting. Buying up all the air freight capacity, finding those An-22 transport planes and figuring out how to move cargo north from Mexico to the US. But after listening to Steve and upon some reflection, I think the story is really one about modularity. Steve said that most people in the plants never knew or cared how the parts got there. A mark of modularity is that you have permission to be incurious about what's on the other side of the interface. Gene (01:51:11): To steal a phrase from Zack Tellman who wrote the amazing book Elements of Clojure. He said, "We don't need to care about where the electricity comes from from that electrical outlet or what actually happens when you flip the light switch." The just-in-time supply chain is designed so that it is modular, so that you can interchange components with each other. Just like Netflix could keep running when [inaudible 01:51:33] can kill the server, or when an entire Amazon availability zone went down, the just-in-time supply chain can keep running when 29 US ports shut down. And you can swap up parts being delivered by sea with parts delivered by air. This can only happen when you have a system that has those characteristics of being simple, stabilized, standardized, and synchronized. Gene (01:51:55): Lastly, I promise to describe more about the three underpinning theories of finance, time value of money, option theory and portfolio diversification. I'll save the majority of this for a future episode, but let me explain one comment that Steve made. He said that a system with a great supply chain and great operations can win on all three dimensions. So time value of money, we win because we have less carrying costs tied up in inventory. Two is option value. We can delay making decisions on what to make till when we have more data. And then he mentioned how we can win on portfolio diversification as well. I want to explain this a little more. Gene (01:52:32): Basically what he's saying is that if you have a low cost of change, such as being able to do single minute exchange of dies, instead of taking days, you can build different car models or even different model years, all on the same plant line. There are some reports that some car plants can build multiple cars and trucks on the same line, gasoline and hybrid models on the same line, all in the same plant. This would allow a plant to make far more products in different configurations at different times and quantities, than one that has a high cost of change. This explains how we can win on portfolio diversification as well as on time value of money and option value. Okay. Back to the interview. Dr. Spear (01:53:15): I think a couple of things came out of this, which I just want to say at the end, so that we splice it into whatever audio or written version we get. One this misunderstanding of just-in-time is not until you're just one reporter at the New York Times, it represents a broader misunderstanding. Because if the Wall Street Journal could write basically the same article, and the Wall Street Journal is supposed to be highly tuned to business, to markets. Markets dependent on transactions information, self synchronization, et cetera, et cetera, all the wonderful things that markets do when they're operating well, that the Wall Street Journal could blow this story. That tells you that people really just don't understand just-in-time. And so in terms of positioning our thing, we should start off with, say, these prestigious publications got it wrong. The consequences of getting it wrong are really, I think, quite profound because to the people who look at either paper as the authoritative source, it could lead to very bad policy of public policy and corporate policy, say, oh, jettison just-in-time. That's takeaway one. Dr. Spear (01:54:29): Second thing, is I think, as we've gone through this conversation, the idea of inventory, it's not just the inventory you have just in case. Which sounds almost like the Toyota and the micro-processes or Elon Musk and the nickel he's stockpiling. Inventory, as often as not is the stuff you didn't need, which is why it's there, because if you actually needed it, someone would have made it and someone else would have bought it right away. So there's that. I think the other thing which came up is, how these really well-functioning systems, which are low inventory because they run well, the amount of collective genius that's tapped into is off the charts. I don't want to miss that because I think I've always explained just-in-time as solving a technical dynamic problem. But I don't think I've ever framed it as, it's a social solution to a technical problem. It really is a social solution to a technical problem. Gene (01:55:33): Right. As opposed to the central mothership knowing all, forecasting all, knowing all, controlling all. Those are the opposing poles. Dr. Spear (01:55:43): Those are the opposing poles. They're seeking a technical solution to a technical problem. Gene (01:55:48): A technical solution to a socio-technical problem? Dr. Spear (01:55:51): Right. Yeah, that's right. They're looking for a technical solution to... I like what you said, looking for a technical solution to a social-technical problem without tapping into the social element of the enterprise, which is the collective genius. What we see with these really high level just-in-time, is that they've created systems, again, through the simplification, standardization stabilization, synchronization. Through that they've created systems which lend themselves towards tapping into the social capital of the enterprise. Gene (01:56:28): Common threads to what we heard in the Trent Green interview about the vaccination rollout. Dr. Spear (01:56:34): That's right. I think when we go back and look at the Trent Green story, Trent was responsible in terms of having formal responsibility, but he can't claim sole credit for the order of magnitude increase in vaccinations. The reason he can claim credit is that he set up a system in which he was able to tap into the distributed and the collective genius of his colleagues to see and resolve problems at speed. And because they were able to do that, they were able to get local discovery, both around impediments and also around solutions, and roll those up into the system. And then the next recognition of the local impediment and the next generation of a local solution could get rolled up systemically. The huge credit to Trent is creating the conditions in which that could happen under stress. That's remarkable. Gene (01:57:35): Very similar to General Stanley McChrystal fully enabling the 22-year-old person on the intended allegiance surveillance platform, being able to go from siting to capture, and numerous other things, right? Dr. Spear (01:57:49): Yeah [crosstalk 01:57:49]. I'll tell you Gene, I think the McChrystal reference, it's a good one. I hadn't thought of it, but it's a tight analogy that you're making and I agree with it. Is that before we were talking about the difference between a just-in-time system, which is one of distributed intelligence, and a plan system, which is one of central intelligence. It's not a matter of taking the same amount of intelligence and flattening out. There's actually much more intelligence distributed than you can possibly concentrate. Dr. Spear (01:58:21): Now, I think McChrystal, the team of teams narrative is very similar, because the way I understand that narrative, is they were dependent on a central authority to process the inputs and generate outputs in terms of commands and instructions. The thing was, the operating environment was just too data rich and too data fast for the central authority to process it in any kind of meaningful fashion. So what was the pivot? Is that General McChrystal allowed for the distributed intelligence to express itself. He created these systems in which it was possible, and again if you do a close read of the book like you and I both have, the systems he created had these combinations, simplification, standardization, stabilization, and synchronization. Gene (01:59:15): And so in all of those interviews with David Silver and [inaudible 01:59:17], it revealed all those structures that enabled those disparate pieces to actually work towards a common purpose in ways that that could not have been done before. Dr. Spear (01:59:25): Never been done before. Because if it all had a flow up to the central authority and back down, the world was moving way faster with much finer pixelization than the central authority ever could keep up with, which is why they were losing. It wasn't until you started getting a match between the system itself, is dynamic characteristics with the system itself, and the dynamic characteristics in the environment in which it was trying to compete. Until you started getting a match there, it was losing. And when it got a match, and then when it got superiority on that, then it could start winning. Gene (01:59:59): Awesome. Thank you so much, Steve. That's it for this episode. Please join me next time, when I will be speaking with Dr. Gail Murphy. Dr. Gail Murphy is professor of computer science at the University of British Columbia. She is one of my favorite academic researchers in all things related to architecture, modularity and developer productivity. I was introduced to her work by a good friend of mine, Dr. Mik Kersten, who is author of the book Project to Product. Both she and Dr. Kersten were co-founders of the company, Tasktop. Gene (02:00:32): If you've enjoyed this episode so far, please leave a review or you can reach out to me on Twitter. I'm @RealGeneKim and mention @ITRevBooks. Of course, please make sure to subscribe. The Idealcast is produced by IT Revolution. Our goal is to help technology leaders succeed and their organizations win through books, events, podcasts, and research.

gene (2) (1)

Gene Kim

Gene Kim is a Wall Street Journal bestselling author, researcher, and multiple award-winning CTO. He has been studying high-performing technology organizations since 1999 and was the founder and CTO of Tripwire for 13 years. He is the author of six books, The Unicorn Project (2019), and co-author of the Shingo Publication Award winning Accelerate (2018), The DevOps Handbook (2016), and The Phoenix Project (2013). Since 2014, he has been the founder and organizer of DevOps Enterprise Summit, studying the technology transformations of large, complex organizations.

Want to be the First to Hear About New Books, Research, and Events?