Skip to content

February 4, 2019

How Finding & Fixing Faults is the Path to Perfection

By IT Revolution
dr. steve spear

The following is an excerpt from a presentation by Dr. Steve Spear, Principal, HVE LLC, titled “Discovering Your Way to Greatness: How Finding and Fixing Faults is the Path to Perfection.”

You can watch the video of the presentation, which was originally delivered at the 2018 DevOps Enterprise Summit in Las Vegas.

I’m going to make a case for three key points.

  1. Learning is good. Not surprising coming from a guy at MIT wearing a bow tie and glasses. But what I mean more particularly is that knowing how to get smarter better, faster, matters a lot. And I’ll give some examples of just the profound differences between those who learn very, very well and those who learn in a more normal fashion.
  2. All it requires is that you aggressively seek out fault. I grew up in New York, the same neighborhood that the Trump family is from. You can see it’s a natural thing where we come from no matter what you’re doing, that sucks. That’s awful. It’s terrible, but it turns out that finding fault is a necessary trigger and the first informer that you don’t understand or that you need to do something differently. So we’ll build on that and some examples.
  3. Then, of course, that leads to the third critical point that, easy to say and hard to do, in terms of this whole finding fault in your thinking, is finding fault in your doing and correcting it.

Let me start on that first point

My roots begin in trying to understand why Toyota had such a hugely dominant presence and still does, in its sector.

Toyota came to the US market in the late 1950s and they showed up with a car called the Toyopet. Now, most people haven’t even heard of the Toyopet. What does it tell you about it? Because you’ve heard of a model T, you’ve heard of a Lamborghini, Chevy Corvette. So if most people haven’t heard of it, it suggests that it sucked, right?

Now, let’s add some dimensionality to the word suck. The Toyopet, when it came to the United States, if you had to drive up a hill, there was no guarantee you’d get to the top of the hill. If you wanted to increase the odds, it was better if you were in reverse.

The first market for the Toyopet was California. I’m willing to bet that there are some hills either east of Los Angeles or by San Francisco, that you can still go by today and see a rotten pile of metal all covered in rust of Toyopets.

Let’s take this through. Toyota starts off with this abysmal product called the Toyopet. Turns out they were even abysmal at making an abysmal product. Toyota’s productivity in 1957-58 was about one eighth the world standard. Which is probably a good thing because otherwise there would have been a lot more Toyopets.

But then from 1957 when they left the US market until 1962, Toyota went from one eighth the world standard in terms of productivity to equal the world standard. By the late 1960s, their productivity was double the world standard.

Now Toyota comes back to the US market in 1973 with small fuel-efficient cars, compact, micro compact, subcompacts, which was invited in by the rising price of gasoline. At first, the car was appealing because it needed far less fuel than what other cars were on the market, but people came to a quick realization that is was also incredibly reliable.

In 1973, Toyota reestablishes the competitive bar in that sector by showing up with a car that they can make with twice the productivity, which was affordable, and incredibly dependable. It’s also easier to maintain and service than what’s on the market.

Alongside that value proposition of affordable reliability, Toyota also started showing up with not only small cars but midsize cars. Again, the world standard to do a major model upgrade like that on a car was a big deal. It took thousands of engineering years worth of work, you had to retrain the workforce, reequip the workforce, redesign your supply network, etc.


?? London
 Las Vegas

The world standard for such a thing was four years and Toyota proved that at half the cost you can do it in half the time. Their standard was two years. That has a devastating impact on everybody else because now Toyota is not only selling a car which is more affordable, more reliable, but it’s fresher too. You start thinking about the products you buy and the difference in perception if something is updating on a two-year cycle versus four.

Now it begs the question, how does Toyota go from the Toyopet in 1957 which they were really pathetic at making to this dominance in terms of affordability, reliability, time to market, etc. by the mid-eighties, early 1990s?

The starting point for the Toyota folks was something like this “Why the hell are we selling a Toyopet in the first place?” Then some other guy says “Because we don’t know how to make a better car. If we knew how to make a better car, we would sell it, but this is the best we can do.”

So, Toyota gets into this mindset that whatever they’re doing, they very, very aggressively have to seek out problems in what they’re doing, and wild aggressiveness. Then they say when you’re on a problem, rather than recognize it and cope, be heroic, firefight etc. Do as much as you can to understand the problem, at least the very least to understand its causes, and do something about them.

This is kind of an interesting point, because the conventional view of people turning wrenches in factories, is that you’re measured by how many wrenches you could turn, but in Toyota’s case, they’re saying, “Wait a second. If we’re staying ignorant about what we’re doing and someone else sees that we have a problem, which we didn’t really understand or know how to articulate well, then they’re the expert on this situation.”

It may sound highfalutin to say, but the reality is if you have a situation that no one else is able to resolve, when someone comes to an understanding that’s meaningful enough to make some kind of positive impression on it, they’re the expert and the rest of us aren’t.

What we have to do is build into people’s activity not only seeing problems and solving problems but sharing what they’ve learned. Because as I’ve prefaced, knowing how to get answers smarter, better, faster matters a lot.

It depends on finding fault, which is hard to do that because we’re psychologically and socially, not equipped to say, ‘oh, hey, look at how badly I suck.’ There’s a whole element here that if you’re going to build your competitive presence on your dynamic of solving problems and spreading what’s learned, then what you really have to make sure is that this whole thing is buttressed by leadership who are just constantly modeling, coaching, enabling this finding of fault.

What is the result of that kind of behavior? As we said, our starting point was the Toyopet, an absolutely abysmal car made in an absolutely abysmal fashion. By the time Toyota gets to the mid-1990s, they have products in every significant category in the domestic automotive market ranked one or two.

Here’s another auto example

The world auto market, (that’s you and me,) send the same signal to the world automakers, which was that we wanted double the fuel efficiency. Everyone tried the same approach — tinkering around the edges with existing architectures, more aerodynamic, plastic and aluminum to reduce the weight, electric, electrical, electronic controls on the engines to get better combustion. While all of that helped incrementally, it didn’t solve for twice the fuel efficiency.

Everyone went back to the drawing board for round two, having gotten the answer wrong the first time and they came up with the idea of a hybrid. The idea was to take an electric motor, which is great for acceleration, couple it with an internal combustion engine which is great at a steady state, with a whole mess of software on top to coordinate them — you could get double the fuel efficiency.

Chevy’s expression of this was the Volt. Toyota’s expression of that was the Prius. And this goes back to knowing how to get to the right answer faster. Toyota showed up with the Prius, with a 10-year head start on the Chevy Volt and with that 10-year head start, not only have they put that technology through six, seven, maybe even eight cycles now of modification and improvement, they’ve diversified its application.

The Prius came out with a value proposition of a making a statement as a green product, but then they moved it onto the Camry with a value proposition to taxi fleets, ‘look, fill up your tank once a day instead of twice,’ basically pay for the car in the first year with your fuel savings. Then they started putting it on the Lexus where it wasn’t so much the savings on the gas cost, it was the performance.

Toyota has taken their hybrid drive and put it on 24 different platforms. And in the years since the Chevy Volt and the Prius came to market, the Chevy Volt has had about 100,000 units sold. Toyota with its hybrid system has had 9 million units sold.

Again it was the same problem and the difference in result is 90 to 1. If one of my students came in like that and I to say, damn, we’ll just give you the Nobel now, or you’re cheating, one or the other. All of this to the point of getting to the right answer, fastest matters a lot.


?? London
 Las Vegas

This isn’t just in the automotive industry— there are countless others

To illustrate, pharmaceuticals is another crazy industry where you invest billions of dollars over a decade to get to market with a product, first. They have this system where if you come up with even a puff of an idea, you file a patent.

But once the patent is filed, the clock starts ticking. Once the clock stops ticking, your revenues go to zero because someone is going to make the generic version of it. When we looked at this, we did a calculation that for every day earlier you could get to market with a pharmaceutical it equals $3,000,000. The right answer, faster.

Here is another way this can play out. If you’re in the market with a therapeutic first, you get about 50% of the revenue that therapeutic will ever yield. If you’re second, you’re at about 25%, if you’re third you’re about 15%, etc. If you’re sixth it was a waste of time and a huge burn of money.

Here’s an example of how this ties into the leadership

This is Hyman Rickover, known as the father of the nuclear navy.

When he started in the late 1940s, no one really had control over atomic power. They had explosive atomic power, but no one was in control of it. That required the invention of new science, the invention of new materials, the invention of new processes, etc. This was also spread across not only the navy, uniformed and civilian but a contractor workforce which ran easily into the tens of thousands.

They asked Rickover how he was able to accomplish this and he said ‘Design, whatever you’re doing so you can see what’s stupid about what you’re doing.’ That’s Rickover’s language, but he said, find out what you’re doing wrong, and when you see that you are doing something wrong find out why, immediately. Then whatever you’ve learned from that experience, (it could just be that the problem exists, it could be how to resolve the problem,) make sure you tell somebody else, that’s the multiplier effect.

There are a lot more details about how you lead in that environment, but here’s the result of running an organization that way. The US put to sea the first nuclear-powered submarine in 1955. Since then, across all the different generations of attack submarines and ballistic missile submarines, all the different crews, which have worked on them, all the different shipyards which have built and maintained them, the US Navy’s record on safety related to submarines and reactors is perfect.

Since 1955, there’s been no injury or environmental damage due to reactor failure on a US submarine or aircraft carrier.

This is Rickover’s dynamic of, if you’ve got a problem, call it out loudly. If it’s called out, we’re obliged to try and solve it, and if we solve it we got to show somebody.

Now, who’s the competitor for Rickover and the navy in the 1950’s? The Soviet Union. And for them, let’s just say we can assume their dynamic was not this. They did not have this level of transparency. What was the result of trying to develop nuclear power in an environment where you can’t raise your hand? It’s this:

I could have filled up this slide with failed submarines from the Soviet navy, but what was the difference? It’s the same basic science, the universe has laws and they were trying to harness the same laws. They’re trying to develop the same materials, the same processes, etc. What was the difference? It was the phrase “I got a problem.” On one side, this was okay. The other side, it was an issue, and all of this results in a huge difference in performance.

I was inspired with some of these ideas by doing a ‘Karate Kid’ immersion at Toyota

As they were trying to stand up a first-tier supplier factory. What really got impressed upon me was the absolute profound commitment, aggressive, energetic commitment to making sure that no matter what you were doing in an operating setting, you could see what was going wrong.

Because these failures in operation get ignored and aren’t solved, they snowball to catastrophic effects. I have an example in my book of a nurse who gets confused between a vial of heparin and a vial of insulin, which is similar in so many ways, but it’s just the right set of circumstances that caused patient harm.

NASA has experienced similar things, both with the Challenger and the Columbia. When you get to the root cause of why they suffered those catastrophes, the answer in both cases is that there were known problems. Known problems that were infrequent enough, small enough to be waved away, normalized and dismissed until the moment that they congealed in a fairly fatal way.

Here’s my last example — it’s June 4th, 1942

The Japanese navy shows up at the Island of Midway in the Pacific. They show up with twice as many aircraft carriers as the United States Navy. They show up with twice the pilots, twice the planes, double everything. Now, given that, who should have won the battle of Midway? The Japanese.

They show up with twice of everything, but it didn’t end that way. The United States won the battle of Midway and it was a catastrophic defeat for the Japanese navy, which impacted their ability to wage war in the Pacific for the remainder of the second world war.

Recently, I’ve been reading a book about this, and the folks who wrote that book had access to Japanese archives. They didn’t write the battle from the US perspective. They wrote it from the Japanese perspective.

So I made it all the way through to the end of the book, and there’s this last chapter that says, “Congratulations. After reading this gigantic book, when do you suppose the Japanese lost the battle of Midway?”

I’m thinking that’s a trick question, so I started flipping through the book looking for the answer and I find a spot, where in late May the Japanese navy has some of its ships late to deploy out of the harbors around Tokyo. And I think, that’s it. These guys didn’t get to where they needed to be on time. Flip back to the end and the authors say, ‘Dear reader, you’re probably thinking in late May. The answer is the Japanese navy lost the battle of Midway no later than 1929.’

I’m going back to the table of contents, there’s no 1929 isn’t even included in the book. Then I keep reading and it says, in 1929, the Japanese admiralty had blocked in their assumptions on how the war would be fought at sea, drawing from the lessons learned in the 1905 encounter with the Russians. By 1929 they locked in on that doctrine and it determined how they designed their aircraft carriers, how they designed their aircraft, how they designed fueling, arming, launching, recovery, training, tactics, etc.

Now, here’s the thing, once they locked in on that assumption, they left it unchallenged.

The book concludes with a description of the Japanese admirals’ rehearsal of the battle plan they had for Midway. They set up a giant boardroom table, they’re over here in all their admiral finery and way down here they got some junior officer whose job it is to fight the US side, while the admirals fight the Japanese side.

They take a stick and they shove a little piece of a little wooden ship forward and he takes his turn. Back and forth, back and forth. After a few moves, the junior officer is not fighting according to the battle plan. He looks down at the table, sees the battle plan, sees the pieces, realizes that they have twice of everything, and moves his own way.

As you read this, you get the sense that they kept going through junior officers, petty officers, sailor recruits, everyone, to get them to play by the battle plan but they keep losing.

What’s the real cause of loss? They have it their head that they weren’t going to seek out problems. They used the war game to rehearse the battle plan when they could’ve been stress testing the battle plan, and because they didn’t find the bugs in their plan, the fault in their thinking and that fault in their thinking became fault in their doing. They didn’t correct it and the good guys win.

They did the exact opposite of what Toyota did. They did the exact opposite of what Admiral Rickover did. They didn’t seek problems, they didn’t solve problems, they didn’t spread learning. And why was that? Pathological leaders.

Now, here’s a set of actions you all can take.

During planning, when you have a plan or a code, whatever it is, do you show it to your colleagues and advocate for it or do you say ‘here’s my design. It’s the best I can do. Now tell me what’s wrong.’ I know it’s uncomfortable, but that’s necessary.

Similarly, if you supervise other people, when they put something up, do you say that’s nice, or do you help them discover the fault?

Here’s what you can do in terms of just basic habits of the day. When you go up to somebody and say, hey, how’s it going? What are you doing? Their inclination will be to say that everything is going fine. Your question needs to be ‘What’s not working?’

Then the next question to ask is, ‘Well, why isn’t it working? What’s your understanding? You’re the person who has got the most tactile sense of it not working and why? What can we learn from that? What can we change and what can we teach?’

Thank you.


?? London
 Las Vegas

- About The Authors
Avatar photo

IT Revolution

Trusted by technology leaders worldwide. Since publishing The Phoenix Project in 2013, and launching DevOps Enterprise Summit in 2014, we’ve been assembling guidance from industry experts and top practitioners.

Follow IT on Social Media

No comments found

Leave a Comment

Your email address will not be published.

Jump to Section

    More Like This

    Serverless Myths
    By David Anderson , Michael O’Reilly , Mark McCann

    The term “serverless myths” could also be “modern cloud myths.” The myths highlighted here…

    What is the Modern Cloud/Serverless?
    By David Anderson , Michael O’Reilly , Mark McCann

    What is the Modern Cloud? What is Serverless? This post, adapted from The Value…

    Using Wardley Mapping with the Value Flywheel
    By David Anderson , Michael O’Reilly , Mark McCann

    Now that we have our flywheel turning (see our posts What is the Value…

    12 Key Tenets of the Value Flywheel Effect
    By David Anderson , Michael O’Reilly , Mark McCann

    Now that you've learned about what the Value Flywheel Effect is, let's look at…