Inspire, develop, and guide a winning organization.
Create visible workflows to achieve well-architected software.
Understand and use meaningful data to measure success.
Integrate and automate quality, security, and compliance into daily work.
Understand the unique values and behaviors of a successful organization.
LLMs and Generative AI in the enterprise.
An on-demand learning experience from the people who brought you The Phoenix Project, Team Topologies, Accelerate, and more.
Learn how making work visible, value stream management, and flow metrics can affect change in your organization.
Clarify team interactions for fast flow using simple sense-making approaches and tools.
Multiple award-winning CTO, researcher, and bestselling author Gene Kim hosts enterprise technology and business leaders.
In the first part of this two-part episode of The Idealcast, Gene Kim speaks with Dr. Ron Westrum, Emeritus Professor of Sociology at Eastern Michigan University.
In the first episode of Season 2 of The Idealcast, Gene Kim speaks with Admiral John Richardson, who served as Chief of Naval Operations for four years.
New half-day virtual events with live watch parties worldwide!
DevOps best practices, case studies, organizational change, ways of working, and the latest thinking affecting business and technology leadership.
Is slowify a real word?
Could right fit help talent discover more meaning and satisfaction at work and help companies find lost productivity?
The values and philosophies that frame the processes, procedures, and practices of DevOps.
This post presents the four key metrics to measure software delivery performance.
January 18, 2024
This post is excerpted from Deming’s Journey to Profound Knowledge by John Willis.
In 1982, the US was in a crisis. A crisis of identity, of energy, of economic sovereignty—you name it. Although W. Edwards Deming was enjoying the beginnings of Demingmania and his fame from the NBC documentary, Deming’s overall attitude was one of pessimism. He feared that business, government, and even education were too far gone to bring back from the brink of self-destruction . . . but he had to try, anyway. Out of this urgency came his 1982 book Out of the Crisis. He wrote it almost as a last-ditch effort to reform these three critical sectors of American life before they imploded of their own weight.
Upon completion of reading his book, I couldn’t believe how a man born in 1900 and writing in 1982 could so accurately capture the essence of everything we were trying to say in DevOps. (Of course, at the time I had no idea his fingerprints were already all over software development.) I immediately assembled a presentation I called “Deming to DevOps” to challenge my global community to reexamine the tech industry through Deming’s eyes. He truly was “the Prophet,” able to foresee the very problems we faced in software development and, more importantly, to provide us the principles we need to solve them ourselves.
As I presented in the previous chapter, the world is just now beginning to grapple with the digital crisis we’re in. Cybersecurity is minimal or overlooked altogether in industries and organizations that affect our everyday lives. Anyone from anywhere in the world with a little know-how can exploit these digital systems to disrupt our lives or even threaten them. The digital world is far too complex to be solved with simple solutions. We need a systemized, unified framework and a united front to keep the bad guys from hurting innocents.
Deming’s System of Profound Knowledge is the lens through which organizations can experience profound change. However, the 14 Points for Management still remain the best applicable lessons on how to achieve Profound Knowledge. Were he alive today, lecturing a company or governmental organization on the cybersecurity threat, Ed might present his 14 Points using true-to-life examples, such as something like this:
“Create constancy of purpose toward improvement of product and service, with the aim to become competitive and to stay in business, and to provide jobs.”
In 2019, Springhill Medical Center in Mobile, Alabama, recorded the first death related to a ransomware attack. For eight days, the facility’s capabilities were vastly reduced, including blocked access to patient records and even disabled fetal heartbeat monitors. A pregnant patient was unaware of the attack when she checked in to have her baby. Because of the disabled equipment, the medical staff was unaware the baby was in distress as a result of the umbilical cord wrapped around the neck. The infant suffered a severe brain injury and died nine months later.
Most people in an organization view governance, risk, and compliance (GRC) activities as an annoying inconvenience—something to leave to the IT risk teams. When you silo cybersecurity, seeing it as an independent component instead of a system, you put your entire organization at risk. Cybersecurity can’t be a box to check off; it must be a way of thinking embraced by everyone.
Earlier, I mentioned Shannon Lietz, the person who coined DevSecOps. She says security needs to be designed into an organization’s systems—not something that gets bolted on after the fact. It must be a shared mindset between software developers, operations people, compliance professionals, and security scientists. Moreover, it’s leadership’s responsibility to educate the entire organization about the roles and responsibilities related to cybersecurity. Annual staff trainings aren’t nearly enough.
Ed said companies should “create constancy of purpose.” By this, he meant an aim, direction, or purpose—a desired outcome. Eliyahu Goldratt named his book The Goal after this concept. Bestselling author Simon Sinek calls it the why. Most modern organizations have mission statements that pay lip service to this idea, but few truly have a purpose. Fewer still have a shared purpose across the organization.
The mission of any organization should be to responsibly provide its products and services to others. Recall Paul O’Neill at Alcoa. The 1800s railroad and shipping magnate Collis Potter Huntington once said, “We shall build good ships here. At a profit, if we can. At a loss, if we must. But always good ships.” Quality is paramount.
For Deming, this type of long-term thinking was the opposite of the short-term mentality he saw so prevalent. American managers didn’t have a steadfast commitment to quality but to quarterly earnings. The same is still happening today. Organizational leaders don’t design security (a.k.a. digital quality) into their systems at the outset; it’s something to be addressed at some future point in time.
Just as Deming advocated continuous improvement in processes and products, he would advocate continuous improvement in cybersecurity as well. It’s an integral part of delivering quality today. Springhill Medical Center is a good example of an organization where security was not part of the mission.
“Adopt the new philosophy. We are in a new economic age. Western management must awaken to the challenge, must learn their responsibilities, and take on leadership for change.”
In Deming’s way of thinking, it wasn’t just about adopting new practices but embracing an entirely new mindset. Deming once said, “It is not necessary to change. Survival is not mandatory.” As new cybersecurity threats arise, leaders in organizations from small businesses to global enterprises to national agencies must adapt to these new challenges. Unfortunately, many organizations’ leaders still approach IT with the mindset of Henry Ford, Frederick Taylor, and Alfred Sloan. At the federal level in the US, these deterministic approaches have been institutionalized and enshrined in law and practice. As I hope I’ve demonstrated in this book, those practices were out of date a century ago. We cannot possibly imagine that the old ways are up to the task.
When he published Out of the Crisis, Ed wrote of a new economic age. IBM had just presented the first personal computer to the world the previous year. Answering machines weren’t altogether commonplace, and the marvel of fax machines was still new. I doubt Deming ever wore a pager, but I wouldn’t doubt if his secretary Ceil paged more than one executive to arrange her boss’s schedule. On the other hand, I wouldn’t be surprised to learn that Ed had played with VisiCalc, the world’s first spreadsheet, created by Dan Bricklin.
At Deming’s first book’s publishing, we were in the midst of the Cold War stalemate with the USSR. Being the eternal philomath he was, I imagine Deming read Donella Meadows’s Limits to Growth about the impending overpopulation of the planet. If I had to bet, I’d say he probably knew of the work of two leading Israeli scientists pioneering the field of behavioral economics, Daniel Kahneman and Amos Tversky. All of this is to say that Deming wasn’t naive. He knew the world had become quite complex since World War II.
This was, in fact, the very reason he was so passionate about his message in trying to raise the alarm to American leaders. He foresaw the problems we live with today. Perhaps Point #2 could have simply been “Wake up!” The way leaders had managed companies and national agencies in the 1950s and 1960s led to the problems of the 1970s. Companies like Nashua Paper, Ford, Marshall Industries, and others embraced the fact that what they’d been doing wasn’t working anymore.
If he were here today, he’d shout at organizational leaders, “Wake up! The way you’ve approached IT for the last four decades still doesn’t work! The problems are too complex; simple solutions won’t work. You have to embrace a new way of thinking about security and safety.” Today, leaders wait until there’s a problem before they fix it. This approach must be replaced with a process of continuous learning, continuous improvement, continuous adaptation, and continuous change.
Two decades ago, IT was a department. Today, most companies are built around their tech. Two decades ago, cybersecurity was an activity. Today, it must be part and parcel of how organizations are designed, built, and grown. Imagine what Deming would think about a self-driving car with 500 million lines of code that sometimes has to make life-or-death decisions without real-time human input.
Deming was appalled when an acquaintance of his was allowed inside a critical US chemical plant despite the name and date on the visitor’s security badge being completely wrong. Deming worked on classified projects during World War II, so he knew a thing or two about security and keeping out enemy spies. But without being on a war footing, the country had grown lax about security. The same holds true today: In 2018, a former SunTrust employee was able to steal 1.5 million people’s banking information, allegedly because the company had a laissez-faire approach to clearing former employees’ access codes. Banks spend millions of dollars on physical security, armored trucks, time-lock vaults, dye packs, and more to foil armed robberies. They’re used to that sort of frontal assault. Yet they leave their digital back doors wide open.
It’s a new world. It requires new methods.
“Cease dependence on inspection to achieve quality. Eliminate the need for inspection on a mass basis by building quality into the product in the first place.”
Deming never liked the term total quality management. To him, quality was already an all-encompassing concept. Putting “total” in front of it was redundant or, worse, suggestive that quality could be done piecemeal instead of including the whole organization.
Inspecting a product once it’s built doesn’t improve its quality. Inspection merely discovers a lack of quality. Quality isn’t something to be added at the end but to be designed into the product from the get-go. Deming said quality isn’t so much about improving the product as it is about improving the process.
That’s what Walter Shewhart gave us in 1924 with statistical process control: a tool to help improve the manufacturing process, not the finished product. A high-quality product was merely the output. In 1984, Deming said it would take another fifty years before the world fully appreciated whatShewhart gifted us. I think Deming may have given us more credit than we’re due.
Before 2003, most organizations building software relied on the old inspection model. A quality assurance team (QA), often referred to simply as testers, would look for issues after the code was built. Strip away the cool, shiny tech exterior and you’d find Henry Ford and Frederick Taylor at the heart of software engineering. In 2003, Mary Poppendieck published Lean Software Development. That’s when things began to change. I’ve already related how John Allspaw and Greg Hammond shocked the global IT community in 2009 with the revelation they were doing ten software deployments a day at Yahoo’s Flickr. Two years later, Jon Jenkins dropped the bombshell that Amazon was deploying code on average every 11.7 seconds.
Mary, Tom, John, Greg, and Jon pioneered the new software delivery method called continuous delivery, the seeds of DevOps. This shifted the focus from improving a software product to improving the software process. The DevSecOps movement took the same idea and introduced the concept of designing security in at the beginning instead of using the inspection method at the end.
I had the good fortune to interview Harper Reed for this book. He was the CTO of Project Narwhal, the software behind Barack Obama’s successful 2012 reelection campaign. The team used the DevOps method to design and adapt the system to anticipate nearly every conceivable disaster. This foresight allowed the system to stay up and running through the “Great Amazon Outage” of October 1; Hurricane Sandy later that month, which ravaged the entire Eastern seaboard; and the tremendous stress of last-minute get-out-the-vote efforts on election day. By building quality in from the very beginning, Project Narwhal’s vulnerabilities were exposed early on and addressed.
Contrast that to Mitt Romney’s get-out-the-vote software, ORCA. It was, in part, supposed to allow volunteers outside polling stations to report who had voted. The idea was to direct time and efforts to registered Republicans who hadn’t yet voted or to increase turnout in low-polling areas. As I understand it, the Romney campaign hired a prestigious tech consulting firm (probably one like Accenture or EY) that developed software using the old assembly-line (waterfall) method—you know, waiting until the end to inspect for quality. The campaign touted it as their secret weapon, declaring that Narwhal was nothing compared to what they were building.
How did things turn out? As the tech news provider CNET put it, “Orca was supposed to give the Romney campaign a technical advantage over Obama on election day. It got harpooned instead.” Bugs, a complete shutdown due to a mistaken denial-of-service attack, system overload—it was a mess. Meanwhile, Narwhal swam on smoothly.
“End the practice of awarding business on the basis of price tag. Instead, minimize total cost. Move toward a single supplier for any one item, on a long-term relationship of loyalty and trust.”
I’ve consulted at hundreds of Fortune 500 companies over the last forty years.
Their cafeterias suck.
There have been one or two with food and service north of mediocre, but that’s it. These companies have no idea how much money they’re losing by going with the lowest bidder. Beginning around 2006, I noticed many of the newer web- and cloud-based companies I consulted for had terrific on-site food services. I’m not sure if it was by accident or design, but it seriously boosted company performance. Instead of driving down costs, these companies spent a lot more. Better service translated into higher employee morale. Instead of people going off-site and spending an additional thirty or forty minutes for lunch, they stayed on-site and took shorter lunch breaks. Most importantly, employees ate together and organized more lunch-and-learns, increasing cross-pollination and collaboration among them. Imagine how much going with the lowest food service vendor would have cost!
Western companies often spend a lot of time and money finding other suppliers and quickly switching, even for a marginal financial gain. They’ll outsource to the cheapest supplier, not considering the hidden costs they’re buying. Companies often use multiple suppliers to protect against supply-chain disruptions and also so that the vendors compete against each other vying for the company’s business. As the Japanese so ably demonstrated in post–World War II, tighter relationships with fewer suppliers lead to increased process alignment and lower overall costs.
But the answer isn’t to just work with a limited number of suppliers. For example, when it comes to cybersecurity, one major obstacle is organizations’ purchasing and procurement departments. Many of the Fortune 500 won’t allow employees to purchase software from anyone but approved vendors. A company’s purchasing team and cyber risk team have two different motivations. Purchasing’s job is to get the best deal financially—which is rarely the best product. In some organizations, it can take years for a software provider to become an approved vendor. At today’s pace, some enterprises are buying brand-new software that’s already obsolete.
To save further costs, many purchasing departments choose to get nearly all of their software from a single source. This one-size-fits-all approach is called an enterprise license agreement, and the discounts are substantial (sometimes up to 40%), representing millions of dollars in “savings.” The problem is that their single vendor almost certainly doesn’t have the best solution for every single software product. Take the SolarWinds breach. In my professional opinion, their network-monitoring tool isn’t the best one on the market. However, it does have a lovely price point. While this type of cyberattack can happen with any vendor, there are several higher-end tools I’ve worked with that have tighter controls over their internal software supply chain.
Deming admonished organizations to use fewer suppliers but for a specific purpose: to reduce variation. I read about a great example in Tom Limoncelli’s book The Practice of System and Network Administration. He wrote about Google’s internal Death Squad team, which ensured only two versions of the Linux operating system were in use throughout the company at any given time. Doing so meant that software developed in one part of the company (say, Google Maps) would work well with software developed in another part (say, Google Earth). This was a key component in Google’s ability to control its systems at scale.
If the Obamacare IT team had heeded Deming’s advice from 1982, it might have avoided the rollout fiasco of Healthcare.gov in 2013. On October 1, the website launched to allow people to enroll. However, there were serious tech problems that prevented many people from enrolling for weeks. This, despite costing $1.7 billion, according to the US Inspector General in an investigation the next year. A consultant for the project told me Healthcare.gov used more than twelve different logging frameworks (the way computers record and store system activity data). Reducing variation at the macro level was obviously not part of the project’s specifications.
If only they’d listened to Deming.
“Improve constantly and forever the system of production and service, to improve quality and productivity, and thus constantly decrease costs.”
In Out of the Crisis, Deming observed the West focused more on specification, while the East focused more on uniformity. We saw this with the Mazda transmission example. US-based suppliers built the transmission “to spec,” while their Japanese counterparts minimized variation to their pragmatic limits.
In IT, most cybersecurity activities are specification-based. The National Institute of Standards and Technology (NIST) maintains a list of common vulnerabilities in software. All public and private organizations rely on this database to know if a specific collection of code has documented vulnerabilities, rated on a scale of severity. The famous Equifax breach vulnerability, for example, was a ten, the most severe.
In the industry, there’s a term called “zero-day vulnerability.” This happens when someone (a vendor, white hat hacker, tester, etc.) first discovers a flaw in software that’s already been deployed to the real world. Those organizations using the software have, in theory, zero days to fix this; we assume the bad guys already know about the vulnerability. If they didn’t before, they certainly do once it’s documented. As soon as there’s a patch to fix the software, everyone downloads it and updates their code ASAP. Their software is now up to specifications.
This is a never-ending cat-and-mouse game. The bad guys do bad things while the good guys try to keep up as well as they can. Scanning software for known NIST vulnerabilities is akin to a specification implementation. In other words, “We should not have any software with these vulnerabilities.”
Shannon Lietz takes what I call a uniformity approach. Instead of a purely defensive stance—“How can we protect ourselves against attacks?”—she takes a proactive stance, which she calls adversary management. She monitors the behavior and motivations of attackers. How often do adversaries attack? What’s their purpose or aim? What do they target and what do they ignore? How long will they mount an attack before giving up?
In effect, Shannon uses the PDSA cycle to continuously improve her organization’s cybersecurity. Instead of being a step behind, she tries to stay a step ahead, learning from hackers so that she anticipates their moves. She is institutionalizing a continuous security improvement process. She shares this information throughout the organization, a subtle way of reminding all that cybersecurity is everyone’s responsibility.
“Institute training on the job.”
Remember Peter Senge from the Macy Cybernetics Conferences and his work with John Waraniak at GM? Deming was ninety years old when Senge, relatively unknown outside of academia at the time, asked Deming for feedback on his forthcoming book The Fifth Discipline (soon to become a bestseller and a classic). To his surprise and gratification, Deming promptly returned his letter with a note Senge felt obliged to quote in his book’s introduction:
Our prevailing system of management has destroyed our people. People are born with intrinsic motivation, self-respect, dignity, curiosity to learn, joy in learning. The forces of destruction begin with toddlers—a prize for the best Halloween costume, grades in school, gold stars—and on up through the university. On the job, people, teams, and divisions are ranked, reward for the top, punishment for the bottom. Management by objectives, quotas, incentive pay, business plans, put together separately, division by division, cause further loss, unknown and unknowable.
As I related, Doris Quinn told me that when she accompanied Deming on his four-day seminars, he collectively spent less than a day presenting on statistical process control. The lion’s share was devoted to telling managers and executives how to better manage their people. He would relate the story of how Japan in 1945, decimated beyond hope of repair and with no natural resources to speak of, relied on its people to be reborn and go on to become the second largest economy in the world. People, he said, were the key. He would often lament that America’s worst waste was companies’ failure to fully utilize their people’s abilities. One of his admonishments to fix that was to “institute training on the job.”
If you’ve ever been part of a big company, you should be rolling your eyes at the thought of corporate training. If Deming were here today, he’d abhor what passes for training in most companies. His idea wasn’t for people to file into a classroom, listen to a presentation, and then return to the factory floor or their desks with no idea how to put their training into practice. He specifically said institute training on the job.
When people are being trained, they need to understand what the job involves and why it’s done in the first place. Then, they need to continually learn through practical experience, experimenting with new methods and ideas, studying results, and striving to be as perfect as pragmatically possible.
“Institute leadership. The aim of supervision should be to help people and machines and gadgets to do a better job. Supervision of management is in need of overhaul, as well as supervision of production workers.”
“Help people and machines and gadgets do a better job.” That was Deming’s definition of leadership. It wasn’t about who was in authority and who got punished. It was about improving the process.
To my mind, leadership is mentoring; being more of a coach than a police officer who makes sure she writes her daily ticket quota. Leadership is about inspiration, setting the vision, and “constancy of purpose.”
Deming often gave his take on the old adage: “In God we trust. All others must bring data.” Leaders don’t make knee-jerk reactions; they analyze data. They know how to separate common cause variation from special cause to sort what can be brought under control versus what can’t.
Most organizations think one of a manager’s main functions is to put out fires: all the problems that crop up over the normal (and abnormal) course of business. Problems like security vulnerabilities. In the 1980s, the UK government developed a set of recommendations to standardize IT management practices across what’s now His Majesty’s Kingdom. It’s called the information technology infrastructure library, or ITIL for short. Deming was quite popular in the UK; one of the best books on Deming is The Deming Dimension, written by Dr. Henry Neave, a statistician and father of the British Deming Association. I was gratified to learn that ITIL’s continuous-improvement concepts were based on the PDSA cycle.
According to ITIL, an IT “incident” is an unplanned interruption or reduced quality of an IT service. There are four levels of severity, starting at P1 and going up to P4. Most organizations have enough resources to resolve only P1 incidents. Managers responsible for cybersecurity are primarily motivated to reduce P1 incidents. The more they see that number shrink, the better they believe things are.
Incidents are software quality issues that need to be dealt with quickly and quietly. This is not leadership. This is firefighting. Deming, in contrast, would probably get rid of the arbitrary four categories and instead use the System of Profound Knowledge to create a mathematical approach to managing security incidents. For example, instead of looking at incidents through arbitrary categories (P1 to P4), System of Profound Knowledge could be used to identify common-cause and special-cause patterns across all incidents.
Leadership would be those same supervisors using these incidents as on-the-job training opportunities. John Allspaw says incidents are “unplanned investments.” The resources are a sunk cost. Why not turn them on their head and see them as possible sources of return on investment (ROI)? Why allow the opportunity to slip by? That’s leadership. That’s helping “people and machines and gadgets do a better job.”
“Drive out fear, so that everyone may work effectively for the company.”
About 2010, I remember some Facebook developers telling an audience that new software engineers to the company deploy their own new code on the very first day of work, even before finishing all their paperwork. Someone in the audience asked, “What would you do if they broke the system?” The developer replied, “If they can break our system on the first day of work, they have done their job, and we have not.”
Starting around this time, a trend emerged in the software industry called the blameless post-mortem. Instead of assigning a problem or vulnerability to a certain person, blame would be placed on the system. (I can hear Deming cheering, even now.) After the incident had been handled, the stakeholders would hold a post-incident review to learn what had happened, how it had happened, and how to improve their software delivery process. The digital world is too complex to settle for simple solutions, so the best organizational teams stopped looking at the problem (a.k.a. the outcome) and started looking at all the parts that together make up the software delivery system.
This is exactly what Deming meant when he told managers to “drive out fear so that everyone may work effectively for the company.” Instead of the old Ford and Taylor command-and-control style of management, Deming directed managers to build trust throughout the organization. Eliminate fear: the fear of not getting a bonus; the fear of making a mistake; the fear of not meeting annual MBOs, MBRs, and KPIs; and the fear of not measuring up to their peers.
Deming was especially miffed at those who didn’t understand basic mathematical functions but were managed by them. General Electric’s Jack Welch was famous for his dictum that every business unit must fire the bottom 10% of performers each year. It doesn’t matter how hard a group works or how much they achieve. Mathematically, half of a group will perform above average; half will perform below. That’s how averages work. In each of GE’s business units, it didn’t matter if the bottom 10% were outperforming the top 50% from a few years earlier; the bottom 10% had to go. This was a fear-based organization, based on the old deterministic mentality of the “rugged individualist” that prevails, especially in American culture. Deming saw this at Hawthorne and then again at the Census Bureau. Japan showed him a better way: organizations built on trust and mutual loyalty.
In 2012, Google set out to answer the question “What makes teams successful?” The project’s name, Aristotle, came from the Greek philosopher’s famous quote, “The whole is greater than the sum of its parts.” After surveying hundreds of Google employees across 180 teams, one of their five key findings was something called psychological safety. The more secure team members felt in expressing ideas and participating in discussions, the better the team’s overall performance. The more the discussion was dominated by one person or a small group of people, the worse the team performed.
Amy Edmonson, a leading researcher on psychological safety, defines it as the belief that the team is capable of supporting interpersonal risks. A psychologically safe team isn’t power-ordered and encourages different perspectives, irrespective of sex, gender, expression, age, orientation, disability, physical attributes, culture, ethnicity, personal beliefs, and more.
Fear-based organizational cultures can be costly. Even deadly.
On October 29, 2018, a Boeing 737 MAX operated by Lion Air crashed thirteen minutes after takeoff, killing all 189 people aboard. Reports indicated a malfunctioning flight-control system had been disabled in an attempt to restore control. Barely four months later, another 737 MAX crashed six minutes after takeoff. None of the 157 survived.
Some suspected Boeing’s MCAS, the maneuvering characteristics augmentation system. The MCAS was designed to counteract the airplane’s tendency to push the nose up during certain maneuvers. As part of a larger attempt to shave off as much pilot training time as possible, Boeing removed the MCAS from the plane’s standard operating manual. It argued that the MCAS wasn’t designed for use during normal flight operations and should therefore be excluded. The FAA agreed. When the 737 MAX entered service in 2017, there wasn’t a word about the MCAS. Pilots began flying planes totally unaware of this “feature.” When the MCAS glitched, none of those pilots knew what was happening, much less how to fix it.
To make matters even more tragic: Boeing knew the MCAS sometimes operated erroneously long before the Lion Air crash, an ensuing investigation found. The investigators discovered some members of the company understood the potential for disaster, but this information was buried in isolated pockets around the company. For those who knew there was a problem, the investigators found the employees had unclear reporting procedures to flag such crucial issues.
To add a little context to this decision, consider the fact that Boeing was under enormous pressure to get the 737 MAX into service as quickly as possible. For the first time, American Airlines was considering buying a fleet of planes from the European manufacturer Airbus. To thwart this deal, Boeing modified its existing 737 design as a cheaper alternative, with one of the key savings being that American Airlines’ twelve thousand pilots wouldn’t have to be retrained on a whole new plane.
Contrast that environment to Toyota’s, where any employee on the line could pull the Andon cord, bringing production to a standstill for even a minor quality issue. In the case of the Boeing 737 MAX, hundreds of lives were at stake . . . yet no one pulled the Andon cord.
Does that sound like a psychologically safe place to work? Were people encouraged to voice their concerns and offer dissenting opinions? Did they have a metaphorical Andon cord to halt everything in order to fix a fatal flaw? I’ve never worked with Boeing, but my money says no.
“Break down barriers between departments. People in research, design, sales, and production must work as a team, to foresee problems of production and in use that may be encountered with the product or service.”
Goldratt talked about the mistake of isolated improvements—what he called local optima—that sacrificed what was best for the system as a whole. Another term for this is the inefficiency paradox, where optimizing the individual components of a system results in a suboptimal system.
In his book Free, Perfect, and Now, Deming protégé Bob Rodin of Marshall Industries shares a meeting where a number of managers came clean about the different tricks they used to game the company’s IT system. It wasn’t that they wanted the company to fail; it was just that they were incentivized for their department to win—regardless of the consequences to other departments.
When Deming wrote that managers needed to “break down barriers between departments,” he wasn’t speaking only about collaboration, though that’s a big benefit. He was trying to get executives to see their companies as a single entity, a cohesive system, with everyone focused on the same final output of an ever-higher quality product or service.
In Working Backwards: Insights, Stories, and Secrets from Inside Amazon, the authors reveal that, at the time, the company’s maximum base salary was $160,000. The company believed an incentive-based compensation model might create short-term goals at the expense of long-term value creation. They didn’t want to incentivize employees to reach departmental milestones, regardless of whether those milestones benefited the whole company or not.
I know exactly what that’s like in such a tech company. Cybersecurity specialists are often at odds with operations, business managers, and application developers. The opportunities for silo mentalities abound in these murky waters. The infamous 2017 Equifax breach that exposed the personal information of 147 million people and cost the company $5.3 billion in market cap serves as yet another great illustration. The US House of Representatives committee investigating the breach found some institutional organizational barriers between the chief information security officer (CISO) and the chief information officer (CIO): the CISO didn’t report to the CIO, as you might assume she would, with both of them being responsible for the company’s tech. She didn’t even report to the CEO. The person responsible for cybersecurity of a multibillion-dollar corporation reported to . . . a lawyer.
When the breach occurred, the security officer didn’t even report it to the CIO. Why? She testified before the House committee that she couldn’t recall a particular reason. When asked if she thought it peculiar that she, the CISO, didn’t report to the CIO, she said, “That structure was in place . . . at the time I arrived at Equifax. It was the structure that was there with the person that was my predecessor. And I knew that it was that structure going in. I didn’t question it.”
“I didn’t question it”: famous last words.
“Eliminate slogans, exhortations, and targets for the workforce asking for zero defects and new levels of productivity. Such exhortations only create adversarial relationships, as the bulk of the causes of low quality and low productivity belong to the system and thus lie beyond the power of the workforce.”
My jaw nearly dropped the first time I read Deming’s Point #10: he could have written it the day before I arrived at my new company, it was so relevant! In one of the early startups I worked for, whenever the CEO returned from a business trip, he would have a new set of motivational posters that he would plaster all over the halls. Another time, he bought us each a copy of Nassim Taleb’s business book Antifragile, thinking it would somehow solve all our problems. This CEO was operating from the old deterministic model of management, trying to offer simple solutions to complex problems.
To this day, plenty of executive managers still don’t understand the nature of the digital world and demand “zero defects” or “never fail” software. Dr. David Woods, Professor of Cognitive Systems Engineering and Human Systems Integration and founder of Adaptive Capacity Labs, introduced me to the idea “Maginot Line thinking.” After World War I, France created an impenetrable line of fortifications and defenses stretching roughly from Luxembourg to Basel, Switzerland. Why? Because this was how the Germans had invaded France in World War I. A few years later, their preparations served their purposes: the Germans could not break the Maginot Line. So instead, Hitler went north through Belgium and conquered France that way. Maginot Line thinking is when you expend considerable resources in an effort to counteract a past threat.
Dr. Woods said this mentality is common in cybersecurity, where executives want to spend prodigious sums of money in a vain effort to counter vulnerabilities that have already been exploited. It’s the digital equivalent of locking the barn after the horse has escaped. He described cyberattacks like parasitic, biological intrusions. Human beings will never be able to rid the world of viruses; it’s a fool’s errand. We can’t stop them; we can’t control them. What we can do is be as resilient as possible, i.e., good hygiene, healthy diet, exercise, regular medical checks, etc. The same for cybersecurity: we’ll always have security vulnerabilities. That’s a fact of digital life. What we can do is be as resilient an organization as possible.
Stanford professor Martin Cassado redefined networking and security in 2007 by means of his startup, Nicara. He challenged the security industry’s status quo by pointing out that organizations spend 80% of their resources protecting the organization’s “digital perimeter” keeping the bad guys out. Managers want impenetrable fortresses protecting their IT systems. As we already discussed, the reality is that many breaches come from inside the network. (Think back to that bank hacker who waltzed through the side door and found an empty cubicle.) He said the security industry should assume that hackers will gain access to the system. That is, instead of trying to keep hackers out, we should plan for the inevitability of them getting in. Cassado said organizations should develop more inner perimeter security measures. Software is simply too complex and evolving too quickly to defend the Maginot Line forever.
Simple slogans won’t solve our problems. Never have. Never will.
“Eliminate work standards (quotas) on the factory floor. Substitute leadership. Eliminate management by objective. Eliminate management by numbers, numerical goals. Substitute leadership.”
Deming hated management by objective, results, quotas, numerical goals, and other such measures that oversimplified quality craftsmanship into neat little numbers managers could understand. Too often, managing by hard numbers sacrifices quality. When organizations care more about numbers than quality, they incur enormous costs.
I’ve seen this over and over again in the tech industry. Take one metric for example: KLOC, or defect per thousand lines of code. One KLOC is considered good: out of a thousand lines of code, there was only one bug or error. From a deterministic perspective, it’s logical to want that number to be as close to zero as possible—100% clean code. Therefore, you would always want that number to be going down.
However, there are ways to improve the quality of software code that would make that metric rise. The software development team might decide to optimize its code base, perhaps deleting unused code or writing more efficient code, with the overall result being fewer lines of code. With less code, the remaining defects might result in having a higher KLOC. So, although the block of code is higher quality, the team would get penalized because it resulted in a higher KLOC. Well, obviously that’s an incentive to stick with the status quo. Great-looking metric; status quo quality. This could be why Knight Capital’s old Power Peg program hadn’t been deleted off the eighth server. Who knows?
On the other hand, you could have a team of developers who write code that’s hard to read and maintain. The team’s KLOC can’t measure that facet of low-quality code; the team is rewarded for writing what is essentially a lot of dense code. Or like trying to reach a page count for a high school paper, the team might pad the code with unnecessary extra lines, resulting in a lower KLOC. Bad code; great-looking performance.
This point flows right into the next:
“Remove barriers that rob the hourly worker of his right to pride of workmanship. The responsibility of supervisors must be changed from sheer numbers to quality. Remove barriers that rob people in management and in engineering of their right to pride of workmanship. This means, inter alia, abolishment of the annual or merit rating and of management by objective.”
In his 1993 book, The New Economics, Deming wrote, “The aim of anybody, under the merit system, is to please the boss. The result is destruction of morale. Quality suffers.”
Someone once asked Deming how he would change the merit system. “Change it!?” he thundered. “Abolish it, for the love of God!”
He admonished managers to remove the barriers that robbed people of pride in their work. Deming believed—and study after study supports his belief—that, in general, people want to do good work. They want to be proud of what they do and what they have accomplished. Company practices, processes, and culture get in the way of that, demoralizing them.
For instance, I once consulted for a CIO on a project where a large bank had been acquired by an even larger bank. For part of it, I interviewed several developers in the bank being purchased. One woman I talked to was moving an older application over from their on-site servers into the acquiring bank’s cloud infrastructure. She expressed her frustration that the acquiring bank demanded so many regulatory controls before they would allow her to deploy the software application on their cloud servers. When she asked why she couldn’t use a certain feature or process of theirs, they could never explain why; she just couldn’t. On the other hand, her bosses at the bank being bought were screaming at her about overdue delivery times. She couldn’t move the application until it met certain requirements, but the acquiring bank could never tell her what was needed or why. Ironically, the bank couldn’t explain their controls because they had never deployed this type of application. That is, they were requiring controls they themselves didn’t even know how to administer. Trapped in this digital Catch-22, I asked how she dealt with it.
“It’s simple,” she said. “I write crappy code and make them both happy.”
This was a developer who otherwise took pride in her work. But the work environment she found herself in wouldn’t allow her to write great code. She outright admitted it was shoddy workmanship. But caught between a rock and a hard place, she had to do what she had to do to survive and please her superiors.
Were he here, Deming would read those managers the riot act.
“Institute a vigorous program of education and self-improvement.”
In his Harvard Business Review article “Decoding the DNA of the Toyota Production System,” Dr. Steven Spear reveals the findings of his and his colleagues’ four-year study. The problem, they say, is that companies try to copy Toyota’s methods. What they should really do is understand Toyota’s mentality: “We found that, for outsiders, the key is to understand that the Toyota Production System creates a community of scientists. Whenever Toyota defines a specification, it is establishing sets of hypotheses that can then be tested.” Applying the scientific method in order to arrive at empirical knowledge—this is one of the four core principles of Deming’s System of Profound Knowledge, the theory of knowledge.
In other words, Toyota found a way for its employees to teach themselves. There is absolutely a place for top-down learning, where the organization teaches an individual. But more importantly, the organization needs to be a place where people can teach themselves.
Google’s infamous “20% time” was an incredibly innovative way to achieve this. Google encouraged their people to perform self-directed experiments. It resulted in Gmail, Google Maps, and Google’s cash cow AdSense. Around 2006, Cristophe Bisciglia, a senior engineer, was interviewing potential employees when he realized that even the brightest applicants were unaware of a new style of programming that Google had invented, which was ultimately the genesis of Hadoop, the software used to create the New York Times’ TimesMachine we discussed earlier. Bisciglia decided to use his own 20% time to create a class called “Google 101” for his alma mater, the University of Washington. IBM noticed this class and helped Google promote it throughout some of the most prestigious universities in the US, from MIT to Stanford to Berkeley. This course, probably more than anything else, would lead to the birth of the Big Data movement.
When Shannon Lietz was at Intuit, she ran a sixty-person group of internal hackers called a Red Team. Their primary aim was to find vulnerabilities in Intuit’s software like QuickBooks and TurboTax. As soon as her Red Team was able to breach the software’s security, they would immediately alert the developers of that software with an email that’d say something like, “Hey, we just captured ten million customer names, phone numbers, and social security numbers. Here’s the vulnerability. Please fix ASAP.”
When I related this story in a presentation to one of the largest banks in the world, an executive said, “Oh, Intuit can do that kind of thing. We can’t.” I pointed out that Intuit could justify that size of an internal cybersecurity team with the company having just $15 billion in asset holdings. This bank, on the other hand, had something in the neighborhood of $2.5 trillion in assets at the time. I didn’t see how they couldn’t justify having their own Red Team.
Andrew Clay Shafer says, “You are either a learning organization or you are losing to one that is.”
“Put everybody in the company to work to accomplish the transformation. The transformation is everybody’s job.”
According to Deming, the commitment to continuous process improvement wasn’t an activity an organization could assign to a special task force or a particular role. It wasn’t a project or one-off event. It was an all-encompassing mentality that must be embraced by everyone in the organization, starting with the CEO and diffusing through every layer and level.
As I’ve studied Deming, I’ve come to realize that he chose his words quite carefully. When he titled the chapter containing these 14 Points “Principles for Transformation of Western Management,” he truly meant a transformation. In business, the word transformation gets bandied about a lot, but Deming selected this word with care. It means a thorough or dramatic change.
Over the last two decades, we’ve seen thorough and dramatic changes in the world: socially, environmentally, and technologically. These changes require a similarly thorough and dramatic transformation if we are to survive them. No one can save us. It’s up to every one of us to save all of us.
Deming already did his part.
As he would say at the end of each lecture, “You have heard the words; you must find the way. It will never be perfect. Perfection is not for this world; it is for some other world. I hope what you have heard here today will haunt you the rest of your life. Then I have done my best.”
John Willis has worked in the IT management industry for more than 35 years and is a prolific author, including "Deming's Journey to Profound Knowledge" and "The DevOps Handbook." He is researching DevOps, DevSecOps, IT risk, modern governance, and audit compliance. Previously he was an Evangelist at Docker Inc., VP of Solutions for Socketplane (sold to Docker) and Enstratius (sold to Dell), and VP of Training & Services at Opscode where he formalized the training, evangelism, and professional services functions at the firm. Willis also founded Gulf Breeze Software, an award winning IBM business partner, which specializes in deploying Tivoli technology for the enterprise. Willis has authored six IBM Redbooks for IBM on enterprise systems management and was the founder and chief architect at Chain Bridge Systems.
Well aligned to our digital future.
Reply
Your email address will not be published.
First Name Last Name
Δ
"This feels pointless." "My brain is fried." "Why can't I think straight?" These aren't…
As manufacturers embrace Industry 4.0, many find that implementing new technologies isn't enough to…
I know. You’re thinking I'm talking about Napster, right? Nope. Napster was launched in…
When Southwest Airlines' crew scheduling system became overwhelmed during the 2022 holiday season, the…