Over the past few years I have interviewed hundreds of DevOps practitioners and I have also attended around 30 or more DevOps meetups around the globe. This included with my own experience has enabled me to come up with a good list of what I will call patterns of DevOps Kaizen habits.
The following is a list of patterns of Kaizen habits.
Sounds like it should be the easiest of all patterns. Of course, it isn’t. There have been many great books on this subject by people much smarter than me; however, I will try to illustrate a few of the successful patterns I have seen in the name of DevOps.
7 Ideas to Improve Communication
1. The Smiley Board
Stephen Nelson Smith (@lordcope) talks about how he sometimes uses “the smiley board” on some consulting engagements. At the end of each work day, everyone is required to draw a picture of their face in one of three modes (happy face, blah face or sad face). At the end of each sprint during the retrospective they try to correlate face to work and flow. Never underestimate the power of capturing subjective and soft data.
2. Playing Games
Game playing is also a fantastic way to influence good communication behavior and bring people together. One of my favorite games is a Kanban simulation game (http://getkanban.com/). The game simulates a variable work flow for a SaaS based company over a 13 day period and introduces all sorts of events. A team of 5 or 6 players roll dice to move story cards with a goal of maximizing net profit by optimizing the flow of work. Huddle.com has a game they call the XP game. They create cross function teams from all aspects of the business (dev, ops, sales, marketing, and exec level) and they play a simulated software development life cycle; however in instead of creating software they have jobs like blow up 20 balloons. The teams assign story points to the tasks. The games help all aspects of the business learn how to communicate in a common language for how to get work done.
3. Daily Standups
I am also a big fan of daily standup meetings, especially for managing remote teams. It’s important to have daily short discussions about the previous and new day’s work flow. This also creates a social contract for trust. Everyone get to see and hear what everyone is working on.
4. Hack Days
Hack days are also a great way to build good communication habits. Some companies plan a hack day once a month. Some even do it once a week. Everyone gets to work on whatever they want for the day. Java guys may try a python project, the operations guy might sit in on a Java project to learn more about the framework. Typically the work done on a hack day focuses on the tasks that never seem to get done. The little things that, when done, help productivity. One company actually flies all of their employees to their headquarters’ city once a quarter and they make it a hack day contest. Peers vote on the final projects and prizes are given out.
I have run hack days in some large companies as part of a DevOps workshop and watch employees start reaching over a conference table to share information. One time there were two guys who’s cubicles were back to back for 10 years and after they reached over the conference table to share a screen view they joked that they had never done that in the 10 years they had worked next to each other.
5. Putting Everyone in the Same Space
Another communication hack I recently heard was a developer manager who decided to put his desk right smack in the middle of all the developers in an open floor space. DRW has a story where they have their traders actually pair with a developer. There is a famous story where the developer asks the online trader why he selected one button versus another field. The trader responds that the other button is sometimes hard to see. The developers says “Hold on a second.” Then he commits the change (on a live trading system) and says “How about now?” The trader says “Thats great but can you make it a little bluer?”
At DRW this strategy is working so well that it’s not uncommon for developers to start going to night school in order to become traders.
6. Chat Rooms
Another standard for creating good communication is to create chat rooms. At my current company we have group specific chat rooms as well as one we call Watercooler. Watercooler is an open discussion to talk about anything you want to. There are also cross functional rooms and we also create war room channels when we need. There are some companies that run bots on their chat rooms so that they can push out meaningful information like alerts config runs, builds etc. Some companies actually add clever code hacks to their chat rooms to keep the discussions lively. I have seen hacks that turn on red flashing lights, sound alarms. The sillier the better. Work needs to be fun.
Startups in Silicon Valley have a long history of juicing up the office spaces with pinball machines, ping pong tables and pool tables. It’s not uncommon for a dev and ops team to take their frustrations out on a fierce game of ping pong.
Like I said in the beginning of this section I am no expert on this topic however there is one thing I do know: Having a lot of fun at work goes a long way to improving communication.
Local vs. Global Thinking
Another advantage of a startup is that most employees think globally.
The goals are more immediate and there are fewer procedures. The issues come when the organization grows and that global thinking is not nurtured. In larger organizations there can be years of local procedures and optimizations. Local procedures and optimizations may cause conflict with the overall goal of an organization.
Jesse Robbins, founder of Opscode tells a story from when he was at Amazon. Jesse ran operations at one point and he got into a scuff with a product manager of a new Kindle product offering. The debate was that the Kindle product manager wanted a certain release to go out at a specific date and Jesse refused to accept the code. Jesse’s point was that the new release would bring down the production system. The Kindle manager’s point was that the stock price will go up if they hit the proposed date for the launch.
This debate went very far upstairs and the final decision was “Stock Price Up”. Sure enough the product hit on time the stock price went up and yes the system went down. Jesse’s team got dinged on their performance which ultimately effected their bonuses. The Kindle team hit their targets and and their bonuses and to add salt to injury they had a successful release party. The bad news was Jesse’s team could not attend because they were busy fixing the production outage. You may know this as mis alignment of incentives. Global, goal oriented thinking is the best cure for misalignment of incentives.
It always seems like the simple things turn out to be the hardest. Back in my day, the first things you learned in kindergarden were the rules of respect. You had to raise your hand to ask a question. We had to let girls go first and we could not cut in line anymore. How does the saying go? The most important things I learned were the things I learned in kindergarten.
Over the last 30 years I have advised and trained professionals from over a 1000 fortune 5k organizations and I can tell you that high percent of all of them have forgotten the things they learned in kindergarten.
One of the things I love about startups is that respect is the foundation of most of the organization. At Opscode a former employer of mine, one of the founders, is a gentleman named Adam Jacob. Adam is a modern day Will Rogers in my opinion. I have never seen Adam be disrespectful to another person. In fact I have personally felt his wrath when one time, in the heat of a battle, I was disrespectful to another employee. Let me tell you it was very hard to be disrespectful at Opscode when I worked there because Adam set the bar. When I build teams I try and seek and destroy disrespect. I don’t accept it between peers, cross boundary managers and even from CEO’s. It is easy to spot disrespect if you are looking for it in the right places like IRC, Skype channels and emails. There is no room for people in a DevOps Kaizen culture who disrespect others.
Trust is a tough one. There are many situations where tension around trust should be promoted. However, there are good ways and bad ways to accomplish these goals.
When trust issues arise from impatience it will mostly likely promote an unhealthy trust tension. Unhealthy trust tension also come in the form of hoarding or job protection. This is what I call veil trust. These kind of situations are sometimes hard to deal with. Everyone has seen your typical “Bob”. Bob has been there for ages. He knows where all the bodies are buried and no one knows that better than Bob. Bob is unlikely going to let anyone do something on their own. His most famous excuse is “Let me do it, it will take longer to explain it.” Wiki’s are the “anti-bobs”. Make documenting everything they do part of their bonus plan.
Healthy trust tension arises when senior personnel tend to foster a mentoring environment. Taking junior employees under their wings so to speak. Feeling proud that they just gave access to one of the new guys and it looks like he or she can take on more work now.
One of the most common trust issues in an “dev” and “ops” environment is over who has root access to the production servers. There is no one binary answer to this question; however there are a lot of good debates over the topic. If you give a developer root access on a production server he or she can bring down the system. Actually the real question might be can they do more damage with their already access and control to the application? Remember they’re the ones who write the code for the application and data. I was once told by a pilot that Washington DC’s Reagan international Airport is one of the safest places to land. Why? Because it is the hardest airport to land in. Pilots pay a lot more attention when landing in DC. Give each developer a pager or, even better, give developer managers a pager like Netflix does and you might start seeing healthy trust relationships.
Jody Mulkey, the CIO of Shopzilla, says “In the war room the problem is the enemy.” Imagine that, a war room where the real enemy is the enemy instead of everyone blaming each other. DevOps Kaizen environments try to minimize discussions that include victims. Heavy emphasis on mentoring enables victims to become leaders. Devops leaders look for clusters of events that identify potential victim behavior and apply change agents to improve.
The Smell Test
Chris Read (@cread), a DevOps leader, says if you hear someone asking “What does Bill do all day?” then your organization has failed the smell test. Bill is not the issue nor is the person asking the question. The organization is failing the smell test. It might be that Bill is doing great work but no one knows what he is doing. It might be that he is doing a horrible job. Either way there is a communication issue in the organization that needs to be addressed.
Conversations where one group or persons are criticizing others is a smell test red flag. Look for bottlenecks between the groups (the human problems). Opening up channels of communication throughout the organization that help’s groups feel or understand other groups pains. Cross functional exercises can be helpful in promoting healthy communication in an organization. There are hundred of hacks; after you figure out if you have failed the smell test or not.
Slay the Dragon
Early in the life of a startup there is a sense of this “Slay the Dragon” mentality. Typically the dragon is the original goal to be successful. Sometime the dragons are the competitors. In almost all cases a motivated startup will have dragons that need to be slain.
Back in the early days of Tivoli corporation (now owned by IBM) they use to have “Computer Associates” labeled toilet paper. While I worked at Opscode, Puppet was the enemy when ever I was on a sales call. However, over time the startup begins to grow and the dragons start showing up inside the four walls of the organization. The new director of marketing gets to decide what conferences people can go to. The new head of HR starts getting involved on how many people the organization can hire. The dragons start to become internal organizations as opposed to the real business dragons (i.e., the original goal of the startup).
DevOps Kaizen culture always fosters the “there-be-dragons” attitude wherever possible. Some larger enterprise create silo teams to promote the “there-be-dragons” attitude sometimes on newer or greenfield projects.
Ever wonder why a lot of Wall Street financial wizards have physics degrees? I once asked someone why a lot of trading quants have physics degrees. I was told because they come out of school being fearless when it comes to computers. Most of the people who graduate with CS degrees learn a basic respect for computers by understanding their capacity. However, physics majors think they haven’t asked a hard enough question if the computer is not on fire.
Today’s modern distributed computing is all about learning how to fail. In fact being really good at failure is a DevOps Kaizen virtue. The old days of “It can never ever ever happen” are gone. It will happen. It might happen three days in a row or it might happen once every 5 years. Being prepared to fix things when it does happen is far more important than spending years of trying to prevent it from happening. In fact as IT systems become more complex the possibility of “Black Swans” become more likely.
A new DevOps mantra has swapped the old MTTF (mean time to failure) for the new and improved MTTR (mean time to resolution). Remember the movie Apollo 13 when Ed Harris says “Not on my watch!”? Everyone of his crew knows exactly what to do and where to go. No exhaustive meetings talking about how or why the failure occurred. It happened let’s fix it. In fact the Apollo space program was a poster child for leaning how to react to failure. Imagine a legacy enterprise in the same situation. There would have been at least three meetings to decide whether to get the tin foil used for the cooling systems at Target vs Walmart. “Houston we have a problem” is a model for MTTR.
Piercing the IT Veil
The IT veil. You have seen it many times. You can’t do x because of y. You can’t do cloud because of PCI. You can’t do DevOps because of SOX. Sometimes they’re right but more often the “because ” is just being used as a veil.
What in DevOps would violate SOX? A DevOps Kaizen culture does’t accept single acronym’s as roadblocks. Follow up questions need to be asked to see if the veil can be pierced. Many times the veil is a cultural roadblock for change. We are not allowed to do that. Why? Who says? Many times by asking a few additional questions you will find the veil’s are false or based on no longer needed legacy.
Most culture issues in IT arise from bad communication. Improving communication between teams in an organization is one of the highest goals of a DevOps Kaizen culture. Things like trust, respect, eliminating victims and piercing the IT veil are all great tools to improve communications; however, sometimes they are not quite enough.
One approach is to actually designate or in some cases actually create a position for a shaman. A shaman in a Devops Kaizen culture is someone who’s primarily responsible is to bridge communication gaps between teams. In Malcom Gladwell’s “Tipping Point” he describes certain people that he calls “connectors”. The span all sorts of common boundaries. A DevOps shaman should be the internal connector. The person who knows what everyone else is working on. It’s the person that knows the truth behind all of the legacy stories. The history of the organization. People are excited to converse with a DevOps shaman because they are always left with more information from the conversation than before they spoke.
Some companies hand out a copy of Robert Sutton’s “The No Asshole Rule” book to all new employes. The theme of the book is simple: bullies suck. They create a toxic culture and they should be eradicated immediately from any work place.
Years ago I did some consulting for a mail sorting software company. In the old days the process of optimizing zip codes for mail sorts was really hard process to solve. The postal rules could change as often as once a week. There was one lead developer who was considered a god because he was the only one who knew where all the bodies where buried for a multi million line cobal/assembler based program. One day a junior dev left a sticky note on his desk asking a RTFM question. The “asshole” in this story responded by taking the sticky note and put his 10 inch bowie knife threw it on the the seat of the junior devs chair.
One of my immediate suggestions after hearing that story was to recommend his firing. The organization explained to me that they couldn’t afford to fire him. I told them “Fine. here’s my invoice and you guys can do what ever you want.” A few months later they brought me back in. This time they listened to me. They thought they would go out of business if they fired him. In reality it was a significant bump in the road, but after about 3 or 4 months they were running better than they had since the inception of the businesses Here is a list of Robert Sutton’s dirty dozen taken from Wikipedia…
Violation of personal space