Skip to content

September 28, 2016

CASE STUDY: Etsy, Sprouter and Conway’s Law

By Gene Kim

The DevOps Handbook is now available. This is one of over 40 case studies you will find in the book.

How we organize our teams affects how we perform our work.

Dr. Melvin Conway proved this with a famous experiment he performed in 1968 with a contract research organization that had eight people who were commissioned to produce a COBOL and an ALGOL compiler.

He observed, “After some initial estimates of difficulty and time, five people were assigned to the COBOL job and three to the ALGOL job. The resulting COBOL compiler ran in five phases, the ALGOL compiler ran in three.”

These observations led to what is now known as Conway’s Law, which states that:

“Organizations which design systems…are constrained to produce designs which are copies of the communication structures of these organizations…. The larger an organization is, the less flexibility it has and the more pronounced the phenomenon.”

Eric S. Raymond, author of the book The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary, crafted a simplified (and now, more famous) version of Conway’s Law in his Jargon File:

“The organization of the software and the organization of the software team will be congruent; commonly stated as ‘if you have four groups working on a compiler, you’ll get a 4-pass compiler.’”

In other words, how we organize our teams has a powerful effect on the software we produce, as well as our resulting architectural and production outcomes.

In order to get fast flow of work from Development into Operations, with high quality and great customer outcomes, we must organize our teams and our work so that Conway’s Law works to our advantage. Done poorly, Conway’s Law will prevent teams from working safely and independently; instead, they will be tightly coupled together, all waiting on each other for work to be done, with even small changes creating potentially global, catastrophic consequences.

An example of how Conway’s Law can either impede or reinforce our goals can be seen in a technology that was developed at Etsy called Sprouter.

Etsy’s DevOps journey began in 2009, and is one of the most admired DevOps organizations, with 2014 revenue of nearly $200 million and a successful IPO in 2015.

Originally developed in 2007, Sprouter connected people, processes, and technology in ways that created many undesired outcomes.

Sprouter, shorthand for “stored procedure router,” was originally designed to help make life easier for the developers and database teams. As Ross Snyder, a senior engineer at Etsy, said during his presentation at Surge 2011:

“Sprouter was designed to allow the Dev teams to write PHP code in the application, the DBAs to write SQL inside Postgres, with Sprouter helping them meet in the middle.”

Sprouter resided between their front-end PHP application and the Postgres database, centralizing access to the database and hiding the database implementation from the application layer.

The problem was that adding any changes to business logic resulted in significant friction between developers and the database teams.

As Snyder observed:

“For nearly any new site functionality, Sprouter required that the DBAs write a new stored procedure. As a result, every time developers wanted to add new functionality, they would need something from the DBAs, which often required them to wade through a ton of bureaucracy.”

In other words, developers creating new functionality had a dependency on the DBA team, which needed to be prioritized, communicated, and coordinated, resulting in work sitting in queues, meetings, longer lead times, and so forth.

This is because Sprouter created a tight coupling between the development and database teams, preventing developers  from being able to independently develop, test, and deploy their code into production.

Also, the database stored procedures were tightly coupled to Sprouter—any time a stored procedure was changed, it required changes to Sprouter too.

The result was that Sprouter became an ever-larger single point of failure. Snyder explained that everything was so tightly coupled and required such a high level of synchronization as a result, that almost every deployment caused a mini-outage.

Both the problems associated with Sprouter and their eventual solution can be explained by Conway’s Law.

Etsy initially had two teams, the developers and the DBAs, who were each responsible for two layers of the service, the application logic layer and stored procedure layer. Two teams working on two layers, as Conway’s Law predicts. Sprouter was intended to make life easier for both teams, but it didn’t work as expected—when business rules changed, instead of changing only two layers, they now needed to make changes to three layers (in the application, in the stored procedures, and now in Sprouter).

The resulting challenges of coordinating and prioritizing work across three teams significantly increased lead times and caused reliability problems. And then, in the spring of 2009, as part of what Snyder called “the great Etsy cultural transformation,” Chad Dickerson joined as their new CTO.

Dickerson put into motion many things, including a massive investment into site stability, having developers perform their own deployments into production, as well as beginning a two-year journey to eliminate Sprouter.

To do this, the team decided to move all the business logic from the database layer into the application layer, removing the need for Sprouter. They created a small team that wrote a PHP Object Relational Mapping (ORM) layer, enabling the front-end developers to make calls directly to the database and reducing the number of teams required to change business logic from three teams down to one team.

As Snyder described:

“We started using the ORM for any new areas of the site and migrated small parts of our site from Sprouter to the ORM over time. It took us two years to migrate the entire site off of Sprouter. And even though we all grumbled about Sprouter the entire time, it remained in production throughout.”

By eliminating Sprouter, they also eliminated the problems associated with multiple teams needing to coordinate for business logic changes, decreased the number of handoffs, and significantly increased the speed and success of production deployments, improving site stability.

Furthermore, because small teams could independently develop and deploy their code without requiring another team to make changes in other areas of the system, developer productivity increased.

Among many things, an ORM abstracts a database, enabling developers to do queries and data manipulation as if they were merely another object in the programming language. Popular ORMs include Hibernate for Java, SQLAlchemy for Python, and ActiveRecord for Ruby on Rails.

Sprouter was finally removed from production and Etsy’s version control repositories in early 2001.

As Snyder and Etsy experienced, how we design our organization dictates how work is performed, and, therefore, the outcomes we achieve.

Endnotes and citations:

- About The Authors
Avatar photo

Gene Kim

Gene Kim is a Wall Street Journal bestselling author, researcher, and multiple award-winning CTO. He has been studying high-performing technology organizations since 1999 and was the founder and CTO of Tripwire for 13 years. He is the author of six books, The Unicorn Project (2019), and co-author of the Shingo Publication Award winning Accelerate (2018), The DevOps Handbook (2016), and The Phoenix Project (2013). Since 2014, he has been the founder and organizer of DevOps Enterprise Summit, studying the technology transformations of large, complex organizations.

Follow Gene on Social Media

No comments found

Leave a Comment

Your email address will not be published.

Jump to Section

    More Like This

    Building an Automated Governance Architecture – Investments Unlimited Series: Chapter 5
    By IT Revolution , Helen Beal , Bill Bensing , Jason Cox , Michael Edenzon , Dr. Tapabrata "Topo" Pal , Caleb Queern , John Rzeszotarski , Andres Vega , John Willis

    Welcome to the fifth installment of IT Revolution’s series based on the book Investments…

    Addressing Burnout in Our DevOps Community Through Deming’s Lens
    By John Willis

    A Crucial Battle We Must Not Ignore Today, I'd like to pivot from our…

    The Ethical Tensions Between Bureaucracy and Digital
    By Summary by IT Revolution

    We live in an era of competing value systems—the lingering influence of impersonal, productivity-maximizing…

    The Path of Gracious Perseverance: Developing Leadership Courage for Business Impact 
    By Summary by IT Revolution

    We’ve all encountered situations at work where politics, opinions, and power dynamics seem to…