Some observations on DevOps?

At Zynga, I had the extraordinary good fortune to lead an extraordinary team that later on in life we might have called DevOps.

We lacked the vocabulary that exists today, and we certainly lacked the insight that many people later in life have created.

Instead of trying to explain what Zynga did, I thought it might be interesting to discuss what drove us to create a function that a lot of people call Dev Ops and we called systems engineering.

First, some observations.

Any web scale application has two parts:

Application code
Platform code

And the whole discussion of DevOps hinges on the definition of Platform.

In a traditional Web App, circa let’s say 2003, the platform was a LAMP stack. Deploying the platform was easy, debugging the platform was easy.

In 2008, as web apps routinely started to scale to millions of DAU, and exceeded the performance of a single node, platform innovation became de-rigeur in the industry.

We had to innovate at every layer from how the database is built resulting in a proliferation of no-sql and SQL databases, to messaging, to languages, to monitoring and to development tools.

Layered on top of this innovation was the emergence of programmable infrastructures. In this day and age of AWS, the way the world was in 2004 may feel very alien. You didn’t have large pools of servers connected with very large flexible networks that you could just transform with software packages into whatever kind of database, storage, compute node you wanted. You had very specific boxes, that had to be cabled and racked and stacked. There was very little dynamic flexible infrastructure outside of perhaps the Googleplex.

Because the platform was under such a radical burst of change, and because of the programmable nature of the infrastructure itself, the basic relationship between developers and operations changed. Whereas before operations created a fairly boiler plate platform infrastructure that developers programmed to, now developers were creating a platform that in turn application developers programmed to.

At Zynga the platform and the application were often delivered at the same time. As we needed new features, we delivered new platform capabilities to our game developers.

In this environment, the platform developers could not just very well hand off the operations of new platform components to an operations team that had no clue how to manage the new platform pieces.

And as a result, developers had to assume more responsibility for operations than they had in the past. Because the developers tended to have more sophisticated programming skills than the operations teams, the operations tended to involve significant more automation and programming than what had existed in the past. And as a result a new kind of tools began to emerge.

Systems Engineering – at Zynga – was the peculiar merging of platform operations and platform development. Or what we might call DevOps. The core goal of the systems engineering group was to minimize application developer impact of the changing platform.

What I observed is that as platform components became standardized, the need for DevOps for those components to be managed by a pure DevOps team declined and those components transitioned to a more traditional operations team. And in fact, most DevOps teams worked really hard to get out of the DevOps business by enabling operations team to manage their infrastructure.

We did blur the lines between Operations and Development, but that was out of necessity not because it was efficient. In fact, we discovered that having highly trained and expensive engineers doing operations was a waste of time and money. Similarly having weak programmers writing complex software systems rarely got the result you wanted. The specialization and the separation of roles was a good thing and the blurring of lines was a good thing.

For me, DevOps is part of a complete engineering organization at scale: application developers, platform developers, dev ops, and operations are required to win.

Comments

Anthony Hobbs says

November 3, 2014 at 5:45 am

Hey Kostadis, that was truly an extraordinary learning experience for us. It felt good to finally have a label for what we did at Zynga although I keep hearing different definitions of DevOps (Ask any 10 guys and you get 20 definitions). I don’t think it’s possible to come to a consensus now about what DevOps is (or isn’t) so I’ve started to avoid the term and go back to systems engineering or architecture.

lrajlich says

November 3, 2014 at 6:55 pm

I like the platform / application distinction. One thing is team sizes are now smaller – the emergence of cloud computing and modern web programming frameworks is that you now had teams of half dozen or so engineers building and running a web service – and in this sort of team scenario having a dedicated Sysadmin is extremely expensive, so a developer who’s mostly writing features and handling what’s traditionally a sysadmin’s job with part of their time is incredibly valuable. With SEG, I thought the impetus was to avoid repeats of FrontierVille by creating a dedicated team that specialized of launching and running a game service – since that skillset and priorities are fundamentally different than producing a game and game content, it’s reasonable to have them as a different team.

Share this:

Like this:

Comments

Leave a ReplyCancel reply