Build for operability

In a previous post, I mentioned something about a mullet model of production: operate a service with reliability and simplicity. I intend to expand about of the terms I’ve used there.

In a software-as-a service (SaaS), production refers to the ensemble of software used to deliver a service (e.g. an eCommerce site). If you are a web developer, this includes your code that you’ve written using some language, the database where your data is stored, and the other parts needed to run your service (e.g. hosting infrastructure, instrumentation, etc.).

Consider the Primary Function of your service. Any feature to be build must support that Primary Function. The job of an eCommerce SaaS is to facilitate orders. Customers must be able to visit the site add products to their cart, and collect payment. It is not enough to write the features: there has to be supporting software for these features to deliver its job well.

Operability refers to the degree to which a service can be supported as it performs its Primary Function. Operability varies a lot depending on the type of service. A few of my guide questions are: (1) Can you understand what the code does at 2am while running in production? (2) How long would it take to recall how a feature works after not making any changes for several months? (3) How difficult would it be to extend an existing feature to support a new requirement? These questions impose a lot on the software used to run the services and also its supporting tools. Having a simple, understandable codebase with sufficient test coverage helps a lot. Having a good suite of supporting tools (e.g. alert tracking, instrumentation, etc.) also helps.

Tools and techniques are not enough. Without a team skilled in building and operating what they’ve built, operability would be very difficult to achieve. The team ties everything together. There will be some specialist roles within a team, but everyone in the team has a good mental model of how production works.

Recommended resources

  1. Above the Line, Below the Line. Building reliable services requires a working understanding of the continuously shifting dependencies.
  2. The Soviet Union’s Philosophy of Weapons Design (Chapter 87 of Digest). Build tools with simplicity reliability in mind (e.g. AK-47).
  3. Charity Majors’ Twitter account.

Leave a Comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s