George Lutz
George Lutz

Our goal is to get products to our customers faster. This requires deploying small, modular changes frequently — deploying on a technical schedule, not just on a business-driven schedule.

The focus here is not that we want to do this, but how to do this. It requires two things: a modular system architecture, and iterative project and deployment planning — releasing on a schedule defined by technical needs, not by business needs alone.

In a modular system architecture, there may be many, many small components. Each with a single responsibility. For example, a single responsibility (micro)service, a single responsibility database, single responsibility queues, streams, and caches, a single responsibility worker process, and so on. Each must have well defined inputs and outputs. Each component must be independently tested and monitored. The purpose of this article is not to dive into modular system design though. Let’s just assume we have that. Which is good news, because that’s the hardest part.

Image for post

Figure 1– describes a modular system to which three new components are added.

We have found, though, the right architecture alone is insufficient for achieving the target result. Iterative planning for deployments on a technical schedule is needed. This may include deploying for updates to existing infrastructure, or for new components, or for entirely new products.

Use the diagram in Figure 1 above as an example. Suppose it’s an entirely new approach to an existing service, which must continue to work as always while we bring this change online. Here’s how it might be deployed on a technical schedule:

  1. Deploy the Message Queue. Then enqueue testing messages to confirm it’s working.

  2. Deploy the enqueing logic to the existing service. Now the queue is actually being populated. This can be monitored.

  3. Deploy the Background Worker Process. This will dequeue messages and process them, but it may do nothing else with them. This can be monitored and capacity can be right sized, based on actual production traffic, if needed.

  4. Deploy the Data Store. The worker process can now insert into the data store.

Now each piece is deployed. Yet users experience no change so far. This can all be done on different days, or different weeks, and perhaps even by different people. None of it should impact live customer results. At this point, we can monitor the full system for performance. We can know that each component is deployed and working in isolation. That’s good. Assuming it all worked, let’s continue:

  1. Deploy an update to the Existing Service to actually pull content from the new Data Store.
  2. Validate that this data is returning healthy with optional feature flags. This may be a simple internal A/B test.

Now, all of the components are deployed, running, and tested together in production. Let’s finish it:

  1. Simply enable the feature flags for partial traffic or for all traffic. In some cases, swapping rules on a load balancer may be the switch. In other cases, a simple application configuration setting can be used. Or perhaps going live with a new front end app. (This may eventually bring us to more official external A/B testing scenarios, which is beyond the scope here).

All of this was done on a technical and iterative schedule. Not on a business schedule.

The business schedule is dangerous — it only tells us that on some important date, the changes must be live in production. The business schedule has no finesse and no safeguards — no interest in whether the entire deployment happens on that day or not.

The technical schedule must account for every piece of the system, not just the system itself. The technical deployment schedule must be an iterative one in which the parts move into production in isolation. This gives confidence of quality and monitoring, and that progress is continuous. This makes each deployment have a low complexity, also making rollbacks trivial.

We must avoid deploying/monitoring/testing nothing until we can deploy/monitor/test everything. This is partially why modular design exists to begin with. Yet, I have seen micro-service architectures deployed as monoliths. It gets the hard part right and the easy part wrong.

Here’s a common red flag question: when is all of this going to production? The question should be: what is the plan for each piece to deploy into production? Then when the business deadline arrives, there is nothing to do but to flip a switch and go live.

We must release on a technical schedule, not on a business-only schedule.