Designing a complex piece of software is hard. The requirements are vague, the options are many, and within an hour we are drowning in details i.e. security, data flow, APIs, trade-offs. The hardest part is to keep the high-level vision aligned with the gritty technical details. How do we make sure we are asking the right questions before any code is written?
This article presents a framework for managing the design phase, drawn from practice and from reading how other engineers approach the problem. It is half checklist and half mental reset. It forces us to focus on structure i.e. defining the boundaries, identifying the biggest technical questions, and systematically working through the trade-offs. We'll look at this framework one piece at a time.
A design document is a detailed plan for how a piece of software should be implemented. It lays out all the components and processes which are needed to deliver the product.
A design document serves several purposes:
There is no fixed template for a design document. The shape of the document depends on the design and the plan, both of which vary from project to project. However, the following sections show up in most design documents:
The section list is the easy part. The work which fills these sections is harder.
A useful way to get unstuck is to write a decision inventory before touching any of the section headers above. A decision inventory is a flat list of every unknown in the design. It captures every place where the answer is not clear, or where multiple defensible answers exist. One row per question.
A row in the inventory might look like this:
That is all the inventory pass requires. The goal is to name the problems, not to solve them yet.
Once the inventory is built, we should do two things with it:
The most important part here is tying the rationale back to the initial constraints. If the rationale could apply to any project, it is not really rationale, it is preference. The constraint anchor is what makes the document useful six months later when someone asks why a decision was made.
The decision rationale must tie back to the initial constraints. Without that, it is preference, not rationale.
Recently, we were redesigning the SSO setup of a product. The product has two front-ends i.e. an internal admin platform for staff users, and an external customer-facing portal. Both are fronted by Keycloak. The realm topology which we had in place was not giving us the session isolation we needed between the two.
The decision inventory for this design had several rows. Most of them were minor. The one which ended up structuring the rest was the question of how many Keycloak realms to run. The practical options were:
After mapping out the user populations and what each realm needed to enforce, three realms was the only configuration which fit without awkward exceptions. The rationale traced back to a single line in the goals section i.e. hard session isolation between the admin platform and the portal. Without the decision inventory, we would likely have shipped the two-realm version and discovered its problems in production.
Following is the whole framework as a flowchart. It serves as a quick reference for the order of steps.

A design document does not make the design correct. It makes the design legible. There will still be projects where we write the whole document, get sign-off, and discover a month into implementation that something foundational was missed. The decision inventory catches most of the items, not all of them. However, when something does break, we find out faster, and we know which assumption it was. That is what we are paying for with the hours spent on the document upfront.
If you found this post useful, please subscribe. This will enable you to be notified whenever I write something new. If you don't like subscribing to newsletters, please subscribe to RSS + Web feeds.
In case you subscribe to the newsletter, your email will not be shared with advertisers, you will not be spammed, and you can unsubscribe at any moment.