In my first job as a developer, I was hired to do web work.
That part made sense. I had just come out of a frontend bootcamp, I was programming every day, and I felt genuinely confident in my ability to figure things out. I was active in the Discord full of people I’d gone through the program with, answering questions, trading snippets, and riding the high of finally being “in it.”
I was also the one who had gotten hired.
The role came through a VP at a financial firm where I was working as a financial advisor, which meant I understood the business problem unusually well for someone so green. I had lived it. When I noticed a clear gap and a need for an internal application, I threw myself at it and, with the confidence of someone who hadn’t yet been humbled by production systems, sold myself as the solution.
After all, how hard could it be?
I picked Go because it felt approachable. I worked through the tutorials on the website over the course of a couple hours and remember thinking, this is pretty straightforward. It felt like Python, which I had used before teaching kids at summer camp. Handlers, structs, concurrency primitives. Nothing about it felt intimidating. If anything, it felt validating.
I was the solo developer, the system was new, and I was extremely confident. A deadly combination.
The stakeholders were non-technical and enthusiastic. Every successful demo turned into another request. Another feature. Another “while you’re in there.” Feature rollout became a constant stream of small, reasonable asks that accumulated faster than any architectural pause could.
At the same time, I was hyper-focused on security. This was a regulated environment, and nothing would have been more embarrassing or more damaging than a breach. I obsessed over access controls. I wrote careful transactional queries. I made sure nothing leaked. In hindsight, I was so focused on preventing the worst possible failure that I barely thought about how the system would change, recover, or evolve.
And so it grew.
The prototype started at around three thousand lines of code, handling basic business logic and authentication. Before I really noticed what was happening, it had grown to fifty thousand. Still one service. Still one binary. Still “temporary.”
The software worked. It shipped. People used it. It stayed up.
What bothered me was not failure. Reliability was never the issue. It was that I no longer had a clear way to reason about the system. Feature rollout slowed and required short system-wide outages. Small changes carried unexpected weight. I could make things correct, but I was losing confidence in how the system would behave once something went wrong.
I just hadn’t learned how, or when, to ask the right questions.
Eventually I paid my debt. All of the code was revisited, refactored, and carefully split into multiple services. The system is still in use today, and it’s still a critical part of the business. It’s also much easier to reason about, and it’s more resilient to failure.
That experience changed how I think about monoliths, microservices, and decomposition. Not as architectural preferences to be defended, but as tools for deciding how much failure a system can absorb before it starts to surprise you.
Decomposition is not a style choice
When a system becomes hard to reason about, the instinct to decompose it is almost automatic.
Breaking things apart feels like progress. It promises clarity, ownership, and a way to regain control. In the absence of a clear failure model, decomposition often becomes a stand-in for understanding.
That instinct is understandable. It is also dangerous.
Decomposition does not reduce coupling by default. It reduces proximity. Whether it reduces risk depends entirely on what kind of failure the boundary is meant to contain.
If there is no failure story, there is no boundary. Just more moving parts.
Decomposition as failure analysis
The way I approach decomposition now starts from a single premise:
Decomposition is a tool for controlling failure, not for organizing code.
Everything else follows from that.
Rather than asking what should be a service, I ask the following:
1. What failure am I trying to contain?
This is always the first question.
If a component misbehaves, what breaks? Is the failure about correctness, availability, or both? Who notices first?
If I cannot name a failure mode that should be isolated, decomposition is speculative. A boundary without a failure story is just a seam.
2. If component B fails, what must component A do differently?
This question exposes most false independence.
Does A block? Retry? Degrade? Fail fast? Proceed optimistically? Produce incorrect state?
If A cannot proceed safely without B, separating them does not reduce coupling. It relocates it. Often into retries, timeouts, and assumptions that are harder to reason about than a function call ever was.
3. Where does authoritative state live?
This is where data integrity shows up.
Which component is allowed to say that something happened? Is that decision durable or ephemeral? Does another component infer state rather than observe it directly?
If multiple components need to agree on the order of events, not just their existence, the boundary deserves skepticism. Distributed agreement is not free, and pretending otherwise usually shows up later as correctness bugs or compensating logic that never quite works.
4. How do lifecycles interact across this boundary?
This is where orchestration and delivery meet architecture.
Can these components be restarted independently? Does one hold in-memory intent the other depends on? During shutdown, does one need the other to exit cleanly?
If graceful shutdown requires coordination across services, the boundary has increased operational risk.
A useful way to sanity-check this is to imagine an urgent hotfix.
Not the tidy kind with a quiet off-hours window and a perfect cutover plan. The kind where you need to ship a change while the system is actively serving traffic, because waiting is worse. In practice, this is the same lifecycle problem from a different angle. Instances are replaced while work is in flight, and correctness depends on what happens during the handoff.
If the only safe way to deploy is “stop the world for a minute,” that is rarely a scheduling problem. It is usually a boundary problem.
5. What does this boundary do to debugging and recovery?
How many places do you have to look when something goes wrong? How many partial truths exist at once? Does the boundary make causality clearer, or does it fragment it?
Boundaries multiply context. Sometimes that is worth it. Sometimes it is not.
6. How reversible is this decision?
This is the question that slows everything down.
Are we committing to contracts, schemas, or protocols that will be expensive to undo? Is this an experiment or a long-term commitment?
Decomposition is easy to do early and expensive to reverse later. That asymmetry should be respected.
Monoliths, services, and false binaries
None of this is an argument for or against monoliths.
A large binary can be perfectly reasonable when failure modes are shared, latency matters, or state needs to move together. A distributed system can be the right choice when failures need to be isolated and orthogonality is real.
What matters is not form. It is intent.
A modular monolith with clear failure boundaries can outperform a distributed system that pretends independence it does not have. A set of small services can be more fragile than a single process if they are tightly coupled through ephemeral state and retries.
Decomposition is a process, not a design pattern.
Closing perspective
When I built that first monolith in the coffee shop, I was solving the problem in front of me with the tools I had. The mistakes were not about carelessness or ambition. They were about scope outpacing understanding.
What changed over time was not my preference for one architecture over another. It was learning to take a moment before committing to form, and to ask different questions while the system was still small enough to answer them simply.
Today, I try to understand what actually has to hold when something breaks, what can be restarted without ceremony, and what needs to move together to avoid leaving the system in an in-between state. Once that picture is clear, boundaries tend to suggest themselves.
This way of thinking is not complicated, and it is not reserved for special cases. It just takes a commitment to methodology before drawing lines and transparency about what failure would really look like if it showed up tomorrow.
I still think about that first system almost every time I begin a new project. Not with embarrassment, but as a reminder of how much easier the work becomes once you learn where to be deliberate.