The Case for Simplicity in Web Architecture

There’s a career incentive problem in software engineering: complex architecture is more impressive in job interviews than simple architecture. A resume that lists Kafka, Kubernetes, gRPC, event sourcing, CQRS, and distributed tracing sounds more senior than one describing a well-built Rails monolith that serves millions of requests. The incentive to add complexity exists independent of whether the complexity is warranted.

After 20 years of building and operating systems at scale — and watching many architectures get rebuilt from scratch after their initial complexity became unmanageable — I have a strong prior toward simplicity that has only gotten stronger with experience.

Complexity Has Carrying Costs

Every architectural component has an ongoing carrying cost. A distributed message queue requires operational management, monitoring, capacity planning, and oncall procedures. A microservices topology requires service discovery, network policies, distributed tracing, and coordination of deploys across multiple services. An event-sourcing system requires careful management of event schemas and replay logic.

These costs are real and they don’t diminish over time — they grow as the system evolves and as engineers who understand the original design decisions leave the team. The architecture review that chose Kafka for a use case that would have been fine with a database table requires every future engineer who works on that system to understand Kafka, debug Kafka issues, and maintain Kafka operations.

The question to ask before adding any architectural component: what problem does this solve that a simpler approach doesn’t? And: what’s the ongoing carrying cost, and does the benefit justify it?

Monolith vs. Microservices: The Real Question

The monolith vs. microservices debate has generated more conference talks than any other software engineering topic in the past decade. The actual question is much simpler: what are your scale and team structure requirements?

Microservices make sense when different parts of the system need to scale independently and when the team structure is large enough to have separate ownership of independent services. A company with 200 engineers where different product areas have different traffic patterns and different deployment cadences genuinely benefits from service decomposition.

A company with 5-15 engineers building a product that serves tens of thousands of users does not. The monolith is significantly easier to develop (no network calls between components), significantly easier to debug (one log stream, no distributed tracing required), significantly easier to deploy, and significantly easier for new engineers to understand.

The service decomposition that makes sense for small teams is vertical rather than horizontal: separate the web application from the background job processing. Extract services at the points where they need to scale differently or deploy independently. Don’t decompose further than that until you have evidence that you need to.

At Figment, we ran infrastructure management tooling as a relatively compact service despite operating at large scale — the management plane didn’t need microservice decomposition because the operational surface was clear and the team was small enough to maintain coherent ownership.

The Database Is Usually Not the Bottleneck

The most common architectural decision made on incorrect assumptions: decomposing to microservices or moving to a distributed database to “avoid database bottlenecks,” before determining whether the database is actually a bottleneck.

PostgreSQL on appropriate hardware handles a remarkable amount of load. Millions of transactions per day, tens of thousands of connections (with PgBouncer), complex analytical queries on billions of rows. The point where PostgreSQL is actually the bottleneck in a typical web application is much higher than teams expect.

Before reaching for horizontal database scaling or NoSQL alternatives, the correct sequence is:

Instrument queries. Which queries are slow? What does EXPLAIN ANALYZE show?
Add appropriate indexes. Missing indexes are the cause of most “database performance” problems.
Add connection pooling (PgBouncer or similar). Without pooling, each database connection has overhead that limits concurrent connections.
Consider read replicas if read traffic specifically is the bottleneck.
Cache frequently-read data that changes infrequently. Redis for hot paths.

This sequence handles a significant fraction of “we need to scale the database” problems without architectural changes. The cases that require genuine horizontal scaling or NoSQL are real — but they’re at a scale that most applications never reach.

API Design That Outlasts the Product Roadmap

APIs are architectural commitments. Once a client is built against an API, changing the API breaks the client. The API decisions made in the first sprint of a project will be lived with for years.

The principle that has served me best: design for the client’s use case, not the server’s data model. An API that exposes the database schema directly is fragile — it requires clients to join data and understand internal relationships that should be server-side concerns. An API that exposes the use cases the client actually has is stable even as the underlying data model evolves.

Concrete example: a client needs to display a user profile page. The server has users, addresses, and preferences tables. The fragile API design returns separate endpoint responses for each. The stable design returns a single /users/{id}/profile endpoint that aggregates the data the profile page needs. The client shouldn’t have to know about the join.

REST is fine. GraphQL is fine. gRPC is fine. The choice matters much less than the consistency of the design and the quality of the API contract documentation.

When to Ignore This Advice

The case for simplicity isn’t a case against all architectural sophistication. Some use cases genuinely require complex architecture:

High-frequency trading or order processing where latency matters at sub-millisecond resolution
Global content delivery where geographic distribution is the core value proposition
Real-time collaborative applications where synchronization complexity is fundamental to the product
Data processing at petabyte scale where distributed processing is genuinely necessary

The tell that distinguishes “we need this” from “we want this”: the simpler alternative has already been tried and demonstrated failure at scale. Not “might fail” — demonstrated failure. If the simpler approach hasn’t been tried, there isn’t evidence that the complexity is necessary.

The systems I’ve seen rebuilt from unnecessary complexity are numerous. The systems I’ve seen rebuilt because they started too simple and then thoughtfully added complexity as the scale justified it are few. The carrying cost of unnecessary complexity almost always exceeds the cost of adding complexity incrementally as needed.

Our software development practice has a strong bias toward building the simplest thing that solves the actual problem — and a track record of systems built that way still running well years later. Related: if architectural decisions intersect with cloud infrastructure choices, the conversation about complexity extends to the operational layer. A simple application running on complex infrastructure is still a complex system.

Complexity Has Carrying Costs

Monolith vs. Microservices: The Real Question

The Database Is Usually Not the Bottleneck

API Design That Outlasts the Product Roadmap

When to Ignore This Advice

Related Posts