The universal 4-step approach that applies to every single question — from URL Shortener to YouTube.
A system design interview is not a test of memorization. It is a test of structured thinking under ambiguity. The interviewer wants to see how you break down a vague problem, make justified trade-offs, and communicate complex ideas clearly. Engineers who dive into solutions immediately without clarifying requirements almost always lose marks — not because their solution was wrong, but because they solved the wrong problem. Follow the 4-step framework below for every single question, regardless of what it is.
Never assume. Ask questions to understand what you're actually building. Interviewers often intentionally leave the problem vague to see if you'll ask. The output of this phase is a written list of functional requirements (what the system must do) and non-functional requirements (how well it must do it).
Universal Clarifying Questions (ask for ANY problem)
- Scale: How many users? Daily Active Users (DAU)? Requests per second?
- Read vs Write ratio: Is this mostly reading data or writing data? (e.g., Twitter: 100:1 read-heavy; sensor logging: write-heavy)
- Latency requirements: What's the acceptable response time? Real-time (under 100ms)? Near-real-time (under 1s)? Batch (minutes)?
- Consistency vs Availability: Is it okay if users briefly see stale data? Or must it always be exactly correct?
- Geographic scope: Single region or global? Are users worldwide?
- Data size: How large is a typical record? How much total data do we expect over 5 years?
- Special constraints: GDPR/data privacy? Content moderation? Multi-tenant (serving many different companies)?
Back-of-envelope estimation tells you which problems you actually need to solve. A system handling 100 requests/second needs a completely different architecture from one handling 100,000/second. Estimation also signals to the interviewer that you understand real-world constraints. Always estimate: QPS (queries per second), storage needed per year, and bandwidth. Round aggressively — precision is not the point.
⚠️ Common Mistake: Skipping estimation and going straight to design. Without knowing scale, you can't justify any architectural decision. "We need a cache" — why? How much data? What hit rate? Estimation makes every design choice defensible.
Draw the big picture: clients, load balancer, API servers, databases, caches, queues. At this stage, don't over-engineer. Pick the simplest design that meets the requirements. Explain your major components and why they're there. This gives both you and the interviewer a shared map of the system before you drill into details.
💡 Pro tip: State your assumptions explicitly: "I'll assume users connect via HTTPS. I'll use a single-region setup first and then we can discuss multi-region if time permits." This shows you're in control of the scope.
Now zoom into the hardest parts. The interviewer will guide you toward specific areas of interest. Common deep-dive topics: database schema design, scaling bottlenecks, failure handling, data consistency. For every decision, explain the trade-off — "I chose Cassandra here instead of MySQL because our write volume is 50,000/second and Cassandra handles high write throughput better at the cost of eventual consistency, which is acceptable here." Always say "at the cost of" — that's the trade-off that shows mastery.
The 3 Universal Trade-offs You Will Always Face
| Tension | Option A | Option B | Example |
|---|---|---|---|
| Consistency vs Availability | Always show correct data (may be slow or unavailable during failure) | Always respond fast (may show slightly stale data) | Bank balance vs. Twitter like count |
| Latency vs Throughput | Optimize for single-request speed | Optimize for total requests/second | Real-time bidding vs. batch analytics |
| Simplicity vs Flexibility | Simple design, harder to change later | Complex design, more adaptable | Monolith vs. microservices |
End every interview with a summary of your design:
"This design is optimized for [primary constraint]. The major trade-off is [what we sacrificed] for [what we gained]. The biggest risk is [failure mode], which we mitigate by [mitigation]. To scale 10×, I would [next scaling step]."
| Mistake | Why it loses marks | Fix |
|---|---|---|
| Jumping straight to design | No shared understanding of requirements | Always spend 5–8 min on clarification |
| No estimation | Can't justify any design decisions | Estimate QPS, storage, bandwidth before designing |
| Over-engineering | Complexity without justification | Start simple, add complexity only when required |
| Single points of failure | Naive design crashes under load | Always ask "what happens if X fails?" |
| No trade-off discussion | Shows shallow thinking | Every decision should have "at the cost of" |