⚡ System Design & Machine Learning Playbook

An interactive reference guide for engineers preparing for System Design, Cloud, and ML interviews.

🌐 Open Interactive Site · 📚 Browse All Topics · 🎯 Start Here · 🤝 Contribute

✨ How To Use This Repo

Most interview resources are either too scattered or too theoretical. This repo is organized around three practical tracks:

Track	Best for	Start here
Core System Design	Distributed systems, cloud/platform, APIs, storage, scaling	docs/
AI & Machine Learning	ML system design, agents, classic ML, deep learning, LLMs	docs/machine-learning/README.md
Reference & Practice Appendix	Templates, cheat sheets, LeetCode patterns, LLD	docs/reference/README.md

Use the interactive site when you want navigation, quiz mode, and progress tracking. Use the Markdown docs when you want dense references you can skim before an interview.

🚀 What Makes It Useful

Feature	Description
🎯 48 interview-ready topics	Core system design, cloud/platform, AI/ML, security, and interview reference material
🌙 Dark / Light mode	Persisted preference, instant toggle with `d`
✅ Progress tracking	Mark topics as read. Your progress saves locally.
🔖 Bookmarks	Save topics to revisit. Accessible from any page.
🃏 Quiz / Flashcard mode	Randomized flashcard review across all 48 topics
📖 Inline reader	Read every topic without leaving the page — with prev/next navigation
⌨️ Keyboard-first	`/` search, `q` quiz, `b` bookmarks, `?` shortcuts
📊 Visual progress bar	See your overall completion at a glance
🗺️ 3 learning paths	Beginner, Mid-Level, and Advanced tracks
🔍 Live search	Searches title, category, summary, and tags
🎨 Category color coding	Every domain has its own visual identity
🚀 Zero setup	Open in browser. No install. No build step.

🗂️ Topic Coverage (48 Topics)

🟠 Foundation (4)

📐 The System Design Interview Framework — 4-step universal structure: Clarify → Estimate → Design → Deep Dive
🔢 Numbers Every Engineer Must Know — Latency hierarchy, scale reference points, back-of-envelope formulas
💾 IO Fundamentals: Read vs Write — Latency hierarchy, random vs sequential access, OS page cache, write amplification
🔌 Networking & Concurrency — TCP vs UDP, HTTP/1.1 vs HTTP/2 vs HTTP/3 (QUIC), event loop, goroutines

🟣 Data Storage (5)

🗄️ Database Selection Guide — SQL vs NoSQL tension, 7 database types with when-to-use decision matrix
⚡ Caching Deep Dive — 5 cache layers, read/write patterns, eviction, cache invalidation strategies
📨 Message Queues & Event Streaming — Queue vs Kafka event log, delivery guarantees, DLQ, outbox pattern
🌐 Storage & CDN — Object/block/file storage, CDN pull vs push, cache invalidation
🔩 Database Internals — B-tree vs LSM, indexes, replication, CDC, sharding, ACID vs BASE, isolation levels

🔵 API & Networking (4)

🔌 API Design & API Gateway — REST vs gRPC vs GraphQL, gateway responsibilities, rate limiting algorithms
⚖️ Load Balancing & Networking — L4 vs L7, round-robin/least-connections/consistent hashing, health checks
🔴 Real-time Communication — Polling, SSE, WebSockets compared; scaling stateful WS servers with Redis pub/sub
🚦 Rate Limiting In Depth — Every algorithm compared, distributed Redis implementation, failure modes

☁️ Cloud & Platform (5)

☁️ Cloud Fundamentals & Shared Responsibility — Regions, availability zones, managed services, shared responsibility, environment boundaries
🖥️ Compute & Deployment Patterns — VMs vs containers vs Kubernetes vs serverless, autoscaling, canary/blue-green rollout
🌍 Cloud Networking & Traffic Management — VPCs, subnets, DNS, CDN/WAF, API gateways, service-to-service traffic
🪪 IAM, Secrets & Governance — Least privilege, workload identity, secret rotation, KMS, audit and guardrails
📉 Reliability, Observability & Cost — Multi-AZ vs multi-region, RTO/RPO, SLOs, budget alarms, cost-aware scaling

🟢 Distributed Systems (5)

🌐 Distributed System Fundamentals — CAP, consistency models, consistent hashing, Saga vs 2PC, quorum, vector clocks
🔄 Core Design Patterns — Fan-out (social feed), CQRS, event sourcing, outbox pattern, inventory contention
🧱 Microservices vs Monolith — When to decompose, service discovery, sync vs async communication
🛡️ Resilience Patterns — Timeouts, retries + jitter, circuit breaker, fallbacks, backpressure, load shedding
🔒 Distributed Locking — Why local locks fail, Redis Redlock, fencing tokens

🟡 Search & Analytics (4)

🔍 Search & Typeahead Systems — Inverted index, prefix trie autocomplete, relevance ranking (TF-IDF, BM25)
📊 Stream Processing & Top-K Systems — Count-Min Sketch, Lambda vs Kappa architecture, Flink, windowing
📍 Geo & Location Systems — Geohash, quadtree, proximity queries, Uber-style driver matching
🎲 Probabilistic Data Structures — Bloom filter, HyperLogLog, Count-Min Sketch at massive scale

🟩 Scale & Reliability (6)

📡 Observability & Monitoring — Metrics, logs, traces (three pillars), SLOs, error budgets, OpenTelemetry
📈 High Availability & Auto Scaling — Active-passive vs active-active, autoscaling signals, multi-region patterns
🆔 Unique ID Generation — UUID v4/v7/ULID, Twitter Snowflake, ticket servers — when to use each
📄 API Pagination — Why offset pagination fails, cursor-based and keyset pagination at scale
🔔 Notification System Design — Multi-channel delivery, fan-out at scale, idempotency, retry + DLQ
🔁 Advanced Data Patterns — Pre-computation, materialized views, ETL vs ELT, hot spot problem, backfill

🔴 Security (4)

🔐 Security & Authentication — Sessions vs JWT, OAuth 2.0 flow, API security checklist
🪪 Authorization, SSO & MFA — RBAC/ABAC/ReBAC, OIDC vs SAML, step-up authentication, passkeys
🛡️ Privacy & Data Compliance — PII handling, encryption strategies, GDPR/CCPA, data residency
🔑 Secrets Management & Threat Modeling — secret rotation, API keys, KMS/HSM, STRIDE, attack paths

🩷 AI & Machine Learning (5)

🤖 Machine Learning in System Design — feature store, recommendation and ranking systems, rollout strategy, drift, serving latency, rollback
🧠 AI Agent System Design — planner/reactor loops, function calling, retrieval, observability, agent benchmarks, model routing, budgets, safety
📈 Classic Machine Learning — Bias-variance, Naive Bayes, KNN, bagging vs boosting, SHAP/LIME, calibration, XGBoost, SVM, PCA
🔬 Deep Learning — Weight init, backprop, CNNs, LSTMs, full Transformer deep-dive, GANs, VAEs, diffusion, distillation, GQA/MQA
💬 LLM Interview Questions — Tokenization, RAG, LoRA/QLoRA, RLHF/DPO, scaling laws, MoE, multi-modal models, KV cache, CoT

🩵 Specialized Systems (2)

📝 Real-time Collaboration (Google Docs) — OT vs CRDT, operation logs, full Google Docs architecture
🎣 Webhooks System Design — Signed delivery, exponential retry, idempotency keys, full architecture

🟦 Reference (4)

🎯 Common Scenarios & Solutions — 17 scenario cheat sheets covering classic patterns plus multi-tenant SaaS, webhooks, recommendation/ranking, and multi-region reliability
📋 Reusable Design Templates — 12 full blueprints with architecture diagrams: YouTube, Twitter, WhatsApp, Uber, TinyURL, Rate Limiter, Metrics, TicketMaster, AI Agent, Typeahead, Google Docs, LeetCode
🧩 LeetCode Question Patterns — 21 algorithm patterns with code templates: arrays, two pointers, sliding window, trees, graphs, DP, backtracking, tries, segment tree, and more
🏗️ Low-Level System Design (LLD) — SOLID principles, 10 design patterns with code, 11 classic LLD questions (LRU Cache, Parking Lot, Elevator, Rate Limiter, ATM, Tic-Tac-Toe, Logger, Library)

🛤️ Learning Paths

Pick a path based on your experience level, then use the interactive site to track your progress.

🌱 Beginner — Build your foundation (6 topics)

Interview Framework → Numbers to Know → Database Selection → Caching Deep Dive → API Design & Gateway → Rate Limiting

🚀 Mid-Level — Master distributed systems and platform basics (9 topics)

Distributed Fundamentals → Cloud Fundamentals → Compute & Deployment → Resilience Patterns → Observability → High Availability → Microservices → Notifications → Authorization / MFA

🏆 Advanced — Push beyond the standard interview (8 topics)

AI Agent System Design → ML System Design → Cloud Networking → IAM / Governance → Reliability, Observability & Cost → Real-time Collaboration → Probabilistic DS → DB Internals

⌨️ Keyboard Shortcuts

Open the interactive site and press ? to see all shortcuts:

Key	Action
`/`	Focus search
`d`	Toggle dark mode
`q`	Start quiz / flashcard mode
`b`	Toggle bookmarks panel
`?`	Show all keyboard shortcuts
`Esc`	Close reader / clear search / close panel
`Space`	Reveal quiz answer
`→` / `←`	Next / previous quiz card or topic

🚀 Quick Start

Option A — Interactive site (recommended)

Open the full Interactive Website here 🌐

No install. Works offline after first load. Progress saves to your browser. Includes an inline reader, quiz/flashcard mode, dark mode, and bookmarks.

Option B — Run locally

git clone https://github.com/Ali-Meh619/System_Design_ML_Principles.git
cd System_Design_ML_Principles
# Open site/index.html in your browser — no server needed

Option C — Read on GitHub

Navigate to docs/ and click any topic. GitHub renders Markdown natively.

📁 Repository Structure

System_Design_ML_Principles/
├── site/                       # Interactive web app (no build step)
│   ├── index.html              # Main SPA — dark mode, quiz, bookmarks, inline reader
│   ├── styles.css              # Full design system with dark/light mode
│   ├── app.js                  # All interactive features
│   └── topics.js               # Topic registry with icons, difficulty, tags, paths
├── docs/                       # 48 topic documents
│   ├── foundation/             # Interview framework, estimation, I/O, networking
│   ├── api-networking/         # APIs, load balancing, rate limiting, realtime
│   ├── cloud-platform/         # Cloud foundations, deployment, networking, IAM, reliability
│   ├── data/                   # Databases, caching, queues, internals
│   ├── distributed/            # CAP, consistency, microservices, resilience, patterns
│   ├── search/                 # Full-text search, typeahead, geo, stream processing
│   ├── scale/                  # Observability, HA, ID gen, pagination, notifications
│   ├── security/               # Auth, AuthZ, privacy, secrets, threat modeling
│   ├── machine-learning/       # ML systems, agents, Classic ML, DL, LLMs
│   ├── specialized/            # Collaboration and webhook-heavy systems
│   └── reference/              # Templates, cheat sheets, LeetCode, LLD
└── assets/                     # Architecture diagram images

📖 Recommended Topic Structure

The strongest docs in this repo use a consistent interview-prep structure. Not every legacy page is identical yet, but new and upgraded docs aim to follow this pattern:

## Problem
What are we solving? When does this come up in an interview?

## Options
What are the main approaches? (with trade-off table)

## Recommended Default
What to pick and why, with the specific caveats.

## Failure Modes
What breaks? How do you detect and recover?

## Metrics
What do you measure to know it's working?

## Interview Answer Sketch
The concise 2-minute answer you'd give under time pressure.

🤝 Contributing

Contributions make this better for everyone:

Fork the repo and create a branch: git checkout -b feat/your-topic
Follow the recommended topic structure above — especially defaults, trade-offs, failure modes, and metrics
Add the topic to site/topics.js with an icon, difficulty, and tags
Open a PR using the provided template

Every substantial addition should include:

✅ When to use
❌ When NOT to use
💥 Common failure modes
📊 Measurable success metrics

See CONTRIBUTING.md for the full guide.

⭐ If this helped you

Star the repo — it helps others discover it
Share it with your team or study group
Open issues for topics you'd like to see covered
Submit PRs to improve existing content

📄 License

Built with ❤️ for engineers who take system design seriously.

⭐ Star on GitHub · 🌐 Open Interactive Site · 🐛 Report Issue

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.github		.github
assets		assets
docs		docs
site		site
.gitignore		.gitignore
.nojekyll		.nojekyll
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ System Design & Machine Learning Playbook

An interactive reference guide for engineers preparing for System Design, Cloud, and ML interviews.

✨ How To Use This Repo

🚀 What Makes It Useful

🗂️ Topic Coverage (48 Topics)

🛤️ Learning Paths

🌱 Beginner — Build your foundation (6 topics)

🚀 Mid-Level — Master distributed systems and platform basics (9 topics)

🏆 Advanced — Push beyond the standard interview (8 topics)

⌨️ Keyboard Shortcuts

🚀 Quick Start

📁 Repository Structure

📖 Recommended Topic Structure

🤝 Contributing

⭐ If this helped you

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ System Design & Machine Learning Playbook

An interactive reference guide for engineers preparing for System Design, Cloud, and ML interviews.

✨ How To Use This Repo

🚀 What Makes It Useful

🗂️ Topic Coverage (48 Topics)

🛤️ Learning Paths

🌱 Beginner — Build your foundation (6 topics)

🚀 Mid-Level — Master distributed systems and platform basics (9 topics)

🏆 Advanced — Push beyond the standard interview (8 topics)

⌨️ Keyboard Shortcuts

🚀 Quick Start

📁 Repository Structure

📖 Recommended Topic Structure

🤝 Contributing

⭐ If this helped you

📄 License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages