Skip to content

devosmos/Transformer001

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Understanding Transformers in AI

A comprehensive interactive course on Transformer architecture, from basic concepts to advanced topics. Designed for Python developers with no prior machine learning background.

🎯 Course Overview

This course teaches the complete Transformer architecture through:

  • Interactive math explanations with KaTeX rendering
  • Runnable Python code using Pyodide (runs in browser)
  • Interactive visualizations with D3.js and custom components
  • Step-by-step progression from linear algebra to cutting-edge research

Course Structure (12 Modules)

  1. Foundations - Linear algebra, neural networks, PyTorch primer
  2. Before Transformers - RNNs, LSTMs, seq2seq attention
  3. Attention Is All You Need - The groundbreaking 2017 paper
  4. Self-Attention - Query/Key/Value mechanism deep dive
  5. Multi-Head Architecture - Parallel attention and layer details
  6. Encoder-Decoder - Full architecture walkthrough
  7. Tokenization - BPE, WordPiece, SentencePiece strategies
  8. Landmark Models - BERT, GPT series, LLaMA, Mistral evolution
  9. Training Process - Pre-training, fine-tuning, RLHF, DPO
  10. Efficient Methods - LoRA, quantization, Flash Attention
  11. RAG & Applications - Retrieval-Augmented Generation
  12. Frontier Topics - MoE, Mamba, scaling laws, latest research

πŸš€ Tech Stack

  • Framework: Astro 5 + Starlight (static site generation)
  • Math: KaTeX for server-side equation rendering
  • Interactive Code: Pyodide (Python WASM runtime)
  • Visualizations: D3.js for charts and diagrams
  • Styling: Custom CSS with neutral color palette
  • Theme: Built-in dark/light mode toggle

πŸ“¦ Installation & Setup

Prerequisites

  • Node.js 18+
  • npm or yarn

Quick Start

# Clone the repository
git clone https://github.com/your-username/transformer-course.git
cd transformer-course

# Install dependencies
npm install

# Start development server
npm run dev

The site will be available at http://localhost:4321

Development Commands

# Start dev server with hot reload
npm run dev

# Build for production  
npm run build

# Preview production build
npm run preview

# Type check
npm run astro check

🎨 Interactive Components

PyodideRunner

Inline Python code execution with NumPy support:

<PyodideRunner 
  title="Try Matrix Multiplication"
  initialCode="import numpy as np
A = np.array([[1, 2], [3, 4]])
print('Matrix A:', A)"
/>

ThreeWayView

Math + Code + Visual explanations:

<ThreeWayView
  title="Dot Product"
  mathContent="$$\mathbf{a} \cdot \mathbf{b} = \sum_{i} a_i b_i$$"
  codeContent="dot_product = np.dot(a, b)"
/>

AttentionHeatmap

Interactive attention weight visualizations:

<AttentionHeatmap
  tokens={["The", "cat", "sat"]}
  attentionWeights={[[0.8, 0.1, 0.1], [0.2, 0.6, 0.2], [0.1, 0.3, 0.6]]}
/>

TokenizerPlayground

Compare different tokenization strategies:

<TokenizerPlayground 
  title="BPE vs WordPiece vs SentencePiece"
  initialText="The quick brown fox jumps over the lazy dog."
/>

πŸ“ Project Structure

transformer-course/
β”œβ”€β”€ astro.config.mjs          # Astro configuration
β”œβ”€β”€ package.json              # Dependencies and scripts
β”œβ”€β”€ tsconfig.json            # TypeScript configuration  
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ assets/
β”‚   β”‚   β”œβ”€β”€ custom.css       # Theme and component styles
β”‚   β”‚   └── logo.svg         # Course logo
β”‚   β”œβ”€β”€ components/
β”‚   β”‚   β”œβ”€β”€ AttentionHeatmap.astro
β”‚   β”‚   β”œβ”€β”€ PyodideRunner.astro
β”‚   β”‚   β”œβ”€β”€ ThreeWayView.astro
β”‚   β”‚   └── TokenizerPlayground.astro
β”‚   └── content/
β”‚       └── docs/            # Course content (MDX files)
β”‚           β”œβ”€β”€ index.mdx    # Landing page
β”‚           β”œβ”€β”€ 01-foundations/
β”‚           β”œβ”€β”€ 02-before-transformers/  
β”‚           β”œβ”€β”€ 03-attention-is-all-you-need/
β”‚           β”œβ”€β”€ 04-self-attention/
β”‚           └── ... (modules 5-12)
└── public/
    └── images/              # Static diagrams and assets

🎨 Theme Customization

The course uses a neutral slate/zinc color palette instead of Starlight's default purple theme. Colors are defined in src/assets/custom.css:

:root {
  --sl-color-accent: #475569;        /* Slate-600 */
  --sl-color-accent-high: #334155;   /* Slate-700 */
  --sl-color-accent-low: #f1f5f9;    /* Slate-100 */
}

[data-theme='dark'] {
  --sl-color-accent: #94a3b8;        /* Slate-400 */
  --sl-color-accent-high: #cbd5e1;   /* Slate-300 */
}

πŸ“š External Resources

The course references and builds upon:

  • Papers: "Attention Is All You Need", BERT, GPT series, LLaMA, etc.
  • Books: "Hands-On Large Language Models" (Alammar), "Build a Large Language Model From Scratch" (Raschka)
  • Blogs: Jay Alammar's "Illustrated Transformer", Lilian Weng's blog
  • Code: nanoGPT, Hugging Face Transformers, Annotated Transformer
  • Courses: Stanford CS224N, Hugging Face NLP Course

πŸš€ Deployment

Vercel (Recommended)

npm install -g vercel
vercel --prod

Netlify

npm run build
# Upload dist/ folder to Netlify

GitHub Pages

npm run build
# Push dist/ contents to gh-pages branch

πŸ“ˆ Performance

  • Lighthouse scores: 95+ Performance, 100 Accessibility
  • Bundle size: ~200KB (Astro's zero-JS by default)
  • Math rendering: Server-side KaTeX (no client-side MathJax)
  • Interactive components: Loaded only when needed (Astro Islands)

🀝 Contributing

Contributions welcome! Please see areas for improvement:

  1. Additional visualizations (3D architecture explorer, training animations)
  2. More code exercises for each module
  3. Advanced topics (recent papers, new architectures)
  4. Accessibility improvements
  5. Mobile optimization

Development Guidelines

  • Use semantic HTML and ARIA labels
  • Test interactive components in both themes
  • Ensure math renders correctly in KaTeX
  • Keep bundle size minimal (leverage Astro Islands)
  • Follow the established visual design patterns

πŸ™ Acknowledgments

Special thanks to:

  • Jay Alammar for the "Illustrated Transformer" inspiration
  • Andrej Karpathy for educational approach to AI
  • The Hugging Face team for democratizing transformers
  • Original transformer paper authors (Vaswani et al.)

Ready to understand the architecture that powers modern AI?
Run npm run dev and open http://localhost:4321 to start the course.

About

A practical, interactive course that teaches how Transformer models work, from neural network foundations to modern AI applications.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors