Machine learning · Deep learning · RL · SWE
Deep mastery review for ML & SWE.
Six-stage study sessions that build durable understanding of hard technical topics — the intuition, the derivations, code that runs, and recall that holds under pressure. Interview prep is the sharpest use case: sessions go to the depth interviewers actually probe, and every thread saves so you can pick it back up.
157+
ML & SWE topics
6
Stages per session
2
Tracks
10+
Domains covered
Try it right here
Watch a full session run.
A recorded replay of a real six-stage session — the exact pauses where the tutor stops and waits, plus two follow-ups you can ask. No account, no model calls.
6 stages · the same math, code, and diagrams a live session renders · ~3 minutes
Six stages, in order, never compressed — the pause after each one is the point.
How it works
A better way to review.
Six stages, in order
Big picture, intuition, the math, implementation, machine learning interview questions, retrieval check.
The tutor stops and waits
After every stage you continue, ask a follow-up, or request a revision card — no walls of text.
Every session is saved
Reload any thread and pick up exactly where you stopped, days later.
Stage 6 makes you retrieve it
You reproduce the concept from memory before the session ends — that's the part that sticks.
Knowledge graphs
See how topics connect across the ML and SWE landscape.
Derivations that make sense
Step-by-step math with every term motivated.
Code, side by side
Reference implementations that connect theory to practice.
Spaced retrieval
Reinforce concepts at the moment they start to fade.
Tracks
Two tracks, one mastery loop.
Use LiminalML for interview prep, research refreshers, systems review, or rebuilding the fundamentals you only half-remember.
ML / Research
71 topics · 5 domainsFor attention, transformers, optimization, RLHF, training systems, and the math behind modern models.
Useful for: Research Engineer · MLE · Research Scientist · Applied Scientist
Software Engineering
86 topics · 5 domainsFor rendering models, distributed systems, data structures, caching, databases, and production tradeoffs.
Useful for: Frontend · Backend · Fullstack · System Design · UI/UX
Format · six stages
Every topic, in order, never compressed.
The same arc every session: overview, intuition, math or internals, implementation, calibrated questions, and retrieval check.
The concept in context: what problem it solves, where it appears in real systems, and the mental frame to keep before details arrive.
Core idea in plain language, then a structured diagram with tensor dimensions and data flow annotated. Mandatory for all DL architectures.
Step-by-step derivation with every term motivated. Not just what each symbol is, but what breaks if you remove it.
Production-quality PyTorch with type annotations, every non-obvious line commented, and an explicit test snippet at the end.
5 calibrated questions: conceptual, implementation, applied, systems-level, and failure modes. Interview prep is the first proving ground.
Retrieval drill. The AI asks, waits for your answer, then tells you precisely what was right, wrong, or missing.
The pause is the point
After each stage, LiminalML stops and waits. You can continue, ask a clarifying question, or request a compact revision card. That pacing turns passive reading into active review.
See it first · no sign-up
Read a real session before you start.
These concept pages are written in the exact six-stage format you get in a session — intuition, the math, runnable code, the questions interviewers ask, and a retrieval check. Free to read, no account needed.
Deep Learning
Attention Mechanisms
How scaled dot-product and multi-head attention work — the soft key-value lookup at the heart of every Transformer — with the math, runnable PyTorch, and calibrated interview questions.
9 min read · 6 stages
Deep Learning
Transformer Architecture
The Transformer block from the ground up — self-attention plus a position-wise feed-forward network, residuals and LayerNorm, and the encoder/decoder configurations — with the math, PyTorch, and calibrated interview questions.
9 min read · 6 stages
Deep Learning
Backpropagation
Backpropagation as reverse-mode autodiff — the chain rule over the computational graph, the gradients for a linear layer and ReLU, and why gradients vanish — with a runnable manual backward pass.
9 min read · 6 stages
Deep Learning
Batch Normalization
What Batch Norm normalizes and why, the critical train-vs-inference distinction, BN vs. Layer Norm, with the math and a from-scratch PyTorch implementation.
8 min read · 6 stages
Optimization
Gradient Descent (SGD, Momentum, Adam)
SGD, momentum, and Adam explained — the update rules, why mini-batching wins, Adam's bias correction, and when plain SGD generalizes better — with from-scratch implementations.
8 min read · 6 stages
LLMs
KV Cache
How the KV cache makes autoregressive LLM decoding affordable — what it stores and why reuse is valid, the memory cost, why decoding is memory-bandwidth-bound, and how MQA/GQA shrink it — with code.
8 min read · 6 stages
Context
Personal enough to test your actual understanding.
Resume and background context make the review sharper. Interview prep uses that context for behavioral stories and grounded follow-ups, while concept review uses it to connect ideas to work you have actually done.
Resume context (optional)
Upload a resume to ground sessions and generate STAR stories — fully skippable, deletable anytime
Profile injection
Background and focus areas shape each review session
Grounded retrieval
Follow-ups can reference your actual projects and tradeoffs
Editable anytime
Update profile context before the next review
Memory
Continuity, by design.
Session history
Every session is saved so you can reload the full thread, continue from where you stopped, or revisit a topic after a few days.
Revision cards
Generate compact review cards with the core concept, key equations or tradeoffs, implementation details, and the highest-value follow-up questions.
Mastery catalog
71 topics, 5 domains.
Classical models to LLM internals to RL theory. Built for deep refreshers, derivations, and interview-grade recall.
Classical ML
16- Linear Regression
- Logistic Regression
- Support Vector Machines
- K-Nearest Neighbours
- K-Means Clustering
- Naive Bayes
- Decision Trees
- Random Forests
- Gradient Boosting / XGBoost
- Gradient Descent (SGD, mini-batch, momentum, Adam)
- Bias-Variance Tradeoff
- Regularization (L1, L2, Dropout)
- Cross-Validation
- Evaluation Metrics (precision, recall, F1, AUC-ROC)
- Feature Engineering
- Dimensionality Reduction (PCA, t-SNE)
Deep Learning
26Foundations
- Backpropagation and Computational Graphs
- Activation Functions
- Loss Functions
- Optimizers (Adam, AdamW, LR schedules)
- Batch Normalization and Layer Normalization
- Weight Initialization
Architectures
- Feedforward Networks
- CNNs
- RNNs, LSTMs, GRUs
- Transformers (encoder, decoder, encoder-decoder)
- Attention Mechanisms (scaled dot-product, multi-head, cross-attention, FlashAttention)
- Positional Encoding
- KV Cache
Language and LLMs
- Tokenization and BPE
- Embeddings (word2vec, learned)
- Pretraining vs Fine-tuning
- LoRA and PEFT
- RAG (Retrieval Augmented Generation)
- Tool Calling
- Temperature Sampling, Top-k, Top-p, Beam Search
- RLHF (reward modeling, PPO fine-tuning loop, preference data)
- Inference Optimization
Multi-task and Transfer Learning
- Transfer Learning
- Multi-task Learning
- Negative Transfer
- Task Sampling Strategies
Reinforcement Learning
14Foundations
- Markov Decision Processes
- Bellman Equations
- Value Functions
- Policy-based vs Value-based vs Model-based RL
Tabular Methods
- Dynamic Programming
- Monte Carlo Control
- Temporal Difference Learning
- Q-Learning
- SARSA
Deep RL
- DQN
- Policy Gradients (REINFORCE)
- Actor-Critic Methods
- PPO (Proximal Policy Optimization)
- DPO (Direct Preference Optimization)
Training Engineering
8- Gradient Accumulation
- Gradient Checkpointing
- Mixed Precision Training (fp32, fp16, bfloat16)
- CUDA Setup and Device Management
- Distributed Training and DDP
- Training Config and YAML Structure
- Multi-Seed Evaluation and Reproducibility
- Experiment Tracking with MLflow
Systems and MLOps
7- Model Serving and Deployment
- Embeddings at Scale
- Vector Databases
- Batch vs Online Inference
- Latency vs Throughput Tradeoffs
- CI/CD for ML
- Experiment Tracking
Pricing
Pricing for serious review.
Start free. Upgrade when the monthly cap gets in the way of a focused review habit.
Free
Free.
A generous monthly tier for trying the mastery loop or reviewing a few priority topics.
- 20 sessions per month
- All 157+ ML and SWE topics
- Full 6-stage review format
- Sessions adapt to your background (optional resume)
- Session history + reload
- Revision cards
No card required.
Pro
7-day free trialPro.
For daily review across interview prep, research refreshers, systems review, and deep concept repair.
- Unlimited review sessions
- Everything in Free
- New topics ship to Pro first — request them directly
No charge for 7 days. Cancel anytime.
Begin
Start reviewing at depth.
Pick a topic, move through the stages, and leave with understanding you can retrieve under pressure.