Machine learning · Deep learning · RL · SWE

Deep mastery review for ML & SWE.

Six-stage study sessions that build durable understanding of hard technical topics — the intuition, the derivations, code that runs, and recall that holds under pressure. Interview prep is the sharpest use case: sessions go to the depth interviewers actually probe, and every thread saves so you can pick it back up.

Start a session Watch a session run

157+

ML & SWE topics

Stages per session

Tracks

10+

Domains covered

Demo session · Attention Mechanisms

Scripted replay · no model calls

Try it right here

Watch a full session run.

A recorded replay of a real six-stage session — the exact pauses where the tutor stops and waits, plus two follow-ups you can ask. No account, no model calls.

6 stages · the same math, code, and diagrams a live session renders · ~3 minutes

Six stages, in order, never compressed — the pause after each one is the point.

How it works

A better way to review.

Six stages, in order

Big picture, intuition, the math, implementation, machine learning interview questions, retrieval check.

The tutor stops and waits

After every stage you continue, ask a follow-up, or request a revision card — no walls of text.

Every session is saved

Reload any thread and pick up exactly where you stopped, days later.

Stage 6 makes you retrieve it

You reproduce the concept from memory before the session ends — that's the part that sticks.

Knowledge graphs

See how topics connect across the ML and SWE landscape.

softmaxQK^⊤√d_k

Derivations that make sense

Step-by-step math with every term motivated.

1def attn(q,k,v):

2 s = q @ k.T

3 s /= √d_k

4return softmax(s)@v

Code, side by side

Reference implementations that connect theory to practice.

Spaced retrieval

Reinforce concepts at the moment they start to fade.

Tracks

Two tracks, one mastery loop.

Use LiminalML for interview prep, research refreshers, systems review, or rebuilding the fundamentals you only half-remember.

ML / Research

71 topics · 5 domains

For attention, transformers, optimization, RLHF, training systems, and the math behind modern models.

Classical MLDeep LearningRLTraining & MLOps

Useful for: Research Engineer · MLE · Research Scientist · Applied Scientist

Software Engineering

86 topics · 5 domains

For rendering models, distributed systems, data structures, caching, databases, and production tradeoffs.

FrontendBackendSystem DesignCS Fundamentals

Useful for: Frontend · Backend · Fullstack · System Design · UI/UX

Format · six stages

Every topic, in order, never compressed.

The same arc every session: overview, intuition, math or internals, implementation, calibrated questions, and retrieval check.

1Big Picture

The concept in context: what problem it solves, where it appears in real systems, and the mental frame to keep before details arrive.

2Intuition + Visual

Core idea in plain language, then a structured diagram with tensor dimensions and data flow annotated. Mandatory for all DL architectures.

3The Math

Step-by-step derivation with every term motivated. Not just what each symbol is, but what breaks if you remove it.

4Implementation

Production-quality PyTorch with type annotations, every non-obvious line commented, and an explicit test snippet at the end.

5Interview Questions

5 calibrated questions: conceptual, implementation, applied, systems-level, and failure modes. Interview prep is the first proving ground.

6Retrieval Check

Retrieval drill. The AI asks, waits for your answer, then tells you precisely what was right, wrong, or missing.

The pause is the point

After each stage, LiminalML stops and waits. You can continue, ask a clarifying question, or request a compact revision card. That pacing turns passive reading into active review.

See it first · no sign-up

Read a real session before you start.

These concept pages are written in the exact six-stage format you get in a session — intuition, the math, runnable code, the questions interviewers ask, and a retrieval check. Free to read, no account needed.

Deep Learning

Attention Mechanisms

How scaled dot-product and multi-head attention work — the soft key-value lookup at the heart of every Transformer — with the math, runnable PyTorch, and calibrated interview questions.

9 min read · 6 stages

Deep Learning

Transformer Architecture

The Transformer block from the ground up — self-attention plus a position-wise feed-forward network, residuals and LayerNorm, and the encoder/decoder configurations — with the math, PyTorch, and calibrated interview questions.

9 min read · 6 stages

Deep Learning

Backpropagation

Backpropagation as reverse-mode autodiff — the chain rule over the computational graph, the gradients for a linear layer and ReLU, and why gradients vanish — with a runnable manual backward pass.

9 min read · 6 stages

Deep Learning

Batch Normalization

What Batch Norm normalizes and why, the critical train-vs-inference distinction, BN vs. Layer Norm, with the math and a from-scratch PyTorch implementation.

8 min read · 6 stages

Optimization

Gradient Descent (SGD, Momentum, Adam)

SGD, momentum, and Adam explained — the update rules, why mini-batching wins, Adam's bias correction, and when plain SGD generalizes better — with from-scratch implementations.

8 min read · 6 stages

LLMs

KV Cache

How the KV cache makes autoregressive LLM decoding affordable — what it stores and why reuse is valid, the memory cost, why decoding is memory-bandwidth-bound, and how MQA/GQA shrink it — with code.

8 min read · 6 stages

Browse all 17 concepts

Context

Personal enough to test your actual understanding.

Resume and background context make the review sharper. Interview prep uses that context for behavioral stories and grounded follow-ups, while concept review uses it to connect ideas to work you have actually done.

Resume context (optional)

Upload a resume to ground sessions and generate STAR stories — fully skippable, deletable anytime

Profile injection

Background and focus areas shape each review session

Grounded retrieval

Follow-ups can reference your actual projects and tradeoffs

Editable anytime

Update profile context before the next review

Memory

Continuity, by design.

Session history

Every session is saved so you can reload the full thread, continue from where you stopped, or revisit a topic after a few days.

Revision cards

Generate compact review cards with the core concept, key equations or tradeoffs, implementation details, and the highest-value follow-up questions.

Mastery catalog

71 topics, 5 domains.

Classical models to LLM internals to RL theory. Built for deep refreshers, derivations, and interview-grade recall.

Classical ML

Linear Regression
Logistic Regression
Support Vector Machines
K-Nearest Neighbours
K-Means Clustering
Naive Bayes
Decision Trees
Random Forests
Gradient Boosting / XGBoost
Gradient Descent (SGD, mini-batch, momentum, Adam)
Bias-Variance Tradeoff
Regularization (L1, L2, Dropout)
Cross-Validation
Evaluation Metrics (precision, recall, F1, AUC-ROC)
Feature Engineering
Dimensionality Reduction (PCA, t-SNE)

Deep Learning

Foundations

Backpropagation and Computational Graphs
Activation Functions
Loss Functions
Optimizers (Adam, AdamW, LR schedules)
Batch Normalization and Layer Normalization
Weight Initialization

Architectures

Feedforward Networks
CNNs
RNNs, LSTMs, GRUs
Transformers (encoder, decoder, encoder-decoder)
Attention Mechanisms (scaled dot-product, multi-head, cross-attention, FlashAttention)
Positional Encoding
KV Cache

Language and LLMs

Tokenization and BPE
Embeddings (word2vec, learned)
Pretraining vs Fine-tuning
LoRA and PEFT
RAG (Retrieval Augmented Generation)
Tool Calling
Temperature Sampling, Top-k, Top-p, Beam Search
RLHF (reward modeling, PPO fine-tuning loop, preference data)
Inference Optimization

Multi-task and Transfer Learning

Transfer Learning
Multi-task Learning
Negative Transfer
Task Sampling Strategies

Reinforcement Learning

Foundations

Markov Decision Processes
Bellman Equations
Value Functions
Policy-based vs Value-based vs Model-based RL

Tabular Methods

Dynamic Programming
Monte Carlo Control
Temporal Difference Learning
Q-Learning
SARSA

Deep RL

DQN
Policy Gradients (REINFORCE)
Actor-Critic Methods
PPO (Proximal Policy Optimization)
DPO (Direct Preference Optimization)

Training Engineering

Gradient Accumulation
Gradient Checkpointing
Mixed Precision Training (fp32, fp16, bfloat16)
CUDA Setup and Device Management
Distributed Training and DDP
Training Config and YAML Structure
Multi-Seed Evaluation and Reproducibility
Experiment Tracking with MLflow

Systems and MLOps

Model Serving and Deployment
Embeddings at Scale
Vector Databases
Batch vs Online Inference
Latency vs Throughput Tradeoffs
CI/CD for ML
Experiment Tracking

Pricing

Pricing for serious review.

Start free. Upgrade when the monthly cap gets in the way of a focused review habit.

Free

Free.

$0/ month

A generous monthly tier for trying the mastery loop or reviewing a few priority topics.

20 sessions per month
All 157+ ML and SWE topics
Full 6-stage review format
Sessions adapt to your background (optional resume)
Session history + reload
Revision cards

No card required.

Pro

7-day free trial

Pro.

$9/ month

For daily review across interview prep, research refreshers, systems review, and deep concept repair.

Unlimited review sessions
Everything in Free
New topics ship to Pro first — request them directly

No charge for 7 days. Cancel anytime.

See full pricing and FAQ →

Begin

Start reviewing at depth.

Pick a topic, move through the stages, and leave with understanding you can retrieve under pressure.

Start a session