Study Guide Hub-Spoke Spec
Status: 2026-05-09 — drafted while building AP Calc AB end-to-end sample. All 24 subjects (12 AP + 7 IB + 5 CIE) follow this once spec is signed off.
1. Information architecture (3-level nav, no deeper)
/guides/<track>/ — Track index (AP / IB / A-Level)
/guides/<track>/<subject_code>/ — Subject "outline index" page (EXPANDABLE TREE)
/study/<slug>/guide/ — Spoke page (one sub-topic per spoke)
The subject outline page IS the unit-level navigation. Units do NOT have their own URL. Each unit is rendered as a collapsible section on the subject page; clicking a sub-topic jumps to its spoke page.
[Subject page layout]
AP Calculus AB
├─ Unit 1: Limits and Continuity [▼]
│ Overview → /study/ap-calculus-ab-u1-overview/
│ 1.1 Defining limits → /study/ap-calculus-ab-u1-defining-limits/
│ 1.2 Squeeze theorem → /study/ap-calculus-ab-u1-squeeze-theorem/
│ ... (16 sub-topic links, expand on click)
├─ Unit 2: Differentiation Definition [▶]
...
2. Two spoke types
| Type | Per subject | Length | Purpose |
|---|---|---|---|
| Unit overview spoke | 8-10 (one per unit) | 1500-2500 words | Concept map of the unit + how sub-topics fit |
| Sub-topic spoke | 60-100 (one per CED/syllabus topic) | 1500-2500 words | Definition + worked example (real past paper) + pitfalls + practice |
Both types are first-class .md files under src/study-guides/. They sit
at the same URL depth (/study/<slug>/). The "hub" is the subject page,
not a separate URL layer.
3. Slug naming
<subject_code>-u<unit_number>-<topic_kebab> for sub-topic spokes.
<subject_code>-u<unit_number>-overview for unit overview spokes.
| Example slug | What it is |
|---|---|
ap-calculus-ab-u1-overview |
Unit 1 overview |
ap-calculus-ab-u1-squeeze-theorem |
Unit 1 sub-topic spoke |
cie-9709-p1-overview |
(9709 calls "P1" not "U1") |
cie-9709-p1-binomial-expansion-positive-n |
sub-topic spoke |
ibo-physics-hl-1-overview |
Unit 1 overview |
ibo-physics-hl-1-projectile-motion |
sub-topic spoke |
Migration: existing 6 AP-Calc-AB .md files (e.g. ap-calculus-ab-derivatives.md)
become unit-overview spokes. Rename + redirect from old slug to new slug
to keep SEO juice.
4. KP schema (knowledge_points table)
code e.g. AP_CALC_AB_U1_T08 (unit-level: AP_CALC_AB_U1)
exam_board 'AP' / 'IB' / 'CIE'
subject 'MATH' etc
subject_code 'AP_CALC_AB' / '9709' / 'IB_PHYS' ...
level '12' / 'A_LEVEL' / 'HL' / 'SL' ...
unit 'U1' / 'P1' / etc
parent_code the unit-level KP code (NULL for unit-level rows)
title_en 'Squeeze theorem'
title_zh '夹逼定理'
path ['Limits and Continuity', '1.8 Squeeze theorem']
aliases ['squeeze theorem', '夹逼定理', 'sandwich theorem']
frequency_tier 1-5 (rough exam frequency, populate later from tagged data)
target_question_count integer (how many practice Qs we want surfaced; later)
Two tiers of KP rows per subject:
- 8-10 unit rows (parent_code = NULL)
- 60-100 sub-topic rows (parent_code = the unit row's code)
5. .md frontmatter template
---
title: "<H1 + ' | ' + Subject>" # SEO: keyword first
description: "<140-160 char SEO description>"
slug: "<spoke slug>"
spoke_type: "unit_overview" | "sub_topic"
exam_board: "College Board" | "International Baccalaureate" | "CIE"
subject_code: "<KP subject_code>" # joins to knowledge_points.subject_code
unit: "U1" | "P1" | etc
parent_unit_overview_slug: "ap-calculus-ab-u1-overview" # for sub-topic spokes
kp_codes: ["AP_CALC_AB_U1_T08"] # one row per spoke usually; multi if cross-cuts
syllabus_reference: "AP Calculus AB CED Topic 1.8"
keywords: [<list>]
related_real_past_papers: ["AP_CALC_AB_2018_q3", ...] # auto-populated by tag pipeline
last_updated: "<YYYY-MM-DD>"
reading_time_min: <int>
---
6. Spoke body skeleton (LLM-drafted, human-reviewed)
# <Topic name> — <Subject> Study Guide
> **For**: <subject> candidates.
> **Covers**: <one-line scope>
> **You should already know**: <prereq sub-topics, link to them>
## Why this matters
<150 words: where this fits in the exam, common question patterns>
## Definition / key idea
<formal statement + intuition; KaTeX OK>
## Worked example
<EXACTLY ONE past-paper question, lifted from `questions` table by KP match.
Include paper_id + year. Solve in 4-7 steps with reasoning shown.>
## Common pitfalls
<2-4 specific traps students fall into; ground each in a real wrong answer
pattern, not generic platitudes>
## Quick check
<2-3 short prompts the student answers themselves; reveals concept holes>
## See also
<links to related spoke slugs in the same/adjacent units>
7. Build pipeline (where each piece is generated)
| Artifact | Generated by | Frequency |
|---|---|---|
knowledge_points rows |
kp_seed_<subject>.py (one script per subject) |
once per subject + on syllabus update |
<spoke>.md draft |
llm_draft_spoke.py consumes KP + tagged questions |
bulk run per subject |
<spoke>.zh.md |
llm_translate_spoke.py (mirror of EN with Chinese terms standardised) |
after EN audit |
| Subject outline page | extend build-study-guides.mjs: read knowledge_points DB → render unit accordion |
every build |
| Real-paper worked-example matcher | attach_worked_examples.py: for each spoke pull top-1 question by kp_codes overlap |
run after kp_tagging |
8. Quality gates before Jimmy ships a spoke
- Math correctness:
verify_spoke_math.pyparses every KaTeX block, checks formula identities and worked-example arithmetic via SymPy where possible - CED/syllabus coverage: each unit's spokes cover every official sub-topic — no orphan, no duplicate
- Real worked example:
worked_exampleblock must reference an actualpapers.id, not LLM-fabricated - Pitfalls grounded: each pitfall must cite either a common wrong-answer pattern in
questions.mark_scheme_guidanceor a known mis-step frommessagestable
9. Cross-subject consistency
All subjects use the same template, same spoke length range, same frontmatter
schema. Subject-specific deviations live ONLY in the kp_seed_<subject>.py
script (where the syllabus structure is encoded).