feat: guest play — allow singleplayer quiz without auth

- Add optionalAuth middleware: attaches session when present, never blocks (guests pass through) - Make game endpoints (start/answer) accept optional auth - GameSessionStore.userId: string → string | null - Rate limiter falls back to IP for unauthenticated users - Frontend: remove /play route guard, show 'Create account' CTA on score screen for guests - Add tests for guest session creation, answer submission, and cross-user session isolation
wip
2026-05-31 21:28:08 +02:00 · 2026-05-30 03:47:59 +02:00 · 2026-05-30 03:47:52 +02:00 · 2026-05-25 01:04:49 +02:00 · 2026-05-16 01:59:43 +02:00 · 2026-05-15 23:12:55 +02:00
90 changed files with 7767 additions and 1056915 deletions
--- a/.gitignore
+++ b/.gitignore
@ -12,6 +12,11 @@ __pycache__/
 data-pipeline/archive/
 data-pipeline/stage-1-extract/output/
 data-pipeline/stage-1-extract/sources/
 data-pipeline/stage-2-annotate/output/
 data-pipeline/stage-3-enrich/output/
 data-pipeline/stage-4-merge/output/
 data-pipeline/db/pipeline.db
 data-pipeline/reports/
 data-pipeline/.env
 .aider*
--- a/README.md
+++ b/README.md
@ -1,170 +1,107 @@
-# lila
+# Lila
 **Learn words. Beat friends.**
-lila is a vocabulary trainer built around a Duolingo-style quiz loop: a word appears in one language, you pick the correct translation from four choices. It supports singleplayer and real-time multiplayer, and is designed to work across multiple language pairs without schema changes.
+Lila is a vocabulary trainer that turns the media you love into language practice. Learn vocabulary from a Shakira song, the first chapter of _Harry Potter_, or an episode of _Breaking Bad_ — then challenge your friends in real-time multiplayer quizzes.
 Live at [lilastudy.com](https://lilastudy.com).
 ---
 ## Quickstart
 ```bash
 # 1. Clone and install
 git clone git@git.lilastudy.com:forgejo-lila/lila.git
 cd lila
 pnpm install
 # 2. Environment
 cp .env.example .env
 # 3. Start local services (PostgreSQL, Valkey)
 docker compose up -d
 # 4. Build shared packages
 pnpm --filter @lila/shared build
 pnpm --filter @lila/db build
 # 5. Run migrations and seed data
 pnpm --filter @lila/db migrate
 pnpm --filter @lila/db seed
 # 6. Start dev servers
 pnpm dev
 ```
 API: `http://localhost:3000` · Web: `http://localhost:5173`
 See [DEPLOYMENT.md](DEPLOYMENT.md) for production infrastructure details.
 ---
 ## Documentation Index
 | Document                                     | What you'll find there                                                  |
 | -------------------------------------------- | ----------------------------------------------------------------------- |
 | [STATUS.md](STATUS.md)                       | Current state — what's working, what's blocked, what we're building now |
 | [BACKLOG.md](BACKLOG.md)                     | Prioritized task list: now / next / later / changelog                   |
 | [ARCHITECTURE.md](ARCHITECTURE.md)           | Monorepo structure, layered architecture, data flow                     |
 | [DECISIONS.md](DECISIONS.md)                 | Why we chose X over Y — tool choices, schema design, trade-offs         |
 | [DATA_PIPELINE.md](DATA_PIPELINE.md)         | Kaikki → CEFR enrichment → production PostgreSQL                        |
 | [MODEL_STRATEGY.md](MODEL_STRATEGY.md)       | LLM voter architecture for sense-disambiguated CEFR assignment          |
 | [LLM_SETUP.md](LLM_SETUP.md)                 | Local and cloud LLM provider configuration                              |
 | [DEPLOYMENT.md](DEPLOYMENT.md)               | Hetzner VPS, Caddy, Docker Compose, CI/CD, backups                      |
 | [design/GAME_MODES.md](design/GAME_MODES.md) | Planned multiplayer and singleplayer game modes                         |
 ---
 ## Stack
 | Layer      | Technology                                                    |
-| ------------ | ---------------------------------- |
+| ---------- | ------------------------------------------------------------- |
 | Monorepo   | pnpm workspaces                                               |
-| Frontend     | React 18, Vite, TypeScript         |
+| Frontend   | React 18, Vite, TanStack Router, TanStack Query, Tailwind CSS |
-| Routing      | TanStack Router                    |
+| Backend    | Node.js, Express, TypeScript, WebSockets (`ws`)               |
 | Server state | TanStack Query                     |
 | Styling      | Tailwind CSS                       |
 | Backend      | Node.js, Express, TypeScript       |
 | Database   | PostgreSQL + Drizzle ORM                                      |
 | Validation   | Zod (shared schemas)               |
 | Auth       | Better Auth (Google + GitHub)                                 |
-| Realtime     | WebSockets (`ws` library)          |
+| Validation | Zod (shared between frontend and backend)                     |
 | Testing    | Vitest, supertest                                             |
 | Deployment | Docker Compose, Caddy, Hetzner VPS                            |
 | CI/CD      | Forgejo Actions                                               |
 ---
 ## Current Status
 - ✅ Singleplayer quiz (5 language pairs: en↔it/de/es/fr)
 - ✅ Multiplayer lobby + real-time game (2–4 players, simultaneous answers, 15s timer)
 - ✅ Auth (Google + GitHub)
 - ✅ Live deployment with CI/CD
 - 🔄 Migrating vocabulary data from OpenWordNet to **Kaikki** (sense-disambiguated translations)
 - 🔄 Phase 7 hardening (rate limiting, error boundaries, monitoring)
 See [STATUS.md](STATUS.md) for the full picture.
 ---
 ## Repository Structure
-```tree
+```
 lila/
 ├── apps/
 │   ├── api/           — Express backend
 │   └── web/           — React frontend
 ├── packages/
-│   ├── shared/     — Zod schemas and types shared between frontend and backend
+│   ├── shared/        — Zod schemas + constants (API/web contract)
-│   └── db/         — Drizzle schema, migrations, models, seeding scripts
+│   └── db/            — Drizzle schema, migrations, models, seeding
-├── scripts/        — Python scripts for vocabulary data extraction
+├── data-pipeline/     — Kaikki extraction → enrichment → PostgreSQL sync
-└── documentation/  — Project docs
+├── documentation/   — Project docs (this directory)
-```
+└── Caddyfile, docker-compose.yml, etc.
 `packages/shared` is the contract between frontend and backend. All request/response shapes are defined there as Zod schemas and never duplicated.
 ---
 ## Architecture
 Requests flow through a strict layered architecture:
 ```text
 HTTP Request → Router → Controller → Service → Model → Database
 ```
 Each layer only talks to the layer directly below it. Controllers handle HTTP only. Services contain business logic only. Models contain database queries only. All database code lives in `packages/db` — the API never imports Drizzle directly for queries.
 ---
 ## Data Model
 Words are modelled as language-neutral concepts (`terms`) with per-language `translations`. Adding a new language requires no schema changes — only new rows. CEFR levels (A1–C2) are stored per translation for difficulty filtering.
 Core tables: `terms`, `translations`, `term_glosses`, `decks`, `deck_terms`
 Auth tables (managed by Better Auth): `user`, `session`, `account`, `verification`
 Vocabulary data is sourced from WordNet and the Open Multilingual Wordnet (OMW).
 ---
 ## API
 ```text
 POST /api/v1/game/start     — start a quiz session (auth required)
 POST /api/v1/game/answer    — submit an answer (auth required)
 GET  /api/v1/health         — health check (public)
 ALL  /api/auth/*            — Better Auth handlers (public)
 ```
 The correct answer is never sent to the frontend — all evaluation happens server-side.
 ---
 ## Multiplayer
 Rooms are created via REST, then managed over WebSockets. Messages are typed via a Zod discriminated union. The host starts the game; all players answer simultaneously with a 15-second server-enforced timer. Room state is held in-memory (Valkey deferred).
 ---
 ## Infrastructure
 ```tree
 Internet → Caddy (HTTPS)
            ├── lilastudy.com      → web (nginx, static files)
            ├── api.lilastudy.com  → api (Express)
            └── git.lilastudy.com  → Forgejo (git + registry)
 ```
 Deployed on a Hetzner VPS (Debian 13, ARM64). Images are built cross-compiled for ARM64 and pushed to the Forgejo container registry. CI/CD runs via Forgejo Actions on push to `main`. Daily database backups are synced to the dev laptop via rsync.
 See `documentation/deployment.md` for the full infrastructure setup.
 ---
 ## Local Development
 ### Prerequisites
 - Node.js 20+
 - pnpm 9+
 - Docker + Docker Compose
 ### Setup
 ```bash
 # Install dependencies
 pnpm install
 # Create your local env file (used by docker compose + the API)
 cp .env.example .env
 # Start local services (PostgreSQL, Valkey)
 docker compose up -d
 # Build shared packages
 pnpm --filter @lila/shared build
 pnpm --filter @lila/db build
 # Run migrations and seed data
 pnpm --filter @lila/db migrate
 pnpm --filter @lila/db seed
 # Start dev servers
 pnpm dev
 ```
 The API runs on `http://localhost:3000` and the frontend on `http://localhost:5173`.
 ---
 ## Testing
 ```bash
 # All tests
 pnpm test
 # API only
 pnpm --filter api test
 # Frontend only
 pnpm --filter web test
 ```
 ---
-## Roadmap
+## License
-| Phase | Description                                                            | Status |
+TBD
 | ----- | ---------------------------------------------------------------------- | ------ |
 | 0     | Foundation — monorepo, tooling, dev environment                        | ✅     |
 | 1     | Vocabulary data pipeline + REST API                                    | ✅     |
 | 2     | Singleplayer quiz UI                                                   | ✅     |
 | 3     | Auth (Google + GitHub)                                                 | ✅     |
 | 4     | Multiplayer lobby (WebSockets)                                         | ✅     |
 | 5     | Multiplayer game (real-time, server timer)                             | ✅     |
 | 6     | Production deployment + CI/CD                                          | ✅     |
 | 7     | Hardening (rate limiting, error boundaries, monitoring, accessibility) | 🔄     |
 See `documentation/roadmap.md` for task-level detail.
--- a/apps/api/src/controllers/gameController.test.ts
+++ b/apps/api/src/controllers/gameController.test.ts
@ -1,6 +1,7 @@
 import { describe, it, expect, vi, beforeEach } from "vitest";
 import request from "supertest";
 import type { GameSession, AnswerResult } from "@lila/shared";
 import { auth } from "../lib/auth.js";
 type SuccessResponse<T> = { success: true; data: T };
 type ErrorResponse = { success: false; error: string };
@ -48,6 +49,36 @@ vi.mock("better-auth/node", () => ({
  toNodeHandler: vi.fn().mockReturnValue(vi.fn()),
 }));
 const mockGetSession = vi.mocked(auth.api.getSession);
 function setAuthenticatedUser(userId: string) {
  mockGetSession.mockResolvedValue({
    session: {
      id: "session-1",
      userId,
      token: "fake-token",
      expiresAt: new Date(Date.now() + 1000 * 60 * 60),
      createdAt: new Date(),
      updatedAt: new Date(),
      ipAddress: null,
      userAgent: null,
    },
    user: {
      id: userId,
      name: "Test User",
      email: "test@test.com",
      emailVerified: false,
      image: null,
      createdAt: new Date(),
      updatedAt: new Date(),
    },
  });
 }
 function setGuestUser() {
  mockGetSession.mockResolvedValue(null);
 }
 import { getGameTerms, getDistractors } from "@lila/db";
 import { createApp } from "../app.js";
@ -64,13 +95,19 @@ const validBody = {
 };
 const fakeTerms = [
-  { termId: "t1", sourceText: "dog", targetText: "cane", sourceGloss: null },
+  { entryId: "t1", sourceText: "dog", targetText: "cane", sourceGloss: null },
-  { termId: "t2", sourceText: "cat", targetText: "gatto", sourceGloss: null },
+  { entryId: "t2", sourceText: "cat", targetText: "gatto", sourceGloss: null },
-  { termId: "t3", sourceText: "house", targetText: "casa", sourceGloss: null },
+  {
    entryId: "t3",
    sourceText: "house",
    targetText: "casa",
    sourceGloss: "a building for living in",
  },
 ];
 beforeEach(() => {
  vi.clearAllMocks();
  setAuthenticatedUser("user-1"); // default: authenticated
  mockGetGameTerms.mockResolvedValue(fakeTerms);
  mockGetDistractors.mockResolvedValue(["wrong1", "wrong2", "wrong3"]);
 });
@ -216,3 +253,87 @@ describe("POST /api/v1/game/answer", () => {
    expect(body.success).toBe(false);
  });
 });
 describe("POST /api/v1/game/start — guest", () => {
  beforeEach(() => {
    setGuestUser();
  });
  it("returns 200 for a guest with valid game settings", async () => {
    const res = await request(app).post("/api/v1/game/start").send(validBody);
    const body = res.body as GameStartResponse;
    expect(res.status).toBe(200);
    expect(body.success).toBe(true);
    expect(body.data.sessionId).toBeDefined();
    expect(body.data.questions).toHaveLength(3);
  });
  it("creates a session without userId for guests", async () => {
    const startRes = await request(app)
      .post("/api/v1/game/start")
      .send(validBody);
    const startBody = startRes.body as GameStartResponse;
    const { sessionId, questions } = startBody.data;
    // Guest can answer — no 404 from userId mismatch
    const res = await request(app)
      .post("/api/v1/game/answer")
      .send({
        sessionId,
        questionId: questions[0]!.questionId,
        selectedOptionId: 0,
      });
    expect(res.status).toBe(200);
  });
 });
 describe("POST /api/v1/game/answer — guest", () => {
  beforeEach(() => {
    setGuestUser();
  });
  it("allows a guest to submit an answer", async () => {
    const startRes = await request(app)
      .post("/api/v1/game/start")
      .send(validBody);
    const startBody = startRes.body as GameStartResponse;
    const { sessionId, questions } = startBody.data;
    const question = questions[0]!;
    const res = await request(app)
      .post("/api/v1/game/answer")
      .send({
        sessionId,
        questionId: question.questionId,
        selectedOptionId: 0,
      });
    const body = res.body as GameAnswerResponse;
    expect(res.status).toBe(200);
    expect(body.success).toBe(true);
    expect(body.data.questionId).toBe(question.questionId);
  });
  it("returns 404 when a guest tries to answer an authenticated user's session", async () => {
    // First: create session as authenticated user
    setAuthenticatedUser("user-1");
    const startRes = await request(app)
      .post("/api/v1/game/start")
      .send(validBody);
    const startBody = startRes.body as GameStartResponse;
    const { sessionId, questions } = startBody.data;
    // Then: try to answer as guest
    setGuestUser();
    const res = await request(app)
      .post("/api/v1/game/answer")
      .send({
        sessionId,
        questionId: questions[0]!.questionId,
        selectedOptionId: 0,
      });
    const body = res.body as ErrorResponse;
    expect(res.status).toBe(404);
    expect(body.success).toBe(false);
    expect(body.error).toContain("Game session not found");
  });
 });
--- a/apps/api/src/controllers/gameController.ts
+++ b/apps/api/src/controllers/gameController.ts
@ -16,10 +16,11 @@ export const createGameController = (store: GameSessionStore) => ({
      if (!gameSettings.success) {
        throw new ValidationError("Invalid game settings");
      }
      const userId = req.session?.user.id ?? null;
      const gameQuestions = await createGameSession(
        gameSettings.data,
        store,
-        req.session.user.id,
+        userId,
      );
      res.json({ success: true, data: gameQuestions });
    } catch (error) {
@ -37,11 +38,8 @@ export const createGameController = (store: GameSessionStore) => ({
      if (!submission.success) {
        throw new ValidationError("Invalid answer submission");
      }
-      const result = await evaluateAnswer(
+      const userId = req.session?.user.id ?? null;
-        submission.data,
+      const result = await evaluateAnswer(submission.data, store, userId);
        store,
        req.session.user.id,
      );
      res.json({ success: true, data: result });
    } catch (error) {
      next(error);
--- a/apps/api/src/gameSessionStore/GameSessionStore.ts
+++ b/apps/api/src/gameSessionStore/GameSessionStore.ts
@ -1,6 +1,6 @@
 export type GameSessionData = {
  answers: Map<string, { correctOptionId: number }>;
-  userId: string;
+  userId: string | null;
 };
 export interface GameSessionStore {
--- a/apps/api/src/middleware/authMiddleware.test.ts
+++ b/apps/api/src/middleware/authMiddleware.test.ts
@ -0,0 +1,103 @@
 import express from "express";
 import request from "supertest";
 import { describe, it, expect, vi, beforeEach } from "vitest";
 import type { Session, User } from "better-auth";
 vi.mock("../lib/auth.js", () => ({ auth: { api: { getSession: vi.fn() } } }));
 vi.mock("better-auth/node", () => ({
  fromNodeHeaders: vi.fn().mockReturnValue({}),
 }));
 import { auth } from "../lib/auth.js";
 import { requireAuth, optionalAuth } from "./authMiddleware.js";
 const mockGetSession = vi.mocked(auth.api.getSession);
 function createOptionalAuthApp() {
  const app = express();
  app.use(optionalAuth);
  app.get("/test", (req, res) => {
    res
      .status(200)
      .json({
        hasSession: !!req.session,
        userId: req.session?.user?.id ?? null,
      });
  });
  return app;
 }
 describe("optionalAuth", () => {
  beforeEach(() => {
    vi.clearAllMocks();
  });
  it("allows the request through when no session exists (guest)", async () => {
    mockGetSession.mockResolvedValue(null);
    const app = createOptionalAuthApp();
    const res = await request(app).get("/test");
    expect(res.status).toBe(200);
    expect(res.body).toEqual({ hasSession: false, userId: null });
  });
  it("attaches session to req when user is authenticated", async () => {
    mockGetSession.mockResolvedValue({
      session: { id: "session-1" } as Session,
      user: { id: "user-1" } as User,
    });
    const app = createOptionalAuthApp();
    const res = await request(app).get("/test");
    expect(res.status).toBe(200);
    expect(res.body).toEqual({ hasSession: true, userId: "user-1" });
  });
  it("allows the request through even when getSession throws", async () => {
    mockGetSession.mockRejectedValue(new Error("auth service down"));
    const app = createOptionalAuthApp();
    const res = await request(app).get("/test");
    expect(res.status).toBe(200);
    expect(res.body).toEqual({ hasSession: false, userId: null });
  });
 });
 describe("requireAuth", () => {
  beforeEach(() => {
    vi.clearAllMocks();
  });
  it("returns 401 when no session exists", async () => {
    mockGetSession.mockResolvedValue(null);
    const app = express();
    app.use(requireAuth);
    app.get("/test", (_req, res) => res.status(200).json({ ok: true }));
    const res = await request(app).get("/test");
    expect(res.status).toBe(401);
    expect(res.body).toEqual({ success: false, error: "Unauthorized" });
  });
  it("allows the request through when session exists", async () => {
    mockGetSession.mockResolvedValue({
      session: { id: "session-1" } as Session,
      user: { id: "user-1" } as User,
    });
    const app = express();
    app.use(requireAuth);
    app.get("/test", (req, res) => {
      res.status(200).json({ userId: req.session?.user?.id });
    });
    const res = await request(app).get("/test");
    expect(res.status).toBe(200);
    expect(res.body).toEqual({ userId: "user-1" });
  });
 });
--- a/apps/api/src/middleware/authMiddleware.ts
+++ b/apps/api/src/middleware/authMiddleware.ts
@ -20,3 +20,23 @@ export const requireAuth = async (
  next();
 };
 export const optionalAuth = async (
  req: Request,
  _res: Response,
  next: NextFunction,
 ) => {
  try {
    const session = await auth.api.getSession({
      headers: fromNodeHeaders(req.headers),
    });
    if (session) {
      req.session = session;
    }
  } catch (err) {
    console.warn("Auth check failed, continuing as guest:", err);
  }
  next();
 };
--- a/apps/api/src/middleware/rateLimiters.ts
+++ b/apps/api/src/middleware/rateLimiters.ts
@ -1,4 +1,4 @@
-import rateLimit from "express-rate-limit";
+import rateLimit, { ipKeyGenerator } from "express-rate-limit";
 import type { Request } from "express";
 // TODO: When Valkey is wired up, swap the default in-memory store for
@ -33,7 +33,8 @@ export const gameLimiter = rateLimit({
  limit: 150,
  standardHeaders: "draft-8",
  legacyHeaders: false,
-  keyGenerator: (req: Request) => req.session!.user.id,
+  keyGenerator: (req: Request) =>
    req.session?.user.id ?? ipKeyGenerator(req.ip ?? "unknown"),
  message: {
    success: false,
    error: "Too many requests, please try again later.",
@ -45,7 +46,8 @@ export const lobbyLimiter = rateLimit({
  limit: 20,
  standardHeaders: "draft-8",
  legacyHeaders: false,
-  keyGenerator: (req: Request) => req.session!.user.id,
+  keyGenerator: (req: Request) =>
    req.session?.user.id ?? ipKeyGenerator(req.ip ?? "unknown"),
  message: {
    success: false,
    error: "Too many requests, please try again later.",
--- a/apps/api/src/routes/gameRouter.ts
+++ b/apps/api/src/routes/gameRouter.ts
@ -1,7 +1,7 @@
 import express from "express";
 import type { Router } from "express";
 import { createGameController } from "../controllers/gameController.js";
-import { requireAuth } from "../middleware/authMiddleware.js";
+import { optionalAuth } from "../middleware/authMiddleware.js";
 import { gameLimiter } from "../middleware/rateLimiters.js";
 import type { GameSessionStore } from "../gameSessionStore/index.js";
@ -9,7 +9,7 @@ export const createGameRouter = (store: GameSessionStore): Router => {
  const router = express.Router();
  const controller = createGameController(store);
-  router.use(requireAuth);
+  router.use(optionalAuth);
  router.use(gameLimiter);
  router.post("/start", controller.createGame as express.RequestHandler);
--- a/apps/api/src/services/gameService.test.ts
+++ b/apps/api/src/services/gameService.test.ts
@ -19,10 +19,10 @@ const validRequest: GameRequest = {
 };
 const fakeTerms = [
-  { termId: "t1", sourceText: "dog", targetText: "cane", sourceGloss: null },
+  { entryId: "t1", sourceText: "dog", targetText: "cane", sourceGloss: null },
-  { termId: "t2", sourceText: "cat", targetText: "gatto", sourceGloss: null },
+  { entryId: "t2", sourceText: "cat", targetText: "gatto", sourceGloss: null },
  {
-    termId: "t3",
+    entryId: "t3",
    sourceText: "house",
    targetText: "casa",
    sourceGloss: "a building for living in",
@ -332,3 +332,89 @@ describe("evaluateAnswer", () => {
    ).rejects.toMatchObject({ statusCode: 422 });
  });
 });
 // Add to existing gameService.test.ts
 describe("createGameSession — guest", () => {
  let store: InMemoryGameSessionStore;
  beforeEach(() => {
    store = new InMemoryGameSessionStore();
  });
  it("creates a session with userId null for guests", async () => {
    const session = await createGameSession(validRequest, store, null);
    expect(session.sessionId).toBeDefined();
    expect(session.questions).toHaveLength(3);
  });
  it("stores userId as null in the session store", async () => {
    const session = await createGameSession(validRequest, store, null);
    const stored = await store.get(session.sessionId);
    expect(stored).not.toBeNull();
    expect(stored!.userId).toBeNull();
  });
 });
 describe("evaluateAnswer — guest", () => {
  let store: InMemoryGameSessionStore;
  beforeEach(() => {
    store = new InMemoryGameSessionStore();
  });
  it("allows a guest to answer their own session", async () => {
    const session = await createGameSession(validRequest, store, null);
    const question = session.questions[0]!;
    const correctText = fakeTerms[0]!.targetText;
    const correctOption = question.options.find((o) => o.text === correctText)!;
    const result = await evaluateAnswer(
      {
        sessionId: session.sessionId,
        questionId: question.questionId,
        selectedOptionId: correctOption.optionId,
      },
      store,
      null,
    );
    expect(result.isCorrect).toBe(true);
  });
  it("throws NotFoundError when guest tries to answer an authenticated session", async () => {
    const authSession = await createGameSession(validRequest, store, "user-1");
    const question = authSession.questions[0]!;
    await expect(
      evaluateAnswer(
        {
          sessionId: authSession.sessionId,
          questionId: question.questionId,
          selectedOptionId: 0,
        },
        store,
        null,
      ),
    ).rejects.toThrow("Game session not found");
  });
  it("throws NotFoundError when authenticated user tries to answer a guest session", async () => {
    const guestSession = await createGameSession(validRequest, store, null);
    const question = guestSession.questions[0]!;
    await expect(
      evaluateAnswer(
        {
          sessionId: guestSession.sessionId,
          questionId: question.questionId,
          selectedOptionId: 0,
        },
        store,
        "user-1",
      ),
    ).rejects.toThrow("Game session not found");
  });
 });
--- a/apps/api/src/services/gameService.ts
+++ b/apps/api/src/services/gameService.ts
@ -19,7 +19,7 @@ import { shuffleArray } from "../lib/utils.js";
 export const createGameSession = async (
  request: GameRequest,
  store: GameSessionStore,
-  userId: string,
+  userId: string | null,
 ): Promise<GameSession> => {
  const terms = await getGameTerms(
    request.source_language,
@ -38,8 +38,9 @@ export const createGameSession = async (
  const questions: GameQuestion[] = await Promise.all(
    terms.map(async (term) => {
      const distractorTexts = await getDistractors(
-        term.termId,
+        term.entryId,
        term.targetText,
        request.source_language,
        request.target_language,
        request.pos,
        request.difficulty,
@ -86,11 +87,15 @@ export const createGameSession = async (
 export const evaluateAnswer = async (
  submission: AnswerSubmission,
  store: GameSessionStore,
-  userId: string,
+  userId: string | null,
 ): Promise<AnswerResult> => {
  const session = await store.get(submission.sessionId);
-  if (!session || session.userId !== userId) {
+  if (!session) {
    throw new NotFoundError(`Game session not found: ${submission.sessionId}`);
  }
  if (session.userId !== userId) {
    throw new NotFoundError(`Game session not found: ${submission.sessionId}`);
  }
--- a/apps/api/src/services/multiplayerGameService.test.ts
+++ b/apps/api/src/services/multiplayerGameService.test.ts
@ -9,10 +9,10 @@ const mockGetGameTerms = vi.mocked(getGameTerms);
 const mockGetDistractors = vi.mocked(getDistractors);
 const fakeTerms = [
-  { termId: "t1", sourceText: "dog", targetText: "cane", sourceGloss: null },
+  { entryId: "t1", sourceText: "dog", targetText: "cane", sourceGloss: null },
-  { termId: "t2", sourceText: "cat", targetText: "gatto", sourceGloss: null },
+  { entryId: "t2", sourceText: "cat", targetText: "gatto", sourceGloss: null },
  {
-    termId: "t3",
+    entryId: "t3",
    sourceText: "house",
    targetText: "casa",
    sourceGloss: "a building for living in",
--- a/apps/api/src/services/multiplayerGameService.ts
+++ b/apps/api/src/services/multiplayerGameService.ts
@ -44,8 +44,9 @@ export const generateMultiplayerQuestions = async (): Promise<
  const questions: MultiplayerQuestion[] = await Promise.all(
    correctAnswers.map(async (correctAnswer) => {
      const distractorTexts = await getDistractors(
-        correctAnswer.termId,
+        correctAnswer.entryId,
        correctAnswer.targetText,
        MULTIPLAYER_DEFAULTS.sourceLanguage,
        MULTIPLAYER_DEFAULTS.targetLanguage,
        MULTIPLAYER_DEFAULTS.pos,
        MULTIPLAYER_DEFAULTS.difficulty,
--- a/apps/api/src/types/express.d.ts
+++ b/apps/api/src/types/express.d.ts
@ -18,3 +18,7 @@ declare module "ws" {
 export type AuthenticatedRequest = Request & {
  session: { session: Session; user: User };
 };
 export type GuestRequest = Request & {
  session?: { session: Session; user: User };
 };
--- a/apps/web/README.md
+++ b/apps/web/README.md
@ -1,73 +0,0 @@
 # React + TypeScript + Vite
 This template provides a minimal setup to get React working in Vite with HMR and some ESLint rules.
 Currently, two official plugins are available:
 - [@vitejs/plugin-react](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react) uses [Oxc](https://oxc.rs)
 - [@vitejs/plugin-react-swc](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react-swc) uses [SWC](https://swc.rs/)
 ## React Compiler
 The React Compiler is not enabled on this template because of its impact on dev & build performances. To add it, see [this documentation](https://react.dev/learn/react-compiler/installation).
 ## Expanding the ESLint configuration
 If you are developing a production application, we recommend updating the configuration to enable type-aware lint rules:
 ```js
 export default defineConfig([
  globalIgnores(["dist"]),
  {
    files: ["**/*.{ts,tsx}"],
    extends: [
      // Other configs...
      // Remove tseslint.configs.recommended and replace with this
      tseslint.configs.recommendedTypeChecked,
      // Alternatively, use this for stricter rules
      tseslint.configs.strictTypeChecked,
      // Optionally, add this for stylistic rules
      tseslint.configs.stylisticTypeChecked,
      // Other configs...
    ],
    languageOptions: {
      parserOptions: {
        project: ["./tsconfig.node.json", "./tsconfig.app.json"],
        tsconfigRootDir: import.meta.dirname,
      },
      // other options...
    },
  },
 ]);
 ```
 You can also install [eslint-plugin-react-x](https://github.com/Rel1cx/eslint-react/tree/main/packages/plugins/eslint-plugin-react-x) and [eslint-plugin-react-dom](https://github.com/Rel1cx/eslint-react/tree/main/packages/plugins/eslint-plugin-react-dom) for React-specific lint rules:
 ```js
 // eslint.config.js
 import reactX from "eslint-plugin-react-x";
 import reactDom from "eslint-plugin-react-dom";
 export default defineConfig([
  globalIgnores(["dist"]),
  {
    files: ["**/*.{ts,tsx}"],
    extends: [
      // Other configs...
      // Enable lint rules for React
      reactX.configs["recommended-typescript"],
      // Enable lint rules for React DOM
      reactDom.configs.recommended,
    ],
    languageOptions: {
      parserOptions: {
        project: ["./tsconfig.node.json", "./tsconfig.app.json"],
        tsconfigRootDir: import.meta.dirname,
      },
      // other options...
    },
  },
 ]);
 ```
--- a/apps/web/src/components/game/ScoreScreen.tsx
+++ b/apps/web/src/components/game/ScoreScreen.tsx
@ -1,9 +1,17 @@
 import type { AnswerResult } from "@lila/shared";
 import { ConfettiBurst } from "../ui/ConfettiBurst";
-type ScoreScreenProps = { results: AnswerResult[]; onPlayAgain: () => void };
+type ScoreScreenProps = {
  results: AnswerResult[];
  onPlayAgain: () => void;
  showAuthPrompt?: boolean;
 };
-export const ScoreScreen = ({ results, onPlayAgain }: ScoreScreenProps) => {
+export const ScoreScreen = ({
  results,
  onPlayAgain,
  showAuthPrompt = false,
 }: ScoreScreenProps) => {
  const score = results.filter((r) => r.isCorrect).length;
  const total = results.length;
  const percentage = Math.round((score / total) * 100);
@ -58,12 +66,34 @@ export const ScoreScreen = ({ results, onPlayAgain }: ScoreScreenProps) => {
        ))}
      </div>
      {showAuthPrompt && (
        <div className="w-full rounded-2xl border border-(--color-primary-light) bg-white/60 dark:bg-black/10 backdrop-blur p-6 text-center">
          <p className="text-sm text-(--color-text-muted) mb-3">
            Want to save your progress and compete with friends?
          </p>
          <a
            href="/?modal=auth&redirect=/play"
            className="inline-block w-full py-3 rounded-xl text-base font-bold bg-linear-to-r from-pink-400 to-purple-500 text-white shadow-sm hover:shadow-md hover:-translate-y-0.5 active:translate-y-0 transition-all duration-200"
          >
            Create an account
          </a>
          <button
            onClick={onPlayAgain}
            className="mt-3 text-sm text-(--color-text-muted) hover:text-(--color-primary) transition-colors cursor-pointer"
          >
            Continue as guest →
          </button>
        </div>
      )}
      {!showAuthPrompt && (
        <button
          onClick={onPlayAgain}
          className="w-full py-3 px-10 rounded-2xl text-lg font-black bg-(--color-primary) text-white shadow-sm hover:shadow-md hover:bg-(--color-primary-dark) hover:-translate-y-0.5 active:translate-y-0 transition-all duration-200 cursor-pointer"
        >
          Play again
        </button>
      )}
    </div>
  );
 };
--- a/apps/web/src/routes/play.tsx
+++ b/apps/web/src/routes/play.tsx
@ -1,5 +1,5 @@
-import { createFileRoute, redirect } from "@tanstack/react-router";
+import { createFileRoute } from "@tanstack/react-router";
-import { useState, useCallback } from "react";
+import { useState, useCallback, useEffect } from "react";
 import type { GameSession, GameRequest, AnswerResult } from "@lila/shared";
 import { QuestionCard } from "../components/game/QuestionCard";
 import { ScoreScreen } from "../components/game/ScoreScreen";
@ -18,6 +18,13 @@ function Play() {
  const [currentQuestionIndex, setCurrentQuestionIndex] = useState(0);
  const [results, setResults] = useState<AnswerResult[]>([]);
  const [currentResult, setCurrentResult] = useState<AnswerResult | null>(null);
  const [isAuthenticated, setIsAuthenticated] = useState<boolean | null>(null);
  useEffect(() => {
    void authClient.getSession().then(({ data }) => {
      setIsAuthenticated(!!data);
    });
  }, []);
  const startGame = useCallback(async (settings: GameRequest) => {
    setIsLoading(true);
@ -100,7 +107,11 @@ function Play() {
  if (currentQuestionIndex >= gameSession.questions.length) {
    return (
      <div className="min-h-screen bg-linear-to-b from-purple-100 to-pink-50 flex items-center justify-center p-6">
-        <ScoreScreen results={results} onPlayAgain={resetToSetup} />
+        <ScoreScreen
          results={results}
          onPlayAgain={resetToSetup}
          showAuthPrompt={isAuthenticated === false}
        />
      </div>
    );
  }
@ -129,10 +140,4 @@ function Play() {
 export const Route = createFileRoute("/play")({
  component: Play,
  errorComponent: RouteError,
  beforeLoad: async () => {
    const { data: session } = await authClient.getSession();
    if (!session) {
      throw redirect({ to: "/", search: { modal: "auth", redirect: "/play" } });
    }
  },
 });
--- a/data-pipeline/.env.example
+++ b/data-pipeline/.env.example
@ -0,0 +1,7 @@
 # OpenRouter API key — required for OpenRouter providers
 # Get one at https://openrouter.ai/keys
 OPENROUTER_API_KEY=
 # Anthropic API key — required for Anthropic provider (reference baseline only)
 # Get one at https://console.anthropic.com/
 ANTHROPIC_API_KEY=
--- a/data-pipeline/audit.ts
+++ b/data-pipeline/audit.ts
@ -0,0 +1,87 @@
 import Database from "better-sqlite3";
 import path from "node:path";
 import fs from "node:fs";
 import { fileURLToPath } from "node:url";
 const __dirname = path.dirname(fileURLToPath(import.meta.url));
 const DB_PATH = path.join(__dirname, "db/pipeline.db");
 const db = new Database(DB_PATH, { readonly: true });
 // Pull 50 synsets: ~12 per POS, all must have German translations
 const synsets = db
  .prepare(
    `
    SELECT DISTINCT s.source_id, s.pos
    FROM synsets s
    JOIN translations t ON t.source_id = s.source_id
    WHERE t.language = 'de'
    ORDER BY RANDOM()
    LIMIT 50
  `,
  )
  .all() as { source_id: string; pos: string }[];
 const results: string[] = [];
 let index = 0;
 for (const synset of synsets) {
  index++;
  const glosses = db
    .prepare("SELECT language, text FROM glosses WHERE source_id = ?")
    .all(synset.source_id) as { language: string; text: string }[];
  const enGloss = glosses.find((g) => g.language === "en")?.text ?? "—";
  const deGloss = glosses.find((g) => g.language === "de")?.text ?? "—";
  const deTranslations = db
    .prepare(
      "SELECT word FROM translations WHERE source_id = ? AND language = 'de'",
    )
    .all(synset.source_id) as { word: string }[];
  const enTranslations = db
    .prepare(
      "SELECT word FROM translations WHERE source_id = ? AND language = 'en'",
    )
    .all(synset.source_id) as { word: string }[];
  const deWords = deTranslations.map((t) => t.word);
  const enWords = enTranslations.map((t) => t.word);
  results.push(
    [
      `${String(index).padStart(2, " ")}. [${synset.pos}] ${synset.source_id}`,
      `    EN gloss: ${enGloss}`,
      `    DE gloss: ${deGloss}`,
      `    EN words: ${enWords.join(", ")}`,
      `    DE words: ${deWords.join(", ")}`,
      `    QUALITY:  ___`,
      ``,
    ].join("\n"),
  );
 }
 const output = [
  "# OMW German Translation Quality Audit",
  "",
  "Instructions: for each entry, check if the German translations",
  "match the meaning described by the English gloss.",
  "",
  "Mark QUALITY as:",
  "  OK    — all German translations fit the meaning",
  "  PARTIAL — some fit, some don't",
  "  BAD   — none of the German translations fit",
  "  USELESS — translations are correct but useless for learners",
  "",
  "---",
  "",
  ...results,
 ].join("\n");
 const outPath = path.join(__dirname, "audit.md");
 fs.writeFileSync(outPath, output, "utf-8");
 console.log(`Wrote ${synsets.length} entries → ${outPath}`);
 db.close();
--- a/data-pipeline/db/import.ts
+++ b/data-pipeline/db/import.ts
@ -0,0 +1,154 @@
 import fs from "node:fs/promises";
 import path from "node:path";
 import { fileURLToPath } from "node:url";
 import { SUPPORTED_LANGUAGE_CODES } from "@lila/shared";
 import { openDb } from "./index.js";
 import type { ExtractedSense } from "../stage-1-extract/scripts/extract.js";
 // ── Paths ─────────────────────────────────────────────────────────────────────
 const __dirname = path.dirname(fileURLToPath(import.meta.url));
 const OUTPUT_DIR = path.resolve(__dirname, "../stage-1-extract/output");
 // ── Import ────────────────────────────────────────────────────────────────────
 export async function importKaikki(): Promise<void> {
  const db = openDb();
  const insertEntry = db.prepare(`
    INSERT INTO entries (headword, language, pos, sense_index, gloss, examples)
    VALUES (?, ?, ?, ?, ?, ?)
    ON CONFLICT (headword, language, pos, sense_index)
    DO UPDATE SET
      gloss    = excluded.gloss,
      examples = excluded.examples
    RETURNING id
  `);
  const insertTranslation = db.prepare(`
    INSERT INTO translations (entry_id, target_lang, word, sense_hint)
    VALUES (?, ?, ?, ?)
    ON CONFLICT (entry_id, target_lang, word) DO NOTHING
  `);
  let totalEntries = 0;
  let totalTranslations = 0;
  let totalSkipped = 0;
  for (const lang of SUPPORTED_LANGUAGE_CODES) {
    const filePath = path.join(OUTPUT_DIR, `${lang}.json`);
    let senses: ExtractedSense[];
    try {
      const raw = await fs.readFile(filePath, "utf-8");
      senses = JSON.parse(raw) as ExtractedSense[];
    } catch {
      console.warn(`  Warning: no output file found for ${lang}, skipping`);
      continue;
    }
    console.log(
      `  Importing ${lang}: ${senses.length.toLocaleString()} senses...`,
    );
    // Track next available sense_index per (headword, pos) to handle
    // the same word appearing in multiple JSONL entries with the same POS.
    const senseIndexMap = new Map<string, number>();
    const importLang = db.transaction(() => {
      let entries = 0;
      let translations = 0;
      let skipped = 0;
      for (const sense of senses) {
        const key = `${sense.headword}|${sense.pos}`;
        const nextIndex = senseIndexMap.get(key) ?? 0;
        senseIndexMap.set(key, nextIndex + 1);
        const row = insertEntry.get(
          sense.headword,
          sense.language,
          sense.pos,
          nextIndex,
          sense.gloss ?? null,
          JSON.stringify(sense.examples),
        ) as { id: number } | undefined;
        if (!row) {
          skipped++;
          continue;
        }
        entries++;
        for (const t of sense.translations) {
          insertTranslation.run(
            row.id,
            t.target_lang,
            t.word,
            t.sense_hint ?? null,
          );
          translations++;
        }
      }
      return { entries, translations, skipped };
    });
    const counts = importLang();
    totalEntries += counts.entries;
    totalTranslations += counts.translations;
    totalSkipped += counts.skipped;
    console.log(
      `    entries: ${counts.entries.toLocaleString()}, translations: ${counts.translations.toLocaleString()}, skipped: ${counts.skipped.toLocaleString()}`,
    );
  }
  db.close();
  console.log(`\nImport complete:`);
  console.log(`  Total entries:      ${totalEntries.toLocaleString()}`);
  console.log(`  Total translations: ${totalTranslations.toLocaleString()}`);
  console.log(`  Total skipped:      ${totalSkipped.toLocaleString()}`);
 }
 // ── Check if already imported ─────────────────────────────────────────────────
 export function isImported(): boolean {
  const db = openDb();
  const row = db.prepare("SELECT COUNT(*) as count FROM entries").get() as {
    count: number;
  };
  db.close();
  return row.count > 0;
 }
 // ── Main ─────────────────────────────────────────────────────────────────────
 async function main(): Promise<void> {
  const db = openDb();
  const row = db.prepare("SELECT COUNT(*) as count FROM entries").get() as {
    count: number;
  };
  db.close();
  if (row.count > 0) {
    console.log(
      `pipeline.db already contains ${row.count.toLocaleString()} entries — skipping import.`,
    );
    console.log("Delete pipeline.db and re-run db:init to start fresh.");
    process.exit(0);
  }
  console.log("Importing Kaikki data into pipeline.db...");
  await importKaikki();
 }
 if (import.meta.url === `file://${process.argv[1]}`) {
  main().catch((err) => {
    console.error(err);
    process.exit(1);
  });
 }
--- a/data-pipeline/db/index.ts
+++ b/data-pipeline/db/index.ts
@ -0,0 +1,24 @@
 import path from "node:path";
 import { fileURLToPath } from "node:url";
 import Database from "better-sqlite3";
 // ── Paths ─────────────────────────────────────────────────────────────────────
 const __dirname = path.dirname(fileURLToPath(import.meta.url));
 const DB_PATH = path.join(__dirname, "pipeline.db");
 // ── Types ─────────────────────────────────────────────────────────────────────
 export type Db = InstanceType<typeof Database>;
 // ── Open ──────────────────────────────────────────────────────────────────────
 export function openDb(): Db {
  const db = new Database(DB_PATH);
  db.pragma("journal_mode = WAL");
  db.pragma("foreign_keys = ON");
  return db;
 }
--- a/data-pipeline/db/init.ts
+++ b/data-pipeline/db/init.ts
@ -0,0 +1,42 @@
 import fs from "node:fs/promises";
 import path from "node:path";
 import { fileURLToPath } from "node:url";
 import Database from "better-sqlite3";
 // ── Paths ─────────────────────────────────────────────────────────────────────
 const __dirname = path.dirname(fileURLToPath(import.meta.url));
 const PATHS = {
  schema: path.join(__dirname, "schema.sql"),
  db: path.join(__dirname, "pipeline.db"),
 };
 // ── Init ──────────────────────────────────────────────────────────────────────
 export async function initDb(): Promise<void> {
  const schema = await fs.readFile(PATHS.schema, "utf-8");
  const db = new Database(PATHS.db);
  db.pragma("journal_mode = WAL");
  db.pragma("foreign_keys = ON");
  db.exec(schema);
  db.close();
  console.log(`  pipeline.db initialised → ${PATHS.db}`);
 }
 // ── Main ─────────────────────────────────────────────────────────────────────
 async function main(): Promise<void> {
  console.log("Initialising pipeline.db...");
  await initDb();
 }
 // after
 if (import.meta.url === `file://${process.argv[1]}`) {
  main().catch((err) => {
    console.error(err);
    process.exit(1);
  });
 }
--- a/data-pipeline/db/pipeline.db-shm
+++ b/data-pipeline/db/pipeline.db-shm
--- a/data-pipeline/db/pipeline.db-wal
+++ b/data-pipeline/db/pipeline.db-wal
--- a/data-pipeline/db/reset.ts
+++ b/data-pipeline/db/reset.ts
@ -0,0 +1,41 @@
 import path from "node:path";
 import { fileURLToPath } from "node:url";
 import Database from "better-sqlite3";
 // ── Paths ─────────────────────────────────────────────────────────────────────
 const __dirname = path.dirname(fileURLToPath(import.meta.url));
 const DB_PATH = path.join(__dirname, "pipeline.db");
 // ── Main ──────────────────────────────────────────────────────────────────────
 function main(): void {
  const mode = process.argv[2];
  if (!mode || (mode !== "round1" && mode !== "all")) {
    console.error("Usage: pnpm db:reset round1 | all");
    console.error("  round1 — delete all round1 sub-stage rows");
    console.error("  all    — delete all run_status rows except reverse_link");
    process.exit(1);
  }
  const db = new Database(DB_PATH);
  let result: { changes: number };
  if (mode === "round1") {
    result = db
      .prepare("DELETE FROM run_status WHERE stage LIKE 'round1%'")
      .run();
    console.log(`Deleted ${result.changes} round1 rows from run_status`);
  } else {
    result = db
      .prepare("DELETE FROM run_status WHERE stage NOT IN ('reverse_link')")
      .run();
    console.log(`Deleted ${result.changes} rows from run_status`);
  }
  db.close();
 }
 main();
--- a/data-pipeline/db/schema.sql
+++ b/data-pipeline/db/schema.sql
@ -0,0 +1,164 @@
 -- ── Base data ─────────────────────────────────────────────────────────────────
 -- Imported from Kaikki on first run. Never mutated after import.
 CREATE TABLE IF NOT EXISTS entries (
  id          INTEGER PRIMARY KEY,
  headword    TEXT    NOT NULL,
  language    TEXT    NOT NULL,
  pos         TEXT    NOT NULL,
  sense_index INTEGER NOT NULL DEFAULT 0,
  gloss       TEXT,
  examples    TEXT    NOT NULL DEFAULT '[]', -- JSON array of strings
  source      TEXT    NOT NULL DEFAULT 'kaikki',
  UNIQUE (headword, language, pos, sense_index)
 );
 CREATE TABLE IF NOT EXISTS translations (
  id          INTEGER PRIMARY KEY,
  entry_id    INTEGER NOT NULL REFERENCES entries(id),
  target_lang TEXT    NOT NULL,
  word        TEXT    NOT NULL,
  sense_hint  TEXT,
  source      TEXT    NOT NULL DEFAULT 'kaikki',
  UNIQUE (entry_id, target_lang, word)
 );
 -- ── Status tracking ───────────────────────────────────────────────────────────
 -- One row per entry per model per stage. Drives resumability.
 -- Sentinel rows use entry_id = 0 for one-time pipeline steps.
 -- stage:  round1 | round2 | tiebreak
 -- status: pending | complete | needs_review | flagged
 CREATE TABLE IF NOT EXISTS run_status (
  id         INTEGER PRIMARY KEY,
  entry_id   INTEGER NOT NULL,
  model_name TEXT    NOT NULL,
  stage      TEXT    NOT NULL,
  status     TEXT    NOT NULL,
  created_at TEXT    NOT NULL DEFAULT (datetime('now')),
  updated_at TEXT    NOT NULL DEFAULT (datetime('now')),
  UNIQUE (entry_id, model_name, stage)
 );
 -- ── Round 1 output ────────────────────────────────────────────────────────────
 -- Written atomically per entry per model.
 -- Unique constraints enforce one model one vote.
 CREATE TABLE IF NOT EXISTS model_entry_cefr_votes (
  id         INTEGER PRIMARY KEY,
  entry_id   INTEGER NOT NULL REFERENCES entries(id),
  model_name TEXT    NOT NULL,
  cefr_level TEXT    NOT NULL,
  UNIQUE (entry_id, model_name)
 );
 CREATE TABLE IF NOT EXISTS model_translation_cefr_votes (
  id             INTEGER PRIMARY KEY,
  translation_id INTEGER NOT NULL REFERENCES translations(id),
  model_name     TEXT    NOT NULL,
  cefr_level     TEXT    NOT NULL,
  UNIQUE (translation_id, model_name)
 );
 CREATE TABLE IF NOT EXISTS model_translation_rejections (
  id             INTEGER PRIMARY KEY,
  translation_id INTEGER NOT NULL REFERENCES translations(id),
  model_name     TEXT    NOT NULL,
  UNIQUE (translation_id, model_name)
 );
 CREATE TABLE IF NOT EXISTS generated_glosses (
  id         INTEGER PRIMARY KEY,
  entry_id   INTEGER NOT NULL REFERENCES entries(id),
  model_name TEXT    NOT NULL,
  text       TEXT    NOT NULL,
  UNIQUE (entry_id, model_name)
 );
 CREATE TABLE IF NOT EXISTS generated_examples (
  id         INTEGER PRIMARY KEY,
  entry_id   INTEGER NOT NULL REFERENCES entries(id),
  model_name TEXT    NOT NULL,
  text       TEXT    NOT NULL,
  UNIQUE (entry_id, model_name)
 );
 CREATE TABLE IF NOT EXISTS generated_translations (
  id          INTEGER PRIMARY KEY,
  entry_id    INTEGER NOT NULL REFERENCES entries(id),
  model_name  TEXT    NOT NULL,
  target_lang TEXT    NOT NULL,
  word        TEXT    NOT NULL,
  UNIQUE (entry_id, model_name, target_lang)
 );
 -- ── Round 2 output ────────────────────────────────────────────────────────────
 -- Each row represents one model voting for one candidate.
 -- The candidate with the most votes wins in merge.
 CREATE TABLE IF NOT EXISTS gloss_candidate_votes (
  id         INTEGER PRIMARY KEY,
  gloss_id   INTEGER NOT NULL REFERENCES generated_glosses(id),
  model_name TEXT    NOT NULL,
  UNIQUE (gloss_id, model_name)
 );
 CREATE TABLE IF NOT EXISTS example_candidate_votes (
  id         INTEGER PRIMARY KEY,
  example_id INTEGER NOT NULL REFERENCES generated_examples(id),
  model_name TEXT    NOT NULL,
  UNIQUE (example_id, model_name)
 );
 CREATE TABLE IF NOT EXISTS translation_candidate_votes (
  id             INTEGER PRIMARY KEY,
  translation_id INTEGER NOT NULL REFERENCES generated_translations(id),
  model_name     TEXT    NOT NULL,
  UNIQUE (translation_id, model_name)
 );
 -- ── Resolved output ───────────────────────────────────────────────────────────
 -- Written by merge. Never updated after writing.
 -- Only fully resolved records are written here — no nulls.
 -- Absence of a row means unresolved. Flagged status tracked in run_status.
 -- source: kaikki | model_name
 CREATE TABLE IF NOT EXISTS resolved_entry_cefr (
  id         INTEGER PRIMARY KEY,
  entry_id   INTEGER NOT NULL REFERENCES entries(id),
  cefr_level TEXT    NOT NULL,
  difficulty TEXT    NOT NULL,
  UNIQUE (entry_id)
 );
 CREATE TABLE IF NOT EXISTS resolved_translation_cefr (
  id             INTEGER PRIMARY KEY,
  translation_id INTEGER NOT NULL REFERENCES translations(id),
  cefr_level     TEXT    NOT NULL,
  difficulty     TEXT    NOT NULL,
  UNIQUE (translation_id)
 );
 CREATE TABLE IF NOT EXISTS resolved_glosses (
  id         INTEGER PRIMARY KEY,
  entry_id   INTEGER NOT NULL REFERENCES entries(id),
  text       TEXT    NOT NULL,
  source     TEXT    NOT NULL,
  UNIQUE (entry_id)
 );
 CREATE TABLE IF NOT EXISTS resolved_examples (
  id         INTEGER PRIMARY KEY,
  entry_id   INTEGER NOT NULL REFERENCES entries(id),
  text       TEXT    NOT NULL,
  source     TEXT    NOT NULL
 );
 CREATE TABLE IF NOT EXISTS resolved_generated_translations (
  id          INTEGER PRIMARY KEY,
  entry_id    INTEGER NOT NULL REFERENCES entries(id),
  target_lang TEXT    NOT NULL,
  word        TEXT    NOT NULL,
  source      TEXT    NOT NULL,
  UNIQUE (entry_id, target_lang)
 );
--- a/data-pipeline/package.json
+++ b/data-pipeline/package.json
@ -3,7 +3,16 @@
  "version": "1.0.0",
  "private": true,
  "type": "module",
-  "scripts": {},
+  "scripts": {
    "db:reset": "tsx db/reset.ts",
    "extract": "tsx stage-1-extract/scripts/extract.ts",
    "reverse-link": "tsx stage-2-reverse-link/scripts/reverse-link.ts",
    "db:import": "tsx db/import.ts",
    "db:init": "tsx db/init.ts",
    "test": "vitest run",
    "test:watch": "vitest",
    "pipeline:run": "tsx --env-file .env pipeline.ts"
  },
  "dependencies": {
    "@lila/shared": "workspace:*",
    "better-sqlite3": "^12.9.0"
@ -12,6 +21,7 @@
    "@types/better-sqlite3": "^7.6.13",
    "@types/node": "^24.12.0",
    "tsx": "^4.21.0",
-    "typescript": "^5.9.3"
+    "typescript": "^5.9.3",
    "vitest": "^4.1.0"
  }
 }
--- a/data-pipeline/pipeline.ts
+++ b/data-pipeline/pipeline.ts
@ -0,0 +1,616 @@
 import fs from "node:fs/promises";
 import path from "node:path";
 import { fileURLToPath } from "node:url";
 import { initDb } from "./db/init.js";
 import { isImported, importKaikki } from "./db/import.js";
 import { openDb } from "./db/index.js";
 import { reverseLink } from "./stage-2-reverse-link/scripts/reverse-link.js";
 import { ALL_PROVIDERS, validateProviderKey } from "./stage-3-enrich/config.js";
 import type { ProviderConfig } from "./stage-3-enrich/config.js";
 import { enrich } from "./stage-3-enrich/scripts/enrich.js";
 // ── Types ─────────────────────────────────────────────────────────────────────
 type RunStage =
  | "round1"
  | "compile_candidates"
  | "round2"
  | "compile_votes"
  | "merge"
  | "tiebreak"
  | "compare";
 type StageStatus = "complete" | "pending" | "in_progress";
 type RunStats = {
  startedAt: Date;
  stoppedAt: Date | null;
  recordsProcessed: number;
  recordsSkipped: number;
  needsReview: number;
  modelsRun: string[];
  currentStage: RunStage | null;
 };
 // ── Constants ─────────────────────────────────────────────────────────────────
 const __dirname = path.dirname(fileURLToPath(import.meta.url));
 const PATHS = {
  extractedEn: path.join(__dirname, "stage-1-extract/output/en.json"),
  db: path.join(__dirname, "db/pipeline.db"),
  reports: path.join(__dirname, "reports"),
  llamaHealth: "http://127.0.0.1:8080/health",
 };
 const SENTINEL = { entryId: 0, modelName: "system" };
 // ── Startup checks ────────────────────────────────────────────────────────────
 async function checkExtractedFilesExist(): Promise<void> {
  try {
    await fs.access(PATHS.extractedEn);
  } catch {
    console.error("\n  ERROR: stage-1-extract/output/en.json not found.");
    console.error("  Run the stage 1 extraction script first:");
    console.error("    pnpm extract\n");
    process.exit(1);
  }
 }
 async function checkAndInitDb(): Promise<void> {
  try {
    await fs.access(PATHS.db);
  } catch {
    console.log("  pipeline.db not found — initialising...");
    await initDb();
  }
 }
 async function checkAndImportDb(): Promise<void> {
  if (!isImported()) {
    console.log("  Base tables empty — importing Kaikki data...");
    await importKaikki();
  }
 }
 async function checkLlamaServer(): Promise<boolean> {
  try {
    const res = await fetch(PATHS.llamaHealth);
    return res.ok;
  } catch {
    return false;
  }
 }
 function isLocalProvider(provider: ProviderConfig): boolean {
  return provider.apiKey === "none";
 }
 async function checkProviderReady(provider: ProviderConfig): Promise<void> {
  if (isLocalProvider(provider)) {
    const running = await checkLlamaServer();
    if (!running) {
      console.error("\n  ERROR: llama.cpp server is not running.");
      console.error("  Start the server before running the pipeline:");
      console.error(
        "    ./build/bin/llama-server --model models/<model>.gguf \\",
      );
      console.error("      --port 8080 --host 127.0.0.1");
      console.error("  See llm-setup.md for full instructions.\n");
      process.exit(1);
    }
  } else {
    validateProviderKey(provider);
  }
 }
 // ── Run name generation ───────────────────────────────────────────────────────
 async function generateRunName(): Promise<string> {
  await fs.mkdir(PATHS.reports, { recursive: true });
  const date = new Date().toISOString().exi(0, 10);
  const files = await fs.readdir(PATHS.reports);
  const todaysRuns = files.filter(
    (f) => f.startsWith(date) && f.endsWith(".json"),
  ).length;
  return `${date}_run-${todaysRuns + 1}`;
 }
 // ── Shutdown handler ──────────────────────────────────────────────────────────
 let shutdownRequested = false;
 function registerShutdownHandler(stats: RunStats): void {
  const handler = (): void => {
    if (shutdownRequested) return;
    shutdownRequested = true;
    stats.stoppedAt = new Date();
    console.log("\n\n  Shutdown requested — finishing current record...");
  };
  process.on("SIGINT", handler);
  process.on("SIGTERM", handler);
 }
 // ── Stage status helpers ──────────────────────────────────────────────────────
 function getSentinelStatus(stage: RunStage): StageStatus {
  const db = openDb();
  const row = db
    .prepare(
      `SELECT status FROM run_status
       WHERE entry_id = ? AND model_name = ? AND stage = ?`,
    )
    .get(SENTINEL.entryId, SENTINEL.modelName, stage) as
    | { status: string }
    | undefined;
  db.close();
  return row?.status === "complete" ? "complete" : "pending";
 }
 function markSentinelComplete(stage: RunStage): void {
  const db = openDb();
  db.prepare(
    `INSERT INTO run_status (entry_id, model_name, stage, status)
     VALUES (?, ?, ?, 'complete')
     ON CONFLICT (entry_id, model_name, stage)
     DO UPDATE SET status = 'complete', updated_at = datetime('now')`,
  ).run(SENTINEL.entryId, SENTINEL.modelName, stage);
  db.close();
 }
 function getModelRound1Status(modelName: string): StageStatus {
  const db = openDb();
  const total = (
    db
      .prepare("SELECT COUNT(*) as count FROM entries WHERE language = 'en'")
      .get() as { count: number }
  ).count;
  const complete = (
    db
      .prepare(
        `SELECT COUNT(*) as count FROM run_status
         WHERE model_name = ? AND stage = 'round1_gloss'
         AND status = 'complete'`,
      )
      .get(modelName) as { count: number }
  ).count;
  db.close();
  if (complete === 0) return "pending";
  if (complete >= total) return "complete";
  return "in_progress";
 }
 function getModelRound2Status(modelName: string): StageStatus {
  const db = openDb();
  const total = (
    db
      .prepare("SELECT COUNT(*) as count FROM entries WHERE language = 'en'")
      .get() as { count: number }
  ).count;
  const complete = (
    db
      .prepare(
        `SELECT COUNT(*) as count FROM run_status
         WHERE model_name = ? AND stage = 'round2' AND status = 'complete'`,
      )
      .get(modelName) as { count: number }
  ).count;
  db.close();
  if (complete === 0) return "pending";
  if (complete >= total) return "complete";
  return "in_progress";
 }
 function isReverseLinkDone(): boolean {
  const db = openDb();
  const row = db
    .prepare(
      `SELECT status FROM run_status
       WHERE entry_id = ? AND model_name = ? AND stage = 'reverse_link'`,
    )
    .get(SENTINEL.entryId, SENTINEL.modelName) as
    | { status: string }
    | undefined;
  db.close();
  return row?.status === "complete";
 }
 function markReverseLinkComplete(): void {
  const db = openDb();
  db.prepare(
    `INSERT INTO run_status (entry_id, model_name, stage, status)
     VALUES (?, ?, 'reverse_link', 'complete')
     ON CONFLICT (entry_id, model_name, stage)
     DO UPDATE SET status = 'complete', updated_at = datetime('now')`,
  ).run(SENTINEL.entryId, SENTINEL.modelName);
  db.close();
 }
 // ── Stage runners ─────────────────────────────────────────────────────────────
 function runReverseLinkStage(): void {
  if (isReverseLinkDone()) {
    console.log("\n  [reverse link] Already complete, skipping");
    return;
  }
  console.log("\n  [reverse link] Syncing reverse translation links...");
  reverseLink();
  markReverseLinkComplete();
 }
 async function runRound1(
  provider: ProviderConfig,
  stats: RunStats,
 ): Promise<void> {
  console.log(`\n  [round 1] Running ${provider.name}...`);
  const counts = await enrich(provider);
  stats.recordsProcessed += counts.processed;
  stats.recordsSkipped += counts.skipped;
  stats.needsReview += counts.needsReview;
  stats.modelsRun.push(provider.name);
 }
 function compileCandidates(): void {
  console.log("\n  [compile candidates] Compiling round 1 output...");
  // TODO: implement compile candidates script
  console.log("  [compile candidates] not yet implemented");
  markSentinelComplete("compile_candidates");
 }
 function runRound2(provider: ProviderConfig, stats: RunStats): void {
  console.log(`\n  [round 2] Running ${provider.name}...`);
  // TODO: implement round 2 enrich script
  console.log(`  [round 2] ${provider.name} — not yet implemented`);
  stats.modelsRun.push(provider.name);
 }
 function compileVotes(): void {
  console.log("\n  [compile votes] Compiling round 2 votes...");
  // TODO: implement compile votes script
  console.log("  [compile votes] not yet implemented");
  markSentinelComplete("compile_votes");
 }
 function runMerge(): void {
  console.log("\n  [merge] Resolving votes...");
  // TODO: implement merge script
  console.log("  [merge] not yet implemented");
  markSentinelComplete("merge");
 }
 function runTiebreak(stats: RunStats): void {
  console.log("\n  [tiebreak] Resolving flagged entries...");
  // TODO: implement tiebreak logic
  console.log("  [tiebreak] not yet implemented");
  stats.currentStage = "tiebreak";
 }
 function runCompare(): void {
  console.log("\n  [compare] Generating COVERAGE.md...");
  // TODO: implement compare script
  console.log("  [compare] not yet implemented");
  markSentinelComplete("compare");
 }
 // ── Report generation ─────────────────────────────────────────────────────────
 async function generateReport(runName: string, stats: RunStats): Promise<void> {
  const db = openDb();
  const totalEntries = (
    db.prepare("SELECT COUNT(*) as count FROM entries").get() as {
      count: number;
    }
  ).count;
  const resolvedEntries = (
    db.prepare("SELECT COUNT(*) as count FROM resolved_entry_cefr").get() as {
      count: number;
    }
  ).count;
  const flaggedEntries = (
    db
      .prepare(
        `SELECT COUNT(*) as count FROM run_status
         WHERE stage = 'merge' AND status = 'flagged'`,
      )
      .get() as { count: number }
  ).count;
  const needsReview = (
    db
      .prepare(
        `SELECT COUNT(*) as count FROM run_status
         WHERE status = 'needs_review'`,
      )
      .get() as { count: number }
  ).count;
  db.close();
  const stoppedAt = stats.stoppedAt ?? new Date();
  const durationMs = stoppedAt.getTime() - stats.startedAt.getTime();
  const durationMin = Math.round(durationMs / 60_000);
  const isFinal =
    getSentinelStatus("compare") === "complete" && flaggedEntries === 0;
  const report = {
    runName,
    generatedAt: stoppedAt.toISOString(),
    durationMinutes: durationMin,
    isFinal,
    progress: {
      totalEntries,
      resolvedEntries,
      flaggedEntries,
      needsReview,
      recordsProcessedThisRun: stats.recordsProcessed,
      recordsSkippedThisRun: stats.recordsSkipped,
    },
    modelsRun: stats.modelsRun,
    stages: {
      reverseLink: isReverseLinkDone() ? "complete" : "pending",
      round1: ALL_PROVIDERS.map((p) => ({
        model: p.name,
        status: getModelRound1Status(p.name),
      })),
      compileCandidates: getSentinelStatus("compile_candidates"),
      round2: ALL_PROVIDERS.map((p) => ({
        model: p.name,
        status: getModelRound2Status(p.name),
      })),
      compileVotes: getSentinelStatus("compile_votes"),
      merge: getSentinelStatus("merge"),
      compare: getSentinelStatus("compare"),
    },
  };
  await fs.mkdir(PATHS.reports, { recursive: true });
  const jsonPath = path.join(PATHS.reports, `${runName}.json`);
  const mdPath = path.join(PATHS.reports, `${runName}.md`);
  await fs.writeFile(jsonPath, JSON.stringify(report, null, 2), "utf-8");
  const md = [
    `# Pipeline run: ${runName}`,
    ``,
    `Generated: ${stoppedAt.toISOString()}`,
    `Duration: ${durationMin} minutes`,
    isFinal
      ? `**Status: FINAL — pipeline complete**`
      : `**Status: In progress**`,
    ``,
    `## Progress`,
    ``,
    `| Metric | Value |`,
    `| ------ | ----- |`,
    `| Total entries | ${totalEntries.toLocaleString()} |`,
    `| Resolved entries | ${resolvedEntries.toLocaleString()} |`,
    `| Flagged entries | ${flaggedEntries.toLocaleString()} |`,
    `| Needs review | ${needsReview.toLocaleString()} |`,
    `| Records processed this run | ${stats.recordsProcessed.toLocaleString()} |`,
    `| Records skipped this run | ${stats.recordsSkipped.toLocaleString()} |`,
    ``,
    `## Stage status`,
    ``,
    `### Reverse link: ${report.stages.reverseLink}`,
    ``,
    `### Round 1`,
    ``,
    ...report.stages.round1.map(
      (s) =>
        `- ${s.status === "complete" ? "✅" : s.status === "in_progress" ? "🔄" : "🔲"} ${s.model}`,
    ),
    ``,
    `### Compile candidates: ${report.stages.compileCandidates}`,
    ``,
    `### Round 2`,
    ``,
    ...report.stages.round2.map(
      (s) =>
        `- ${s.status === "complete" ? "✅" : s.status === "in_progress" ? "🔄" : "🔲"} ${s.model}`,
    ),
    ``,
    `### Compile votes: ${report.stages.compileVotes}`,
    `### Merge: ${report.stages.merge}`,
    `### Compare: ${report.stages.compare}`,
    ``,
    `## Models run this session`,
    ``,
    stats.modelsRun.length > 0
      ? stats.modelsRun.map((m) => `- ${m}`).join("\n")
      : "_none_",
  ].join("\n");
  await fs.writeFile(mdPath, md, "utf-8");
  console.log(`\n  Report written → ${jsonPath}`);
  console.log(`  Report written → ${mdPath}`);
 }
 // ── Main ──────────────────────────────────────────────────────────────────────
 async function main(): Promise<void> {
  console.log("lila data pipeline\n");
  // ── Startup checks
  console.log("Checking prerequisites...");
  await checkExtractedFilesExist();
  await checkAndInitDb();
  await checkAndImportDb();
  console.log("  Prerequisites OK");
  // ── Run name
  const runName = await generateRunName();
  console.log(`\n  Run: ${runName}`);
  // ── Stats
  const stats: RunStats = {
    startedAt: new Date(),
    stoppedAt: null,
    recordsProcessed: 0,
    recordsSkipped: 0,
    needsReview: 0,
    modelsRun: [],
    currentStage: null,
  };
  registerShutdownHandler(stats);
  // ── Stage 2 — Reverse link
  runReverseLinkStage();
  if (shutdownRequested) {
    await generateReport(runName, stats);
    process.exit(0);
  }
  // ── Round 1
  console.log("\nRound 1 — generation");
  for (const provider of ALL_PROVIDERS) {
    if (shutdownRequested) break;
    const status = getModelRound1Status(provider.name);
    if (status === "complete") {
      console.log(`  [round 1] ${provider.name} — already complete, skipping`);
      continue;
    }
    await checkProviderReady(provider);
    stats.currentStage = "round1";
    if (status === "in_progress") {
      console.log(`  [round 1] ${provider.name} — resuming...`);
    }
    await runRound1(provider, stats);
  }
  if (shutdownRequested) {
    await generateReport(runName, stats);
    process.exit(0);
  }
  // ── Compile candidates
  if (getSentinelStatus("compile_candidates") === "complete") {
    console.log("\n  [compile candidates] Already complete, skipping");
  } else {
    stats.currentStage = "compile_candidates";
    compileCandidates();
  }
  if (shutdownRequested) {
    await generateReport(runName, stats);
    process.exit(0);
  }
  // ── Round 2
  console.log("\nRound 2 — voting");
  for (const provider of ALL_PROVIDERS) {
    if (shutdownRequested) break;
    const status = getModelRound2Status(provider.name);
    if (status === "complete") {
      console.log(`  [round 2] ${provider.name} — already complete, skipping`);
      continue;
    }
    await checkProviderReady(provider);
    stats.currentStage = "round2";
    if (status === "in_progress") {
      console.log(`  [round 2] ${provider.name} — resuming...`);
    }
    runRound2(provider, stats);
  }
  if (shutdownRequested) {
    await generateReport(runName, stats);
    process.exit(0);
  }
  // ── Compile votes
  if (getSentinelStatus("compile_votes") === "complete") {
    console.log("\n  [compile votes] Already complete, skipping");
  } else {
    stats.currentStage = "compile_votes";
    compileVotes();
  }
  if (shutdownRequested) {
    await generateReport(runName, stats);
    process.exit(0);
  }
  // ── Merge
  if (getSentinelStatus("merge") === "complete") {
    console.log("\n  [merge] Already complete, skipping");
  } else {
    stats.currentStage = "merge";
    runMerge();
  }
  if (shutdownRequested) {
    await generateReport(runName, stats);
    process.exit(0);
  }
  // ── Tiebreak
  const db = openDb();
  const flagged = (
    db
      .prepare(
        `SELECT COUNT(*) as count FROM run_status
         WHERE stage = 'merge' AND status = 'flagged'`,
      )
      .get() as { count: number }
  ).count;
  db.close();
  if (flagged > 0) {
    stats.currentStage = "tiebreak";
    runTiebreak(stats);
  }
  if (shutdownRequested) {
    await generateReport(runName, stats);
    process.exit(0);
  }
  // ── Compare
  if (getSentinelStatus("compare") === "complete") {
    console.log("\n  [compare] Already complete, skipping");
  } else {
    stats.currentStage = "compare";
    runCompare();
  }
  // ── Report (disabled until full pipeline is implemented)
  // stats.stoppedAt = new Date();
  // await generateReport(runName, stats);
  console.log("\nPipeline complete.");
 }
 main().catch((err) => {
  console.error(err);
  process.exit(1);
 });
--- a/data-pipeline/sample/output/sample.json
+++ b/data-pipeline/sample/output/sample.json
--- a/data-pipeline/sample/scripts/sample.ts
+++ b/data-pipeline/sample/scripts/sample.ts
@ -154,7 +154,7 @@ async function loadAnnotated(): Promise<AnnotatedRecord[]> {
      for (const [l, examples] of Object.entries(record.examples)) {
        const lang = l as SupportedLanguageCode;
        if (!base.examples[lang]) {
-          base.examples[lang] = examples as Example[];
+          base.examples[lang] = examples;
        }
      }
    }
--- a/data-pipeline/stage-1-extract/scripts/extract.py
+++ b/data-pipeline/stage-1-extract/scripts/extract.py
@ -1,204 +0,0 @@
 """
 data-pipeline/stage-1-extract/scripts/extract.py
 Extract all synsets from the Open Multilingual Wordnet (OMW) for all
 supported languages and parts of speech.
 Output: one JSON file per language, written to stage-1-extract/output/
  en.json, it.json, es.json, de.json, fr.json
 Each file is a JSON array of synset records:
  {
    "source_id": "ili:i12345",
    "pos": "noun",
    "translations": { "en": ["dog", "canine"], "it": ["cane"] },
    "glosses":      { "en": ["a domesticated animal..."] },
    "examples":     { "en": ["the dog barked at the stranger"] }
  }
 Usage:
  python stage-1-extract/scripts/extract.py
  python stage-1-extract/scripts/extract.py --sample
 Prerequisites:
  pip install wn
  python -m wn download omw-en:1.4
  python -m wn download omw-it:1.4
  python -m wn download omw-de:1.4
  python -m wn download omw-es:1.4
  python -m wn download omw-fr:1.4
 """
 import json
 import sys
 from pathlib import Path
 import wn
 SUPPORTED_LANGUAGE_CODES: list[str] = ["en", "it", "es", "de", "fr"]
 POS_MAP: dict[str, str] = {
    "n": "noun",
    "v": "verb",
    "a": "adjective",
    "s": "adjective",  # adjective satellite — collapsed into adjective
    "r": "adverb",
 }
 def extract_all(
    output_dir: str = "stage-1-extract/output", sample: bool = False
 ) -> None:
    out = Path(output_dir)
    out.mkdir(parents=True, exist_ok=True)
    sample_size = 100 if sample else None
    # Load one Wordnet object per language up front.
    print("Loading wordnets...")
    wordnets: dict[str, wn.Wordnet] = {}
    for lang in SUPPORTED_LANGUAGE_CODES:
        try:
            wordnets[lang] = wn.Wordnet(lang=lang)
            synset_count = len(wordnets[lang].synsets())
            print(f"  {lang}: {synset_count:,} total synsets")
        except wn.Error as e:
            print(f"  ERROR loading {lang}: {e}")
            print(f"  Run: python -m wn download omw-{lang}:1.4")
            sys.exit(1)
    # Collect per-ILI data across all languages and POS.
    print("\nExtracting synsets...")
    by_ili: dict[str, dict] = {}
    for lang, wnet in wordnets.items():
        for omw_pos, pos_label in POS_MAP.items():
            synsets = wnet.synsets(pos=omw_pos)
            covered = 0
            for synset in synsets:
                ili = synset.ili
                if not ili:
                    continue
                covered += 1
                lemmas = [str(lemma) for lemma in synset.lemmas()]
                defns = [d for d in synset.definitions() if d]
                examples = [e for e in synset.examples() if e]
                if ili not in by_ili:
                    by_ili[ili] = {"pos": pos_label}
                if lang not in by_ili[ili]:
                    by_ili[ili][lang] = {
                        "lemmas": lemmas,
                        "glosses": defns,
                        "examples": examples,
                    }
                else:
                    # ILI already exists for this language — merge data.
                    # Happens when 'a' and 's' both map to adjective for the
                    # same ILI. Deduplicate to avoid repeated entries.
                    existing = by_ili[ili][lang]
                    existing["lemmas"] = list(
                        dict.fromkeys(existing["lemmas"] + lemmas)
                    )
                    existing["glosses"] = list(
                        dict.fromkeys(existing["glosses"] + defns)
                    )
                    existing["examples"] = list(
                        dict.fromkeys(existing["examples"] + examples)
                    )
            print(f"  {lang} {pos_label}: {covered:,} synsets with ILI")
    # Build records and write single combined output file.
    print("\nBuilding records...")
    ilis = sorted(by_ili.keys())
    if sample_size:
        ilis = ilis[:sample_size]
    records: list[dict] = []
    for ili in ilis:
        data = by_ili[ili]
        record: dict = {
            "source_id": f"ili:{ili}",
            "pos": data["pos"],
            "translations": {},
            "glosses": {},
            "examples": {},
        }
        for key, value in data.items():
            if key == "pos":
                continue
            lang = key
            if value["lemmas"]:
                record["translations"][lang] = value["lemmas"]
            if value["glosses"]:
                record["glosses"][lang] = value["glosses"]
            if value["examples"]:
                record["examples"][lang] = value["examples"]
        records.append(record)
    output_file = out / "omw.json"
    with open(output_file, "w", encoding="utf-8") as f:
        json.dump(records, f, indent=2, ensure_ascii=False)
    print(f"\nWrote {len(records):,} synsets → {output_file}")
    _print_coverage(records)
 def _print_coverage(records: list[dict]) -> None:
    """Print per-language translation, gloss, and example counts."""
    lang_stats: dict[str, dict[str, int]] = {}
    for lang in SUPPORTED_LANGUAGE_CODES:
        lang_stats[lang] = {"translations": 0, "glosses": 0, "examples": 0}
    pos_stats: dict[str, int] = {}
    for r in records:
        pos = r["pos"]
        pos_stats[pos] = pos_stats.get(pos, 0) + 1
        for lang, lemmas in r["translations"].items():
            if lang in lang_stats:
                lang_stats[lang]["translations"] += len(lemmas)
        for lang, gloss_list in r["glosses"].items():
            if lang in lang_stats:
                lang_stats[lang]["glosses"] += len(gloss_list)
        for lang, example_list in r["examples"].items():
            if lang in lang_stats:
                lang_stats[lang]["examples"] += len(example_list)
    print("\nPOS breakdown:")
    for pos, count in sorted(pos_stats.items()):
        print(f"  {pos}: {count:,}")
    print("\nCoverage per language:")
    for lang, counts in lang_stats.items():
        t = counts["translations"]
        g = counts["glosses"]
        e = counts["examples"]
        total = len(records)
        print(
            f"  {lang}: {t:,} translations, {g:,} glosses, {e:,} examples (avg {(t / total):.1f} translations/synset)"
        )
 if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser(description="Extract OMW data to JSON")
    parser.add_argument(
        "--output-dir",
        default="stage-1-extract/output",
        help="Output directory for JSON files",
    )
    parser.add_argument(
        "--sample",
        action="store_true",
        help="Extract only 100 synsets per language for inspection",
    )
    args = parser.parse_args()
    extract_all(output_dir=args.output_dir, sample=args.sample)
--- a/data-pipeline/stage-1-extract/scripts/extract.ts
+++ b/data-pipeline/stage-1-extract/scripts/extract.ts
@ -0,0 +1,257 @@
 import fs from "node:fs";
 import path from "node:path";
 import readline from "node:readline";
 import { fileURLToPath } from "node:url";
 import { SUPPORTED_LANGUAGE_CODES } from "@lila/shared";
 import type { SupportedLanguageCode, SupportedPos } from "@lila/shared";
 // ── Types ─────────────────────────────────────────────────────────────────────
 type KaikkiTranslation = {
  code?: string;
  lang_code?: string;
  word?: string;
  sense?: string;
 };
 type KaikkiSense = {
  glosses?: string[];
  examples?: { text?: string }[];
  translations?: KaikkiTranslation[];
 };
 type KaikkiEntry = {
  word?: string;
  pos?: string;
  lang_code?: string;
  senses?: KaikkiSense[];
 };
 export type ExtractedSense = {
  headword: string;
  language: SupportedLanguageCode;
  pos: SupportedPos;
  sense_index: number;
  gloss: string | null;
  examples: string[];
  translations: {
    target_lang: SupportedLanguageCode;
    word: string;
    sense_hint: string | null;
  }[];
 };
 // ── Constants ─────────────────────────────────────────────────────────────────
 const __dirname = path.dirname(fileURLToPath(import.meta.url));
 const SOURCES_DIR = path.resolve(__dirname, "../sources");
 const OUTPUT_DIR = path.resolve(__dirname, "../output");
 const LANG_TO_FILE: Record<SupportedLanguageCode, string> = {
  en: "kaikki.org-dictionary-English.jsonl",
  de: "kaikki.org-dictionary-German.jsonl",
  it: "kaikki.org-dictionary-Italian.jsonl",
  fr: "kaikki.org-dictionary-French.jsonl",
  es: "kaikki.org-dictionary-Spanish.jsonl",
 };
 const POS_MAP: Record<string, SupportedPos> = {
  noun: "noun",
  verb: "verb",
  adj: "adjective",
  adv: "adverb",
 };
 const SUPPORTED_LANG_SET = new Set<string>(SUPPORTED_LANGUAGE_CODES);
 // ── Helpers ───────────────────────────────────────────────────────────────────
 function mapPos(kaikkiPos: string): SupportedPos | null {
  return POS_MAP[kaikkiPos] ?? null;
 }
 function isAbbreviation(gloss: string): boolean {
  return gloss.toLowerCase().startsWith("abbreviation of");
 }
 function extractTranslations(
  sense: KaikkiSense,
  sourceLang: SupportedLanguageCode,
 ): ExtractedSense["translations"] {
  const seen = new Set<string>();
  const result: ExtractedSense["translations"] = [];
  for (const t of sense.translations ?? []) {
    const code = t.code ?? t.lang_code;
    if (!code || !SUPPORTED_LANG_SET.has(code)) continue;
    if (code === sourceLang) continue; // skip same-language translations
    if (!t.word?.trim()) continue;
    const key = `${code}:${t.word.trim()}`;
    if (seen.has(key)) continue;
    seen.add(key);
    result.push({
      target_lang: code as SupportedLanguageCode,
      word: t.word.trim(),
      sense_hint: t.sense?.trim() ?? null,
    });
  }
  return result;
 }
 function extractExamples(sense: KaikkiSense): string[] {
  return (sense.examples ?? [])
    .map((e) => e.text?.trim())
    .filter((t): t is string => !!t);
 }
 function processEntry(
  entry: KaikkiEntry,
  sourceLang: SupportedLanguageCode,
 ): Omit<ExtractedSense, "sense_index">[] {
  const pos = mapPos(entry.pos ?? "");
  if (!pos) return [];
  if (!entry.word?.trim()) return [];
  // For non-English files, only process entries in the target language
  const entryLang = (entry as Record<string, unknown>)["lang_code"] as
    | string
    | undefined;
  if (sourceLang !== "en" && entryLang !== sourceLang) return [];
  const headword = entry.word.trim();
  const results: Omit<ExtractedSense, "sense_index">[] = [];
  for (const sense of entry.senses ?? []) {
    const gloss = sense.glosses?.[0]?.trim() ?? null;
    if (gloss && isAbbreviation(gloss)) continue;
    if (sourceLang === "en") {
      // English: require translations in supported languages
      const translations = extractTranslations(sense, sourceLang);
      if (translations.length === 0) continue;
      results.push({
        headword,
        language: sourceLang,
        pos,
        gloss,
        examples: extractExamples(sense),
        translations,
      });
    } else {
      // Non-English: just extract the entry, no translations needed
      results.push({
        headword,
        language: sourceLang,
        pos,
        gloss,
        examples: extractExamples(sense),
        translations: [],
      });
    }
  }
  return results;
 }
 // ── Extract ───────────────────────────────────────────────────────────────────
 export async function extract(
  lang: SupportedLanguageCode,
  sampleLimit?: number,
 ): Promise<void> {
  const filename = LANG_TO_FILE[lang];
  const sourcePath = path.join(SOURCES_DIR, filename);
  const outputPath = path.join(OUTPUT_DIR, `${lang}.json`);
  console.log(`\nExtracting ${lang}...`);
  console.log(`  Source: ${sourcePath}`);
  if (sampleLimit) console.log(`  Sample mode: ${sampleLimit} entries`);
  await fs.promises.mkdir(OUTPUT_DIR, { recursive: true });
  const fileStream = fs.createReadStream(sourcePath);
  const rl = readline.createInterface({
    input: fileStream,
    crlfDelay: Infinity,
  });
  const senses: ExtractedSense[] = [];
  const senseIndexMap = new Map<string, number>();
  let linesRead = 0;
  let entriesProcessed = 0;
  let entriesSkipped = 0;
  for await (const line of rl) {
    if (!line.trim()) continue;
    if (sampleLimit && entriesProcessed >= sampleLimit) break;
    linesRead++;
    let entry: KaikkiEntry;
    try {
      entry = JSON.parse(line) as KaikkiEntry;
    } catch {
      console.warn(`  Warning: failed to parse line ${linesRead}, skipping`);
      continue;
    }
    const extracted = processEntry(entry, lang);
    if (extracted.length === 0) {
      entriesSkipped++;
      continue;
    }
    for (const sense of extracted) {
      const key = `${sense.headword}|${sense.pos}`;
      const senseIndex = senseIndexMap.get(key) ?? 0;
      senseIndexMap.set(key, senseIndex + 1);
      senses.push({ ...sense, sense_index: senseIndex });
    }
    entriesProcessed++;
    if (entriesProcessed % 10_000 === 0) {
      console.log(
        `  Processed ${entriesProcessed.toLocaleString()} entries...`,
      );
    }
  }
  await fs.promises.writeFile(
    outputPath,
    JSON.stringify(senses, null, 2),
    "utf-8",
  );
  console.log(`  Lines read:        ${linesRead.toLocaleString()}`);
  console.log(`  Entries processed: ${entriesProcessed.toLocaleString()}`);
  console.log(`  Entries skipped:   ${entriesSkipped.toLocaleString()}`);
  console.log(`  Senses extracted:  ${senses.length.toLocaleString()}`);
  console.log(`  Output:            ${outputPath}`);
 }
 // ── Main ─────────────────────────────────────────────────────────────────────
 async function main(): Promise<void> {
  // Hardcoded sample limit for development — remove for full extraction
  const SAMPLE = 500;
  for (const lang of SUPPORTED_LANGUAGE_CODES) {
    await extract(lang, SAMPLE);
  }
  console.log("\nExtraction complete.");
 }
 if (import.meta.url === `file://${process.argv[1]}`) {
  main().catch((err) => {
    console.error(err);
    process.exit(1);
  });
 }
--- a/data-pipeline/stage-2-annotate/scripts/annotate.ts
+++ b/data-pipeline/stage-2-annotate/scripts/annotate.ts
@ -1,227 +0,0 @@
 import fs from "node:fs/promises";
 import path from "node:path";
 import { SUPPORTED_LANGUAGE_CODES } from "@lila/shared";
 import type { SupportedLanguageCode, SupportedPos } from "@lila/shared";
 // ── Types ────────────────────────────────────────────────────────────────────
 type OmwExample = { text: string; source: "omw" };
 type CefrExample = { text: string; source: "cefr" };
 type Example = OmwExample | CefrExample;
 type OmwRecord = {
  source_id: string;
  pos: SupportedPos;
  translations: Partial<Record<SupportedLanguageCode, string[]>>;
  glosses: Partial<Record<SupportedLanguageCode, string[]>>;
  examples: Partial<Record<SupportedLanguageCode, string[]>>;
 };
 type AnnotatedRecord = {
  source_id: string;
  pos: SupportedPos;
  translations: Partial<Record<SupportedLanguageCode, string[]>>;
  glosses: Partial<Record<SupportedLanguageCode, string[]>>;
  examples: Partial<Record<SupportedLanguageCode, Example[]>>;
  votes: Partial<
    Record<SupportedLanguageCode, Record<string, { cefr_source: string }>>
  >;
 };
 type CefrSourceEntry = {
  word: string;
  pos: string;
  cefr_level: string;
  example_sentence_native?: string;
 };
 type ConflictEntry = {
  word: string;
  pos: string;
  language: SupportedLanguageCode;
  levels: string[];
 };
 // ── Constants ─────────────────────────────────────────────────────────────────
 const POS_NORMALIZE: Record<string, SupportedPos> = {
  noun: "noun",
  n: "noun",
  nom: "noun", // French
  verb: "verb",
  verbs: "verb",
  v: "verb",
  v1: "verb",
  adjective: "adjective",
  adjektiv: "adjective", // German
  adj: "adjective",
  adverb: "adverb",
  adverbs: "adverb",
  adv: "adverb",
 };
 const CEFR_LEVELS = new Set(["A1", "A2", "B1", "B2", "C1", "C2"]);
 const PATHS = {
  omw: "stage-1-extract/output/omw.json",
  cefrDir: "stage-2-annotate/sources/cefr",
  outputDir: "stage-2-annotate/output",
 };
 // ── CEFR source loading ───────────────────────────────────────────────────────
 type CefrIndex = Map<string, { level: string; example?: string }>;
 async function loadCefrSource(
  lang: SupportedLanguageCode,
 ): Promise<{ index: CefrIndex; conflicts: ConflictEntry[] }> {
  const filepath = path.join(PATHS.cefrDir, `${lang}.json`);
  const raw = await fs.readFile(filepath, "utf-8");
  const entries = JSON.parse(raw) as CefrSourceEntry[];
  // First pass — detect conflicts.
  // Structure: "word|pos" -> Set of CEFR levels seen
  const seen = new Map<string, Set<string>>();
  for (const entry of entries) {
    const pos = POS_NORMALIZE[entry.pos.toLowerCase().trim()];
    if (!pos) continue;
    if (!CEFR_LEVELS.has(entry.cefr_level)) continue;
    const key = `${entry.word.toLowerCase().trim()}|${pos}`;
    if (!seen.has(key)) seen.set(key, new Set());
    seen.get(key)!.add(entry.cefr_level);
  }
  const conflicts: ConflictEntry[] = [];
  for (const [key, levels] of seen.entries()) {
    if (levels.size > 1) {
      const [word, pos] = key.split("|") as [string, string];
      conflicts.push({ word, pos, language: lang, levels: [...levels] });
    }
  }
  // Second pass — build index, skip conflicting entries.
  const conflictKeys = new Set(conflicts.map((c) => `${c.word}|${c.pos}`));
  const index: CefrIndex = new Map();
  for (const entry of entries) {
    const pos = POS_NORMALIZE[entry.pos.toLowerCase().trim()];
    if (!pos) continue;
    if (!CEFR_LEVELS.has(entry.cefr_level)) continue;
    const key = `${entry.word.toLowerCase().trim()}|${pos}`;
    if (conflictKeys.has(key)) continue;
    index.set(key, {
      level: entry.cefr_level,
      ...(entry.example_sentence_native
        ? { example: entry.example_sentence_native }
        : {}),
    });
  }
  return { index, conflicts };
 }
 // ── Annotation ────────────────────────────────────────────────────────────────
 async function annotate(): Promise<void> {
  // Load OMW records
  console.log("Reading OMW extract...");
  const raw = await fs.readFile(PATHS.omw, "utf-8");
  const omwRecords = JSON.parse(raw) as OmwRecord[];
  console.log(`  Loaded ${omwRecords.length.toLocaleString()} synsets`);
  // Load CEFR sources for all languages
  console.log("\nLoading CEFR source files...");
  const cefrIndexes = new Map<SupportedLanguageCode, CefrIndex>();
  const allConflicts: ConflictEntry[] = [];
  for (const lang of SUPPORTED_LANGUAGE_CODES) {
    const { index, conflicts } = await loadCefrSource(lang);
    cefrIndexes.set(lang, index);
    allConflicts.push(...conflicts);
    console.log(
      `  ${lang}: ${index.size.toLocaleString()} entries, ${conflicts.length} conflicts`,
    );
  }
  // Write conflicts file
  await fs.mkdir(PATHS.outputDir, { recursive: true });
  await fs.writeFile(
    path.join(PATHS.outputDir, "conflicts.json"),
    JSON.stringify(allConflicts, null, 2),
    "utf-8",
  );
  console.log(
    `\nWrote ${allConflicts.length} conflicts → ${PATHS.outputDir}/conflicts.json`,
  );
  // Annotate and write one file per language
  console.log("\nAnnotating...");
  for (const lang of SUPPORTED_LANGUAGE_CODES) {
    const index = cefrIndexes.get(lang)!;
    const records: AnnotatedRecord[] = [];
    let matched = 0;
    for (const record of omwRecords) {
      const annotated: AnnotatedRecord = {
        source_id: record.source_id,
        pos: record.pos,
        translations: record.translations,
        glosses: record.glosses,
        examples: {},
        votes: {},
      };
      // Convert OMW examples to typed format
      for (const [l, exList] of Object.entries(record.examples)) {
        annotated.examples[l as SupportedLanguageCode] = exList.map((text) => ({
          text,
          source: "omw" as const,
        }));
      }
      // Match translations for this language against CEFR index
      const langTranslations = record.translations[lang] ?? [];
      for (const word of langTranslations) {
        const key = `${word.toLowerCase().trim()}|${record.pos}`;
        const cefrEntry = index.get(key);
        if (!cefrEntry) continue;
        matched++;
        // Add CEFR vote
        if (!annotated.votes[lang]) annotated.votes[lang] = {};
        annotated.votes[lang]![word] = { cefr_source: cefrEntry.level };
        // Add native example if present
        if (cefrEntry.example) {
          if (!annotated.examples[lang]) annotated.examples[lang] = [];
          annotated.examples[lang]!.push({
            text: cefrEntry.example,
            source: "cefr" as const,
          });
        }
      }
      records.push(annotated);
    }
    const outputFile = path.join(PATHS.outputDir, `${lang}.json`);
    await fs.writeFile(outputFile, JSON.stringify(records, null, 2), "utf-8");
    console.log(
      `  ${lang}: ${matched.toLocaleString()} matches → ${outputFile}`,
    );
  }
 }
 // ── Main ─────────────────────────────────────────────────────────────────────
 annotate().catch((err) => {
  console.error(err);
  process.exit(1);
 });
--- a/data-pipeline/stage-2-annotate/sources/cefr/de.json
+++ b/data-pipeline/stage-2-annotate/sources/cefr/de.json
--- a/data-pipeline/stage-2-annotate/sources/cefr/en.json
+++ b/data-pipeline/stage-2-annotate/sources/cefr/en.json
--- a/data-pipeline/stage-2-annotate/sources/cefr/es.json
+++ b/data-pipeline/stage-2-annotate/sources/cefr/es.json
--- a/data-pipeline/stage-2-annotate/sources/cefr/fr.json
+++ b/data-pipeline/stage-2-annotate/sources/cefr/fr.json
--- a/data-pipeline/stage-2-annotate/sources/cefr/it.json
+++ b/data-pipeline/stage-2-annotate/sources/cefr/it.json
--- a/data-pipeline/stage-2-reverse-link/scripts/reverse-link.ts
+++ b/data-pipeline/stage-2-reverse-link/scripts/reverse-link.ts
@ -0,0 +1,109 @@
 import { openDb } from "../../db/index.js";
 // ── Types ─────────────────────────────────────────────────────────────────────
 type TranslationRow = {
  translation_id: number;
  entry_id: number;
  entry_language: string;
  entry_headword: string;
  target_lang: string;
  word: string;
  sense_hint: string | null;
 };
 type EntryRow = { id: number };
 // ── Sync ──────────────────────────────────────────────────────────────────────
 export function reverseLink(): void {
  const db = openDb();
  // Find all translations and their source entry details
  const translations = db
    .prepare(
      `SELECT
        t.id          AS translation_id,
        t.entry_id,
        e.language    AS entry_language,
        e.headword    AS entry_headword,
        t.target_lang,
        t.word,
        t.sense_hint
       FROM translations t
       JOIN entries e ON e.id = t.entry_id`,
    )
    .all() as TranslationRow[];
  console.log(
    `  Found ${translations.length.toLocaleString()} translations to check`,
  );
  const findEntry = db.prepare(
    `SELECT id FROM entries WHERE headword = ? AND language = ? LIMIT 1`,
  );
  const insertReverseLink = db.prepare(
    `INSERT INTO translations (entry_id, target_lang, word, sense_hint, source)
     VALUES (?, ?, ?, ?, 'reverse_link')
     ON CONFLICT (entry_id, target_lang, word) DO NOTHING`,
  );
  const sync = db.transaction(() => {
    let inserted = 0;
    let skipped = 0;
    let noEntry = 0;
    for (const t of translations) {
      // Look for an entry in the target language with the translation word as headword
      const targetEntry = findEntry.get(t.word, t.target_lang) as
        | EntryRow
        | undefined;
      if (!targetEntry) {
        noEntry++;
        continue;
      }
      // Insert reverse link: target entry → source language → source headword
      const result = insertReverseLink.run(
        targetEntry.id,
        t.entry_language,
        t.entry_headword,
        t.sense_hint ?? null,
      );
      if (result.changes > 0) {
        inserted++;
      } else {
        skipped++;
      }
    }
    return { inserted, skipped, noEntry };
  });
  const counts = sync();
  db.close();
  console.log(`  Inserted: ${counts.inserted.toLocaleString()} reverse links`);
  console.log(
    `  Skipped:  ${counts.skipped.toLocaleString()} (already existed)`,
  );
  console.log(
    `  No entry: ${counts.noEntry.toLocaleString()} (target word not in entries)`,
  );
 }
 // ── Main ─────────────────────────────────────────────────────────────────────
 function main(): void {
  console.log("Running reverse link sync...");
  reverseLink();
  console.log("\nReverse link sync complete.");
 }
 if (import.meta.url === `file://${process.argv[1]}`) {
  main();
 }
--- a/data-pipeline/stage-3-enrich/config.ts
+++ b/data-pipeline/stage-3-enrich/config.ts
@ -0,0 +1,123 @@
 // ── Provider configuration ────────────────────────────────────────────────────
 //
 // Each provider + model combination counts as one vote in the final majority.
 // Running the same model twice is not supported — one model, one vote.
 // The `name` field is used as the model identifier in pipeline.db and must
 // be unique across all runs.
 //
 // The pipeline iterates through ALL_PROVIDERS in order, skipping models that
 // have already completed a full run and resuming models with partial progress.
 //
 // See llm-setup.md for full setup instructions and model recommendations.
 export type ProviderConfig = {
  name: string; // unique model identifier — stored in pipeline.db
  baseURL: string;
  apiKey: string;
  model: string;
  maxTokens: number;
 };
 // ── Local llama.cpp ───────────────────────────────────────────────────────────
 export const LOCAL_QWEN35_4B: ProviderConfig = {
  name: "local-qwen3.5-4b",
  baseURL: "http://127.0.0.1:8080/v1",
  apiKey: "none",
  model: "qwen3.5-4b",
  maxTokens: 1024, // no reasoning overhead so 1024 is enough
 };
 export const LOCAL_GEMMA4: ProviderConfig = {
  name: "local-gemma4-e4b",
  baseURL: "http://127.0.0.1:8080/v1",
  apiKey: "none", // llama.cpp ignores this
  model: "gemma4-e4b", // llama.cpp ignores model name, uses loaded model
  maxTokens: 2048,
 };
 export const LOCAL_QWEN7B: ProviderConfig = {
  name: "local-qwen2.5-7b",
  baseURL: "http://127.0.0.1:8080/v1",
  apiKey: "none",
  model: "qwen2.5-7b",
  maxTokens: 512,
 };
 // ── OpenRouter — free tier ────────────────────────────────────────────────────
 export const OR_QWEN3_480B: ProviderConfig = {
  name: "or-qwen3-480b",
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env["OPENROUTER_API_KEY"] ?? "",
  model: "qwen/qwen3-coder:free",
  maxTokens: 512,
 };
 export const OR_GEMMA4_31B: ProviderConfig = {
  name: "or-gemma4-31b",
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env["OPENROUTER_API_KEY"] ?? "",
  model: "google/gemma-4-31b-it:free",
  maxTokens: 512,
 };
 export const OR_QWEN3_80B: ProviderConfig = {
  name: "or-qwen3-80b",
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env["OPENROUTER_API_KEY"] ?? "",
  model: "qwen/qwen3-next-80b-a3b-instruct:free",
  maxTokens: 512,
 };
 export const OR_NEMOTRON: ProviderConfig = {
  name: "or-nemotron-120b",
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env["OPENROUTER_API_KEY"] ?? "",
  model: "nvidia/nemotron-3-super-120b-a12b:free",
  maxTokens: 512,
 };
 // ── Anthropic — reference baseline ───────────────────────────────────────────
 // Note: Anthropic uses a different API format. An adapter is required.
 // See llm-setup.md for details.
 export const ANTHROPIC_SONNET: ProviderConfig = {
  name: "anthropic-sonnet-4",
  baseURL: "https://api.anthropic.com/v1",
  apiKey: process.env["ANTHROPIC_API_KEY"] ?? "",
  model: "claude-sonnet-4-6",
  maxTokens: 512,
 };
 // ── All configured providers ──────────────────────────────────────────────────
 // The pipeline runs through these in order — local models first, then cloud.
 // Add new providers here to include them in the voting pool.
 export const ALL_PROVIDERS: ProviderConfig[] = [
  LOCAL_QWEN35_4B,
  // LOCAL_GEMMA4,
  // LOCAL_QWEN7B,
  // OR_QWEN3_480B,
  // OR_GEMMA4_31B,
  // OR_QWEN3_80B,
  // OR_NEMOTRON,
  // ANTHROPIC_SONNET,
 ];
 // ── Key validation ────────────────────────────────────────────────────────────
 const LOCAL_PROVIDERS = new Set(["none"]);
 export function validateProviderKey(provider: ProviderConfig): void {
  if (LOCAL_PROVIDERS.has(provider.apiKey)) return;
  if (!provider.apiKey) {
    const keyName = provider.name.startsWith("anthropic")
      ? "ANTHROPIC_API_KEY"
      : "OPENROUTER_API_KEY";
    console.error(`\n  ERROR: ${keyName} is not set in .env`);
    console.error(`  Provider "${provider.name}" requires this key to run.\n`);
    process.exit(1);
  }
 }
--- a/data-pipeline/stage-3-enrich/scripts/enrich.ts
+++ b/data-pipeline/stage-3-enrich/scripts/enrich.ts
@ -0,0 +1,877 @@
 import { openDb } from "../../db/index.js";
 import type { ProviderConfig } from "../config.js";
 import { CEFR_LEVELS, SUPPORTED_LANGUAGE_CODES } from "@lila/shared";
 import type { SupportedLanguageCode } from "@lila/shared";
 // ── Types ─────────────────────────────────────────────────────────────────────
 type EntryRow = {
  id: number;
  headword: string;
  language: SupportedLanguageCode;
  pos: string;
  gloss: string | null;
  examples: string; // JSON array string
 };
 type TranslationRow = {
  id: number;
  target_lang: SupportedLanguageCode;
  word: string;
 };
 type GlossResult = { status: "ok" } | { status: "improved"; gloss: string };
 type ExampleResult = { status: "ok" } | { status: "improved"; example: string };
 type TranslationResult = {
  translations: Partial<
    Record<SupportedLanguageCode, Record<string, "ok" | "reject">>
  >;
  generated?: Partial<Record<SupportedLanguageCode, string>>;
 };
 type CefrResult = {
  headword_cefr: string;
  translation_cefr: Partial<
    Record<SupportedLanguageCode, Record<string, string>>
  >;
 };
 type SubStage =
  | "round1_gloss"
  | "round1_example"
  | "round1_translations"
  | "round1_cefr";
 // ── Constants ─────────────────────────────────────────────────────────────────
 const SUPPORTED_LANG_SET = new Set<string>(SUPPORTED_LANGUAGE_CODES);
 const CEFR_SET = new Set<string>(CEFR_LEVELS);
 // ── Shutdown ──────────────────────────────────────────────────────────────────
 let shutdownRequested = false;
 let currentCallController: AbortController | null = null;
 export function registerEnrichShutdown(): void {
  const handler = (): void => {
    if (shutdownRequested) return;
    shutdownRequested = true;
    console.log("\n\n  Shutdown requested — aborting current LLM call...");
    currentCallController?.abort();
  };
  process.on("SIGINT", handler);
  process.on("SIGTERM", handler);
 }
 // ── Prompt builders ───────────────────────────────────────────────────────────
 function buildGlossPrompt(entry: EntryRow): string {
  const glossText = entry.gloss ?? "none";
  const examples: string[] = JSON.parse(entry.examples) as string[];
  const examplesText =
    examples.length > 0 ? examples.map((e) => `  - ${e}`).join("\n") : "  none";
  return `You are a language learning expert.
 Review this gloss for the ${entry.pos} "${entry.headword}" (sense ${entry.sense_index}).
 Gloss: "${glossText}"
 Examples of this specific sense:
 ${examplesText}
 Is this gloss clear, accurate for this specific sense, and suitable for a language learner?
 - If yes, respond with: {"status": "ok"}
 - If no or if gloss is "none", respond with: {"status": "improved", "gloss": "your improved gloss here"}
 IMPORTANT: Your improved gloss must describe THIS SPECIFIC SENSE shown by the examples above,
 not a more common or general meaning of the word.
 Respond ONLY with valid JSON and nothing else.`;
 }
 function buildTranslationsPrompt(
  entry: EntryRow,
  translations: TranslationRow[],
  verifiedGloss: string,
 ): string {
  const byLang = new Map<SupportedLanguageCode, string[]>();
  for (const t of translations) {
    if (!byLang.has(t.target_lang)) byLang.set(t.target_lang, []);
    byLang.get(t.target_lang)!.push(t.word);
  }
  const coveredLangs = new Set(byLang.keys());
  const missingLangs = SUPPORTED_LANGUAGE_CODES.filter(
    (l) => l !== entry.language && !coveredLangs.has(l),
  );
  const translationsText =
    byLang.size > 0
      ? [...byLang.entries()]
          .map(([lang, words]) => `  ${lang}: ${words.join(", ")}`)
          .join("\n")
      : "  none";
  const missingText =
    missingLangs.length > 0 ? missingLangs.join(", ") : "none";
  const exampleResponse: Record<string, unknown> = {
    translations: {
      de: { frei: "ok", "-frei": "reject" },
      it: { libero: "ok", free: "reject" },
    },
  };
  if (missingLangs.length > 0) {
    exampleResponse["generated"] = { es: "libre", fr: "libre" };
  }
  return `You are a language learning expert.
 For the ${entry.language} ${entry.pos} "${entry.headword}" (meaning: "${verifiedGloss}"), review these translations:
 ${translationsText}
 For each translation:
 - Write "ok" if it is a valid translation for this specific meaning
 - Write "reject" if it is wrong, a suffix (starts with -), garbled text, or the wrong language
 Examples of correct behaviour:
 - "free" listed as Italian → "reject" (it is English, not Italian)
 - "-frei" listed as German → "reject" (it is a suffix, not a standalone word)
 - "libre" listed as Spanish → "ok" (it is a valid Spanish word)
 ${missingLangs.length > 0 ? `Also generate the single best translation for these missing languages: ${missingText}` : ""}
 Respond ONLY with valid JSON and nothing else:
 ${JSON.stringify(exampleResponse, null, 2)}`;
 }
 function buildCefrPrompt(
  entry: EntryRow,
  verifiedGloss: string,
  validatedTranslations: Map<SupportedLanguageCode, string[]>,
 ): string {
  const translationsText =
    validatedTranslations.size > 0
      ? [...validatedTranslations.entries()]
          .map(([lang, words]) => `  ${lang}: ${words.join(", ")}`)
          .join("\n")
      : "  none";
  return `You are a language learning expert.
 Assign CEFR levels (A1, A2, B1, B2, C1, or C2) to this word and its validated translations.
 Base your levels on how commonly a language learner at that level would encounter this specific sense.
 Consider register — slang, technical, and archaic words should be rated higher.
 WORD: ${entry.headword} (${entry.pos})
 MEANING: ${verifiedGloss}
 VALIDATED TRANSLATIONS:
 ${translationsText}
 Respond ONLY with valid JSON and nothing else:
 {
  "headword_cefr": "B1",
  "translation_cefr": {
    "de": { "frei": "A2" },
    "it": { "libero": "A2" }
  }
 }`;
 }
 // ── Validation ────────────────────────────────────────────────────────────────
 function validateGloss(raw: string): GlossResult | null {
  try {
    const obj = JSON.parse(raw) as Record<string, unknown>;
    if (obj["status"] === "ok") return { status: "ok" };
    if (
      obj["status"] === "improved" &&
      typeof obj["gloss"] === "string" &&
      obj["gloss"].trim()
    ) {
      return { status: "improved", gloss: obj["gloss"].trim() };
    }
    return null;
  } catch {
    return null;
  }
 }
 function validateExample(raw: string): ExampleResult | null {
  try {
    const obj = JSON.parse(raw) as Record<string, unknown>;
    if (obj["status"] === "ok") return { status: "ok" };
    if (
      obj["status"] === "improved" &&
      typeof obj["example"] === "string" &&
      obj["example"].trim()
    ) {
      return { status: "improved", example: obj["example"].trim() };
    }
    return null;
  } catch {
    return null;
  }
 }
 function validateTranslations(
  raw: string,
  translations: TranslationRow[],
 ): TranslationResult | null {
  try {
    const obj = JSON.parse(raw) as Record<string, unknown>;
    if (typeof obj["translations"] !== "object" || obj["translations"] === null)
      return null;
    const result: TranslationResult = { translations: {} };
    const translationsObj = obj["translations"] as Record<string, unknown>;
    // Validate each language's votes
    for (const [lang, votes] of Object.entries(translationsObj)) {
      if (!SUPPORTED_LANG_SET.has(lang)) continue;
      if (typeof votes !== "object" || votes === null) continue;
      result.translations[lang as SupportedLanguageCode] = {};
      for (const [word, status] of Object.entries(
        votes as Record<string, unknown>,
      )) {
        if (status === "ok" || status === "reject") {
          result.translations[lang as SupportedLanguageCode]![word] = status;
        }
      }
    }
    // Validate generated translations
    if (obj["generated"] !== undefined && obj["generated"] !== null) {
      if (typeof obj["generated"] !== "object") return null;
      result.generated = {};
      for (const [lang, word] of Object.entries(
        obj["generated"] as Record<string, unknown>,
      )) {
        if (!SUPPORTED_LANG_SET.has(lang)) continue;
        if (typeof word === "string" && word.trim()) {
          result.generated[lang as SupportedLanguageCode] = word.trim();
        }
      }
    }
    // Check all translations got a vote
    const byLang = new Map<string, Set<string>>();
    for (const t of translations) {
      if (!byLang.has(t.target_lang)) byLang.set(t.target_lang, new Set());
      byLang.get(t.target_lang)!.add(t.word);
    }
    for (const [lang, words] of byLang.entries()) {
      const votes = result.translations[lang as SupportedLanguageCode];
      if (!votes) return null;
      for (const word of words) {
        if (!votes[word]) return null;
      }
    }
    return result;
  } catch {
    return null;
  }
 }
 function validateCefr(
  raw: string,
  validatedTranslations: Map<SupportedLanguageCode, string[]>,
 ): CefrResult | null {
  try {
    const obj = JSON.parse(raw) as Record<string, unknown>;
    if (typeof obj["headword_cefr"] !== "string") return null;
    if (!CEFR_SET.has(obj["headword_cefr"])) return null;
    if (
      typeof obj["translation_cefr"] !== "object" ||
      obj["translation_cefr"] === null
    )
      return null;
    const translationCefr = obj["translation_cefr"] as Record<string, unknown>;
    // Verify all validated translations have a CEFR vote
    for (const [lang, words] of validatedTranslations.entries()) {
      const votes = translationCefr[lang] as Record<string, string> | undefined;
      if (!votes) return null;
      for (const word of words) {
        if (!votes[word] || !CEFR_SET.has(votes[word])) return null;
      }
    }
    return {
      headword_cefr: obj["headword_cefr"],
      translation_cefr: translationCefr as Partial<
        Record<SupportedLanguageCode, Record<string, string>>
      >,
    };
  } catch {
    return null;
  }
 }
 // ── LLM call ──────────────────────────────────────────────────────────────────
 async function callLlm(
  prompt: string,
  provider: ProviderConfig,
 ): Promise<string> {
  currentCallController = new AbortController();
  const timeout = setTimeout(() => currentCallController?.abort(), 120_000);
  let response: Response;
  try {
    response = await fetch(`${provider.baseURL}/chat/completions`, {
      method: "POST",
      signal: currentCallController.signal,
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${provider.apiKey}`,
      },
      body: JSON.stringify({
        model: provider.model,
        max_tokens: provider.maxTokens,
        messages: [{ role: "user", content: prompt }],
        temperature: 0.1,
      }),
    });
  } finally {
    clearTimeout(timeout);
    currentCallController = null;
  }
  if (!response.ok) {
    throw new Error(`LLM API error: ${response.status} ${response.statusText}`);
  }
  const data = (await response.json()) as {
    choices?: { message?: { content?: string } }[];
  };
  const content = data.choices?.[0]?.message?.content;
  if (!content) throw new Error("LLM returned empty response");
  return content
    .replace(/```json\n?/g, "")
    .replace(/```\n?/g, "")
    .trim();
 }
 // ── Status helpers ────────────────────────────────────────────────────────────
 function getSubStageStatus(
  entryId: number,
  modelName: string,
  stage: SubStage,
 ): "complete" | "needs_review" | "pending" {
  const db = openDb();
  const row = db
    .prepare(
      `SELECT status FROM run_status
       WHERE entry_id = ? AND model_name = ? AND stage = ?`,
    )
    .get(entryId, modelName, stage) as { status: string } | undefined;
  db.close();
  if (!row) return "pending";
  if (row.status === "complete") return "complete";
  if (row.status === "needs_review") return "needs_review";
  return "pending";
 }
 function markSubStage(
  entryId: number,
  modelName: string,
  stage: SubStage,
  status: "complete" | "needs_review",
 ): void {
  const db = openDb();
  db.prepare(
    `INSERT INTO run_status (entry_id, model_name, stage, status)
     VALUES (?, ?, ?, ?)
     ON CONFLICT (entry_id, model_name, stage)
     DO UPDATE SET status = ?, updated_at = datetime('now')`,
  ).run(entryId, modelName, stage, status, status);
  db.close();
 }
 // ── Write helpers ─────────────────────────────────────────────────────────────
 function writeGloss(
  entryId: number,
  modelName: string,
  result: GlossResult,
 ): void {
  if (result.status === "improved") {
    const db = openDb();
    db.prepare(
      `INSERT INTO generated_glosses (entry_id, model_name, text)
       VALUES (?, ?, ?)
       ON CONFLICT (entry_id, model_name) DO NOTHING`,
    ).run(entryId, modelName, result.gloss);
    db.close();
  }
 }
 function writeExample(
  entryId: number,
  modelName: string,
  result: ExampleResult,
 ): void {
  if (result.status === "improved") {
    const db = openDb();
    db.prepare(
      `INSERT INTO generated_examples (entry_id, model_name, text)
       VALUES (?, ?, ?)
       ON CONFLICT (entry_id, model_name) DO NOTHING`,
    ).run(entryId, modelName, result.example);
    db.close();
  }
 }
 function writeTranslations(
  entryId: number,
  modelName: string,
  result: TranslationResult,
  translations: TranslationRow[],
 ): void {
  const db = openDb();
  db.transaction(() => {
    // Write rejections
    for (const t of translations) {
      const vote = result.translations[t.target_lang]?.[t.word];
      if (vote === "reject") {
        db.prepare(
          `INSERT INTO model_translation_rejections (translation_id, model_name)
           VALUES (?, ?)
           ON CONFLICT (translation_id, model_name) DO NOTHING`,
        ).run(t.id, modelName);
      }
    }
    // Write generated translations
    if (result.generated) {
      for (const [lang, word] of Object.entries(result.generated)) {
        db.prepare(
          `INSERT INTO generated_translations (entry_id, model_name, target_lang, word)
           VALUES (?, ?, ?, ?)
           ON CONFLICT (entry_id, model_name, target_lang) DO NOTHING`,
        ).run(entryId, modelName, lang, word);
      }
    }
  })();
  db.close();
 }
 function writeCefr(
  entryId: number,
  modelName: string,
  result: CefrResult,
  translations: TranslationRow[],
 ): void {
  const db = openDb();
  db.transaction(() => {
    // Headword CEFR
    db.prepare(
      `INSERT INTO model_entry_cefr_votes (entry_id, model_name, cefr_level)
       VALUES (?, ?, ?)
       ON CONFLICT (entry_id, model_name) DO NOTHING`,
    ).run(entryId, modelName, result.headword_cefr);
    // Translation CEFR votes
    for (const t of translations) {
      const level = result.translation_cefr[t.target_lang]?.[t.word];
      if (level && CEFR_SET.has(level)) {
        db.prepare(
          `INSERT INTO model_translation_cefr_votes (translation_id, model_name, cefr_level)
           VALUES (?, ?, ?)
           ON CONFLICT (translation_id, model_name) DO NOTHING`,
        ).run(t.id, modelName, level);
      }
    }
  })();
  db.close();
 }
 // ── Progress ──────────────────────────────────────────────────────────────────
 function updateProgress(
  processed: number,
  needsReview: number,
  total: number,
  llmMs: number,
  startTime: number,
 ): void {
  const totalProcessed = processed + needsReview;
  const pct = ((totalProcessed / total) * 100).toFixed(1);
  const elapsed = (Date.now() - startTime) / 1000;
  const rate = elapsed > 0 ? totalProcessed / elapsed : 0;
  const remaining = rate > 0 ? (total - totalProcessed) / rate : 0;
  const eta =
    remaining === 0
      ? "calculating..."
      : remaining < 60
        ? `${Math.round(remaining)}s`
        : `${Math.round(remaining / 60)}m`;
  const totalElapsedStr =
    elapsed < 60
      ? `${Math.round(elapsed)}s`
      : `${Math.floor(elapsed / 60)}m ${Math.round(elapsed % 60)}s`;
  process.stdout.write(
    `\r    ${totalProcessed}/${total} (${pct}%) — entry: ${(llmMs / 1000).toFixed(1)}s — total: ${totalElapsedStr} — ETA: ${eta}    `,
  );
 }
 // ── Main enrich function ──────────────────────────────────────────────────────
 export async function enrich(
  provider: ProviderConfig,
 ): Promise<{ processed: number; skipped: number; needsReview: number }> {
  registerEnrichShutdown();
  const db = openDb();
  const allEntries = db
    .prepare(`SELECT * FROM entries WHERE language = 'en'`)
    .all() as EntryRow[];
  // An entry is fully complete when all 4 sub-stages are complete
  const completeEntries = db
    .prepare(
      `SELECT entry_id FROM run_status
       WHERE model_name = ? AND stage = 'round1_gloss'
       AND status = 'complete'`,
    )
    .all(provider.name) as { entry_id: number }[];
  const completeIds = new Set(completeEntries.map((r) => r.entry_id));
  const pending = allEntries.filter((e) => !completeIds.has(e.id)).slice(0, 50);
  db.close();
  console.log(`\n  Model: ${provider.name}`);
  console.log(`  Total entries: ${allEntries.length.toLocaleString()}`);
  console.log(`  Already complete: ${completeIds.size.toLocaleString()}`);
  console.log(`  Pending: ${pending.length.toLocaleString()}`);
  if (pending.length === 0) {
    console.log("  Nothing to process.");
    return { processed: 0, skipped: completeIds.size, needsReview: 0 };
  }
  let processedCount = 0;
  let needsReviewCount = 0;
  let llmMs = 0;
  const startTime = Date.now();
  for (const entry of pending) {
    if (shutdownRequested) break;
    const db2 = openDb();
    const translations = db2
      .prepare(
        `SELECT id, target_lang, word FROM translations WHERE entry_id = ? AND source = 'kaikki'`,
      )
      .all(entry.id) as TranslationRow[];
    db2.close();
    let entryFailed = false;
    // ── Sub-stage 1: Gloss ────────────────────────────────────────────────────
    let verifiedGloss = entry.gloss ?? "";
    if (
      getSubStageStatus(entry.id, provider.name, "round1_gloss") !== "complete"
    ) {
      try {
        const llmStart = Date.now();
        const raw = await callLlm(buildGlossPrompt(entry), provider);
        llmMs = Date.now() - llmStart;
        const result = validateGloss(raw);
        if (!result) {
          markSubStage(entry.id, provider.name, "round1_gloss", "needs_review");
          console.warn(
            `\n    needs_review: entry ${entry.id} round1_gloss — invalid response`,
          );
          entryFailed = true;
        } else {
          writeGloss(entry.id, provider.name, result);
          if (result.status === "improved") verifiedGloss = result.gloss;
          markSubStage(entry.id, provider.name, "round1_gloss", "complete");
        }
      } catch (err) {
        llmMs = 0;
        const message = err instanceof Error ? err.message : String(err);
        markSubStage(entry.id, provider.name, "round1_gloss", "needs_review");
        console.warn(
          `\n    needs_review: entry ${entry.id} round1_gloss — ${message}`,
        );
        entryFailed = true;
      }
    }
    if (entryFailed) {
      needsReviewCount++;
      updateProgress(
        processedCount,
        needsReviewCount,
        pending.length,
        llmMs,
        startTime,
      );
      continue;
    }
    /*
    // ── Sub-stages 2, 3, 4 — not yet active ──────────────────────────────────
    // ── Sub-stage 2: Example ──────────────────────────────────────────────────
    if (
      getSubStageStatus(entry.id, provider.name, "round1_example") !==
      "complete"
    ) {
      try {
        const llmStart = Date.now();
        const raw = await callLlm(
          buildExamplePrompt(entry, verifiedGloss),
          provider,
        );
        llmMs = Date.now() - llmStart;
        const result = validateExample(raw);
        if (!result) {
          markSubStage(
            entry.id,
            provider.name,
            "round1_example",
            "needs_review",
          );
          console.warn(
            `\n    needs_review: entry ${entry.id} round1_example — invalid response`,
          );
          entryFailed = true;
        } else {
          writeExample(entry.id, provider.name, result);
          markSubStage(entry.id, provider.name, "round1_example", "complete");
        }
      } catch (err) {
        llmMs = 0;
        const message = err instanceof Error ? err.message : String(err);
        markSubStage(entry.id, provider.name, "round1_example", "needs_review");
        console.warn(
          `\n    needs_review: entry ${entry.id} round1_example — ${message}`,
        );
        entryFailed = true;
      }
    }
    if (entryFailed) {
      needsReviewCount++;
      updateProgress(
        processedCount,
        needsReviewCount,
        pending.length,
        llmMs,
        startTime,
      );
      continue;
    }
    // ── Sub-stage 3: Translations ─────────────────────────────────────────────
    const validatedTranslations = new Map<SupportedLanguageCode, string[]>();
    if (
      getSubStageStatus(entry.id, provider.name, "round1_translations") !==
      "complete"
    ) {
      try {
        const llmStart = Date.now();
        const raw = await callLlm(
          buildTranslationsPrompt(entry, translations, verifiedGloss),
          provider,
        );
        llmMs = Date.now() - llmStart;
        const result = validateTranslations(raw, translations);
        if (!result) {
          markSubStage(
            entry.id,
            provider.name,
            "round1_translations",
            "needs_review",
          );
          console.warn(
            `\n    needs_review: entry ${entry.id} round1_translations — invalid response`,
          );
          entryFailed = true;
        } else {
          writeTranslations(entry.id, provider.name, result, translations);
          markSubStage(
            entry.id,
            provider.name,
            "round1_translations",
            "complete",
          );
          // Build validated translations map for CEFR sub-stage
          // Include kaikki translations that were ok'd + generated translations
          for (const t of translations) {
            const vote = result.translations[t.target_lang]?.[t.word];
            if (vote === "ok") {
              if (!validatedTranslations.has(t.target_lang)) {
                validatedTranslations.set(t.target_lang, []);
              }
              validatedTranslations.get(t.target_lang)!.push(t.word);
            }
          }
          if (result.generated) {
            for (const [lang, word] of Object.entries(result.generated)) {
              const l = lang as SupportedLanguageCode;
              if (!validatedTranslations.has(l))
                validatedTranslations.set(l, []);
              validatedTranslations.get(l)!.push(word);
            }
          }
        }
      } catch (err) {
        llmMs = 0;
        const message = err instanceof Error ? err.message : String(err);
        markSubStage(
          entry.id,
          provider.name,
          "round1_translations",
          "needs_review",
        );
        console.warn(
          `\n    needs_review: entry ${entry.id} round1_translations — ${message}`,
        );
        entryFailed = true;
      }
    } else {
      // Already complete — rebuild validated translations from db
      const db3 = openDb();
      const rejections = new Set(
        (
          db3
            .prepare(
              `SELECT translation_id FROM model_translation_rejections WHERE model_name = ?`,
            )
            .all(provider.name) as { translation_id: number }[]
        ).map((r) => r.translation_id),
      );
      for (const t of translations) {
        if (!rejections.has(t.id)) {
          if (!validatedTranslations.has(t.target_lang)) {
            validatedTranslations.set(t.target_lang, []);
          }
          validatedTranslations.get(t.target_lang)!.push(t.word);
        }
      }
      const generated = db3
        .prepare(
          `SELECT target_lang, word FROM generated_translations WHERE entry_id = ? AND model_name = ?`,
        )
        .all(entry.id, provider.name) as {
        target_lang: SupportedLanguageCode;
        word: string;
      }[];
      for (const g of generated) {
        if (!validatedTranslations.has(g.target_lang))
          validatedTranslations.set(g.target_lang, []);
        validatedTranslations.get(g.target_lang)!.push(g.word);
      }
      db3.close();
    }
    if (entryFailed) {
      needsReviewCount++;
      updateProgress(
        processedCount,
        needsReviewCount,
        pending.length,
        llmMs,
        startTime,
      );
      continue;
    }
    // ── Sub-stage 4: CEFR ─────────────────────────────────────────────────────
    if (
      getSubStageStatus(entry.id, provider.name, "round1_cefr") !== "complete"
    ) {
      try {
        const llmStart = Date.now();
        const raw = await callLlm(
          buildCefrPrompt(entry, verifiedGloss, validatedTranslations),
          provider,
        );
        llmMs = Date.now() - llmStart;
        const result = validateCefr(raw, validatedTranslations);
        if (!result) {
          markSubStage(entry.id, provider.name, "round1_cefr", "needs_review");
          console.warn(
            `\n    needs_review: entry ${entry.id} round1_cefr — invalid response`,
          );
          needsReviewCount++;
        } else {
          // Get translation rows for validated words only
          const validatedRows = translations.filter((t) => {
            return validatedTranslations.get(t.target_lang)?.includes(t.word);
          });
          writeCefr(entry.id, provider.name, result, validatedRows);
          markSubStage(entry.id, provider.name, "round1_cefr", "complete");
          processedCount++;
        }
      } catch (err) {
        llmMs = 0;
        const message = err instanceof Error ? err.message : String(err);
        markSubStage(entry.id, provider.name, "round1_cefr", "needs_review");
        console.warn(
          `\n    needs_review: entry ${entry.id} round1_cefr — ${message}`,
        );
        needsReviewCount++;
      }
    } else {
      processedCount++;
    }
    */
    processedCount++;
    updateProgress(
      processedCount,
      needsReviewCount,
      pending.length,
      llmMs,
      startTime,
    );
  }
  process.stdout.write("\n");
  const totalMs = Date.now() - startTime;
  const totalMin = Math.floor(totalMs / 60_000);
  const totalSec = Math.round((totalMs % 60_000) / 1000);
  console.log(`  Total time: ${totalMin}m ${totalSec}s`);
  console.log(
    `  Avg per entry: ${(totalMs / Math.max(processedCount + needsReviewCount, 1) / 1000).toFixed(1)}s`,
  );
  console.log(`  Processed: ${processedCount.toLocaleString()}`);
  console.log(`  Needs review: ${needsReviewCount.toLocaleString()}`);
  return {
    processed: processedCount,
    skipped: completeIds.size,
    needsReview: needsReviewCount,
  };
 }
--- a/data-pipeline/tests/validation/db-import.validation.test.ts
+++ b/data-pipeline/tests/validation/db-import.validation.test.ts
@ -0,0 +1,230 @@
 import fs from "node:fs/promises";
 import path from "node:path";
 import { describe, it, expect, beforeAll } from "vitest";
 import { SUPPORTED_LANGUAGE_CODES } from "@lila/shared";
 import type { SupportedLanguageCode, SupportedPos } from "@lila/shared";
 // ── Types ─────────────────────────────────────────────────────────────────────
 type ExtractedSense = {
  headword: string;
  language: SupportedLanguageCode;
  pos: SupportedPos;
  sense_index: number;
  gloss: string | null;
  examples: string[];
  translations: {
    target_lang: SupportedLanguageCode;
    word: string;
    sense_hint: string | null;
  }[];
 };
 // ── Paths ─────────────────────────────────────────────────────────────────────
 const DB_PATH = path.resolve("db/pipeline.db");
 const OUTPUT_DIR = path.resolve("stage-1-extract/output");
 // ── Helpers ───────────────────────────────────────────────────────────────────
 async function dbExists(): Promise<boolean> {
  try {
    await fs.access(DB_PATH);
    return true;
  } catch {
    return false;
  }
 }
 // ── Tests ─────────────────────────────────────────────────────────────────────
 describe("pipeline.db — import validation", () => {
  let db: import("better-sqlite3").Database;
  let expectedEntriesByLang: Map<SupportedLanguageCode, number>;
  let expectedTotalTranslations: number;
  beforeAll(async () => {
    if (!(await dbExists())) return;
    const Database = (await import("better-sqlite3")).default;
    db = new Database(DB_PATH, { readonly: true });
    db.pragma("foreign_keys = ON");
    expectedEntriesByLang = new Map();
    expectedTotalTranslations = 0;
    for (const lang of SUPPORTED_LANGUAGE_CODES) {
      try {
        const raw = await fs.readFile(
          path.join(OUTPUT_DIR, `${lang}.json`),
          "utf-8",
        );
        const senses = JSON.parse(raw) as ExtractedSense[];
        expectedEntriesByLang.set(lang, senses.length);
        if (lang === "en") {
          for (const sense of senses) {
            expectedTotalTranslations += sense.translations.length;
          }
        }
      } catch {
        expectedEntriesByLang.set(lang, 0);
      }
    }
  }, 30_000);
  it("pipeline.db exists — skipping all tests if not", async () => {
    const exists = await dbExists();
    if (!exists) {
      console.warn(
        "\n  pipeline.db not found — run pnpm db:init and pnpm db:import first\n",
      );
    }
    expect(exists).toBe(true);
  });
  it("entry count per language matches source files", () => {
    if (!db) return;
    const errors: string[] = [];
    for (const lang of SUPPORTED_LANGUAGE_CODES) {
      const expected = expectedEntriesByLang.get(lang) ?? 0;
      const row = db
        .prepare("SELECT COUNT(*) as count FROM entries WHERE language = ?")
        .get(lang) as { count: number };
      if (row.count !== expected) {
        errors.push(`${lang}: expected ${expected} entries, got ${row.count}`);
      }
    }
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("translation count matches source files plus reverse links", () => {
    if (!db) return;
    const row = db
      .prepare("SELECT COUNT(*) as count FROM translations")
      .get() as { count: number };
    const reverseLinks = db
      .prepare(
        "SELECT COUNT(*) as count FROM translations WHERE source = 'reverse_link'",
      )
      .get() as { count: number };
    expect(row.count).toBe(expectedTotalTranslations + reverseLinks.count);
  });
  it("every translation references a valid entry", () => {
    if (!db) return;
    const rows = db
      .prepare(
        `SELECT t.id, t.entry_id
         FROM translations t
         LEFT JOIN entries e ON e.id = t.entry_id
         WHERE e.id IS NULL`,
      )
      .all() as { id: number; entry_id: number }[];
    const errors = rows.map(
      (r) => `translation ${r.id}: references missing entry ${r.entry_id}`,
    );
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("every entry has a valid language code", () => {
    if (!db) return;
    const validLangs = SUPPORTED_LANGUAGE_CODES.map((l) => `'${l}'`).join(", ");
    const rows = db
      .prepare(
        `SELECT id, headword, language FROM entries
         WHERE language NOT IN (${validLangs})`,
      )
      .all() as { id: number; headword: string; language: string }[];
    const errors = rows.map(
      (r) => `entry ${r.id} "${r.headword}": invalid language "${r.language}"`,
    );
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("every entry has a valid pos", () => {
    if (!db) return;
    const rows = db
      .prepare(
        `SELECT id, headword, pos FROM entries
         WHERE pos NOT IN ('noun', 'verb', 'adjective', 'adverb')`,
      )
      .all() as { id: number; headword: string; pos: string }[];
    const errors = rows.map(
      (r) => `entry ${r.id} "${r.headword}": invalid pos "${r.pos}"`,
    );
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("sense_index is unique per headword, language, pos", () => {
    if (!db) return;
    const rows = db
      .prepare(
        `SELECT headword, language, pos, sense_index, COUNT(*) as c
         FROM entries
         GROUP BY headword, language, pos, sense_index
         HAVING c > 1`,
      )
      .all() as {
      headword: string;
      language: string;
      pos: string;
      sense_index: number;
      c: number;
    }[];
    const errors = rows.map(
      (r) =>
        `"${r.headword}" (${r.language} ${r.pos}): duplicate sense_index ${r.sense_index} (${r.c} rows)`,
    );
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("non-English entries have no Kaikki translations", () => {
    if (!db) return;
    const nonEnLangs = SUPPORTED_LANGUAGE_CODES.filter((l) => l !== "en")
      .map((l) => `'${l}'`)
      .join(", ");
    const rows = db
      .prepare(
        `SELECT e.headword, e.language, COUNT(t.id) as c
         FROM entries e
         JOIN translations t ON t.entry_id = e.id
         WHERE e.language IN (${nonEnLangs})
         AND t.source = 'kaikki'
         GROUP BY e.id`,
      )
      .all() as { headword: string; language: string; c: number }[];
    const errors = rows.map(
      (r) =>
        `"${r.headword}" (${r.language}): unexpected ${r.c} Kaikki translations`,
    );
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("all Kaikki translation target languages are supported and not English", () => {
    if (!db) return;
    const validLangs = SUPPORTED_LANGUAGE_CODES.map((l) => `'${l}'`).join(", ");
    const rows = db
      .prepare(
        `SELECT t.id, t.target_lang
         FROM translations t
         WHERE t.source = 'kaikki'
         AND (t.target_lang NOT IN (${validLangs}) OR t.target_lang = 'en')`,
      )
      .all() as { id: number; target_lang: string }[];
    const errors = rows.map(
      (r) => `translation ${r.id}: invalid target_lang "${r.target_lang}"`,
    );
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
 });
--- a/data-pipeline/tests/validation/stage-1.validation.test.ts
+++ b/data-pipeline/tests/validation/stage-1.validation.test.ts
@ -0,0 +1,192 @@
 import fs from "node:fs/promises";
 import path from "node:path";
 import { describe, it, expect, beforeAll } from "vitest";
 import { SUPPORTED_LANGUAGE_CODES, SUPPORTED_POS } from "@lila/shared";
 import type { SupportedLanguageCode, SupportedPos } from "@lila/shared";
 // ── Types ─────────────────────────────────────────────────────────────────────
 type ExtractedSense = {
  headword: string;
  language: SupportedLanguageCode;
  pos: SupportedPos;
  sense_index: number;
  gloss: string | null;
  examples: string[];
  translations: {
    target_lang: SupportedLanguageCode;
    word: string;
    sense_hint: string | null;
  }[];
 };
 // ── Paths ─────────────────────────────────────────────────────────────────────
 const OUTPUT_DIR = path.resolve("stage-1-extract/output");
 // ── Tests ─────────────────────────────────────────────────────────────────────
 describe("stage 1 — Kaikki extraction output validation", () => {
  const sensesByLang = new Map<SupportedLanguageCode, ExtractedSense[]>();
  beforeAll(async () => {
    for (const lang of SUPPORTED_LANGUAGE_CODES) {
      const filePath = path.join(OUTPUT_DIR, `${lang}.json`);
      const raw = await fs.readFile(filePath, "utf-8");
      sensesByLang.set(lang, JSON.parse(raw) as ExtractedSense[]);
    }
  }, 30_000);
  it("all five language output files exist", async () => {
    const errors: string[] = [];
    for (const lang of SUPPORTED_LANGUAGE_CODES) {
      try {
        await fs.access(path.join(OUTPUT_DIR, `${lang}.json`));
      } catch {
        errors.push(`missing: ${lang}.json`);
      }
    }
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("every language file is a non-empty array", () => {
    const errors: string[] = [];
    for (const lang of SUPPORTED_LANGUAGE_CODES) {
      const senses = sensesByLang.get(lang)!;
      if (!Array.isArray(senses)) errors.push(`${lang}: not an array`);
      else if (senses.length === 0) errors.push(`${lang}: empty array`);
    }
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("every sense has required fields", () => {
    const errors: string[] = [];
    for (const lang of SUPPORTED_LANGUAGE_CODES) {
      for (const sense of sensesByLang.get(lang)!) {
        if (!sense.headword) errors.push(`${lang}: sense missing headword`);
        if (!sense.language)
          errors.push(`${lang} ${sense.headword}: missing language`);
        if (!sense.pos) errors.push(`${lang} ${sense.headword}: missing pos`);
        if (sense.sense_index === undefined)
          errors.push(`${lang} ${sense.headword}: missing sense_index`);
        if (!Array.isArray(sense.examples))
          errors.push(`${lang} ${sense.headword}: examples not an array`);
        if (!Array.isArray(sense.translations))
          errors.push(`${lang} ${sense.headword}: translations not an array`);
      }
    }
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("every sense has a valid pos", () => {
    const errors: string[] = [];
    const validPos = new Set(SUPPORTED_POS);
    for (const lang of SUPPORTED_LANGUAGE_CODES) {
      for (const sense of sensesByLang.get(lang)!) {
        if (!validPos.has(sense.pos)) {
          errors.push(`${lang} ${sense.headword}: invalid pos "${sense.pos}"`);
        }
      }
    }
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("every sense language code matches its file", () => {
    const errors: string[] = [];
    for (const lang of SUPPORTED_LANGUAGE_CODES) {
      for (const sense of sensesByLang.get(lang)!) {
        if (sense.language !== lang) {
          errors.push(
            `${lang} ${sense.headword}: language field "${sense.language}" does not match file`,
          );
        }
      }
    }
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("no abbreviation senses in output", () => {
    const errors: string[] = [];
    for (const lang of SUPPORTED_LANGUAGE_CODES) {
      for (const sense of sensesByLang.get(lang)!) {
        if (sense.gloss?.toLowerCase().startsWith("abbreviation of")) {
          errors.push(
            `${lang} ${sense.headword}: abbreviation sense not filtered`,
          );
        }
      }
    }
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("English senses all have at least one translation", () => {
    const errors: string[] = [];
    for (const sense of sensesByLang.get("en")!) {
      if (sense.translations.length === 0) {
        errors.push(
          `en ${sense.headword} (sense ${sense.sense_index}): no translations`,
        );
      }
    }
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("non-English senses have no translations", () => {
    const errors: string[] = [];
    for (const lang of SUPPORTED_LANGUAGE_CODES) {
      if (lang === "en") continue;
      for (const sense of sensesByLang.get(lang)!) {
        if (sense.translations.length > 0) {
          errors.push(
            `${lang} ${sense.headword}: unexpected translations in non-English file`,
          );
        }
      }
    }
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("all translation target languages are supported and not English", () => {
    const errors: string[] = [];
    const validLangs = new Set(SUPPORTED_LANGUAGE_CODES);
    for (const sense of sensesByLang.get("en")!) {
      for (const t of sense.translations) {
        if (!validLangs.has(t.target_lang)) {
          errors.push(
            `en ${sense.headword}: unsupported translation language "${t.target_lang}"`,
          );
        }
        if (t.target_lang === "en") {
          errors.push(
            `en ${sense.headword}: translation to same language "en"`,
          );
        }
        if (!t.word?.trim()) {
          errors.push(
            `en ${sense.headword}: empty translation word for ${t.target_lang}`,
          );
        }
      }
    }
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
  it("sense_index is unique per headword and pos within each language", () => {
    const errors: string[] = [];
    for (const lang of SUPPORTED_LANGUAGE_CODES) {
      const seen = new Map<string, Set<number>>();
      for (const sense of sensesByLang.get(lang)!) {
        const key = `${sense.headword}|${sense.pos}`;
        if (!seen.has(key)) seen.set(key, new Set());
        const indexes = seen.get(key)!;
        if (indexes.has(sense.sense_index)) {
          errors.push(
            `${lang} ${sense.headword} (${sense.pos}): duplicate sense_index ${sense.sense_index}`,
          );
        }
        indexes.add(sense.sense_index);
      }
    }
    expect(errors, `\n${errors.join("\n")}`).toHaveLength(0);
  });
 });
--- a/data-pipeline/tsconfig.json
+++ b/data-pipeline/tsconfig.json
@ -8,5 +8,5 @@
    "types": ["node"]
  },
  "references": [{ "path": "../packages/shared" }],
-  "include": ["./**/*"]
+  "include": ["./**/*", "vitest.config.ts"]
 }
--- a/data-pipeline/vitest.config.ts
+++ b/data-pipeline/vitest.config.ts
@ -0,0 +1,11 @@
 import { defineConfig } from "vitest/config";
 export default defineConfig({
  test: {
    environment: "node",
    globals: true,
    include: ["tests/**/*.test.ts"],
    exclude: ["**/dist/**", "**/node_modules/**"],
    testTimeout: 60_000,
  },
 });
--- a/documentation/ARCHITECTURE.md
+++ b/documentation/ARCHITECTURE.md
@ -0,0 +1,229 @@
 # Architecture
 > How Lila is structured, how data flows, and why the boundaries are where they are.
 ---
 ## Monorepo Layout
 ```
 lila/
 ├── apps/
 │   ├── api/              — Express backend (HTTP + WebSocket)
 │   └── web/              — React frontend (Vite, TanStack Router)
 ├── packages/
 │   ├── shared/           — Zod schemas + constants (API/web contract)
 │   └── db/               — Drizzle schema, migrations, models, seeding
 ├── data-pipeline/        — Kaikki extraction → enrichment → PostgreSQL sync
 ├── documentation/        — Project docs
 ├── Caddyfile             — Reverse proxy routing
 ├── docker-compose.yml    — Local dev stack
 └── pnpm-workspace.yaml   — Workspace definition
 ```
 **Package boundaries:**
 | Package           | Owns                                                              | Consumed by                           |
 | ----------------- | ----------------------------------------------------------------- | ------------------------------------- |
 | `packages/shared` | Zod schemas, constants, derived TypeScript types                  | `apps/api`, `apps/web`, `packages/db` |
 | `packages/db`     | Drizzle schema, DB connection, all model/query functions          | `apps/api`                            |
 | `apps/api`        | Router, controllers, services, error handling, WebSocket handlers | —                                     |
 | `apps/web`        | React components, routes, client-side state                       | —                                     |
 **Rule:** `apps/api` never imports `drizzle-orm` for queries. It only calls functions exported from `packages/db`.
 ---
 ## Layered Architecture (HTTP)
 ```
 HTTP Request
     ↓
  Router        — maps URL + HTTP method to a controller
     ↓
 Controller     — handles HTTP only: validates input (Zod safeParse),
                  calls service, sends response or next(error)
     ↓
  Service       — business logic only: no HTTP, no direct DB access
     ↓
  Model         — database queries only: no business logic
     ↓
  Database      — PostgreSQL via Drizzle ORM
 ```
 **The rule:** each layer only talks to the layer directly below it.
 - **Controller** never touches the database.
 - **Service** never reads `req.body`.
 - **Model** never knows what a quiz is.
 ### Error Flow
 ```
 Controller throws ValidationError (400) or calls next(error)
     ↓
 Central errorHandler middleware in app.ts
     ↓
 Maps AppError subclasses to HTTP status codes
     ↓
 Unknown errors → 500
 ```
 ---
 ## WebSocket Architecture
 The WebSocket server is attached to the same Express HTTP server. It upgrades connections on the `/ws` path.
 ```
 WS Connection Upgrade
     ↓
 Auth middleware — validates Better Auth session from cookie
     ↓
 Message Router — dispatches by `type` field (Zod discriminated union)
     ↓
 Handler (lobby or game) — business logic, broadcasts state
     ↓
 In-memory stores (lobby game state, game session state)
 ```
 **Message protocol:** All WebSocket messages are validated against Zod schemas defined in `packages/shared/src/schemas/lobby.ts` and `packages/shared/src/schemas/game.ts`. The `type` field is a discriminated union — the router switches on it and validates the payload against the corresponding schema.
 **State storage:**
 - **Lobby membership** — stored in PostgreSQL (`lobbies`, `lobby_players` tables) for durability
 - **Game/room state** — stored in-memory (`InMemoryLobbyGameStore`, `InMemoryGameSessionStore`). Valkey migration is planned.
 ---
 ## Database Schema (Core)
 **Concept:** Words are language-neutral concepts (`terms`) with per-language `translations`. Adding a new language requires no schema changes — only new rows.
 ### Core Tables
 | Table          | Purpose                                                                          |
 | -------------- | -------------------------------------------------------------------------------- |
 | `terms`        | Language-neutral concept: `id`, `pos` (noun/verb/adj/adv), `source`, `source_id` |
 | `translations` | Per-language word: `term_id` (FK), `language_code`, `text`, `cefr_level` (A1–C2) |
 | `term_glosses` | Per-language definition: `term_id` (FK), `language_code`, `text`                 |
 | `decks`        | Curated wordlists: `source_language`, `validated_languages`, frequency tier      |
 | `deck_terms`   | Junction: which terms belong to which deck                                       |
 ### Auth Tables (managed by Better Auth)
 | Table          | Purpose                                                                           |
 | -------------- | --------------------------------------------------------------------------------- |
 | `user`         | Account: `id`, `name`, `email`, `image`                                           |
 | `session`      | Active sessions: `id`, `user_id`, `token`, `expires_at`                           |
 | `account`      | Social provider links: `user_id`, `provider` (google/github), `providerAccountId` |
 | `verification` | Email verification tokens (unused for social-only auth)                           |
 **Key constraints:**
 - `language_code` is CHECK-constrained against `SUPPORTED_LANGUAGE_CODES` (`en`, `it`, `de`, `es`, `fr`)
 - `pos` is CHECK-constrained against `SUPPORTED_POS` (`noun`, `verb`, `adjective`, `adverb`)
 - `cefr_level` is nullable `varchar(2)` with CHECK `A1`–`C2`
 - `translations` has UNIQUE `(term_id, language_code, text)` — allows synonyms, prevents exact duplicates
 ---
 ## Data Flow: Quiz Session
 ### Singleplayer
 ```
 User clicks "Start Quiz"
     ↓
 POST /api/v1/game/start  (GameRequestSchema: source_lang, target_lang, pos, difficulty, rounds)
     ↓
 gameController.validate → gameService.createGameSession
     ↓
 termModel.getGameTerms(filters) + termModel.getDistractors(filters)
     ↓
 Service shuffles options, stores session in GameSessionStore
     ↓
 Returns GameSession { sessionId, questions[] } — correct answer NEVER sent to frontend
     ↓
 User answers → POST /api/v1/game/answer (AnswerSubmissionSchema)
     ↓
 Service evaluates server-side, returns AnswerResult { isCorrect, correctOptionId, selectedOptionId }
 ```
 ### Multiplayer
 ```
 Host creates lobby → POST /api/v1/lobbies → returns room code
     ↓
 Players join via code → POST /api/v1/lobbies/:code/join
     ↓
 All players connect WebSocket → send lobby:join with room code
     ↓
 Server broadcasts lobby:state (player list) to all connections in room
     ↓
 Host clicks "Start" → WS lobby:start
     ↓
 Server generates questions via MultiplayerGameService, broadcasts game:question
     ↓
 Players submit answers via WS game:answer within 15s server timer
     ↓
 On all-answered or timeout → evaluate, broadcast game:answer_result
     ↓
 After N rounds → broadcast game:finished with final scores
 ```
 ---
 ## The `packages/shared` Contract
 `packages/shared` is the **single source of truth** for all data shapes crossing the API boundary.
 **What lives here:**
 - `constants.ts` — `SUPPORTED_LANGUAGE_CODES`, `SUPPORTED_POS`, `DIFFICULTY_LEVELS`, `CEFR_LEVELS`, `GAME_ROUNDS`
 - `schemas/game.ts` — `GameRequestSchema`, `GameSessionSchema`, `GameQuestionSchema`, `AnswerOptionSchema`, `AnswerSubmissionSchema`, `AnswerResultSchema`
 - `schemas/lobby.ts` — `LobbyCreateSchema`, `LobbyJoinSchema`, `LobbyStateSchema`, `WebSocketMessageSchema` (discriminated union)
 - `schemas/auth.ts` — Auth-related shared types
 **Why this matters:** If the shape changes, TypeScript compilation fails in both `apps/api` and `apps/web` simultaneously. Silent drift is impossible.
 ---
 ## GameSessionStore Abstraction
 The service layer stores session state through an interface, not a concrete implementation:
 ```typescript
 interface GameSessionStore {
  createSession(session: GameSession): Promise<void>;
  getSession(sessionId: string): Promise<GameSession | null>;
  // ...
 }
 ```
 **Current:** `InMemoryGameSessionStore` — Map-based, lives in `apps/api` process memory. Lost on restart.
 **Planned:** `ValkeyGameSessionStore` — Redis-compatible, persists across restarts, enables horizontal scaling.
 The same pattern applies to `LobbyGameStore` (lobby state).
 ---
 ## Key Design Decisions (Quick Reference)
 | Decision                          | Where it's explained          |
 | --------------------------------- | ----------------------------- |
 | Why Drizzle over Prisma           | `DECISIONS.md` → ORM          |
 | Why `ws` over Socket.io           | `DECISIONS.md` → WebSocket    |
 | Why server-side answer evaluation | `DECISIONS.md` → Architecture |
 | Why Better Auth over Keycloak     | `DECISIONS.md` → Auth         |
 | Why terms/translations schema     | `DECISIONS.md` → Data Model   |
 | Why Caddy over Nginx/Traefik      | `DECISIONS.md` → Deployment   |
 ---
 ## Further Reading
 - [DATA_PIPELINE.md](DATA_PIPELINE.md) — How vocabulary data gets from Kaikki into PostgreSQL
 - [DEPLOYMENT.md](DEPLOYMENT.md) — Production infrastructure and ops
 - [MODEL_STRATEGY.md](MODEL_STRATEGY.md) — LLM voter architecture for CEFR assignment
 - [design/GAME_MODES.md](design/GAME_MODES.md) — Planned multiplayer modes
--- a/documentation/BACKLOG.md
+++ b/documentation/BACKLOG.md
@ -8,15 +8,29 @@ Labels: `[feature]` `[infra]` `[security]` `[ux]` `[debt]`
 Things that are actively in progress or should be picked up immediately. Mostly operational risk and the remaining phase 7 hardening work.
 - **Hetzner domain migration check** `[infra]`
  Verify whether the lilastudy.com domain needs to be migrated following a Hetzner DNS change. Check Hetzner dashboard for any pending migration notice.
 ---
 ## next
 Clearly planned work, not yet started. No hard ordering — sequence based on what unblocks real users first.
 - **404 handling for unknown subdomains and routes** `[ux]`
  Unknown subdomains (e.g., `foo.lilastudy.com`) and client-side routes return raw errors or blank pages. Add catch-all 404 handling: Caddy-level redirect for unknown subdomains, frontend catch-all route for unknown paths.
 - how to update forgejo regularly?
 - stop sql backup script on dev laptop until database moved from openwordnet to kaikki
 - stop db-backup on vps until database moved from openwordnet to kaikki
 - monitor hd space on vps (and other resources?)
 - admin dashboard to see users and their status!
 - refactor frontend: play => learn alone, multiplayer => learn together, also multiplayer => public and private games
 - if logged out, navbar should not show play and multiplayer
 - **Batch distractor queries to eliminate N+1** `[debt]`
  createGameSession calls getDistractors once per term in parallel — 3 queries for 3 rounds, 10 for 10. Each query does ORDER BY RANDOM() which can't use an index and gets slower as the translations table grows. Fix: add a getDistractorsForTerms(termIds[], ...) function to @lila/db that batches all distractor fetches into a single query and returns results grouped by term. The service distributes the results per question. Prerequisite: none. Blocked by: nothing, but coordinate with any ongoing @lila/db changes.
--- a/documentation/DATA_PIPELINE.md
+++ b/documentation/DATA_PIPELINE.md
@ -0,0 +1,489 @@
 # lila data pipeline
 This pipeline extracts vocabulary data from Wiktionary via the Kaikki dataset, enriches it with CEFR levels and fills content gaps using local LLMs, and produces authoritative output in `pipeline.db`. This database is consumed by the sync script to populate the production database with vocabulary entries, translations, glosses, CEFR levels, and difficulty ratings.
 ## Overview
 ```mermaid
 flowchart LR
    kaikki[(Kaikki JSONL)]
    extract[Extract]
    reverselink[Reverse Link Sync]
    enrich[Enrich]
    pipelinedb[(pipeline.db)]
    merge[Merge]
    tiebreak[Tiebreak]
    compare[Compare]
    sync[Sync]
    db[(PostgreSQL)]
    kaikki --> extract
    extract --> pipelinedb
    pipelinedb --> reverselink
    reverselink --> pipelinedb
    pipelinedb --> enrich
    enrich --> pipelinedb
    pipelinedb --> merge
    merge --> pipelinedb
    pipelinedb --> tiebreak
    tiebreak --> pipelinedb
    pipelinedb --> compare
    pipelinedb --> sync
    sync --> db
 ```
 Each stage is a standalone script that reads from and writes to `pipeline.db`. The pipeline is fully resumable — interrupted overnight runs pick up from the last processed record without losing work.
 Stage 1 is a manual prerequisite and is not run by the pipeline orchestrator. See **Stage 1 — Extract** for instructions.
 The enrich stage is designed to run overnight, one model at a time. Each model processes every entry and writes results to `pipeline.db` atomically per record.
 Only fully resolved records reach the production database. Records where LLMs could not reach a majority vote are handled automatically by the tiebreaker stage before syncing.
 ## pipeline.db
 All pipeline state is stored in `pipeline.db` — a SQLite database in `data-pipeline/db/`. It is created automatically on first run and is not committed to git.
 The database serves three purposes:
 - **Resumability** — every record is written atomically with a status. Interrupted overnight runs resume from the last pending record without losing work.
 - **Vote tracking** — all model votes for CEFR levels and generated content are stored per model per record, giving full auditability of how every decision was reached.
 - **Resolved output** — the final resolved records live here and are read by the sync script to seed the production database.
 The schema is defined in `data-pipeline/db/schema.sql`. Never edit `pipeline.db` directly — all writes go through the pipeline scripts.
 On first run the orchestrator initialises `pipeline.db` automatically and imports the stage 1 output into the base tables. This happens once — subsequent runs skip the import if the base tables are already populated.
 ## Common commands
 ### Starting llama.cpp
 ```bash
 cd ~/Downloads/llama.cpp
 ./build/bin/llama-server \
  --model models/qwen3.5-4b-q4_k_m.gguf \
  --port 8080 \
  --ctx-size 4096 \
  --n-gpu-layers 999 \
  --host 127.0.0.1 \
  --chat-template-kwargs '{"enable_thinking":false}' \
  --reasoning-budget 0
 ```
 Verify the server is running:
 ```bash
 curl http://127.0.0.1:8080/health
 ```
 ### Running the pipeline
 ```bash
 pnpm --filter @lila/pipeline pipeline:run
 ```
 The pipeline auto-generates a run name from the date and a counter. It picks up where it left off — completed stages are skipped automatically.
 ### Stage 1 — Extract
 ```bash
 pnpm --filter @lila/pipeline extract
 ```
 Runs in sample mode (500 entries per language) by default. Remove the hardcoded limit in `stage-1-extract/scripts/extract.ts` for a full run.
 ### Stage 2 — Reverse link sync
 ```bash
 pnpm --filter @lila/pipeline reverse-link
 ```
 ### Initialising and importing the database
 ```bash
 # Initialise pipeline.db from schema
 pnpm --filter @lila/pipeline db:init
 # Import stage 1 output into pipeline.db
 pnpm --filter @lila/pipeline db:import
 ```
 ### Resetting the database
 ```bash
 # Full reset — delete and reinitialise
 rm data-pipeline/db/pipeline.db
 pnpm --filter @lila/pipeline db:init
 pnpm --filter @lila/pipeline db:import
 pnpm --filter @lila/pipeline reverse-link
 ```
 ### Resetting enrich stage progress
 ```bash
 # Reset round 1 only
 pnpm --filter @lila/pipeline db:reset round1
 # Reset all stages except reverse link
 pnpm --filter @lila/pipeline db:reset all
 ```
 ### Checking pipeline progress
 ```bash
 node -e "
 const Database = require('better-sqlite3');
 const db = new Database('data-pipeline/db/pipeline.db', { readonly: true });
 const total = db.prepare('SELECT COUNT(*) as c FROM entries WHERE language = \\'en\\'').get().c;
 const complete = db.prepare(\"SELECT COUNT(*) as c FROM run_status WHERE stage = 'round1' AND status = 'complete'\").get().c;
 const needsReview = db.prepare(\"SELECT COUNT(*) as c FROM run_status WHERE stage = 'round1' AND status = 'needs_review'\").get().c;
 console.log('Total English entries:', total);
 console.log('Round 1 complete:', complete);
 console.log('Needs review:', needsReview);
 console.log('Pending:', total - complete - needsReview);
 db.close();
 "
 ```
 ## Data source
 ### Kaikki (Wiktionary)
 The pipeline uses pre-extracted Wiktionary data from [kaikki.org](https://kaikki.org), built with the [wiktextract](https://github.com/tatuylonen/wiktextract) tool. This data is updated weekly from the English Wiktionary dump and is freely available under the same license as Wiktionary (CC-BY-SA).
 **Why Kaikki instead of OMW:**
 Kaikki is structured per word sense. Each headword has multiple senses, and translations are linked to a specific sense rather than a general concept. This prevents the sense disambiguation problems found in OMW, where a single concept entry could contain translations from entirely different meanings of a word.
 Each Kaikki entry provides:
 - A headword in the entry language
 - One or more senses, each with a gloss and examples
 - Per-sense translations to other languages with sense hints
 - IPA pronunciations and audio file references (deferred — see **Further extensions**)
 - Inflected forms (deferred — see **Further extensions**)
 The pipeline uses the English Wiktionary edition (`enwiktionary`), which contains entries for all five supported languages with glosses in English.
 ### CEFR levels
 CEFR levels are assigned entirely by LLM majority vote. Each model receives the headword, gloss, and an example sentence and votes on the appropriate level (A1–C2). There are no curated source files — the LLMs are the sole source of CEFR annotations.
 If no majority is reached after all model runs, the entry is handled automatically by the tiebreaker stage.
 ## Setup
 ### Kaikki data files
 Download the pre-extracted Kaikki JSONL files for each language. These are large files — download them to `stage-1-extract/sources/` which is not committed to git.
 ```bash
 mkdir -p stage-1-extract/sources
 cd stage-1-extract/sources
 # English entries (contains translations to all other languages)
 wget https://kaikki.org/dictionary/English/kaikki.org-dictionary-English.jsonl.gz
 # Per-language files (for entries written in those languages)
 wget https://kaikki.org/dictionary/German/kaikki.org-dictionary-German.jsonl.gz
 wget https://kaikki.org/dictionary/Italian/kaikki.org-dictionary-Italian.jsonl.gz
 wget https://kaikki.org/dictionary/French/kaikki.org-dictionary-French.jsonl.gz
 wget https://kaikki.org/dictionary/Spanish/kaikki.org-dictionary-Spanish.jsonl.gz
 # Decompress
 gunzip *.gz
 ```
 ### LLM setup
 See `llm-setup.md`.
 ## Pipeline stages
 | Stage           | What it does                                                             |
 | --------------- | ------------------------------------------------------------------------ |
 | 1. Extract      | Parses Kaikki JSONL, imports entries into `pipeline.db`                  |
 | 2. Reverse link | Inserts missing reverse translations between language pairs              |
 | 3. Enrich       | LLMs fill translation gaps, improve glosses/examples, assign CEFR levels |
 | 4. Merge        | Resolves LLM votes into final values                                     |
 | 4b. Tiebreak    | Runs unused models on flagged entries until majority is reached          |
 | 5. Compare / QA | Generates `COVERAGE.md` with detailed quality report                     |
 | 6. Sync         | Upserts resolved records into production PostgreSQL                      |
 ### 1. Extract
 Parses the Kaikki JSONL files for all five languages and imports them into the base tables of `pipeline.db`. Filters to the four supported parts of speech: noun, verb, adjective, adverb. Each Kaikki sense becomes one row in `vocabulary_entries`. Translations are stored in `entry_translations` with their sense hints.
 **Input:** `stage-1-extract/sources/*.jsonl`
 **Output:** `pipeline.db` — `vocabulary_entries` and `entry_translations` tables populated
 ```bash
 pnpm --filter @lila/pipeline extract
 ```
 Add `--sample 100` to import only 100 entries per language for inspection before running the full import.
 Each entry in `pipeline.db` looks like this:
 ```json
 {
  "headword": "thrill",
  "language": "en",
  "pos": "verb",
  "sense_index": 0,
  "gloss": "To suddenly excite someone, or to give them great pleasure.",
  "examples": ["The movie thrilled the audience."],
  "translations": [
    { "language": "de", "word": "begeistern", "sense_hint": "suddenly excite" },
    {
      "language": "fr",
      "word": "enthousiasmer",
      "sense_hint": "suddenly excite"
    },
    { "language": "it", "word": "entusiasmare" },
    { "language": "es", "word": "emocionar" }
  ]
 }
 ```
 > **Note:** Stage 1 is a manual prerequisite. It is not run by the pipeline orchestrator (`pipeline.ts`). Run it once before running the orchestrator for the first time, and re-run it manually if the Kaikki source files are updated.
 ### 2. Reverse link sync
 A pure script stage — no LLMs. For each translation pair in `entry_translations`, checks whether the reverse link exists. If English _thrill → begeistern_ exists and the German entry _begeistern_ exists in `vocabulary_entries` but lacks the English back-link, it is inserted automatically.
 This runs before the enrich stage so that LLMs only generate translations that are genuinely missing — not translations that would be found by a simple reverse lookup.
 **Input:** `pipeline.db` — populated `vocabulary_entries` and `entry_translations`
 **Output:** `pipeline.db` — missing reverse links inserted into `entry_translations`
 ```bash
 pnpm --filter @lila/pipeline reverse-link
 ```
 ### 3. Enrich
 > **Note:** Before running this stage, ensure the llama.cpp server is running
 > locally. The orchestrator checks for a running server at
 > `http://127.0.0.1:8080/health` and exits with instructions if it is not
 > reachable. See `llm-setup.md` for setup instructions.
 The enrich stage runs in four ordered sub-stages per entry, designed to build context progressively. All output is written to `pipeline.db` atomically per sub-stage — runs are fully resumable if interrupted. Each model is run once — one model produces one vote per sub-stage.
 **Sub-stage order:**
 1. **`round1_gloss`** — the LLM reviews the existing gloss. If it is clear and learner-friendly, it confirms it. If not, it generates a better one.
 2. **`round1_example`** — the LLM reviews the existing examples. If they are natural and suitable, it confirms them. If not, it generates one better example sentence in the entry language.
 3. **`round1_translations`** — using the verified gloss as context, the LLM reviews each existing translation. Valid translations are confirmed. Invalid ones (wrong language, suffixes, garbled text, wrong sense) are explicitly rejected. Missing languages get a generated translation.
 4. **`round1_cefr`** — using only the validated translations from the previous sub-stage, the LLM votes on the CEFR level for the headword and for each confirmed translation. Rejected translations never reach this sub-stage.
 This ordering ensures the CEFR voting sub-stage only sees clean, verified data.
 All output is written to `pipeline.db` atomically per sub-stage per entry. Interrupted runs resume from the last incomplete sub-stage without losing work. Each model is run once — one model, one vote per sub-stage.
 **Input:** `pipeline.db` — entries after reverse link sync
 **Output:** `pipeline.db` — gloss votes, example votes, translation votes, CEFR votes per entry per model
 > **Note:** The tiebreaker is not a standalone script. It runs automatically > as part of the pipeline orchestrator after merge completes.
 ### 4. Merge
 Reads all LLM votes from `pipeline.db` and resolves the final value for every field. Writes resolved entries back to `pipeline.db`.
 **Merge rules:**
 - Kaikki source data wins automatically and is never overridden by LLM output
 - For CEFR levels: the level with the most votes wins. If no majority is reached, the entry is flagged for the tiebreaker
 - For LLM-generated text fields: the candidate with the most votes wins. If no majority is reached, the tiebreaker runs
 **Difficulty mapping:**
 | CEFR   | Difficulty   |
 | ------ | ------------ |
 | A1, A2 | easy         |
 | B1, B2 | intermediate |
 | C1, C2 | hard         |
 **Input:** `pipeline.db` — LLM votes
 **Output:** `pipeline.db` — entries updated with resolved values or flagged status
 ### 4b. Tiebreak
 Runs automatically after merge if any entries remain flagged. The script queries `pipeline.db` for flagged entries, identifies which configured models have not yet voted on each entry, and runs those models on the flagged subset only. Merge is re-run after each tiebreaker pass. This repeats until all flagged entries are resolved or no unused models remain.
 If unused models are exhausted and flagged entries remain, the script logs a detailed report showing the exact vote split for each unresolved entry and lists available models from OpenRouter that have not been used. Syncing is blocked until all entries are resolved. To continue, add one or more models to the config and re-run the pipeline — the tiebreaker will pick up automatically.
 > **Note:** The tiebreaker is not a standalone script. It runs automatically as part of the pipeline orchestrator after merge completes.
 ### 5. Compare / QA
 Read-only. Generates `COVERAGE.md` with a full breakdown of pipeline output quality per language. Run this after merge to verify output before syncing to the database.
 **Input:** `pipeline.db` — entries with status `final`
 **Output:** `COVERAGE.md`
 `COVERAGE.md` reports the following per language:
 - Total entries extracted
 - POS breakdown — entry counts for noun, verb, adjective, adverb
 - Translation coverage — how many entries have translations in each other language
 - CEFR coverage — how many entries have a resolved CEFR level, broken down by level
 - Difficulty breakdown — entry counts for easy, intermediate, hard
 - Gloss coverage — how many entries have a gloss, broken down by source (Kaikki vs LLM-generated)
 - Example coverage — same breakdown as glosses
 - LLM model contribution — how many CEFR votes and text candidates each anonymised model contributed
 ## Sync
 The sync script transfers all entries with status `final` in `pipeline.db` to the production PostgreSQL database. It is upsert-based and never wipes existing data. For each entry it checks whether a matching record already exists in the target database:
 - **Missing** → insert
 - **Present but changed** → update
 - **Present and unchanged** → skip
 Run this after all entries are resolved and Compare / QA has been reviewed.
 ```bash
 pnpm --filter @lila/pipeline sync
 ```
 The sync script requires a connection string to the target database. Set `DATABASE_URL` in your `.env` file before running.
 ## Reports
 The pipeline generates a report at the end of every run. Reports are written to `data-pipeline/reports/` as a JSON file and a markdown file with the same name. The markdown is generated from the JSON and contains identical data.
 ```
 data-pipeline/reports/
  2026-05-03_run-1.json
  2026-05-03_run-1.md
 ```
 The run name is auto-generated from the date and a counter. Reports are not committed to git.
 **Nightly report** contains:
 - Entries processed this run vs total
 - Entries remaining per stage
 - Average processing speed and estimated nights remaining
 - `needs_review` count — entries that failed structural validation
 - Per-model progress breakdown
 **Final report** (generated when all entries are processed) additionally contains:
 - Full vote breakdown per model
 - Flagged entries with exact vote splits
 - Available unused models from OpenRouter for tiebreaking
 - Per-model quality metrics — CEFR agreement rate, field coverage, JSON parse rate
 ## Adding a new language
 1. Add the language code to `SUPPORTED_LANGUAGE_CODES` in `packages/shared/src/constants.ts`
 2. Build shared: `pnpm --filter @lila/shared build`
 3. Generate and run a DB migration: `pnpm --filter @lila/db generate` then `pnpm --filter @lila/db migrate`
 4. Download the Kaikki JSONL file for the language from kaikki.org
 5. Re-run the full pipeline
 ## Constants and constraints
 These values are defined in `packages/shared/src/constants.ts` and enforced by database check constraints. The pipeline filters out any entries that violate them.
 | Constant        | Values                                |
 | --------------- | ------------------------------------- |
 | Languages       | `en`, `it`, `de`, `es`, `fr`          |
 | Parts of speech | `noun`, `verb`, `adjective`, `adverb` |
 | CEFR levels     | `A1`, `A2`, `B1`, `B2`, `C1`, `C2`    |
 | Difficulty      | `easy`, `intermediate`, `hard`        |
 Adding a new value to any of these requires a constants update and a database migration before re-running the pipeline. See **Adding a new language** for the full steps — the same process applies for new parts of speech.
 ## Further extensions
 These are not part of the current pipeline but are worth considering as the dataset matures:
 - **IPA pronunciations** — Kaikki includes IPA transcriptions for most entries. Could be extracted and stored in a `entry_pronunciations` table and displayed in the quiz UI.
 - **Audio files** — kaikki.org provides bulk audio file downloads (~20GB) for pronunciations. Could be stored as static files and served alongside the quiz UI.
 - **Inflected forms** — Kaikki provides conjugation and declension tables in a `forms` array. Useful for a future grammar-focused quiz mode.
 - **Grammatical gender** — Kaikki includes grammatical gender for nouns. Could be stored per entry and used as an additional quiz mechanic.
 - **Frequency data** — Word frequency rankings per language from sources like the Google Ngram dataset. Useful for smarter difficulty calibration beyond CEFR levels alone.
 - **Additional languages** — The pipeline is language-agnostic. Adding a new language requires downloading its Kaikki JSONL file, a constants update, and a database migration. See **Adding a new language**.
 ## Roadmap
 **Current state:** Stage 1 extraction and stage 2 reverse link sync complete and verified on sample data. Stage 3 enrich script written and tested — redesigning to sub-stage architecture for better data quality. llama.cpp running with Qwen3.5-4B.
 **Next action:** Rewrite enrich script for sub-stage design.
 | Stage           | Status         |
 | --------------- | -------------- |
 | 1. Extract      | 🔲 not started |
 | 2. Reverse link | 🔲 not started |
 | 3. Enrich       | 🔲 not started |
 | 4. Merge        | 🔲 not started |
 | 4b. Tiebreak    | 🔲 not started |
 | 5. Compare / QA | 🔲 not started |
 | 6. Sync         | 🔲 not started |
 ### Stage 1 — Extract `🔄 in progress`
 - [x] Download Kaikki JSONL files for all 5 languages
 - [x] Write extraction script
 - [x] Write stage 1 validation tests
 - [x] Write db schema, init, and import scripts
 - [x] Write db import validation tests
 - [x] Run sample extraction → `stage-1-extract/output/{lang}.json`
 - [ ] Remove sample limit and run full extraction
 - [ ] Re-run full import → `pipeline.db`
 ### Stage 2 — Reverse link sync `🔄 in progress`
 - [x] Write reverse link sync script
 - [x] Run reverse link sync on sample data → 141 links inserted
 - [ ] Run reverse link sync on full data after full extraction
 ### Stage 3 — Enrich `🔄 in progress`
 **Next action:** Rewrite enrich script for sub-stage design.
 - [x] Write initial enrich script (single-prompt design)
 - [x] Install llama.cpp and verify server
 - [x] Smoke test with sample entries
 - [ ] Rewrite enrich script for sub-stage design (round1_gloss, round1_example, round1_translations, round1_cefr)
 - [ ] Write tests for enrich sub-stages
 - [ ] Run full sample, collect metrics
 - [ ] Compare providers (local vs OpenRouter free models)
 - [ ] Production run — all entries, all models
 ### Stage 4 — Merge `🔲 not started`
 - [ ] Write merge script
 - [ ] Write tests
 - [ ] Run merge → `pipeline.db`
 - [ ] Confirm tiebreaker resolves all flagged entries
 ### Stage 4b — Tiebreak `🔲 not started`
 - [ ] Write tiebreak logic
 - [ ] Run tiebreaker for all flagged entries
 - [ ] Confirm no flagged entries remain before syncing
 ### Stage 5 — Compare / QA `🔲 not started`
 - [ ] Write compare script
 - [ ] Write tests
 - [ ] Run compare → `COVERAGE.md`
 - [ ] Review output quality before syncing
 ### Stage 6 — Sync `🔲 not started`
 - [ ] Write sync script
 - [ ] Write tests
 - [ ] Configure `DATABASE_URL` in `.env`
 - [ ] Run sync → production PostgreSQL
 - [ ] Verify seeded data in production
 ### Utilities
 **`sample/`** — Runs the pipeline against a small sample to produce human-readable output for a quick sanity check before committing to a full run. Run this after any script change before running the full pipeline.
--- a/documentation/DECISIONS.md
+++ b/documentation/DECISIONS.md
--- a/documentation/DEPLOYMENT.md
+++ b/documentation/DEPLOYMENT.md
--- a/documentation/LLM_SETUP.md
+++ b/documentation/LLM_SETUP.md
@ -1,9 +1,12 @@
 # LLM Setup — lila pipeline
-This document covers the LLM infrastructure for stage 3 (enrich) of the lila
+This document covers the LLM infrastructure for stage 3 (enrich) of the lila data pipeline. It documents the hardware constraints, supported providers, model recommendations, and how to configure and swap providers in the test and production scripts.
-data pipeline. It documents the hardware constraints, supported providers,
+
-model recommendations, and how to configure and swap providers in the test
+---
-and production scripts.
+
 ## Provider model
 Each provider + model combination counts as one vote in the final majority. Running the same model twice is not supported — one model, one vote. To increase vote confidence, add more models rather than re-running existing ones.
 ---
@ -16,17 +19,13 @@ and production scripts.
 | GPU       | NVIDIA GeForce GTX 950M — 4 GB VRAM (Maxwell, CUDA compute 5.0) |
 | OS        | Debian GNU/Linux 13 (trixie) x86_64                             |
-**Local inference verdict:** viable for small/quantized models, not for
+**Local inference verdict:** viable for small/quantized models, not for production runs. See the [Local inference](#local-inference-llamacpp) section for details.
 production runs. See the [Local inference](#local-inference-llamacpp) section
 for details.
 ---
 ## Provider overview
-The enrich script uses a single, swappable provider config. All providers
+The enrich script uses a single, swappable provider config. All providers except Anthropic expose an OpenAI-compatible API, so the same client code works across all of them — only `baseURL`, `apiKey`, and `model` change.
 except Anthropic expose an OpenAI-compatible API, so the same client code
 works across all of them — only `baseURL`, `apiKey`, and `model` change.
 | Provider               | Use case                                      | Cost               | Rate limits            |
 | ---------------------- | --------------------------------------------- | ------------------ | ---------------------- |
@ -41,20 +40,13 @@ works across all of them — only `baseURL`, `apiKey`, and `model` change.
 ### Why local inference is worth testing
-Time is not a constraint — the pipeline scripts are fully resumable. The
+Time is not a constraint — the pipeline scripts are fully resumable. The laptop can run overnight for multiple nights. The only question is output quality, which the test script evaluates empirically.
 laptop can run overnight for multiple nights. The only question is output
 quality, which the test script evaluates empirically.
 ### Hardware constraints
-The GTX 950M has 4 GB VRAM and Maxwell architecture (CUDA compute 5.0).
+The GTX 950M has 4 GB VRAM and Maxwell architecture (CUDA compute 5.0). llama.cpp supports Maxwell via CUDA backend but newer builds may require the `--cuda-no-kv-offload` flag depending on the version.
 llama.cpp supports Maxwell via CUDA backend but newer builds may require
 the `--cuda-no-kv-offload` flag depending on the version.
-llama.cpp splits model layers between GPU and CPU automatically via
+llama.cpp splits model layers between GPU and CPU automatically via `--n-gpu-layers`. You set how many layers go on the GPU; the rest run on CPU/RAM. This means a model larger than VRAM is not a dead end — it runs in hybrid mode, slower than full-GPU but much faster than pure CPU.
 `--n-gpu-layers`. You set how many layers go on the GPU; the rest run on
 CPU/RAM. This means a model larger than VRAM is not a dead end — it runs
 in hybrid mode, slower than full-GPU but much faster than pure CPU.
 Practical estimates for this hardware (~3.5 GB VRAM usable after drivers):
@ -67,24 +59,19 @@ Practical estimates for this hardware (~3.5 GB VRAM usable after drivers):
 ### Recommended local models
-Two candidates worth testing, covering different points on the size/quality
+Two candidates worth testing, covering different points on the size/quality tradeoff:
 tradeoff:
 **Gemma 4 E4B Instruct (Q4 / UD-Q4_K_XL)**
 - GGUF file: `gemma-4-E4B-it-UD-Q4_K_XL.gguf` (~2.5 GB)
 - Source: https://huggingface.co/unsloth/gemma-4-E4B-it-GGUF
- Runs fully on GPU. Brand new (April 2025), built for edge hardware, 140+
+- Runs fully on GPU. Brand new (April 2025), built for edge hardware, 140+ language support including all five pipeline languages. First candidate to test.
  language support including all five pipeline languages. First candidate
  to test.
 **Qwen2.5 7B Instruct (Q4_K_M)**
 - GGUF file: `Qwen2.5-7B-Instruct-Q4_K_M.gguf` (~4.5 GB)
 - Source: https://huggingface.co/Qwen/Qwen2.5-7B-Instruct-GGUF
- Runs in hybrid mode (~26 of 32 layers on GPU, rest on CPU), ~8–12 tok/s.
+- Runs in hybrid mode (~26 of 32 layers on GPU, rest on CPU), ~8–12 tok/s. Stronger multilingual generation than any 3–4B model. Second candidate, for comparison against the smaller Gemma 4 E4B.
  Stronger multilingual generation than any 3–4B model. Second candidate,
  for comparison against the smaller Gemma 4 E4B.
 ### Installation
@ -190,16 +177,17 @@ Set `Authorization: Bearer <OPENROUTER_API_KEY>` in the request headers.
 ---
-## Provider configuration in the test script
+## Provider configuration in the enrich script
-The enrich test script reads a single config object. To switch providers,
+The enrich script reads a single config object. To switch providers,
-change this object and re-run.
+change this object and re-run. The `name` field is used as the model
 identifier in `pipeline.db` — it must be unique across all runs.
 ```typescript
 // config.ts
 export type ProviderConfig = {
-  name: string; // used for output folder naming
+  name: string; // used as model identifier in pipeline.db — must be unique
  baseURL: string;
  apiKey: string;
  model: string;
@ -243,14 +231,9 @@ export const ANTHROPIC_SONNET: ProviderConfig = {
 };
 ```
-Output from each run lands in:
+All output is written to `pipeline.db`. Each record is stored with the
-
+model name as identifier so results from different providers can be
-```
+compared and compiled into votes.
 stage-3-enrich/test/output/{provider.name}/results.json
 stage-3-enrich/test/output/{provider.name}/metrics.json
 ```
 The evaluate script compares all `metrics.json` files side by side.
 ---
@ -297,5 +280,6 @@ The test script measures the following per provider run:
   production. If not, use the cloud model that passed.
 5. **Production run**
-   Full 117k records. Resume-safe — the script checkpoints after each
+   Full 117k records. Resume-safe — each record is written to `pipeline.db`
-   record so overnight runs can be stopped and continued.
+   atomically as it is processed. Overnight runs can be stopped and
   continued at any time without losing work.
--- a/documentation/MODEL_STRATEGY.md
+++ b/documentation/MODEL_STRATEGY.md
@ -0,0 +1,181 @@
 # Model Strategy
 ## The problem
 The pipeline requires LLMs to perform four tasks per vocabulary entry:
 1. **Gloss review** — confirm or improve the existing gloss
 2. **Example review** — confirm or improve existing examples
 3. **Translation validation** — confirm valid translations, reject bad data, generate missing ones
 4. **CEFR assignment** — assign A1-C2 to the headword and each translation
 The core challenge is that vocabulary entries have **multiple senses**. The word "cat" appears five times in the database — as an animal, as slang for "guy", as a nautical term, as a verb meaning "to vomit", and as a verb meaning "to hoist an anchor". Each sense requires a different CEFR level and different translations. A model that only knows "cat" is A1 gets four out of five wrong.
 This makes CEFR assignment fundamentally a **sense-disambiguation problem**, not just a vocabulary lookup. Specialized CEFR classifiers (like `cefrpy` or `dksysd/cefr-classifier`) operate at the word or sentence level and cannot distinguish between senses of the same word. General LLMs handle sense disambiguation well but introduce quality and reliability problems that depend heavily on model size.
 The secondary challenge is **hardware constraints**. The available local hardware (GTX 950M, 4GB VRAM) can only run models up to approximately 4B parameters fully in GPU memory. Larger models run in hybrid CPU/GPU mode which is significantly slower. Free cloud API tiers are generous enough for the sample dataset but have daily limits that make processing 100k+ entries across multiple sub-stages a multi-day or multi-week operation.
 ## What we tried and why it failed or worked
 ### Single-prompt design (abandoned)
 The first enrich script sent one large prompt per entry covering all four tasks at once — CEFR voting, gloss improvement, example improvement, translation validation, and missing translation generation. This produced the following problems:
 - The model skipped translations it considered invalid rather than explicitly rejecting them, causing validation failures
 - Bad data in the translation table (`it:free`, `de:-frei`, `es:de fai`) caused consistent validation failures because the model refused to vote on them even when explicitly instructed
 - The combined prompt was large enough to trigger reasoning mode on Gemma 4 E4B, consuming all available tokens on thinking before producing output
 - 20% of entries required manual review
 ### Sub-stage design (current)
 Splitting into four ordered sub-stages fixed the reasoning and validation problems:
 1. `round1_gloss` — LLM reviews the gloss in isolation
 2. `round1_example` — LLM reviews examples with verified gloss as context
 3. `round1_translations` — LLM validates translations with verified gloss as context
 4. `round1_cefr` — LLM assigns CEFR levels only to validated translations
 This ordering ensures the CEFR sub-stage never sees bad data. The smaller, focused prompts eliminated reasoning mode triggering and reduced per-entry time from ~120 seconds to ~25 seconds.
 ### Gloss quality (ongoing)
 Testing on 50 entries with Qwen3.5-4B showed ~80% good quality. The 20% failures fall into three categories:
 - **Category header glosses** — Kaikki occasionally uses "Terms relating to people." or "Terms relating to things." as a gloss instead of a real definition. No model handles these correctly because there is no real meaning to improve.
 - **Rare/obscure senses** — slang, archaic, and theological senses that a 4B model does not have enough knowledge to handle (e.g. "cat" meaning "to vomit", "word" meaning "Logos, Christ").
 - **Short ambiguous glosses** — one or two word glosses with no example context cause hallucination.
 ### Gemma 4 E4B (rejected)
 Gemma 4 E4B is a hybrid reasoning model. Disabling thinking via `--reasoning-budget 0` or `--chat-template-kwargs '{"enable_thinking":false}'` does not work reliably in llama.cpp for the E4B variant — the model either puts reasoning into the content field as plain text or returns empty content with reasoning in `reasoning_content`. Per-entry time exceeded 100 seconds making it impractical.
 ### Qwen3.5-4B (current local model)
 Non-thinking by default for the small series. Runs fully in 4GB VRAM at ~5 seconds per sub-stage. Acceptable quality for common vocabulary (A1-B2) but struggles with rare and specialized senses. Used as the primary local voter.
 ### Specialized CEFR classifiers (rejected for primary use)
 HuggingFace hosts several CEFR text classifiers (`dksysd/cefr-classifier`, `AbdulSami/bert-base-cased-cefr`) and the `cefrpy` Python library maps individual words to CEFR levels. These operate at the word or sentence level and cannot distinguish between senses. "cat" would always be assigned A1 regardless of whether the sense is the animal or obscure nautical slang. Useful only as a sanity check signal, not as a primary voter.
 ## Available free resources
 | Resource                     | Type               | Requests/day      | Quality   | Notes                                                                  |
 | ---------------------------- | ------------------ | ----------------- | --------- | ---------------------------------------------------------------------- |
 | Local Qwen3.5-4B Q4_K_M      | Local model        | Unlimited         | Decent    | Non-thinking by default, fits in 4GB VRAM, ~5s per sub-stage           |
 | Local Qwen3.5-9B Q4_K_M      | Local model        | Unlimited         | Good      | Hybrid CPU/GPU mode on 4GB VRAM, slower but better quality             |
 | Local Llama 3.1 8B Q4_K_M    | Local model        | Unlimited         | Decent    | ~4.3GB, fits in VRAM or light hybrid, different architecture from Qwen |
 | Groq — Llama 3.3 70B         | Cloud API          | 1,000             | Excellent | Best free quality available, 5-10x with batching                       |
 | Groq — Llama 3.1 8B          | Cloud API          | 14,400            | Decent    | High volume, similar quality to local 4B                               |
 | Google Gemini AI Studio      | Cloud API          | 1,500             | Very good | Google account required, 5-10x with batching                           |
 | OpenRouter free rotation     | Cloud API          | 50–1,000          | Varies    | Rotates between free models automatically via `openrouter/free`        |
 | Wiktionary API               | Context enrichment | Unlimited         | N/A       | Structured vocabulary data, directly related to Kaikki source          |
 | `cefrpy` Python library      | Word lookup        | Unlimited         | Limited   | Deterministic English word CEFR lookup, no sense disambiguation        |
 | HuggingFace CEFR classifiers | Text classifier    | Unlimited (local) | Limited   | Sentence-level difficulty, not sense-aware                             |
 ### Batching
 All cloud APIs support sending multiple entries in a single request. Sending 5 entries per request multiplies effective daily capacity by 5x:
 - Groq Llama 3.3 70B: 1,000 requests → ~5,000 entries/day
 - Gemini: 1,500 requests → ~7,500 entries/day
 ### Multiple accounts
 Prohibited by the terms of service of all providers listed above.
 ## Final approach per sub-stage
 The pipeline runs multiple models as independent voters. Each model processes every entry once and writes its votes to `pipeline.db`. The merge stage resolves disagreements by majority vote. A tiebreaker runs additional models on flagged entries where no majority was reached.
 ### round1_gloss and round1_example
 These sub-stages require a model that understands sense context from examples. Specialized classifiers cannot help here — only general LLMs can evaluate whether a gloss correctly describes a specific sense.
 **Primary voter:** Local Qwen3.5-9B Q4_K_M — runs overnight, unlimited, handles common vocabulary well.
 **Secondary voter:** Groq Llama 3.3 70B with 5-entry batching — higher quality, catches errors the local model makes on rare or specialized senses.
 **Tertiary voter:** Gemini AI Studio with 5-entry batching — third independent opinion, different training data from both Groq and local model.
 **Context enrichment via Wiktionary API:** Before calling any model for the gloss or example sub-stage, the pipeline queries the Wiktionary API for the headword. The API returns the full Wiktionary entry including all senses, usage notes, and examples. This structured data is added to the prompt as additional context, giving the model a much clearer picture of which specific sense it is working with.
 This directly fixes the two hardest failure cases:
 - **Category header glosses** ("Terms relating to people.") — the Wiktionary entry contains the real definition which the model can use to generate a proper gloss
 - **Short ambiguous glosses** — the additional sense context prevents the model from guessing the wrong meaning
 The Wiktionary API is free, has no rate limits for reasonable use, and is directly related to the Kaikki data source since Kaikki extracts from Wiktionary.
 ### round1_translations
 Same voter stack as gloss/example. The few-shot examples in the prompt (showing that `it:free` → reject and `de:-frei` → reject) handle the bad data cases that caused validation failures in the single-prompt design.
 ### round1_cefr
 This sub-stage only receives translations that survived the validation step. All bad data is already excluded.
 **Primary voter:** Local Qwen3.5-9B Q4_K_M.
 **Secondary voter:** Groq Llama 3.3 70B with 5-entry batching.
 **Tertiary voter:** Gemini AI Studio with 5-entry batching.
 **Sanity check:** `cefrpy` provides a deterministic English word CEFR level as a reference signal. If the majority LLM vote disagrees significantly (e.g. LLMs vote C2 for "cat" the animal), the entry is flagged for human review. `cefrpy` does not vote — it only triggers review flags.
 ### Voter summary
 | Sub-stage           | Voter 1            | Voter 2            | Voter 3 |
 | ------------------- | ------------------ | ------------------ | ------- |
 | round1_gloss        | Qwen3.5-9B (local) | Groq Llama 3.3 70B | Gemini  |
 | round1_example      | Qwen3.5-9B (local) | Groq Llama 3.3 70B | Gemini  |
 | round1_translations | Qwen3.5-9B (local) | Groq Llama 3.3 70B | Gemini  |
 | round1_cefr         | Qwen3.5-9B (local) | Groq Llama 3.3 70B | Gemini  |
 Three voters means a correct majority requires at least two models to agree. Even if the local model gets a difficult sense wrong, the two cloud models will likely agree on the correct answer and outvote it.
 ## Open questions
 ### Wiktionary API context extraction
 The Wiktionary API returns the full entry for a word including all senses. For a word like "free" with 8+ senses, dumping the entire entry into the prompt wastes tokens and may confuse the model. The open question is how to extract only the relevant sense — options include matching by sense_index, fuzzy-matching the Kaikki gloss against Wiktionary glosses, or letting the model see all senses and identify the correct one itself.
 ### Batching prompt design
 Batching 5-10 entries per API call multiplies effective daily capacity significantly. The prompt and validation logic for batched requests is more complex — the model must return a structured JSON object keyed by entry ID, and partial failures (one entry in a batch fails validation) need careful handling. Not yet designed or tested.
 ### Groq and Gemini API integration
 Neither Groq nor Gemini is integrated into the pipeline yet. Both use OpenAI-compatible APIs so integration is straightforward — add provider configs to `stage-3-enrich/config.ts` and set API keys in `.env`. The batching prompt design needs to be finalised first.
 ### OpenRouter free model rotation
 OpenRouter's `openrouter/free` router selects a model at random from available free models. This means output style and quality vary between requests, which complicates round 2 voting where models review each other's candidates. May need to pin specific free models rather than using the router.
 ### Qwen3.5-9B performance on hard cases
 The 9B model has not yet been tested. It is expected to handle rare and specialized senses better than the 4B model but this has not been verified. Needs a test run against the same 50 entries used to evaluate the 4B model.
 ### Llama.cpp Gemma 4 bug
 The llama.cpp chat template bug preventing reliable JSON output from Gemma 4 E4B may be fixed in a future release. The model fits in 4GB VRAM and would be a useful additional local voter if the bug is resolved. Worth checking periodically.
 ### Full dataset scale
 The current pipeline runs on a 500-entry sample per language. The full Kaikki English file contains approximately 1.3 million entries, of which a fraction will pass the POS and translation filters. The exact count and the time required to run all sub-stages across all models at full scale is not yet known.
 ### Category header glosses
 Kaikki occasionally uses category headers ("Terms relating to people.", "Terms relating to things.") as glosses. These are not real definitions and no model produces useful output for them. Options include pre-filtering them before the gloss sub-stage and generating a gloss purely from examples, or flagging them as a special case for human review.
 wget -O models/llama-3.1-8b-instruct-q4_k_m.gguf \
 "https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf"
 # Q4_K_M (5.68GB — hybrid mode, better quality)
 wget -O models/qwen3.5-9b-q4_k_m.gguf \
 "https://huggingface.co/unsloth/Qwen3.5-9B-GGUF/resolve/main/Qwen3.5-9B-Q4_K_M.gguf"
 # Q3_K_S (4.32GB — might fit fully in VRAM)
 wget -O models/qwen3.5-9b-q3_k_s.gguf \
 "https://huggingface.co/unsloth/Qwen3.5-9B-GGUF/resolve/main/Qwen3.5-9B-Q3_K_S.gguf"
--- a/documentation/STARTUP_ROADMAP.md
+++ b/documentation/STARTUP_ROADMAP.md
@ -0,0 +1,356 @@
 # Lila — Feature & Startup Strategy Roadmap
 > **Context for any LLM reading this:** Lila is a language learning/vocabulary app with two core differentiators: (1) **media-based practice** — users learn vocabulary extracted from real media they love (e.g., a Shakira song, the first chapter of _Harry Potter_, or an episode of _Breaking Bad_), and (2) **multiplayer modes** — users practice vocabulary together or competitively in real-time sessions. The app is currently at an early MVP stage. The existing MVP was built around OpenWordNet, which is being replaced because it produces unreliable translations (sense-disambiguation issues). The team is migrating the data pipeline to **Kaikki**, which structures entries per word sense and links translations to specific senses rather than vague general concepts. This migration is the current technical priority. The project is a TypeScript monorepo (pnpm workspaces) with an Express/WebSocket API (`apps/api`), a React frontend using TanStack Router (`apps/web`), a data ingestion pipeline (`data-pipeline`) backed by SQLite/Drizzle, shared packages (`packages/db`, `packages/shared`), and Docker-based deployment orchestrated with Caddy. Documentation restructuring (human-readable vs. AI-optimized docs) is being handled in a separate parallel workstream.
 ---
 ## Current State (Ground Truth — 2026-05-15)
 ### What Works Today ✅
 - **Singleplayer quiz** — Duolingo-style, 5 language pairs (en↔it/de/es/fr), 3 or 10 rounds, POS + difficulty filters
 - **Multiplayer** — Create/join lobby by room code, 2–4 players, simultaneous answers, 15s server timer, live scoring, winner screen
 - **Auth** — Google + GitHub via Better Auth
 - **Deployment** — Live at lilastudy.com, Hetzner VPS, Caddy HTTPS, Docker Compose, CI/CD via Forgejo Actions
 - **Database** — PostgreSQL with Drizzle ORM, daily backups
 ### What's In Progress / Blocked 🚧
 - **Kaikki data pipeline migration** — Stage 1 (extract) and Stage 2 (reverse link) complete on sample data. Stage 3 (enrich) being rewritten for sub-stage architecture. Stages 4–6 not started.
 - **Guest play** — No try-before-signup flow yet. Auth required for all game routes.
 - **Game session store** — Still in-memory. Valkey container exists locally but not wired up.
 - **Media ingestion** — Not started. No pipeline for subtitles/lyrics → vocab extraction yet.
 ### The Strategic Gap
 The app is currently a **generic vocabulary quiz**. The media-based practice feature (the differentiator) does not exist yet. It depends on:
 1. Kaikki pipeline reaching production (fixes translation quality)
 2. A media ingestion prototype (subtitles/lyrics → text → vocab extraction → quiz)
 ---
 ## Stream 1: Documentation Restructure (Parallel Track)
 **Status:** ✅ Complete. Human-readable branch (README, STATUS, ARCHITECTURE, BACKLOG, DECISIONS, DEPLOYMENT, DATA_PIPELINE, MODEL_STRATEGY, LLM_SETUP, design/GAME_MODES) and AI-context branch (00–06, prompts/meta.md, 99-current-task.md) are live in `documentation/`.
 ---
 ## Stream 2: Feature Roadmap (Three Lanes)
 ### Lane A — Attract & Keep Users
 **Goal:** A user lands on Lila, understands the value in 10 seconds, and completes a satisfying vocabulary practice session in under 2 minutes.
 **Current Reality Check:**
 - Singleplayer and multiplayer quizzes are **already working and deployed**.
 - The app is functional but **not differentiated** — it's a generic vocabulary quiz right now.
 - The "wow" moment requires the **media-based practice feature**, which does not exist yet.
 **Must-Haves for First Users:**
 1. **Guest Play (Zero-Friction Onboarding)** `[in backlog next]`
   - No signup required for first session.
   - Capture email or OAuth only after the user experiences value.
   - Critical for viral loops and investor demos.
   - **Status:** Planned in BACKLOG.md. Not yet implemented.
 2. **One Polished Media Demo** `[not started]`
   - Pick **ONE** piece of media and make it flawless end-to-end: subtitles/lyrics → Kaikki-based vocab extraction with sense-disambiguated translations → playable quiz with timestamps/context.
   - **Language pair:** en→es (biggest market, most content)
   - **Media candidates:** _Breaking Bad S01E01_, a Shakira song, or _Harry Potter and the Sorcerer's Stone Ch. 1_.
   - This is the primary "wow" moment. Differentiates Lila from all other vocabulary apps.
   - **Blocker:** Requires (a) Kaikki pipeline in production, and (b) a media ingestion prototype.
 3. **One Additional Multiplayer Mode** `[design exists, not implemented]`
   - Proves the mode-agnostic lobby architecture works and adds variety beyond the current simultaneous-answer flow.
   - **Recommended first mode:** Race to the Top (target score, no round limit) — simplest to implement, changes only scoring logic.
   - Alternative: TV Quiz Show (buzzer — first to press answers) — most visually distinct, but requires new answer flow.
   - **Status:** Lobby infrastructure is mode-agnostic. Each mode adds game logic only. See `design/GAME_MODES.md` for full designs.
   - **Why it matters:** Duolingo has no multiplayer. Anki has no multiplayer. Real-time modes are a genuine differentiator even without media.
 4. **Social Proof / Shareable Output** `[not started]`
   - Post-game card: "I learned 12 words from _La Tortura_ — can you beat my score?"
   - Image export or copy-paste text for Reddit, Discord, Twitter.
   - This is the organic growth engine.
   - **Blocker:** Requires media demo to exist first.
 **Already Shipped (Don't Rebuild):**
 - ✅ Singleplayer quiz (5 languages, POS/difficulty filters)
 - ✅ Multiplayer lobby + real-time game (2–4 players, simultaneous answers, 15s timer, scoring)
 - ✅ Auth (Google + GitHub)
 - ✅ Live deployment with CI/CD
 **Nice-to-Haves (Post-Launch):**
 - Additional multiplayer modes (Chain Link, Elimination Round, Cooperative Challenge)
 - Leaderboards
 - Spaced repetition review queue
 ---
 ### Lane B — Investor-Ready
 **Goal:** Walk into a pitch with engagement metrics and a defensibility story tied to Lila's unique data pipeline.
 **Checklist:**
 1. **Metrics Instrumentation** `[not started]`
   - Track: DAU/MAU, session length, quiz completion rate, multiplayer match completion rate, Day 1 / Day 7 retention.
   - Tool: PostHog, Mixpanel, or Plausible (self-hosted).
   - Need 4–6 weeks of real-user data.
   - **Note:** The app is live but has no analytics. This is a prerequisite for any investor conversation.
 2. **Growth Mechanic** `[not started]`
   - The shareable card (Lane A.3) must be live and instrumented.
   - Measure k-factor (viral coefficient). Even 0.3 is a story.
   - **Blocker:** Requires media demo.
 3. **Defensibility Story** `[partially true, not yet proven]`
   - **Data moat:** Lila's Kaikki → media mapping pipeline produces sense-disambiguated vocabulary tied to specific media timestamps. Competitors using generic word lists or OpenWordNet-style dumps cannot match the precision.
   - **Current reality:** The Kaikki pipeline exists but is not in production. The media mapping pipeline does not exist yet.
   - **What investors would ask:** "You have a quiz app. Where's the media feature you pitched?"
   - **Requirement:** Media demo + Kaikki production data must be live before investor conversations.
 4. **Monetization Hypothesis** `[deferred to business co-founder]`
   - Not the technical founder's priority right now.
   - Will be owned by business co-founder or advisor after traction.
   - Options to test later: freemium (free curated media, premium for uploads/unlimited multiplayer/stats), B2B schools, affiliate links to streaming/books/music.
 **Investor Timeline:**
 - **Now → Month 2:** Finish Kaikki pipeline + ship media demo + add metrics.
 - **Month 2–3:** Soft launch to 100 strangers, gather retention data.
 - **Month 3+:** Investor-ready if retention curves look good.
 ---
 ### Lane C — Co-Founder-Ready
 **Goal:** A potential co-founder looks at Lila and thinks, "This person can build, and there's a real product here."
 **Checklist:**
 1. **Clean Codebase + Documentation** `[in progress]`
   - Documentation restructure is complete.
   - README must get a new dev from `git clone` to `docker compose up` in < 5 minutes.
   - **Status:** Docs are done. Code cleanliness is ongoing (BACKLOG.md `next`/`later` items).
 2. **Live Demo with Real Users** `[partially done]`
   - App is live at lilastudy.com with real auth and multiplayer.
   - **Gap:** No real users yet. The current app is a generic quiz — not compelling enough for strangers to stick around.
   - **Requirement:** Media demo must be live before pitching to potential co-founders.
 3. **Clear Vision Doc** `[not written]`
   - 1-page: What Lila is, what it isn't, and the 18-month arc.
   - Include: target languages, target media types, target user persona, and what "success" looks like at 6 / 12 / 18 months.
 **Co-Founder Search: Deferred**
 - **Not needed now.** No savings, no traction, no differentiated product. A business co-founder can't raise money or design monetization from a generic quiz.
 - **Revisit in Month 6+** after media demo + 100 users + retention data.
 - **Exception:** If Reaktor.berlin accepts solo founder, take it. If they require a team, evaluate then — but don't rush a bad match.
 ---
 ## Stream 3: Building the Startup (Technical Founder Journey)
 ### Phase 0 — Runway Acquisition (Now → June)
 **Goal:** Secure full-time building capacity.
 **Profile:** EU citizen, Berlin-based, no savings, currently employed but being fired May 28. Eligible for Arbeitslosengeld I (24 months employment history).
 **Primary Path:**
 1. **Register at Arbeitsamt** — May 28 (immediately after firing)
 2. **Apply for Arbeitslosengeld I** — Same day
 3. **Apply for Gründungszuschuss** — Within 4 weeks of starting self-employment
   - Requires: business plan (1–2 pages), viability check by counselor
   - Provides: 9–12 months basic income (~€1,500–2,000/month)
   - **Best case:** Full-time building, no equity cost, no co-founder needed
   - **Probability:** High (70–80%) — low burn rate, working app, clear technical path
 **Parallel Path:**
 - **Reaktor.berlin** — Already applied. Solo founder accepted (lower odds). €25K for 6 months, 2.5% equity.
  - If accepted + Gründungszuschuss approved: Choose Gründungszuschuss (no equity, longer runway)
  - If accepted + Gründungszuschuss rejected: Take Reaktor, build solo for 6 months
  - If rejected: Continue with Gründungszuschuss or Arbeitslosengeld I
 **Backup Path:**
 - **Arbeitslosengeld I only** — If Gründungszuschuss rejected. €1,000–1,500/month. Continue building full-time.
 - **Part-time job** — If Arbeitslosengeld I insufficient. Slower progress but sustainable.
 **Not Pursuing:**
 - Full-time job + nights/weekends — Burnout risk, Lila stalls
 - Freelance/consulting — No current skill set or client network, too time-consuming to build
 - EXIST/INVEST/YC/Techstars — Wrong stage, too competitive, too long timeline
 ---
 ### Phase 1 — Differentiate the MVP (Month 1–3)
 **Duration:** 2–3 months  
 **Rule:** Build first, measure second, align third.
 **Assumption:** Gründungszuschuss approved (full-time building). If not, same tasks but slower.
 **Tasks:**
 1. **Finish Kaikki pipeline** (Stage 3–6)
   - Complete enrich sub-stage rewrite
   - Run full sample, validate quality
   - Production sync to PostgreSQL
   - **Timeline:** 2–4 weeks
 2. **Build media ingestion prototype**
   - Pick ONE media piece (Breaking Bad S01E01, Shakira song, or Harry Potter Ch. 1)
   - Pipeline: subtitles/lyrics → text extraction → vocabulary identification → Kaikki sense-matching → quiz generation
   - UI: media selection → quiz with context ("This word appears at 00:04:23")
   - **Language pair:** en→es
   - **Timeline:** 2–4 weeks (parallel with Kaikki pipeline)
 3. **Ship guest play**
   - Make auth optional on game routes
   - "Try without account" button on landing page
   - Capture email/OAuth after first session
   - **Timeline:** 1 week
 4. **Add one additional multiplayer mode**
   - Race to the Top recommended (simplest: target score, no round limit)
   - Changes only scoring logic, reuses existing lobby infrastructure
   - **Timeline:** 1–2 weeks
 5. **Add metrics instrumentation**
   - PostHog or Plausible
   - Track: signups, quiz starts, completions, multiplayer matches, retention
   - **Timeline:** 1 week
 6. **Soft launch to 100 strangers**
   - Reddit (r/languagelearning, r/Anki, r/Refold), language-learning Discords, Hacker News Show HN
   - Collect qualitative feedback
   - **Timeline:** 1 week (after media demo is live)
 ---
 ### Phase 2 — Validate & Measure (Month 3–4)
 **Goal:** Prove that the media feature resonates and that retention curves exist.
 **Tasks:**
 - Analyze metrics: Do users who try media-based practice return more than singleplayer-only users?
 - Iterate on media selection and quiz UX based on feedback
 - Polish shareable output (social cards)
 - Fix hardening items from BACKLOG.md Phase 7
 **Decision gate:** If 100 users show positive retention signals (Day 1 > 30%, Day 7 > 10%), proceed to Phase 3. If not, iterate on media feature or pivot.
 ---
 ### Phase 3 — Funding or Revenue (Month 4–6)
 **Goal:** Secure runway beyond Gründungszuschuss/Arbeitslosengeld I.
 **If metrics are positive:**
 - Apply to accelerators: Reaktor.berlin (next batch), Y Combinator, Techstars Berlin
 - Angel outreach: Berlin ed-tech angels, former founders in language learning
 - EU grants: EXIST (if university partnership), INVEST (Berlin-specific)
 **If metrics are weak:**
 - Iterate on media feature or pivot value proposition
 - Consider B2B angle (schools, language institutes) if consumer traction is low
 - Part-time work to extend runway while iterating
 **Co-founder search:** Not a priority. If funding requires a team (some accelerators), evaluate then. Otherwise, hire contractors or employees for specific gaps.
 ---
 ### Phase 4 — Scale (Month 6+)
 **Goal:** Grow user base, build team, secure Series A or sustainable revenue.
 **Only relevant if Phase 3 succeeds.** Otherwise, continue iterating.
 ---
 ## Suggested Execution Order
 ### Month 0 (Now → May 28)
 - **Week 1:** Finish Kaikki Stage 3 enrich sub-stage rewrite. Run full sample, validate quality.
 - **Week 2:** Register at Arbeitsamt (May 28). Apply for Arbeitslosengeld I. Ask about Gründungszuschuss.
 ### Month 1 (June)
 - **Week 1–2:** Submit Gründungszuschuss application. Prepare business plan (1–2 pages).
 - **Week 3:** Start media ingestion prototype (parallel). Pick one media piece, get text extraction working.
 - **Week 4:** Continue Kaikki pipeline + media prototype.
 ### Month 2 (July)
 - **Week 1–2:** Complete media ingestion prototype. End-to-end: media → quiz.
 - **Week 3:** Ship guest play. Add additional multiplayer mode (Race to the Top).
 - **Week 4:** Add metrics (PostHog/Plausible). Polish shareable output.
 ### Month 3 (August)
 - **Week 1:** Soft launch to 100 strangers. Gather feedback.
 - **Week 2–3:** Iterate based on feedback. Fix hardening items from BACKLOG.md.
 - **Week 4:** Analyze metrics. Decision gate: proceed or iterate?
 ### Month 4–6 (September–November)
 - If metrics positive: Apply to accelerators, angels, grants.
 - If metrics weak: Iterate or pivot.
 - If Gründungszuschuss expires: Transition to Arbeitslosengeld I, part-time work, or funding.
 ---
 ## Open Questions (Answered)
 ### Product Reality Check
 - [x] What actually works today? — Singleplayer quiz, multiplayer, auth, deployment
 - [x] What is broken or placeholder? — Kaikki pipeline Stage 3, guest play, in-memory session store
 - [x] What language pairs are supported? — en↔it/de/es/fr
 - [x] Exact blocker on Kaikki Stage 3? — No blocker, work in progress
 ### Target Audience
 - [x] Who is the ideal first user? — Immersion learner (Netflix watcher) + social learner
 - [x] What languages for launch? — en→es for media demo (biggest market), all 5 pairs live
 ### Business Model
 - [x] Monetization hypothesis? — **Deferred** to business co-founder or advisor after traction
 - [x] Unit economics? — **Deferred** until product-market fit
 ### Competitive Landscape
 - [x] Direct competitors? — Duolingo, Anki, LingQ, FluentU, Quizlet
 - [x] What they do poorly? — No real media integration, no real-time multiplayer, no sense-disambiguated translations
 ### Runway & Constraints
 - [x] Full-time or nights/weekends? — Transitioning to full-time via Gründungszuschuss
 - [x] Funding/savings? — No savings. EU citizen, Berlin-based, eligible for Arbeitslosengeld I
 - [x] Hard deadline? — None. Self-paced. Gründungszuschuss application deadline: within 4 weeks of May 28
 ### Co-Founder Search
 - [x] Local or remote? — Berlin-based, but open
 - [x] What do they do? — **Not needed now.** Revisit Month 6+ after traction
 - [x] Known candidates? — None. Starting from zero
 - [x] Equity mindset? — **Deferred** until co-founder search begins
 ---
 _End of Lila feature & startup strategy doc._
--- a/documentation/STATUS.md
+++ b/documentation/STATUS.md
@ -0,0 +1,46 @@
 # Status — 2026-05-15
 > Last updated: 2026-05-15. Update this file after every deploy or when switching tasks.
 ## What Works Today ✅
 - **Singleplayer quiz** — Duolingo-style, 5 language pairs (en↔it/de/es/fr), 3 or 10 rounds, POS + difficulty filters
 - **Multiplayer** — Create/join lobby by room code, 2–4 players, simultaneous answers, 15s server timer, live scoring, winner screen
 - **Auth** — Google + GitHub via Better Auth, cross-subdomain cookies, session middleware on protected routes
 - **Deployment** — Live at lilastudy.com, Hetzner VPS, Caddy HTTPS, Docker Compose, CI/CD via Forgejo Actions
 - **Database** — PostgreSQL with Drizzle ORM, daily backups, idempotent seeding
 ## What's Broken / Blocked 🚧
 - **Data quality** — Production still uses OpenWordNet/OMW translations. Kaikki pipeline (sense-disambiguated) is in progress but not yet synced to production.
 - **Guest play** — Auth is required for all game routes. No try-before-signup flow.
 - **Game session store** — Still in-memory (`InMemoryGameSessionStore`). Valkey container exists in local dev but not wired up.
 - **Rate limiting** — Partially implemented on auth endpoints; game endpoints not yet covered.
 - **React error boundaries** — Not implemented; runtime crashes take down the whole app.
 - **Monitoring** — No uptime alerts or centralized logging on the VPS.
 ## What I'm Working On Now 🔄
 **Primary:** Rewriting the Kaikki data pipeline enrich script for sub-stage architecture (round1_gloss → round1_example → round1_translations → round1_cefr).
 **Secondary:** Phase 7 hardening backlog items (see BACKLOG.md `next` section).
 ## Next 2-Week Goal 🎯
 Finish Kaikki Stage 3 (enrich) sub-stage rewrite → run full sample → compare quality → decide on production sync timeline.
 ## The Big Picture
 Lila is a **deployed, working vocabulary quiz app**. The core loop (singleplayer + multiplayer) is solid. The next strategic milestone is **media-based practice** (learn vocab from a song/TV episode/book chapter), but that depends on:
 1. Kaikki data pipeline reaching production (fixes translation quality)
 2. A media ingestion prototype (subtitles/lyrics → text → vocab extraction → quiz)
 Until then, the app is a generic vocabulary quiz — functional but not differentiated.
 ## Quick Links
 - [BACKLOG.md](BACKLOG.md) — Prioritized tasks
 - [DATA_PIPELINE.md](DATA_PIPELINE.md) — Pipeline stages and current progress
 - [BACKLOG.md](BACKLOG.md) — `now` / `next` / `later`
 - [DEPLOYMENT.md](DEPLOYMENT.md) — Infrastructure ops
--- a/documentation/ai-context/00-project-overview.md
+++ b/documentation/ai-context/00-project-overview.md
@ -0,0 +1,116 @@
 # 00 — Project Overview
 > **Purpose:** Give any LLM instant context on what Lila is, what makes it different, and what's currently built vs. planned. Concatenate this file with domain-specific files (01–06) and 99-current-task.md before handing to an LLM.
 > **Last updated:** 2026-05-15
 > **Depends on:** Nothing (this is the entry point)
 ---
 ## What Lila Is
 Lila is a vocabulary learning app with two core differentiators:
 1. **Media-based practice** — Users learn vocabulary extracted from real media they love: a Shakira song, the first chapter of _Harry Potter_, an episode of _Breaking Bad_. The app extracts vocabulary from subtitles/lyrics/text and turns it into quiz questions.
 2. **Multiplayer modes** — Users practice vocabulary together or competitively in real-time sessions (2–4 players, simultaneous answers, live scoring).
 The core learning loop is Duolingo-style: a word appears in one language, the user picks the correct translation from four choices.
 Live at [lilastudy.com](https://lilastudy.com).
 ---
 ## Current State (2026-05-15)
 ### What Works Today
 - **Singleplayer quiz** — 5 language pairs (en↔it/de/es/fr), 3 or 10 rounds, POS + difficulty filters
 - **Multiplayer** — Create/join lobby by room code, 2–4 players, simultaneous answers, 15s server timer, live scoring, winner screen
 - **Auth** — Google + GitHub via Better Auth
 - **Deployment** — Live on Hetzner VPS, Caddy HTTPS, Docker Compose, CI/CD via Forgejo Actions
 - **Database** — PostgreSQL with Drizzle ORM, daily backups
 ### What's In Progress / Blocked
 - **Kaikki data pipeline migration** — Replacing OpenWordNet/OMW with sense-disambiguated Kaikki data. Stage 1 (extract) and Stage 2 (reverse link) complete on sample data. Stage 3 (enrich) being rewritten for sub-stage architecture.
 - **Guest play** — No try-before-signup flow yet. Auth required for all game routes.
 - **Game session store** — Still in-memory. Valkey container exists locally but not wired up.
 - **Media ingestion** — Not started. No pipeline for subtitles/lyrics → vocab extraction yet.
 ### The Strategic Gap
 The app is currently a **generic vocabulary quiz**. The media-based practice feature (the differentiator) does not exist yet. It depends on:
 1. Kaikki pipeline reaching production (fixes translation quality)
 2. A media ingestion prototype (subtitles/lyrics → text → vocab extraction → quiz)
 ---
 ## Tech Stack
 | Layer         | Technology                                                     |
 | ------------- | -------------------------------------------------------------- |
 | Monorepo      | pnpm workspaces                                                |
 | Frontend      | React 18, Vite, TanStack Router, TanStack Query, Tailwind CSS  |
 | Backend       | Node.js, Express, TypeScript, WebSockets (`ws` library)        |
 | Database      | PostgreSQL + Drizzle ORM                                       |
 | Auth          | Better Auth (Google + GitHub)                                  |
 | Validation    | Zod (shared between frontend and backend in `packages/shared`) |
 | Testing       | Vitest, supertest                                              |
 | Deployment    | Docker Compose, Caddy, Hetzner VPS                             |
 | CI/CD         | Forgejo Actions                                                |
 | Data Pipeline | Kaikki (Wiktionary) → SQLite (`pipeline.db`) → PostgreSQL      |
 ---
 ## Repository Structure
 ```
 lila/
 ├── apps/
 │   ├── api/              — Express backend (HTTP + WebSocket)
 │   └── web/              — React frontend (Vite, TanStack Router)
 ├── packages/
 │   ├── shared/           — Zod schemas + constants (API/web contract)
 │   └── db/               — Drizzle schema, migrations, models, seeding
 ├── data-pipeline/        — Kaikki extraction → enrichment → PostgreSQL sync
 └── documentation/        — Project docs (human + AI-context branches)
 ```
 **Key rule:** `packages/shared` is the single source of truth for all data shapes crossing the API boundary. Both frontend and backend import from it. If a schema changes, TypeScript compilation fails in both places simultaneously.
 ---
 ## Key Architecture Principles
 1. **Layered architecture** — Router → Controller → Service → Model → Database. Each layer only talks to the layer below it.
 2. **Server-side answer evaluation** — The correct answer is never sent to the frontend. All evaluation happens server-side.
 3. **Zod discriminated unions for WebSockets** — All WS messages are typed via Zod schemas in `packages/shared`. The router switches on the `type` field.
 4. **GameSessionStore abstraction** — Session state is stored through an interface (`InMemoryGameSessionStore` now, `ValkeyGameSessionStore` planned).
 5. **Language-neutral data model** — `terms` are concepts; `translations` are per-language words. Adding a language requires no schema changes.
 ---
 ## Key Decisions (Summary)
 | Topic       | Decision                    | Why                                           |
 | ----------- | --------------------------- | --------------------------------------------- |
 | ORM         | Drizzle, not Prisma         | No binary, no engine, closer to SQL           |
 | WebSocket   | `ws` library, not Socket.io | 2–4 players, explicit Zod protocol sufficient |
 | Auth        | Better Auth, not Keycloak   | Embedded middleware, no separate service      |
 | Answer eval | Server-side only            | Correct answer never sent to frontend         |
 | Data source | Kaikki, not OMW             | Sense-disambiguated translations              |
 ---
 ## Further Reading (AI-Context Files)
 | File                                                 | What it covers                                               |
 | ---------------------------------------------------- | ------------------------------------------------------------ |
 | [01-architecture.md](01-architecture.md)             | Monorepo structure, layered architecture, data flow diagrams |
 | [02-data-model.md](02-data-model.md)                 | Database schema, tables, relationships, constraints          |
 | [03-api-contract.md](03-api-contract.md)             | REST endpoints, request/response schemas, Zod types          |
 | [04-websocket-protocol.md](04-websocket-protocol.md) | WS message types, game flow, auth, state management          |
 | [05-data-pipeline.md](05-data-pipeline.md)           | Kaikki pipeline stages, enrich sub-stages, sync              |
 | [06-deployment.md](06-deployment.md)                 | Docker, Caddy, CI/CD, backups                                |
 | [prompts/meta.md](prompts/meta.md)                   | How to work with LLMs on this codebase                       |
 | [99-current-task.md](99-current-task.md)             | Template: fill this out before giving a task to an LLM       |
--- a/documentation/ai-context/01-architecture.md
+++ b/documentation/ai-context/01-architecture.md
@ -0,0 +1,156 @@
 # 01 — Architecture
 > **Purpose:** Give an LLM the structural context needed to navigate the codebase and understand data flow. Concatenate with 00-project-overview.md and 99-current-task.md.
 > **Last updated:** 2026-05-15
 > **Depends on:** 00-project-overview.md
 ---
 ## Monorepo Boundaries
 ```
 lila/
 ├── apps/
 │   ├── api/              — Express backend: routers, controllers, services, WS handlers
 │   └── web/              — React frontend: routes, components, hooks, client state
 ├── packages/
 │   ├── shared/           — Zod schemas, constants, derived types. THE CONTRACT.
 │   └── db/               — Drizzle schema, migrations, models (termModel, lobbyModel), seeding
 ├── data-pipeline/        — Kaikki extraction → enrichment → sync to PostgreSQL
 └── documentation/        — Human docs + ai-context/
 ```
 **Critical rule:** `apps/api` never imports `drizzle-orm` for queries. It only calls functions exported from `packages/db`. All database code lives in `packages/db`.
 ---
 ## Layered Architecture (HTTP)
 ```
 HTTP Request
     ↓
  Router        — maps URL + method to controller (Express Router)
     ↓
 Controller     — validates input (Zod safeParse), calls service, sends response
                  or next(error) for errorHandler middleware
     ↓
  Service       — business logic only. No HTTP, no direct DB access.
                  Calls model functions from packages/db.
     ↓
  Model         — database queries only. No business logic.
                  Lives in packages/db/src/models/
     ↓
  Database      — PostgreSQL via Drizzle ORM
 ```
 **Error flow:** Controller throws `ValidationError` (400) or `NotFoundError` (404) → caught by `errorHandler` middleware in `app.ts` → mapped to HTTP status. Unknown errors → 500.
 ---
 ## WebSocket Architecture
 The WS server attaches to the same Express HTTP server. Upgrades on `/ws` path.
 ```
 WS Connection Upgrade
     ↓
 Auth middleware — validates Better Auth session from cookie on upgrade
     ↓
 Message Router — dispatches by `type` field (Zod discriminated union)
     ↓
 Handler (lobby or game) — business logic, broadcasts state to room
     ↓
 In-memory stores (lobby game state, game session state)
 ```
 **Message protocol:** All WS messages validated against Zod schemas in `packages/shared/src/schemas/lobby.ts` and `packages/shared/src/schemas/game.ts`. Router switches on `type` field.
 **State storage:**
 - Lobby membership → PostgreSQL (`lobbies`, `lobby_players` tables) — durable
 - Game/room state → in-memory (`InMemoryLobbyGameStore`, `InMemoryGameSessionStore`) — ephemeral, lost on restart. Valkey migration planned.
 ---
 ## Data Flow: Singleplayer Quiz
 ```
 POST /api/v1/game/start (GameRequestSchema)
     ↓
 Controller validates → Service.createGameSession
     ↓
 termModel.getGameTerms(filters) + termModel.getDistractors(filters)
     ↓
 Service shuffles options, stores session in GameSessionStore
     ↓
 Returns GameSession { sessionId, questions[] }
     ↓
 [frontend] User selects option → confirms → POST /api/v1/game/answer
     ↓
 Service evaluates server-side (correct answer NEVER sent to frontend)
     ↓
 Returns AnswerResult { isCorrect, correctOptionId, selectedOptionId }
 ```
 **Key design:** Correct answer is stored server-side only (in GameSessionStore). Frontend only sees `optionId` (0–3) and `text`. Prevents cheating.
 ---
 ## Data Flow: Multiplayer Game
 ```
 Host creates lobby → POST /api/v1/lobbies → returns room code (e.g. WOLF-42)
     ↓
 Players join via code → POST /api/v1/lobbies/:code/join
     ↓
 All players WS connect → send lobby:join with room code
     ↓
 Server broadcasts lobby:state (player list) to all in room
     ↓
 Host clicks "Start" → WS lobby:start
     ↓
 MultiplayerGameService generates questions, broadcasts game:question
     ↓
 Players submit answers via WS game:answer within 15s server timer
     ↓
 On all-answered or timeout → evaluate, broadcast game:answer_result
     ↓
 After N rounds → broadcast game:finished with final scores
 ```
 ---
 ## GameSessionStore Abstraction
 ```typescript
 // packages/shared/src/schemas/game.ts (interface defined in apps/api)
 interface GameSessionStore {
  createSession(session: GameSession): Promise<void>;
  getSession(sessionId: string): Promise<GameSession | null>;
  // ...
 }
 ```
 **Current:** `InMemoryGameSessionStore` — Map-based, process memory, lost on restart.
 **Planned:** `ValkeyGameSessionStore` — Redis-compatible, persists across restarts.
 Same pattern for `LobbyGameStore`.
 ---
 ## Key Files by Concern
 | Concern         | Key Files                                                                              |
 | --------------- | -------------------------------------------------------------------------------------- |
 | HTTP routing    | `apps/api/src/routes/apiRouter.ts`, `gameRouter.ts`, `lobbyRouter.ts`                  |
 | Controllers     | `apps/api/src/controllers/gameController.ts`, `lobbyController.ts`                     |
 | Services        | `apps/api/src/services/gameService.ts`, `multiplayerGameService.ts`, `lobbyService.ts` |
 | Models          | `packages/db/src/models/termModel.ts`, `lobbyModel.ts`                                 |
 | WS handlers     | `apps/api/src/ws/handlers/gameHandlers.ts`, `lobbyHandlers.ts`                         |
 | WS router       | `apps/api/src/ws/router.ts`                                                            |
 | WS auth         | `apps/api/src/ws/auth.ts`                                                              |
 | Shared schemas  | `packages/shared/src/schemas/game.ts`, `lobby.ts`, `auth.ts`                           |
 | Constants       | `packages/shared/src/constants.ts`                                                     |
 | DB schema       | `packages/db/src/db/schema.ts`                                                         |
 | Auth config     | `apps/api/src/lib/auth.ts`                                                             |
 | Auth middleware | `apps/api/src/middleware/authMiddleware.ts`                                            |
--- a/documentation/ai-context/02-data-model.md
+++ b/documentation/ai-context/02-data-model.md
@ -0,0 +1,221 @@
 # 02 — Data Model
 > **Purpose:** Database schema reference for LLMs working on features that query or modify data. Concatenate with 00-project-overview.md and 99-current-task.md.
 > **Last updated:** 2026-05-15
 > **Depends on:** 00-project-overview.md
 ---
 ## Core Tables
 ### `terms` — Language-neutral concepts
 | Column       | Type      | Constraints                                  | Notes                                                  |
 | ------------ | --------- | -------------------------------------------- | ------------------------------------------------------ |
 | `id`         | uuid      | PK                                           |                                                        |
 | `pos`        | varchar   | CHECK: `noun`, `verb`, `adjective`, `adverb` | Part of speech                                         |
 | `source`     | varchar   |                                              | Pipeline that created this term (e.g. `kaikki`, `omw`) |
 | `source_id`  | varchar   | UNIQUE(`source`, `source_id`)                | Idempotency key for imports                            |
 | `synset_id`  | varchar   | nullable                                     | WordNet synset ID. Nullable for non-WordNet terms.     |
 | `created_at` | timestamp | default now()                                |                                                        |
 **Rule:** One row per concept. The word "cat" (animal) and "cat" (nautical) are separate rows because they have different `source_id` values.
 ---
 ### `translations` — Per-language words
 | Column          | Type       | Constraints                         | Notes                                    |
 | --------------- | ---------- | ----------------------------------- | ---------------------------------------- |
 | `id`            | uuid       | PK                                  |                                          |
 | `term_id`       | uuid       | FK → terms.id                       |                                          |
 | `language_code` | varchar(2) | CHECK: `en`, `it`, `de`, `es`, `fr` |                                          |
 | `text`          | varchar    |                                     | The actual word                          |
 | `cefr_level`    | varchar(2) | nullable, CHECK: `A1`–`C2`          | Difficulty of THIS word in THIS language |
 | `created_at`    | timestamp  | default now()                       |                                          |
 **Unique constraint:** (`term_id`, `language_code`, `text`) — allows synonyms (e.g. "dog" and "hound" for same term), prevents exact duplicates.
 **Key design:** `cefr_level` is on `translations`, not `terms`. "House" in English is A1; "domicile" is also English but B2 — same concept, different words, different difficulty.
 ---
 ### `term_glosses` — Definitions per language
 | Column          | Type       | Constraints                         | Notes                  |
 | --------------- | ---------- | ----------------------------------- | ---------------------- |
 | `id`            | uuid       | PK                                  |                        |
 | `term_id`       | uuid       | FK → terms.id                       |                        |
 | `language_code` | varchar(2) | CHECK: `en`, `it`, `de`, `es`, `fr` |                        |
 | `text`          | text       |                                     | Definition/explanation |
 | `created_at`    | timestamp  | default now()                       |                        |
 **Unique constraint:** (`term_id`, `language_code`) — one gloss per term per language. Prevents left joins from multiplying question rows.
 **Note:** Italian gloss coverage is sparse (~2% of terms have Italian glosses). UI falls back to English gloss when no gloss exists for the user's language.
 ---
 ### `decks` — Curated wordlists
 | Column                | Type         | Constraints                                       | Notes                                                   |
 | --------------------- | ------------ | ------------------------------------------------- | ------------------------------------------------------- |
 | `id`                  | uuid         | PK                                                |                                                         |
 | `name`                | varchar      |                                                   | e.g. `en-core-1000`                                     |
 | `source_language`     | varchar(2)   | CHECK                                             | Language the wordlist was built from                    |
 | `validated_languages` | varchar(2)[] | CHECK: source_language NOT IN validated_languages | Languages with complete translations for all deck terms |
 | `description`         | text         | nullable                                          |                                                         |
 | `created_at`          | timestamp    | default now()                                     |                                                         |
 **Design:** One deck per frequency tier per source language. POS, difficulty, and category are query filters, not separate decks. Decks must not overlap — each term appears in exactly one tier.
 **Source:** SUBTLEX frequency lists (per-language editions, same methodology).
 ---
 ### `deck_terms` — Junction table
 | Column       | Type      | Constraints   | Notes |
 | ------------ | --------- | ------------- | ----- |
 | `deck_id`    | uuid      | FK → decks.id |       |
 | `term_id`    | uuid      | FK → terms.id |       |
 | `created_at` | timestamp | default now() |       |
 **PK:** (`deck_id`, `term_id`)
 ---
 ## Auth Tables (managed by Better Auth)
 Better Auth creates and owns these tables. Do not modify directly.
 ### `user`
 | Column           | Type      | Notes                |
 | ---------------- | --------- | -------------------- |
 | `id`             | varchar   | PK                   |
 | `name`           | varchar   | Display name         |
 | `email`          | varchar   |                      |
 | `email_verified` | boolean   |                      |
 | `image`          | varchar   | nullable, avatar URL |
 | `created_at`     | timestamp |                      |
 | `updated_at`     | timestamp |                      |
 ### `session`
 | Column       | Type      | Notes         |
 | ------------ | --------- | ------------- |
 | `id`         | varchar   | PK            |
 | `user_id`    | varchar   | FK → user.id  |
 | `token`      | varchar   | Session token |
 | `expires_at` | timestamp |               |
 | `ip_address` | varchar   | nullable      |
 | `user_agent` | text      | nullable      |
 | `created_at` | timestamp |               |
 ### `account` — Social provider links
 | Column          | Type      | Notes                |
 | --------------- | --------- | -------------------- |
 | `id`            | varchar   | PK                   |
 | `user_id`       | varchar   | FK → user.id         |
 | `account_id`    | varchar   | Provider's user ID   |
 | `provider_id`   | varchar   | `google` or `github` |
 | `access_token`  | text      | nullable             |
 | `refresh_token` | text      | nullable             |
 | `id_token`      | text      | nullable             |
 | `expires_at`    | timestamp | nullable             |
 **Note:** One user can have multiple accounts (Google + GitHub linked to same user).
 ### `verification`
 Email verification tokens. Unused for social-only auth but managed by Better Auth.
 ---
 ## Lobby Tables (Multiplayer)
 ### `lobbies`
 | Column        | Type      | Constraints                                 | Notes                                        |
 | ------------- | --------- | ------------------------------------------- | -------------------------------------------- |
 | `id`          | uuid      | PK                                          |                                              |
 | `code`        | varchar   | UNIQUE                                      | Human-readable room code (e.g. `WOLF-42`)    |
 | `host_id`     | varchar   | FK → user.id                                |                                              |
 | `status`      | varchar   | CHECK: `waiting`, `in_progress`, `finished` |                                              |
 | `max_players` | integer   | default 4                                   |                                              |
 | `settings`    | jsonb     | nullable                                    | Game mode, round count, timer duration, etc. |
 | `created_at`  | timestamp | default now()                               |                                              |
 | `updated_at`  | timestamp | default now()                               | Used for stale recovery                      |
 ### `lobby_players`
 | Column         | Type      | Constraints     | Notes                        |
 | -------------- | --------- | --------------- | ---------------------------- |
 | `id`           | uuid      | PK              |                              |
 | `lobby_id`     | uuid      | FK → lobbies.id |                              |
 | `user_id`      | varchar   | FK → user.id    |                              |
 | `display_name` | varchar   |                 | Player's shown name in lobby |
 | `is_host`      | boolean   | default false   |                              |
 | `joined_at`    | timestamp | default now()   |                              |
 **Unique constraint:** (`lobby_id`, `user_id`) — one entry per player per lobby.
 ---
 ## Key Relationships
 ```
 terms (1) ←──→ (N) translations
 terms (1) ←──→ (N) term_glosses
 terms (N) ←──→ (N) decks via deck_terms
 user (1) ←──→ (N) sessions
 user (1) ←──→ (N) accounts
 user (1) ←──→ (N) lobbies (as host)
 user (1) ←──→ (N) lobby_players
 lobbies (1) ←──→ (N) lobby_players
 ```
 ---
 ## Query Patterns
 ### Get quiz terms (singleplayer)
 ```sql
 SELECT t.id, t.pos, src.text AS source_text, tgt.text AS target_text, g.text AS gloss
 FROM terms t
 JOIN translations src ON src.term_id = t.id AND src.language_code = ?
 JOIN translations tgt ON tgt.term_id = t.id AND tgt.language_code = ?
 LEFT JOIN term_glosses g ON g.term_id = t.id AND g.language_code = ?
 WHERE t.pos = ? AND tgt.cefr_level IN (?)
 LIMIT ?
 ```
 ### Get distractors
 ```sql
 SELECT text FROM translations
 WHERE language_code = ? AND pos = ? AND cefr_level IN (?)
 AND term_id != ? AND text != ?
 ORDER BY RANDOM()
 LIMIT 3
 ```
 **Note:** This is the N+1 query mentioned in BACKLOG.md. Each question fetches 3 distractors separately. Batching is planned.
 ---
 ## Deferred Schema Extensions (Not Yet Implemented)
 These tables are planned but do not exist yet. All are additive — they reference existing `terms` rows via FK.
 | Table                 | Purpose                                         | Trigger                 |
 | --------------------- | ----------------------------------------------- | ----------------------- |
 | `noun_forms`          | Gender, singular, plural, articles per language | Grammar quiz mode       |
 | `verb_forms`          | Conjugation tables per language                 | Grammar quiz mode       |
 | `term_pronunciations` | IPA + audio URLs per language                   | Pronunciation quiz mode |
 | `user_decks`          | Which decks a user studies                      | User customization      |
 | `user_term_progress`  | Spaced repetition state per user/term/language  | SRS review queue        |
 | `quiz_answers`        | Answer history for stats/analytics              | User stats dashboard    |
--- a/documentation/ai-context/03-api-contract.md
+++ b/documentation/ai-context/03-api-contract.md
@ -0,0 +1,367 @@
 # 03 — API Contract
 > **Purpose:** REST and WebSocket endpoint reference with exact Zod schemas. Concatenate with 00-project-overview.md and 99-current-task.md.
 > **Last updated:** 2026-05-15
 > **Depends on:** 00-project-overview.md, 02-data-model.md
 ---
 ## REST Endpoints
 ### Health
 ```
 GET /api/v1/health
 ```
 **Response:** `{ "status": "ok" }`
 **Auth:** None (public)
 ---
 ### Game — Start Session
 ```
 POST /api/v1/game/start
 ```
 **Request body** (GameRequestSchema):
 ```typescript
 {
  source_language: SupportedLanguageCode,  // "en" | "it" | "de" | "es" | "fr"
  target_language: SupportedLanguageCode,
  pos: SupportedPos,                        // "noun" | "verb" | "adjective" | "adverb"
  difficulty: DifficultyLevel,              // "easy" | "intermediate" | "hard"
  rounds: GameRounds                        // "3" | "10" (string enum, converted to number in service)
 }
 ```
 **Validation rules:**
 - `source_language` !== `target_language`
 - Both languages in `SUPPORTED_LANGUAGE_CODES`
 - `pos` in `SUPPORTED_POS`
 - `difficulty` in `DIFFICULTY_LEVELS`
 - `rounds` in `GAME_ROUNDS`
 **Response** (GameSessionSchema):
 ```typescript
 {
  sessionId: string,        // UUID
  questions: GameQuestion[]
 }
 ```
 **GameQuestionSchema:**
 ```typescript
 {
  questionId: string,       // UUID
  prompt: string,           // Word in source language
  gloss: string | null,     // Definition (falls back to English if target lang gloss missing)
  options: AnswerOption[]   // 4 items, shuffled
 }
 ```
 **AnswerOptionSchema:**
 ```typescript
 {
  optionId: number,         // 0–3
  text: string              // Translation in target language
 }
 ```
 **Note:** The correct answer is NOT included in the response. The frontend only sees `optionId` and `text`. The server stores `questionId → correctOptionId` mapping in the GameSessionStore.
 **Auth:** Required (session middleware)
 ---
 ### Game — Submit Answer
 ```
 POST /api/v1/game/answer
 ```
 **Request body** (AnswerSubmissionSchema):
 ```typescript
 {
  sessionId: string,        // UUID
  questionId: string,       // UUID
  selectedOptionId: number  // 0–3
 }
 ```
 **Response** (AnswerResultSchema):
 ```typescript
 {
  questionId: string,
  isCorrect: boolean,
  correctOptionId: number,   // 0–3
  selectedOptionId: number   // 0–3
 }
 ```
 **Error cases:**
 - Session not found → 404 NotFoundError
 - Question not in session → 404 NotFoundError
 - Invalid optionId → 400 ValidationError
 **Auth:** Required
 ---
 ### Lobby — Create
 ```
 POST /api/v1/lobbies
 ```
 **Request body:** None (host's auth session determines host_id)
 **Response:**
 ```typescript
 {
  id: string,           // UUID
  code: string,         // Human-readable room code (e.g. "WOLF-42")
  host_id: string,
  status: "waiting",
  max_players: number,
  settings: object | null,
  created_at: string
 }
 ```
 **Auth:** Required
 ---
 ### Lobby — Join
 ```
 POST /api/v1/lobbies/:code/join
 ```
 **Path param:** `code` — room code (e.g. "WOLF-42")
 **Response:** Same as create (the lobby object)
 **Error cases:**
 - Lobby not found → 404
 - Lobby full → 400
 - Already joined → 200 (idempotent)
 **Auth:** Required
 ---
 ### Auth
 ```
 ALL /api/auth/*     — Better Auth handlers (public)
 ```
 Better Auth mounts its own router at `/api/auth/*`. Handles:
 - `/api/auth/signin/social` — initiate social login
 - `/api/auth/callback/:provider` — OAuth callback
 - `/api/auth/signout` — clear session
 - `/api/auth/session` — get current session
 **Auth:** Mixed (some public, some require valid session)
 ---
 ## WebSocket Protocol
 All WS messages are JSON objects with a `type` field. The `type` is a discriminated union — the router validates the payload against the schema for that type.
 ### Connection
 1. Client opens WebSocket to `wss://api.lilastudy.com/ws`
 2. Server validates Better Auth session from cookie on upgrade
 3. Connection established
 ### Client → Server Messages
 #### `lobby:join`
 ```typescript
 {
  type: "lobby:join",
  payload: {
    code: string  // Room code (e.g. "WOLF-42")
  }
 }
 ```
 #### `lobby:leave`
 ```typescript
 {
  type: "lobby:leave",
  payload: {
    code: string
  }
 }
 ```
 #### `lobby:start`
 ```typescript
 {
  type: "lobby:start",
  payload: {
    code: string
  }
 }
 ```
 Only the host can send this. Triggers game start.
 #### `game:answer`
 ```typescript
 {
  type: "game:answer",
  payload: {
    code: string,
    questionId: string,
    optionId: number  // 0–3
  }
 }
 ```
 Must be sent within the 15-second server timer.
 ---
 ### Server → Client Messages
 #### `lobby:state`
 ```typescript
 {
  type: "lobby:state",
  payload: {
    code: string,
    players: {
      id: string,
      display_name: string,
      is_host: boolean
    }[],
    status: "waiting" | "in_progress" | "finished",
    settings: object | null
  }
 }
 ```
 Broadcast to all players in the lobby on any membership change.
 #### `game:question`
 ```typescript
 {
  type: "game:question",
  payload: {
    questionId: string,
    prompt: string,
    gloss: string | null,
    options: { optionId: number, text: string }[],
    timeLimit: number  // seconds (15)
  }
 }
 ```
 Broadcast when the game starts or a new round begins.
 #### `game:answer_result`
 ```typescript
 {
  type: "game:answer_result",
  payload: {
    questionId: string,
    results: {
      playerId: string,
      displayName: string,
      isCorrect: boolean,
      selectedOptionId: number,
      score: number
    }[]
  }
 }
 ```
 Broadcast after all players answer or timer expires.
 #### `game:finished`
 ```typescript
 {
  type: "game:finished",
  payload: {
    finalScores: {
      playerId: string,
      displayName: string,
      score: number
    }[],
    winner: {
      playerId: string,
      displayName: string
    } | null  // null for ties
  }
 }
 ```
 Broadcast after all rounds complete.
 ---
 ## Zod Schema Locations
 All schemas live in `packages/shared/src/schemas/`:
 | Schema                 | File       | Used by                                 |
 | ---------------------- | ---------- | --------------------------------------- |
 | GameRequestSchema      | `game.ts`  | API controller, frontend GameSetup      |
 | GameSessionSchema      | `game.ts`  | API service, frontend quiz flow         |
 | GameQuestionSchema     | `game.ts`  | API service, frontend QuestionCard      |
 | AnswerOptionSchema     | `game.ts`  | API service, frontend OptionButton      |
 | AnswerSubmissionSchema | `game.ts`  | API controller, frontend submit handler |
 | AnswerResultSchema     | `game.ts`  | API controller, frontend ScoreScreen    |
 | LobbyCreateSchema      | `lobby.ts` | API controller                          |
 | LobbyJoinSchema        | `lobby.ts` | API controller                          |
 | LobbyStateSchema       | `lobby.ts` | WS handler, frontend lobby UI           |
 | WebSocketMessageSchema | `lobby.ts` | WS router (discriminated union)         |
 **Rule:** Never duplicate these schemas. Import from `packages/shared` in both API and frontend.
 ---
 ## Error Responses
 All errors follow this shape:
 ```typescript
 {
  error: string,      // Human-readable message
  statusCode: number  // HTTP status
 }
 ```
 **Common status codes:**
 - 400 — ValidationError (bad input, schema mismatch)
 - 401 — Unauthorized (no valid session)
 - 404 — NotFoundError (session, question, or lobby not found)
 - 500 — Unknown error (logged, generic message to client)
--- a/documentation/ai-context/04-websocket-protocol.md
+++ b/documentation/ai-context/04-websocket-protocol.md
@ -0,0 +1,237 @@
 # 04 — WebSocket Protocol
 > **Purpose:** Deep dive into WebSocket lifecycle, state management, and edge cases for LLMs working on multiplayer features. Concatenate with 00-project-overview.md and 99-current-task.md.
 > **Last updated:** 2026-05-15
 > **Depends on:** 00-project-overview.md, 03-api-contract.md
 ---
 ## Connection Lifecycle
 ### 1. Upgrade
 ```
 Client: GET wss://api.lilastudy.com/ws
        Headers: Cookie: better-auth.session=...
 Server: Validates session via Better Auth (reads cookie, looks up in DB)
        → Valid: 101 Switching Protocols, connection established
        → Invalid: 401 Unauthorized, connection rejected
 ```
 **Auth is mandatory.** No anonymous WebSocket connections. Guest play (if implemented) would need a different auth strategy here.
 ### 2. Message Routing
 After connection, all messages flow through:
 ```
 Raw JSON message
     ↓
 Zod safeParse against WebSocketMessageSchema (discriminated union on `type`)
     ↓
 Router switches on `type` → dispatches to handler
     ↓
 Handler executes business logic → broadcasts to room
 ```
 **Invalid messages:** Parse failures are logged and silently dropped. The client receives no error response — this is intentional to prevent error spam from malformed clients.
 ### 3. Disconnect
 When a client disconnects (browser close, network loss, page navigate):
 ```
 Connection close event
     ↓
 Handler removes player from lobby (if in one)
     ↓
 Broadcasts updated lobby:state to remaining players
     ↓
 If game in progress and player disconnects:
     → Player is marked as "disconnected" (not removed from game state)
     → Their answer slot is treated as "no answer" (timeout)
     → Game continues
 ```
 **No automatic reconnect.** The client must manually reconnect and re-join the lobby. Graceful reconnect with state restoration is planned (BACKLOG.md `next`).
 ---
 ## State Management
 ### Two-Tier Storage
 | State Type       | Storage                                                          | Durability | Use Case                                          |
 | ---------------- | ---------------------------------------------------------------- | ---------- | ------------------------------------------------- |
 | Lobby membership | PostgreSQL (`lobbies`, `lobby_players`)                          | Durable    | Who is in which room, who is host                 |
 | Game state       | In-memory (`InMemoryLobbyGameStore`, `InMemoryGameSessionStore`) | Ephemeral  | Current question, scores, timer, answers received |
 **Why the split?** Lobby membership must survive server restarts (players shouldn't be kicked on deploy). Game state is ephemeral by design — a game lasts minutes, and losing state on restart is acceptable for MVP.
 ### In-Memory Store Structure
 ```typescript
 // Conceptual — actual implementation in apps/api/src/gameSessionStore/
 interface InMemoryGameState {
  [lobbyCode: string]: {
    status: "waiting" | "question" | "result" | "finished";
    currentRound: number;
    totalRounds: number;
    currentQuestion: GameQuestion | null;
    answers: Map<playerId, { optionId: number; timestamp: number }>;
    scores: Map<playerId, number>;
    timer: NodeJS.Timeout | null; // 15s server timer
    questionStartTime: number; // For speed-based tiebreaking
  };
 }
 ```
 ---
 ## The 15-Second Timer
 ### Implementation
 ```
 Host sends lobby:start
     ↓
 Server generates questions, stores in game state
     ↓
 Broadcast game:question to all players
     ↓
 START 15-second timer (NodeJS setTimeout)
     ↓
 Player answers collected in Map<playerId, answer>
     ↓
 Timer expires OR all players answered
     ↓
 STOP timer, evaluate answers, broadcast game:answer_result
     ↓
 If more rounds: wait 3s → broadcast next game:question → restart timer
     ↓
 If last round: broadcast game:finished
 ```
 ### Timer Edge Cases
 | Scenario                   | Behavior                                                                  |
 | -------------------------- | ------------------------------------------------------------------------- |
 | Player answers at 14.9s    | Valid, collected before timer expiry                                      |
 | Player answers at 15.1s    | Rejected, treated as timeout. Timer already fired.                        |
 | All players answer early   | Timer is cleared early, round proceeds immediately                        |
 | No one answers             | All players get 0 points for that round, next round starts                |
 | Host disconnects mid-game  | Game continues, any player can see results. No "host transfer" logic yet. |
 | Non-host sends lobby:start | Silently ignored (or rejected — check implementation)                     |
 ---
 ## Message Broadcasting
 ### Room-Based Broadcasting
 The server maintains a mapping of `lobbyCode → Set<WebSocket connections>`. When a message needs to broadcast:
 ```typescript
 // Pseudo-code from ws/connections.ts
 function broadcastToRoom(code: string, message: WebSocketMessage) {
  const connections = roomConnections.get(code);
  for (const ws of connections) {
    if (ws.readyState === WebSocket.OPEN) {
      ws.send(JSON.stringify(message));
    }
  }
 }
 ```
 **Self-broadcast:** The sender receives their own broadcast. The frontend must handle this (e.g., ignore their own lobby:state if they already updated optimistically).
 ### Message Ordering
 WebSocket guarantees in-order delivery per connection. However, race conditions can occur:
 - Player A sends `game:answer` at 14.5s
 - Player B's connection lags, receives `game:answer_result` before their own `game:answer` ack
 - **Frontend must handle out-of-order messages gracefully**
 ---
 ## Edge Cases & Failure Modes
 ### Mid-Game Disconnect
 ```
 Player disconnects during question phase
     ↓
 Connection close handler triggered
     ↓
 Player NOT removed from game state (they might reconnect)
     ↓
 Timer continues
     ↓
 On timer expiry: player has no answer → treated as wrong
     ↓
 Result broadcast includes "disconnected" status for that player
 ```
 **Current gap:** No reconnect-with-state-restoration. Player must re-join lobby and game state is not recovered. Planned in BACKLOG.md `next`.
 ### Double Join
 ```
 Player joins lobby ABC
     ↓
 Player joins lobby ABC again (accidental double-click, retry)
     ↓
 Server: idempotent — player already in lobby, return 200
     ↓
 No duplicate entries in lobby_players table
 ```
 ### Rapid Start/Stop
 ```
 Host clicks "Start" twice rapidly
     ↓
 First click: game starts, state changes to "in_progress"
     ↓
 Second click: server checks state, sees "in_progress", ignores
 ```
 ### Client-Side Message Loss
 If a client's `game:answer` never reaches the server (network blip):
 - Server never receives the answer
 - Timer expires
 - Player gets 0 points for that round
 - **No retry mechanism** — client sends once, no ack expected
 ---
 ## Planned Improvements (Not Yet Implemented)
 From BACKLOG.md `next`:
 1. **Graceful WS reconnect** — Exponential back-off, restore game state on reconnection if game still in progress
 2. **Heartbeat/ping** — Detect stale connections faster than TCP timeout
 3. **Valkey for game state** — Replace in-memory store with Redis-compatible storage for horizontal scaling and persistence across restarts
 4. **Configurable game settings** — Host sets round count, timer duration, target score via lobby settings jsonb column
 5. **Additional game modes** — TV Quiz Show, Race to the Top, Chain Link, Elimination Round, Cooperative Challenge (see design/GAME_MODES.md)
 ---
 ## Key Files
 | File                                              | Purpose                                       |
 | ------------------------------------------------- | --------------------------------------------- |
 | `apps/api/src/ws/index.ts`                        | WebSocket server setup, attach to HTTP server |
 | `apps/api/src/ws/auth.ts`                         | Session validation on upgrade                 |
 | `apps/api/src/ws/router.ts`                       | Message routing by `type`                     |
 | `apps/api/src/ws/connections.ts`                  | Connection management, room mapping           |
 | `apps/api/src/ws/handlers/lobbyHandlers.ts`       | lobby:join, lobby:leave, lobby:start          |
 | `apps/api/src/ws/handlers/gameHandlers.ts`        | game:answer                                   |
 | `apps/api/src/services/multiplayerGameService.ts` | Game logic, timer, scoring                    |
 | `apps/api/src/lobbyGameStore/`                    | In-memory lobby state storage                 |
 | `packages/shared/src/schemas/lobby.ts`            | WS message Zod schemas                        |
 | `packages/shared/src/schemas/game.ts`             | Game state Zod schemas                        |
--- a/documentation/ai-context/05-data-pipeline.md
+++ b/documentation/ai-context/05-data-pipeline.md
@ -0,0 +1,173 @@
 # 05 — Data Pipeline
 > **Purpose:** Condensed reference for LLMs working on the Kaikki data pipeline. Covers stages, data flow, and current blockers. For full operational details (llama.cpp setup, provider configs, hardware specs), see the human-readable DATA_PIPELINE.md.
 > **Last updated:** 2026-05-15
 > **Depends on:** 00-project-overview.md
 ---
 ## Pipeline Overview
 ```
 Kaikki JSONL (Wiktionary extracts)
     ↓
 Stage 1: Extract → Parse into pipeline.db (SQLite)
     ↓
 Stage 2: Reverse Link → Insert missing reverse translations
     ↓
 Stage 3: Enrich → LLMs review glosses, examples, translations, assign CEFR
     ↓
 Stage 4: Merge → Resolve LLM votes into final values
     ↓
 Stage 4b: Tiebreak → Run unused models on flagged entries
     ↓
 Stage 5: Compare / QA → Generate COVERAGE.md quality report
     ↓
 Stage 6: Sync → Upsert resolved records into production PostgreSQL
 ```
 **Current state:** Stage 1 and 2 complete on sample data. Stage 3 enrich script being rewritten for sub-stage architecture. Stages 4–6 not started.
 ---
 ## Stage 1: Extract
 **Input:** `data-pipeline/stage-1-extract/sources/*.jsonl` (Kaikki files, not in git)
 **Output:** `pipeline.db` — `vocabulary_entries` and `entry_translations` tables
 **What it does:**
 - Parses Kaikki JSONL for all 5 languages (en, de, es, fr, it)
 - Filters to 4 POS: noun, verb, adjective, adverb
 - Each Kaikki sense becomes one `vocabulary_entries` row
 - Translations stored in `entry_translations` with sense hints
 **Key design:** Kaikki is structured per word sense. Each headword has multiple senses, and translations are linked to a specific sense. This prevents the sense-disambiguation problems of OpenWordNet/OMW.
 ---
 ## Stage 2: Reverse Link Sync
 **Pure script, no LLMs.**
 For each translation pair (e.g., English "thrill" → German "begeistern"), checks if the reverse exists (German "begeistern" → English "thrill"). If the German entry exists but lacks the English back-link, inserts it automatically.
 **Why:** Ensures LLMs in Stage 3 only generate translations that are genuinely missing — not translations findable by simple reverse lookup.
 ---
 ## Stage 3: Enrich (In Progress — Being Rewritten)
 **Current blocker:** The original single-prompt design had problems (skipped invalid translations, triggered reasoning mode, 20% manual review). Being rewritten as four ordered sub-stages.
 ### Sub-Stage Architecture
 Each model processes every entry through four sub-stages in order:
 1. **`round1_gloss`** — Review existing gloss. Confirm if clear, generate better one if not.
 2. **`round1_example`** — Review examples. Confirm if natural, generate one better sentence.
 3. **`round1_translations`** — Validate translations with verified gloss as context. Confirm valid, reject invalid, generate missing.
 4. **`round1_cefr`** — Assign CEFR level (A1–C2) to headword and each confirmed translation.
 **Why this order:** CEFR sub-stage only sees clean, verified data. Bad translations are rejected before reaching CEFR assignment.
 **Voter strategy:** Multiple models vote independently. Each model = one vote per sub-stage. Current plan:
 - Primary: Local Qwen3.5-9B (overnight runs, unlimited)
 - Secondary: Groq Llama 3.3 70B (cloud, batched)
 - Tertiary: Gemini AI Studio (cloud, batched)
 **Context enrichment:** Before calling models for gloss/example, pipeline queries Wiktionary API for the headword. Full entry (all senses, usage notes) added to prompt. Fixes category header glosses and short ambiguous glosses.
 ---
 ## Stage 4: Merge
 Resolves LLM votes into final values per entry.
 **Rules:**
 - Kaikki source data wins automatically (never overridden)
 - CEFR: level with most votes wins
 - Text fields (gloss, example, translation): candidate with most votes wins
 - No majority → flag for tiebreaker
 **Difficulty mapping:**
 | CEFR | Difficulty |
 |------|-----------|
 | A1, A2 | easy |
 | B1, B2 | intermediate |
 | C1, C2 | hard |
 ---
 ## Stage 4b: Tiebreak
 Runs automatically after merge if flagged entries remain. Queries unused models (not yet voted) and re-runs merge. Repeats until resolved or no unused models remain.
 **If still unresolved:** Sync is blocked. Add more models to config and re-run.
 ---
 ## Stage 5: Compare / QA
 Read-only. Generates `COVERAGE.md` with per-language breakdown:
 - Total entries, POS distribution
 - Translation coverage per language pair
 - CEFR coverage and difficulty breakdown
 - Gloss/example coverage by source (Kaikki vs LLM)
 - Per-model contribution stats
 Run this before syncing to production.
 ---
 ## Stage 6: Sync
 Upserts all `status = "final"` entries from `pipeline.db` to production PostgreSQL.
 **Behavior:**
 - Missing → insert
 - Present but changed → update
 - Present and unchanged → skip
 **Idempotent.** Safe to re-run.
 ---
 ## Key Constraints
 | Constant   | Values                                |
 | ---------- | ------------------------------------- |
 | Languages  | `en`, `it`, `de`, `es`, `fr`          |
 | POS        | `noun`, `verb`, `adjective`, `adverb` |
 | CEFR       | `A1`, `A2`, `B1`, `B2`, `C1`, `C2`    |
 | Difficulty | `easy`, `intermediate`, `hard`        |
 Adding a new value requires updating `packages/shared/src/constants.ts` AND a database migration before re-running the pipeline.
 ---
 ## Current Blockers
 1. **Enrich sub-stage rewrite** — Stage 3 script needs redesign and testing
 2. **Cloud provider integration** — Groq and Gemini not yet wired into pipeline
 3. **Batching prompt design** — 5–10 entries per API call for efficiency; not yet designed
 4. **Full dataset scale unknown** — Currently running on 500-entry samples. Full Kaikki English file has ~1.3M entries. Exact filtered count and runtime estimate not yet known.
 ---
 ## Key Files
 | File                                                         | Purpose                                                   |
 | ------------------------------------------------------------ | --------------------------------------------------------- |
 | `data-pipeline/pipeline.ts`                                  | Orchestrator — runs stages in order, handles resumability |
 | `data-pipeline/stage-1-extract/scripts/extract.ts`           | Parse Kaikki JSONL                                        |
 | `data-pipeline/stage-2-reverse-link/scripts/reverse-link.ts` | Insert reverse translations                               |
 | `data-pipeline/stage-3-enrich/scripts/enrich.ts`             | LLM enrichment (being rewritten)                          |
 | `data-pipeline/stage-3-enrich/config.ts`                     | Provider configs (local, OpenRouter, etc.)                |
 | `data-pipeline/db/schema.sql`                                | pipeline.db schema                                        |
 | `data-pipeline/db/import.ts`                                 | Import stage 1 output into pipeline.db                    |
 | `packages/shared/src/constants.ts`                           | Language codes, POS, CEFR, difficulty constants           |
--- a/documentation/ai-context/06-deployment.md
+++ b/documentation/ai-context/06-deployment.md
@ -0,0 +1,144 @@
 # 06 — Deployment
 > **Purpose:** Condensed infrastructure reference for LLMs working on deployment, CI/CD, or ops tasks. For full setup details (VPS provisioning, Forgejo configuration, backup scripts), see the human-readable DEPLOYMENT.md.
 > **Last updated:** 2026-05-15
 > **Depends on:** 00-project-overview.md
 ---
 ## Infrastructure Overview
 ```
 Internet
     ↓
 Caddy (Docker container, ports 80/443)
     ├── lilastudy.com       → web container (nginx:alpine, static files)
     ├── api.lilastudy.com   → api container (Express, port 3000)
     └── git.lilastudy.com   → forgejo container (git + registry, port 3000)
 SSH (port 2222) → forgejo container (git push/pull)
 ```
 **VPS:** Hetzner, Debian 13, ARM64 (aarch64), 4GB RAM  
 **Domain:** lilastudy.com, wildcard `*.lilastudy.com` configured  
 **Only Caddy faces the internet.** All other services communicate over internal Docker network.
 ---
 ## Docker Compose Stack
 Services on shared `lila-network`:
 | Service  | Image                                            | Ports (internal) | Notes                                              |
 | -------- | ------------------------------------------------ | ---------------- | -------------------------------------------------- |
 | caddy    | caddy:alpine                                     | 80, 443          | Only container with published ports                |
 | api      | `git.lilastudy.com/forgejo-lila/lila-api:latest` | 3000             | Multi-stage Dockerfile, runs migrations on startup |
 | web      | `git.lilastudy.com/forgejo-lila/lila-web:latest` | 80               | nginx:alpine, SPA fallback via try_files           |
 | database | postgres:16                                      | 5432             | Named volume `lila-db` for persistence             |
 | forgejo  | forgejo:...                                      | 3000, 2222       | Git + container registry, SSH on 2222              |
 **No ports exposed on internal services.** Only Caddy (80/443) and Forgejo SSH (2222) are public.
 ---
 ## Build & Deploy Flow
 ```
 Dev laptop: git push to main
     ↓
 Forgejo Actions triggers (runner on VPS)
     ↓
 Build API image (target: runner)
 Build Web image (target: production, VITE_API_URL baked in)
     ↓
 Push both to git.lilastudy.com registry
     ↓
 SSH into VPS, docker compose pull, restart containers
     ↓
 API container runs migrations on startup (migrate.js before server.js)
     ↓
 App updated (~2–5 min total)
 ```
 **Cross-compilation:** Images built natively on ARM64 VPS (no QEMU). Dev laptop used for initial pushes before CI/CD was set up.
 ---
 ## Environment-Driven Config
 Same code runs in dev and production. Environment variables control behavior:
 | Variable          | Dev                                  | Production                                        |
 | ----------------- | ------------------------------------ | ------------------------------------------------- |
 | `DATABASE_URL`    | `postgres://...@localhost:5432/lila` | `postgres://...@database:5432/lila`               |
 | `BETTER_AUTH_URL` | `http://localhost:3000`              | `https://api.lilastudy.com`                       |
 | `CORS_ORIGIN`     | `http://localhost:5173`              | `https://lilastudy.com`                           |
 | `COOKIE_DOMAIN`   | undefined                            | `.lilastudy.com`                                  |
 | `VITE_API_URL`    | `http://localhost:3000`              | `https://api.lilastudy.com` (baked at build time) |
 **Note:** `VITE_API_URL` is baked into the frontend at Docker build time via `--build-arg`. It cannot be changed at runtime.
 ---
 ## Database
 ### Migrations
 Drizzle migrations run automatically on API container startup. The Dockerfile entrypoint:
 ```dockerfile
 CMD ["node", "dist/src/migrate.js", "&&", "node", "dist/src/server.js"]
 ```
 **Deploy order enforced automatically:** migrations before server starts.
 ### Backups
 - Daily cron job at 3:00 AM: `pg_dump` → compressed SQL → `~/backups/`
 - 7-day retention on VPS
 - Dev laptop auto-syncs new backups on login via `rsync`
 - **Offsite storage:** Planned (Hetzner Object Storage or S3-compatible)
 ### Seeding
 Idempotent (`onConflictDoNothing`). Safe to re-run for adding new languages without affecting existing data or user tables.
 ---
 ## Auth & OAuth
 **Better Auth** embedded in Express API. No separate auth service.
 **Social providers:**
 - Google OAuth — consent screen in testing mode (100 user cap). Must publish before reaching 80 users.
 - GitHub OAuth — configured for both dev and production redirect URIs
 **Cross-subdomain cookies:** `COOKIE_DOMAIN=.lilastudy.com` (leading dot) makes auth cookie valid across all subdomains.
 ---
 ## Known Issues & Limitations
 | Issue                             | Impact                                                                             | Status                       |
 | --------------------------------- | ---------------------------------------------------------------------------------- | ---------------------------- |
 | lila-web has no healthcheck       | Vite dev server has no health endpoint; `depends_on` uses API healthcheck as proxy | Acceptable for dev           |
 | Valkey memory overcommit warning  | Harmless in dev. Fix before production: `vm.overcommit_memory = 1`                 | Documented                   |
 | No centralized monitoring/logging | No uptime alerts or log aggregation on VPS                                         | Planned (BACKLOG.md)         |
 | Backups only on VPS + dev laptop  | No offsite protection against VPS failure                                          | Planned (BACKLOG.md)         |
 | Google OAuth in testing mode      | 100 user cap                                                                       | Must publish before 80 users |
 ---
 ## Key Files
 | File                            | Purpose                                                |
 | ------------------------------- | ------------------------------------------------------ |
 | `docker-compose.yml` (root)     | Local dev stack                                        |
 | `docker-compose.yml` (VPS)      | Production stack                                       |
 | `apps/api/Dockerfile`           | Multi-stage: deps → dev → builder → runner             |
 | `apps/web/Dockerfile`           | Multi-stage: deps → dev → builder → production (nginx) |
 | `apps/web/nginx.conf`           | SPA fallback routing                                   |
 | `Caddyfile`                     | Reverse proxy routing, automatic HTTPS                 |
 | `.forgejo/workflows/deploy.yml` | CI/CD pipeline                                         |
 | `apps/api/src/migrate.ts`       | Drizzle migration runner                               |
--- a/documentation/ai-context/99-current-task-blueprint.md
+++ b/documentation/ai-context/99-current-task-blueprint.md
@ -0,0 +1,108 @@
 # 99 — Current Task
 > **Purpose:** Fill out this template before giving a task to an LLM. Concatenate with 00-project-overview.md and relevant domain files (01–06). After the task is complete, ask the LLM to review this checklist and suggest doc updates.
 > **Last updated:** 2026-05-15
 > **Depends on:** 00-project-overview.md, prompts/meta.md
 ---
 ## Task Description
 **What I'm building / fixing / refactoring:**
 [Describe the task in 1–2 sentences. Be specific.]
 Example: "Implement guest play flow so users can try a 3-round quiz without creating an account."
 ---
 ## Context
 **Which parts of the codebase does this touch?**
 - [ ] Frontend (`apps/web/`)
 - [ ] Backend API (`apps/api/`)
 - [ ] Database schema (`packages/db/`)
 - [ ] Shared schemas (`packages/shared/`)
 - [ ] WebSocket protocol (`apps/api/src/ws/`)
 - [ ] Data pipeline (`data-pipeline/`)
 - [ ] Infrastructure / deployment (`docker-compose.yml`, Caddyfile, etc.)
 - [ ] Documentation
 **Relevant files I already know about:**
 [List files you've identified. The LLM may ask for additional ones.]
 Example:
 - `apps/api/src/controllers/gameController.ts` — needs guest variant
 - `apps/api/src/middleware/authMiddleware.ts` — needs optional auth path
 - `packages/shared/src/schemas/game.ts` — needs GuestGameRequestSchema
 ---
 ## Constraints & Requirements
 **Must have:**
 - [ ]
 - [ ]
 **Nice to have:**
 - [ ]
 - [ ]
 **Must NOT break:**
 - [ ] Existing auth flow (logged-in users still work normally)
 - [ ] WebSocket protocol (if applicable)
 - [ ] Database schema (additive changes only unless migration planned)
 - [ ] Zod schemas in `packages/shared` (no silent drift)
 **Known blockers or open questions:**
 - [ ]
 ---
 ## Definition of Done
 - [ ] Code implemented and tested
 - [ ] No TypeScript errors (`pnpm typecheck` passes)
 - [ ] Tests pass (`pnpm test`)
 - [ ] Manual verification in dev environment
 - [ ] Commit message follows convention (see prompts/meta.md)
 - [ ] Feature branch merged to main
 ---
 ## Post-Work Checklist
 After the task is complete, ask the LLM:
 > "Review the post-work checklist in prompts/meta.md. Which documentation files need updates based on what we just changed?"
 The LLM should check:
 | File                               | Check if...                                                             |
 | ---------------------------------- | ----------------------------------------------------------------------- |
 | `documentation/STATUS.md`          | Task changes what's working or what's blocked                           |
 | `documentation/BACKLOG.md`         | Task completes a backlog item or creates a new one                      |
 | `documentation/DECISIONS.md`       | Task involved choosing between alternatives with long-term consequences |
 | `documentation/ARCHITECTURE.md`    | Task changes monorepo structure, data flow, or layer boundaries         |
 | `documentation/ai-context/*.md`    | Task changes schemas, endpoints, protocol, or pipeline stages           |
 | `packages/shared/src/schemas/*.ts` | Task changes request/response shapes or WS message types                |
 | `README.md`                        | Task changes quickstart steps, stack, or current status                 |
 **Expected output format:**
 ```
 - FILE: [filename] — REASON: [what changed and why the doc needs updating]
 ```
 ---
 ## Notes
 [Any additional context, links, or scratch notes for this specific task.]
--- a/documentation/ai-context/99-current-task.md
+++ b/documentation/ai-context/99-current-task.md
@ -0,0 +1,78 @@
 # 99 — Current Task
 ## Task Description
 Implement guest play flow so users can try a 3-round singleplayer quiz without creating an account. After completing the quiz, optionally prompt them to sign up to save progress.
 ## Context
 **Which parts of the codebase does this touch?**
 - [x] Frontend (`apps/web/`)
 - [x] Backend API (`apps/api/`)
 - [ ] Database schema (`packages/db/`) — no schema changes needed
 - [x] Shared schemas (`packages/shared/`) — may need GuestGameRequestSchema
 - [ ] WebSocket protocol (`apps/api/src/ws/`) — not touched (guest play is singleplayer only)
 - [ ] Data pipeline (`data-pipeline/`)
 - [ ] Infrastructure / deployment (`docker-compose.yml`, Caddyfile, etc.)
 - [x] Documentation
 **Relevant files I already know about:**
 - `apps/api/src/middleware/authMiddleware.ts` — needs optional auth path
 - `apps/api/src/controllers/gameController.ts` — needs guest variant of start/answer
 - `apps/api/src/services/gameService.ts` — may need guest session logic
 - `apps/api/src/routes/gameRouter.ts` — route definitions
 - `apps/web/src/routes/play.tsx` — singleplayer route
 - `apps/web/src/components/game/GameSetup.tsx` — start quiz UI
 - `packages/shared/src/schemas/game.ts` — request/response schemas
 - `apps/web/src/components/auth/AuthModal.tsx` — post-game auth prompt
 ## Constraints & Requirements
 **Must have:**
 - [ ] Guest users can start and complete a singleplayer quiz (3 or 10 rounds)
 - [ ] No login required to reach `/play` or call `POST /api/v1/game/start`
 - [ ] Server-side answer evaluation still works (correct answer never sent to frontend)
 - [ ] Guest sessions are ephemeral (no database storage of guest progress)
 - [ ] After quiz completion, show a friendly "Save your progress?" prompt with auth options
 **Nice to have:**
 - [ ] Guest sessions stored in-memory with a TTL (e.g., 24h) so refreshing the page doesn't lose the current quiz
 - [ ] Post-game prompt includes a "Continue as guest" option to play again without signing up
 **Must NOT break:**
 - [x] Existing auth flow (logged-in users still work normally)
 - [x] WebSocket protocol (if applicable)
 - [x] Database schema (additive changes only unless migration planned)
 - [x] Zod schemas in `packages/shared` (no silent drift)
 **Known blockers or open questions:**
 - [ ] Should guest sessions use the same `GameSessionStore` interface with a guest flag, or a separate store?
 - [ ] Should the post-game auth prompt be a modal or a redirect to a dedicated page?
 ## Definition of Done
 - [ ] Code implemented and tested
 - [ ] No TypeScript errors (`pnpm typecheck` passes)
 - [ ] Tests pass (`pnpm test`)
 - [ ] Manual verification in dev environment (both logged-in and guest flows)
 - [ ] Commit message follows convention
 - [ ] Feature branch merged to main
 ## Post-Work Checklist
 After the task is complete, ask the LLM:
 &gt; "Review the post-work checklist in prompts/meta.md. Which documentation files need updates based on what we just changed?"
 Expected doc updates:
 - `documentation/STATUS.md` — Guest play is now live
 - `documentation/ai-context/03-api-contract.md` — New guest endpoint or schema changes
 - `packages/shared/src/schemas/game.ts` — If GuestGameRequestSchema added
 - `README.md` — Quickstart may mention guest play
--- a/documentation/ai-context/WORKFLOW.md
+++ b/documentation/ai-context/WORKFLOW.md
@ -0,0 +1,260 @@
 # Workflow — Working with LLMs on Lila
 > **Purpose:** The process for using AI assistants effectively on this codebase. Covers task scoping, context selection, conversation management, verification, and doc updates. Complements `prompts/meta.md` (which covers prompt templates and methodology).
 > **Last updated:** 2026-05-15
 ---
 ## Before Starting a Task
 ### 1. Define the Task
 Write a clear, specific description in 1–2 sentences. Avoid vague goals.
 **Bad:** "Fix multiplayer"  
 **Good:** "Handle the case where a player disconnects mid-game and reconnects within 10 seconds — restore their game state without restarting the round."
 ### 2. Fill Out `99-current-task.md`
 Copy `documentation/ai-context/99-current-task.md` and fill in:
 - What you're building/fixing
 - Which parts of the codebase it touches (check the boxes)
 - Known files already involved
 - Constraints (must haves, nice to haves, must not break)
 - Definition of done
 This forces you to scope the task before involving the LLM.
 ### 3. Select Context Files
 Don't feed all AI-context files for every task. Pick the minimum the LLM needs.
 | Task Type                                               | Feed These Files                                                                                                   |
 | ------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
 | Frontend-only (UI component, route, styling)            | `00-project-overview.md` + `03-api-contract.md` + `99-current-task.md`                                             |
 | Backend-only (new endpoint, service logic, model query) | `00-project-overview.md` + `01-architecture.md` + `02-data-model.md` + `99-current-task.md`                        |
 | Full-stack feature (touches API + frontend + schema)    | `00-project-overview.md` + `01-architecture.md` + `02-data-model.md` + `03-api-contract.md` + `99-current-task.md` |
 | WebSocket / multiplayer                                 | `00-project-overview.md` + `04-websocket-protocol.md` + `99-current-task.md`                                       |
 | Data pipeline (Kaikki, enrichment, sync)                | `00-project-overview.md` + `05-data-pipeline.md` + `99-current-task.md`                                            |
 | Deployment / infrastructure                             | `00-project-overview.md` + `06-deployment.md` + `99-current-task.md`                                               |
 | Auth / security                                         | `00-project-overview.md` + `03-api-contract.md` + `99-current-task.md`                                             |
 | Cross-cutting refactor                                  | `00-project-overview.md` + `01-architecture.md` + `02-data-model.md` + `03-api-contract.md` + `99-current-task.md` |
 **Always include:** `00-project-overview.md` (ground truth) and `99-current-task.md` (task scope).  
 **Never include:** `prompts/meta.md` (that's for you, not the LLM).
 ### 4. Check for Decision Conflicts
 If the task touches any area from the Decision Index in `00-project-overview.md`, also feed the relevant section from `documentation/DECISIONS.md`.
 Examples:
 - Changing ORM or query patterns → feed `DECISIONS.md` → ORM section
 - Adding a new WebSocket library → feed `DECISIONS.md` → WebSocket section
 - Changing auth provider → feed `DECISIONS.md` → Auth section
 ---
 ## During the Task
 ### 5. Feed Context in Order
 ```
 1. 00-project-overview.md
 2. Relevant domain file(s) (01–06)
 3. 99-current-task.md (filled out)
 4. [Optional] Relevant DECISIONS.md section
 5. [Optional] Specific code files the LLM asks for
 ```
 **Why this order:** The LLM sees the big picture first, then the domain details, then the specific task. This reduces hallucination.
 ### 6. Start with the Base Prompt
 Use the template from `prompts/meta.md`:
 ```
 I'm working on Lila, a vocabulary learning app. Here's the project context:
 [PASTE: selected context files]
 My current task: [from 99-current-task.md]
 Please follow these rules:
 [1–8 from prompts/meta.md Base Prompt Template]
 ```
 ### 7. Work File-by-File, Section-by-Section
 The LLM will suggest files to modify. Go through them one at a time:
 1. **LLM explains the change** — concept first, code second
 2. **You review** — does this make sense? Does it violate any constraints from 99-current-task.md?
 3. **LLM shows the code** — section by section, not the whole file at once
 4. **You apply** — copy-paste into your editor, don't let the LLM write files directly
 5. **You verify** — TypeScript compiles, tests pass, manual check
 **Rule:** Never let the LLM modify more than one file before you review it.
 ### 8. Verify LLM Assumptions
 LLMs hallucinate file paths, schema shapes, and API endpoints. Periodically ask:
 > "Which files from the context did you actually look at?"  
 > "What schema from packages/shared are you using here?"  
 > "Show me the exact Zod schema for this request body."
 If the LLM's answer doesn't match the context files, correct it immediately. Wrong assumptions compound.
 ### 9. When to Start a New Conversation
 Start a fresh chat when:
 | Scenario                       | Why                                                                                          |
 | ------------------------------ | -------------------------------------------------------------------------------------------- |
 | Conversation exceeds ~25 turns | LLM coherence degrades; starts contradicting earlier context                                 |
 | Task pivots significantly      | "We were fixing a bug, now we're redesigning the feature" — fresh context prevents confusion |
 | LLM seems confused             | Repeating questions, forgetting constraints, suggesting things already ruled out             |
 | You took a break > 2 hours     | Context window state is opaque; safer to restart                                             |
 | Multiple failed attempts       | The LLM is stuck in a bad pattern; reset gives it a clean slate                              |
 **How to restart:** Paste the same context files + updated 99-current-task.md (mark what's already done). Summarize progress in 2–3 sentences.
 ### 10. Handle Multi-File Changes
 For tasks touching 3+ files, establish a sequence:
 ```
 1. Shared schemas (packages/shared) — foundation everything else depends on
 2. Database models (packages/db) — if schema or queries change
 3. Backend service + controller (apps/api) — business logic
 4. Backend tests (apps/api) — verify the service
 5. Frontend API client + types (apps/web) — consume the new contract
 6. Frontend components (apps/web) — UI changes
 7. Frontend tests (apps/web) — verify the UI
 8. Integration / e2e tests — full flow
 ```
 **Exception:** If the task is frontend-only, skip steps 2–4. If backend-only, skip 5–7.
 Tell the LLM: "We'll go in this order. Start with [file]."
 ---
 ## After the Task
 ### 11. Final Verification
 Before declaring done:
 - [ ] `pnpm typecheck` passes (no TypeScript errors)
 - [ ] `pnpm test` passes (all tests green)
 - [ ] `pnpm lint` passes (no ESLint errors)
 - [ ] Manual verification in dev environment
 - [ ] No console errors in browser
 - [ ] No server errors in API logs
 ### 12. Ask for Doc Updates
 Prompt the LLM:
 > "Review the post-work checklist in prompts/meta.md. Which documentation files need updates based on what we just changed?"
 Expected output format:
 ```
 - FILE: documentation/STATUS.md — REASON: Guest play flow is now live
 - FILE: documentation/ai-context/03-api-contract.md — REASON: New endpoint added
 - FILE: packages/shared/src/schemas/game.ts — REASON: New schema added
 ```
 ### 13. Update Docs Yourself
 The LLM suggests; you apply. Docs are your responsibility, not the LLM's.
 Priority order:
 1. `STATUS.md` — if "what works today" changed
 2. `BACKLOG.md` — if a task was completed or discovered
 3. `packages/shared/src/schemas/*.ts` — if request/response shapes changed
 4. `ai-context/*.md` — if architecture, API, or protocol changed
 5. `DECISIONS.md` — if you made a new architectural choice
 6. `README.md` — if quickstart or stack changed
 ### 14. Generate Ticket File (If Significant)
 For completed tasks, create a ticket in `documentation/tickets/`:
 | Prefix   | Use when...                                          | Example                               |
 | -------- | ---------------------------------------------------- | ------------------------------------- |
 | `adr-`   | Decision between options with long-term consequences | `adr-websocket-reconnect-strategy.md` |
 | `feat-`  | New feature shipped                                  | `feat-guest-play.md`                  |
 | `fix-`   | Bug fixed                                            | `fix-race-condition-lobby-join.md`    |
 | `chore-` | Routine maintenance, refactoring, tooling            | `chore-batch-distractor-queries.md`   |
 **Ticket contents:**
 - What was done (summary)
 - Why it was needed (context)
 - What files changed (list)
 - Any follow-up work (notes)
 - Setup guide if applicable (how to verify it works)
 ---
 ## Common Anti-Patterns
 | Anti-Pattern                                | Why It Fails                                    | Fix                                              |
 | ------------------------------------------- | ----------------------------------------------- | ------------------------------------------------ |
 | Feeding all ai-context files for every task | Bloated context, LLM loses focus, wastes tokens | Use the file selection table (step 3)            |
 | Letting the LLM write files directly        | You don't understand the code, can't debug it   | Copy-paste into your editor, review line by line |
 | Skipping verification                       | "It compiles" ≠ "it works"                      | Run tests, manual check, no console errors       |
 | Not updating docs                           | Future You is confused, LLMs get stale context  | Post-work checklist is non-negotiable            |
 | One long conversation for everything        | LLM forgets constraints, contradicts itself     | Restart at ~25 turns or on pivot                 |
 | Accepting code you don't understand         | You can't maintain it, can't explain it         | Ask "explain this line" until you do             |
 ---
 ## Quick Reference
 ### File Selection Cheat Sheet
 ```
 Frontend only     → 00 + 03 + 99
 Backend only      → 00 + 01 + 02 + 99
 Full-stack        → 00 + 01 + 02 + 03 + 99
 Multiplayer/WS    → 00 + 04 + 99
 Data pipeline     → 00 + 05 + 99
 Deployment        → 00 + 06 + 99
 Auth              → 00 + 03 + 99
 Big refactor      → 00 + 01 + 02 + 03 + 99
 ```
 ### Conversation Restart Template
 ```
 I'm continuing work on Lila. Here's the current context:
 [PASTE: 00-project-overview.md]
 [PASTE: relevant domain file(s)]
 Previously, we [brief summary of what was done].
 Current task: [updated 99-current-task.md, marking completed items]
 Let's continue from [specific file/section].
 ```
 ### Verification Checklist
 ```
 □ pnpm typecheck
 □ pnpm test
 □ pnpm lint
 □ Manual dev verification
 □ No browser console errors
 □ No server errors
 □ Doc updates applied
 □ Ticket file created (if significant)
 ```
--- a/documentation/ai-context/prompts/meta.md
+++ b/documentation/ai-context/prompts/meta.md
@ -0,0 +1,172 @@
 # Prompts — Meta
 > **Purpose:** Reusable prompt templates and working methodology for LLM-assisted development on Lila. Use these as preambles when starting a new task with any LLM.
 > **Last updated:** 2026-05-15
 ---
 ## Working Methodology
 This project is a learning exercise. The goal is to understand the code, not just to ship it.
 ### How to use an LLM for help
 1. **Paste the relevant AI-context files as context** (00-project-overview.md + domain files + 99-current-task.md)
 2. **Describe what you're working on and what you're stuck on**
 3. **Ask for hints and explanations, not raw solutions** — understand the concept, then implement it yourself
 4. **After completing a task, ask the LLM what docs need updating**
 ### Refactoring workflow
 After completing a task: share the code, ask what to refactor and why. The LLM should explain the concept, not write the implementation.
 ---
 ## Base Prompt Template
 Use this as the opening when starting any task with an LLM:
 ```
 I'm working on Lila, a vocabulary learning app. Here's the project context:
 [PASTE: 00-project-overview.md]
 [PASTE: relevant domain file(s) from ai-context/]
 My current task: [describe what you're building or fixing]
 Please follow these rules:
 1. Tell me which files you need to see to get the full context of the problem.
   Do not assume you know the codebase — ask for files.
 2. Walk me text-only through the problem and the solution.
   Explain the concept before showing code.
 3. If we need to update multiple files, let's go through them one by one,
   no matter how many files there are.
 4. If we go through a file, we'll do it slowly section by section,
   no matter how many sections.
 5. Suggest a feature branch name. Tell me when it's time to git commit
   and provide a commit message.
 6. If we have multiple options, provide options that reflect current
   industry standards and best practices. Explain the trade-offs.
 7. Never assume anything. Always ask for clarification if uncertain.
 8. For every completed task, tell me which documentation files need updates.
   Use this format:
   - FILE: [filename] — REASON: [what changed and why the doc needs updating]
 Let's start.
 ```
 ---
 ## Task-Specific Prompt Templates
 ### Generate a Feature
 ```
 [Base prompt template above]
 Additional context:
 - This is a [feature/bugfix/refactor] task
 - It touches these areas: [frontend/backend/database/websocket/pipeline]
 - The user-facing behavior should be: [describe]
 - Technical constraints: [e.g., must work with existing Zod schemas, must not break WebSocket protocol]
 ```
 ### Review Code for Bugs
 ```
 [Base prompt template above]
 Additional context:
 - I'm seeing this symptom: [error message, unexpected behavior]
 - It happens when: [reproduction steps]
 - I've checked these files already: [list]
 - Focus on: [race conditions, null handling, async flow, type safety, etc.]
 ```
 ### Generate Tests
 ```
 [Base prompt template above]
 Additional context:
 - Test type: [unit/integration/e2e]
 - What to test: [function/component/endpoint]
 - Current test coverage: [none/existing but incomplete]
 - Mocking strategy: [mock DB/mock WS/mock auth]
 ```
 ### Debug an Issue
 ```
 [Base prompt template above]
 Additional context:
 - Error message: [paste full error]
 - Stack trace: [paste if available]
 - Recent changes: [what was modified before it broke]
 - Environment: [dev/production/local/CI]
 ```
 ---
 ## Post-Work Doc Update Checklist
 After completing any task, the LLM should check these files for needed updates:
 | File                               | Check if...                                                             |
 | ---------------------------------- | ----------------------------------------------------------------------- |
 | `documentation/STATUS.md`          | Task changes what's working or what's blocked                           |
 | `documentation/BACKLOG.md`         | Task completes a backlog item or creates a new one                      |
 | `documentation/DECISIONS.md`       | Task involved choosing between alternatives with long-term consequences |
 | `documentation/ARCHITECTURE.md`    | Task changes monorepo structure, data flow, or layer boundaries         |
 | `documentation/ai-context/*.md`    | Task changes schemas, endpoints, protocol, or pipeline stages           |
 | `packages/shared/src/schemas/*.ts` | Task changes request/response shapes or WS message types                |
 | `README.md`                        | Task changes quickstart steps, stack, or current status                 |
 **Format for doc updates:**
 ```
 - FILE: documentation/STATUS.md — REASON: Guest play flow is now live, update "What Works Today"
 - FILE: documentation/ai-context/03-api-contract.md — REASON: New endpoint POST /api/v1/game/guest-start added
 - FILE: packages/shared/src/schemas/game.ts — REASON: Added GuestGameRequestSchema
 ```
 ---
 ## Ticket File Convention
 For completed tasks, produce a ticket file in `documentation/tickets/`:
 | Prefix   | Use when...                                          | Example                             |
 | -------- | ---------------------------------------------------- | ----------------------------------- |
 | `adr-`   | Decision between options with long-term consequences | `adr-websocket-library.md`          |
 | `feat-`  | New feature shipped                                  | `feat-guest-play.md`                |
 | `fix-`   | Bug fixed                                            | `fix-race-condition-lobby-join.md`  |
 | `chore-` | Routine maintenance, refactoring, tooling            | `chore-batch-distractor-queries.md` |
 **Ticket contents:**
 - What was done (summary)
 - Why it was needed (context)
 - What files changed (list)
 - Any follow-up work (notes)
 - Setup guide if applicable (how to verify it works)
 ---
 ## Tips for Effective LLM Collaboration
 1. **Start small.** Give the LLM one file or one function at a time, not the whole codebase.
 2. **Verify assumptions.** If the LLM assumes something about your stack, correct it immediately — wrong assumptions compound.
 3. **Ask for alternatives.** "What's the simplest way to do this?" vs. "What's the most robust way?" often yield different answers.
 4. **Don't accept code you don't understand.** Ask the LLM to explain a line until you do.
 5. **Test everything.** The LLM can suggest tests, but you run them. Trust nothing until it passes.
 6. **Keep context fresh.** If a conversation gets long, start a new one with the base prompt + current task template.
--- a/documentation/archive/notes.md
+++ b/documentation/archive/notes.md
@ -115,45 +115,4 @@ Manage your app audience in the Audience page of the Google Auth Platform.
 - openapi
 - bruno for api testing
 - tailscale
 - husky/lint-staged
 - musicforprogramming.net
 ## openwordnet
 download libraries via
 ```bash
 python -c 'import wn; wn.download("omw-fr")';
 ```
 libraries:
 odenet:1.4
 omw-es:1.4
 omw-fr:1.4
 omw-it:1.4
 omw-en:1.4
 upgrade wn package:
 ```bash
 pip install --upgrade wn
 ```
 check if wn is available, eg italian:
 ```bash
 python -c "import wn; print(len(wn.words(lang='it', lexicon='omw-it:1.4')))"
 ```
 remove a library:
 ```bash
 python -c "import wn; wn.remove('oewn:2024')"﬌ python -c "import wn; wn.remove('oewn:2024')"
 ```
 list all libraries:
 ```bash
 python -c "import wn; print(wn.lexicons())"
 ```
--- a/documentation/archive/roadmap.md
+++ b/documentation/archive/roadmap.md
--- a/documentation/archive/spec.md
+++ b/documentation/archive/spec.md
--- a/documentation/audits/generating-decks.md
+++ b/documentation/audits/generating-decks.md
@ -1,371 +0,0 @@
 # Code Review: `build-top-english-nouns-deck` seed script
 Hey, good work getting this to a finished, working state — that's genuinely the hardest part. Below is feedback structured the way a mentor would give it: what the problem is, why it matters in a real codebase, and how to fix it. Work through these one by one when you refactor.
 ---
 ## 1. Function names should be imperative, not gerunds
 ### What you wrote
 ```ts
 const readingFromWordlist = async () => { ... }
 const checkingSourceWordsAgainstDB = async () => { ... }
 ```
 ### Why it's a problem
 Functions represent _actions_. In English, imperative verbs describe actions: `read`, `fetch`, `build`. Gerunds (`reading`, `checking`) describe ongoing processes — they read like you're narrating what's happening rather than declaring what a function does. This isn't just style preference: when you're scanning a call stack or reading `main()`, imperative names parse faster because they match the mental model of "I am calling this to do a thing."
 ### How to fix it
 ```ts
 const readWordlist = async () => { ... }
 const resolveSourceTerms = async () => { ... }  // "checking" undersells what it returns
 const writeMissingWords = async () => { ... }
 ```
 Note the rename of `checkingSourceWordsAgainstDB` → `resolveSourceTerms`. The original name describes the _mechanism_ (checking against DB). A better name describes the _result_ (resolving words into term IDs). Callers don't need to know it hits the DB.
 ### Further reading
 - [Clean Code, Chapter 2 – Meaningful Names](https://www.oreilly.com/library/view/clean-code-a/9780136083238/) — specifically the section on "Use Intention-Revealing Names"
 - [Google TypeScript Style Guide – Naming](https://google.github.io/styleguide/tsguide.html#naming-style)
 ---
 ## 2. N+1 query pattern in `validateLanguages` and `logLanguageCoverage`
 ### What you wrote
 ```ts
 for (const language of languages) {
  const rows = await db
    .selectDistinct({ termId: translations.term_id })
    .from(translations)
    .where(
      and(
        inArray(translations.term_id, termIds),
        eq(translations.language_code, language),
      ),
    );
 }
 ```
 ### Why it's a problem
 This fires one database query _per language_. If you have 15 supported languages, that's 15 round trips. Each round trip has network latency, connection overhead, and query planning cost. The database already knows how to aggregate across all languages in a single pass — you're just not asking it to.
 This pattern is called **N+1** (one query to get the list, then N queries for each item in the list) and it's one of the most common performance mistakes in applications that use databases. At 15 languages it's fine. At 50 languages with 100k terms, your script will be the reason someone gets paged at 2am.
 ### How to fix it
 Ask the database to do the grouping for you in a single query:
 ```ts
 import { count, ne } from "drizzle-orm";
 const coverage = await db
  .select({
    language: translations.language_code,
    coveredCount: count(translations.term_id),
  })
  .from(translations)
  .where(
    and(
      inArray(translations.term_id, termIds),
      ne(translations.language_code, sourceLanguage),
    ),
  )
  .groupBy(translations.language_code);
 const validatedLanguages = coverage
  .filter((row) => row.coveredCount === termIds.length)
  .map((row) => row.language);
 ```
 One query. The database returns a row per language with the count of covered terms. You filter in JS. Done.
 ### Further reading
 - [Drizzle ORM – `groupBy` and aggregations](https://orm.drizzle.team/docs/select#aggregations)
 - ["What is the N+1 query problem" — StackOverflow](https://stackoverflow.com/questions/97197/what-is-the-n1-select-query-problem-and-how-can-it-be-avoided)
 ---
 ## 3. Two functions doing the same database work
 ### What you wrote
 `validateLanguages` and `logLanguageCoverage` both loop over languages and fire the same query per language. You wrote the same logic twice.
 ### Why it's a problem
 This is a violation of **DRY** (Don't Repeat Yourself). The immediate cost is that any bug in the query exists in two places — fixing one doesn't fix the other. The deeper cost is that it doubles your database load for no reason: you fetch the coverage data, use it to compute `validatedLanguages`, throw it away, then fetch it again just to log it.
 ### How to fix it
 Once you apply the fix from point 2, you have a single `coverage` array. Use it for both purposes:
 ```ts
 const coverage = await db...  // single query from point 2
 // Use for validation
 const validatedLanguages = coverage
  .filter((row) => row.coveredCount === termIds.length)
  .map((row) => row.language);
 // Use for logging
 for (const row of coverage) {
  console.log(`  ${row.language}: ${row.coveredCount} / ${termIds.length} terms covered`);
 }
 ```
 No second trip to the database.
 ### Further reading
 - [The DRY Principle](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself)
 ---
 ## 4. Unnecessary array copying inside a loop
 ### What you wrote
 ```ts
 const wordToTermIds = new Map<string, string[]>();
 for (const row of rows) {
  const existing = wordToTermIds.get(word) ?? [];
  wordToTermIds.set(word, [...existing, row.termId]); // spreads the whole array every iteration
 }
 ```
 ### Why it's a problem
 `[...existing, row.termId]` creates a _brand new array_ every time and copies all the previous elements into it. If "bank" has 3 homonyms, you allocate arrays of size 0, 1, 2, and 3 — throwing the first three away. This is an `O(n²)` memory allocation pattern. For 1000 words it's invisible. In a tighter loop or with more data, it adds up.
 This pattern comes from functional programming habits (immutability is good there). But in a one-off script building a local data structure, there's no reason to avoid mutation.
 ### How to fix it
 ```ts
 const wordToTermIds = new Map<string, string[]>();
 for (const row of rows) {
  const word = row.text.toLowerCase();
  if (!wordToTermIds.has(word)) {
    wordToTermIds.set(word, []);
  }
  wordToTermIds.get(word)!.push(row.termId);
 }
 ```
 Get the array once, push into it. No copies.
 ### Further reading
 - [MDN – Array.prototype.push()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/push)
 - [Big O Notation primer](https://www.freecodecamp.org/news/big-o-notation-why-it-matters-and-why-it-doesnt-1674cfa8a23c/) — worth understanding O(n²) vs O(n)
 ---
 ## 5. No database transaction — your "idempotent" script can corrupt state
 ### What you wrote
 ```ts
 deckId = await createDeck(validatedLanguages); // step 1
 const addedCount = await addTermsToDeck(deckId, termIds); // step 2
 await updateValidatedLanguages(deckId, validatedLanguages); // step 3
 ```
 ### Why it's a problem
 These three operations are separate database round trips with nothing tying them together. If step 2 throws (network blip, constraint violation, anything), you end up with a deck row that has no terms. Run the script again and it finds the existing deck, skips creation, then tries to add terms — but now your `validated_languages` from the previous partial run might be stale. The script _appears_ to recover, but you can't be sure of what state you're in.
 A **transaction** is a guarantee: either all steps succeed together, or none of them do. If anything fails mid-way, the database rolls back to the state before the transaction started. This is fundamental to writing scripts that touch multiple tables.
 ### How to fix it
 ```ts
 await db.transaction(async (tx) => {
  const existingDeck = await findExistingDeck(tx);
  let deckId: string;
  if (!existingDeck) {
    deckId = await createDeck(tx, validatedLanguages);
  } else {
    deckId = existingDeck.id;
  }
  await addTermsToDeck(tx, deckId, termIds);
  await updateValidatedLanguages(tx, deckId, validatedLanguages);
 });
 ```
 You'll need to thread the `tx` (transaction context) through your functions instead of using the global `db` — that's the key change.
 ### Further reading
 - [Drizzle ORM – Transactions](https://orm.drizzle.team/docs/transactions)
 - [PostgreSQL – What is a Transaction?](https://www.postgresql.org/docs/current/tutorial-transactions.html)
 - [ACID properties explained](https://www.databricks.com/glossary/acid-transactions) — Atomicity is what protects you here
 ---
 ## 6. The `isNewDeck` flag is unnecessary
 ### What you wrote
 ```ts
 let isNewDeck: boolean;
 if (!existingDeck) {
  deckId = await createDeck(validatedLanguages);
  isNewDeck = true;
 } else {
  deckId = existingDeck.id;
  isNewDeck = false;
 }
 // ...later...
 if (!isNewDeck) {
  await updateValidatedLanguages(deckId, validatedLanguages);
 }
 ```
 ### Why it's a problem
 You introduced `isNewDeck` to avoid calling `updateValidatedLanguages` when the deck was just created — reasoning that you already passed `validatedLanguages` to `createDeck`. But that means you're calling `updateValidatedLanguages` in _one path_ and `createDeck(..., validatedLanguages)` in the _other_ path. The intent (always keep validated languages current) is the same in both cases, but the code splits it into two branches you have to mentally reconcile.
 The cleaner model: always call `updateValidatedLanguages` after finding or creating the deck. Then `createDeck` doesn't need `validatedLanguages` at all, and `isNewDeck` disappears.
 ### How to fix it
 ```ts
 const deckId = existingDeck ? existingDeck.id : await createDeck(); // no validatedLanguages needed here
 await addTermsToDeck(deckId, termIds);
 await updateValidatedLanguages(deckId, validatedLanguages); // always runs
 ```
 Fewer variables, one clear flow.
 ---
 ## 7. Comments explain _what_, not _why_
 ### What you wrote
 ```ts
 // new Set() automatically discards duplicate values,
 // and spreading it back with ... converts it to a plain array again.
 // So if "bank" appears twice in the file,
 // the resulting array will only contain it once.
 const words = [
  ...new Set(
    raw
      .split("\n")
      .map((w) => w.trim().toLowerCase())
      .filter(Boolean),
  ),
 ];
 ```
 ### Why it's a problem
 Comments that re-explain what the code literally does are called **noise comments**. They add length without adding understanding — any developer who can read this script already knows what `Set` does. Worse, they can get out of date if the code changes but the comment doesn't.
 Good comments explain _why_ a decision was made, not _what_ the code does. The code already says what it does.
 Meanwhile, your most complex line — `const termIds = [...new Set(Array.from(wordToTermIds.values()).flat())]` — has no comment at all. That's the one that earns a note.
 ### How to fix it
 ```ts
 // Deduplicate: multiple words can map to the same term ID (e.g. via synonyms)
 const termIds = [...new Set(Array.from(wordToTermIds.values()).flat())];
 ```
 And remove the Set explanation from `readWordlist`. The code is clear.
 ### Further reading
 - [Clean Code, Chapter 4 – Comments](https://www.oreilly.com/library/view/clean-code-a/9780136083238/) — specifically "Explain Yourself in Code" and "Noise Comments"
 ---
 ## 8. The finished roadmap comment should be deleted
 ### What you wrote
 ```ts
 /*
 * roadmap
 * [x] Setup
 * [x] Read wordlist
 * ...all checked off
 */
 ```
 ### Why it's a problem
 This was useful _while you were planning_. Now that every item is checked, it communicates nothing except "this is done" — which the existence of a working script already communicates. Leaving it in adds noise to the file header and signals that you're not sure what belongs in source control vs. a task tracker.
 ### How to fix it
 Delete it. Use GitHub Issues, a Notion doc, or even a scratchpad file for planning notes. Source code is the output of planning, not the place to store it.
 ---
 ## 9. No log levels — everything goes to `console.log`
 ### What you wrote
 ```ts
 console.log("📖 Reading word list...");
 console.log(`   ${sourceWords.length} words loaded\n`);
 // ...and so on for every step
 ```
 ### Why it's a problem
 In a real environment — CI/CD pipelines, server logs, anything beyond your local terminal — all of this output lands in the same stream at the same priority. Actual errors (`console.error`) get buried in progress logs. There's no way to run the script quietly when you just need the summary, or verbosely when you're debugging.
 For a one-off seed script this is low priority, but it's a habit worth building early.
 ### How to fix it
 At minimum, use `console.error` for actual errors (not just in the catch block — also for things like "deck creation returned no ID"). For the detailed per-language breakdown, consider putting it behind a `--verbose` CLI flag so you can run the script cleanly in CI without dumping hundreds of lines of coverage data.
 ```ts
 // Basic approach
 if (process.argv.includes("--verbose")) {
  await logLanguageCoverage(termIds);
 }
 ```
 ### Further reading
 - [Node.js `process.argv`](https://nodejs.org/en/learn/command-line/nodejs-accept-arguments-from-the-command-line)
 - For a proper solution later: [pino](https://github.com/pinojs/pino) — a lightweight structured logger widely used in Node.js
 ---
 ## Summary
 | #   | Issue                          | Priority                                |
 | --- | ------------------------------ | --------------------------------------- |
 | 1   | Gerund function names          | Low — style, but builds good habits     |
 | 2   | N+1 queries                    | High — real performance impact          |
 | 3   | Duplicate query logic          | High — bugs in two places               |
 | 4   | Array spread in loop           | Medium — inefficient pattern to unlearn |
 | 5   | No transaction                 | High — can corrupt database state       |
 | 6   | `isNewDeck` flag               | Low — unnecessary complexity            |
 | 7   | Comments explain what, not why | Low — style, but important long-term    |
 | 8   | Roadmap comment left in        | Low — cleanup                           |
 | 9   | No log levels                  | Low — good habit to build               |
 Start with **2, 3, and 5** — those are the ones that would cause real problems in production. The rest are about writing code that's easier to read and maintain over time.
 Good luck with the refactor. Come back with the updated script when you're done.
--- a/documentation/data-pipeline.md
+++ b/documentation/data-pipeline.md
@ -1,468 +0,0 @@
 # lila data pipeline
 > **NOTE: BEFORE RUNNING THE PIPELINE, CONSIDER IMPROVING THE CEFR SOURCE
 > FILES IN `stage-2-annotate/sources/cefr/`. BETTER SOURCE COVERAGE MEANS
 > FEWER WORDS FOR THE LLM TO ANNOTATE FROM SCRATCH, FASTER OVERNIGHT RUNS,
 > AND HIGHER CONFIDENCE IN THE FINAL OUTPUT. SEE UNIVERSALCEFR
 > (huggingface.co/UniversalCEFR) AND CEFR-J
 > (github.com/openlanguageprofiles/olp-en-cefrj) AS STARTING POINTS.**
 This pipeline extracts vocabulary data from the Open Multilingual Wordnet (OMW), annotates it with CEFR levels from curated source files, verifies and enriches annotations using local LLMs, and produces authoritative JSON files per language. These files are consumed by the seeder in `packages/db` to populate the database with terms, translations, glosses, CEFR levels, difficulty ratings, and LLM-generated descriptions.
 ## Overview
 ```mermaid
 flowchart LR
    omw[(OMW SQLite DBs)]
    cefr[(CEFR JSON files)]
    extract[Extract]
    annotate[Annotate]
    enrich[Enrich]
    merge[Merge]
    final[(final/lang.json)]
    flagged[(flagged/lang.json)]
    seeder[packages/db seeder]
    db[(Database)]
    omw --> extract
    cefr --> annotate
    extract --> annotate
    annotate --> enrich
    enrich --> merge
    merge --> final
    merge --> flagged
    final --> seeder
    seeder --> db
 ```
 Each stage is a standalone script that reads from the previous stage's output and produces one JSON file per language. Stages can be re-run independently without affecting earlier or later stages.
 The enrich stage is the exception — it produces one checkpoint file per model run per language, plus a compiled votes file once all runs are complete. It is designed to run overnight, one model at a time, and is fully resumable if interrupted.
 Only fully annotated output in `stage-4-merge/output/final/` reaches the database. Words where LLMs could not reach a majority vote land in `stage-4-merge/output/flagged/` and wait for manual review before seeding.
 ## Data sources
 ### OMW / WordNet
 The Open Multilingual Wordnet (OMW) is the base vocabulary source. It provides synsets — groups of synonymous words — with translations and glosses across multiple languages. One SQLite database per language is downloaded and placed in `sources/omw/`. These files are not committed to git.
 All four parts of speech are extracted: noun, verb, adjective, adverb. WordNet's adjective satellites are collapsed into adjective — this is a WordNet-internal distinction that has no relevance for language learning. Alongside translations and glosses, usage examples are extracted where available and stored in the database as term_examples.
 See **Setup** for download instructions.
 ### CEFR source files
 Per-language JSON files in `sources/cefr/` provide the initial CEFR level annotations. These files do not cover the full vocabulary extracted from OMW — coverage varies by language. Gaps and disagreements are handled by the enrich stage.
 | Language | File                   |
 | -------- | ---------------------- |
 | English  | `sources/cefr/en.json` |
 | Italian  | `sources/cefr/it.json` |
 | Spanish  | `sources/cefr/es.json` |
 | German   | `sources/cefr/de.json` |
 | French   | `sources/cefr/fr.json` |
 These files are committed to git. For per-language coverage detail see `COVERAGE.md`.
 ### CEFR annotation and verification
 CEFR levels are determined by a majority vote combining all available sources:
 - The CEFR source file counts as one vote (if it has an entry for the word)
 - Each LLM model run counts as one vote
 The LLMs verify existing annotations as well as filling gaps — a source file entry does not automatically win. Majority vote across all sources determines the final level.
 If no majority is reached, the word is flagged for manual review and excluded from the database until resolved.
 ## Setup
 ### OMW databases
 Download the OMW SQLite database for each language using the `wn` Python
 library:
 ```bash
 python -m wn download omw-en:1.4
 python -m wn download omw-it:1.4
 python -m wn download omw-de:1.4
 python -m wn download omw-es:1.4
 python -m wn download omw-fr:1.4
 ```
 The data is stored automatically at `~/.wn_data/wn.db` and is not committed
 to git.
 ### LLM setup
 See `LLM-SETUP.md`.
 ## Pipeline stages
 The pipeline runs in five stages. Each stage is independent and can be re-run without affecting the others.
 | Stage       | What it does                                                         |
 | ----------- | -------------------------------------------------------------------- |
 | 1. Extract  | Reads OMW SQLite database, outputs normalized JSON per language      |
 | 2. Annotate | Merges CEFR source files into extracted data, adds source file votes |
 | 3. Enrich   | Runs local LLMs in two rounds — generation then voting               |
 | 4. Merge    | Resolves votes, derives difficulty, splits into final and flagged    |
 | 5. Compare  | Generates COVERAGE.md with detailed quality report                   |
 ### 1. Extract
 Reads the OMW SQLite database (`~/.wn_data/wn.db`) and produces a single normalized JSON file containing all synsets with their translations, glosses, and usage examples across all five languages and all parts of speech. Adjective satellites are collapsed into adjective at this stage.
 **Input:** `~/.wn_data/wn.db`
 **Output:** `stage-1-extract/output/omw.json`
 ```bash
 python stage-1-extract/scripts/extract.py
 ```
 Add `--sample` to extract 100 synsets for inspection before running the full
 extraction.
 Each record in the output looks like this:
 ```json
 {
  "source_id": "ili:i1",
  "pos": "adjective",
  "translations": {
    "en": ["able"],
    "it": ["abile", "intelligente", "valente", "capace"],
    "es": ["capaz"],
    "fr": ["comptable"]
  },
  "glosses": {
    "en": [
      "(usually followed by 'to') having the necessary means or skill or know-how or authority to do something"
    ]
  },
  "examples": { "en": ["able to swim", "she was able to program her computer"] }
 }
 ```
 Note: glosses and examples are not available for all languages. French and Spanish have no glosses or examples in the current OMW database — these will be generated by the LLM in the enrich stage. Coverage detail is in `COVERAGE.md`.
 ### 2. Annotate
 Reads the combined OMW extract and merges CEFR source data into it. Each translation in each language is matched against the corresponding CEFR source
 file by word text and part of speech. Matched translations receive a `cefr_source` vote which carries into the enrich stage. Unmatched translations proceed without a vote.
 This stage also extracts native example sentences from the CEFR source files and adds them to the record alongside OMW examples, with `source: "cefr"` to distinguish them.
 Words appearing in the CEFR source file multiple times with different CEFR levels are written to `conflicts.json` for manual review and excluded from voting until resolved.
 **Input:** `stage-1-extract/output/omw.json` + `stage-2-annotate/sources/cefr/{lang}.json`
 **Output:**
 - `stage-2-annotate/output/{lang}.json` — one per language
 - `stage-2-annotate/output/conflicts.json` — cross-language conflicts for review
 ```bash
 pnpm --filter @lila/pipeline annotate
 ```
 Each record in the output extends the OMW record with a `votes` field and any additional examples from the CEFR source file:
 ```json
 {
  "source_id": "ili:i1",
  "pos": "adjective",
  "translations": {
    "en": ["able"],
    "it": ["abile", "intelligente", "valente", "capace"],
    "es": ["capaz"],
    "fr": ["comptable"]
  },
  "glosses": { "en": ["having the necessary means or skill to do something"] },
  "examples": {
    "en": [
      { "text": "able to swim", "source": "omw" },
      { "text": "She was able to finish the task.", "source": "cefr" }
    ]
  },
  "votes": { "en": { "able": { "cefr_source": "B1" } } }
 }
 ```
 Words not present in the CEFR source file will have an empty `votes` object.
 ### 3. Enrich
 The enrich stage runs in two rounds, both designed to execute overnight one model at a time. The llama.cpp server must be running locally before starting either round. See `LLM-SETUP.md` for setup instructions.
 **Round 1 — generation**
 Each model processes every word in every language one term at a time and
 generates:
 - A CEFR level vote for each translation
 - A description for each language
 - A translation for each language, only if OMW provides none
 - A gloss for each language, only if OMW provides none
 - Usage examples for each language, only if OMW provides none
 OMW data is never duplicated — the script checks what OMW already provides before building the prompt. For translations, glosses and examples, if OMW data exists for that language the LLM skips generation entirely. This significantly reduces compute time for languages with good OMW coverage such as English.
 All model-generated content is stored with an anonymised source (`model_1`, `model_2` etc.) so models cannot be biased by knowing who generated what in round 2.
 **Input:** `stage-2-annotate/output/{lang}.json`
 **Output:** `stage-3-enrich/output/round1/{lang}_{model}.json` per run
 ```bash
 pnpm --filter @lila/pipeline enrich --round 1 --model {model}
 ```
 **Compiling candidates**
 Once all round 1 runs are complete, compile all generated candidates into a single structured file per language. This is the input to round 2.
 **Input:** `stage-3-enrich/output/round1/{lang}_{model}.json`
 **Output:** `stage-3-enrich/output/candidates/{lang}_candidates.json`
 ```bash
 pnpm --filter @lila/pipeline enrich --compile-candidates
 ```
 **Round 2 — voting**
 Each model receives the compiled candidate list for every word and votes on:
 - The best gloss candidate (if multiple exist)
 - The best description candidate (if multiple exist)
 - The best usage examples candidate (if multiple exist)
 - A CEFR level vote for each translation
 OMW data is not put to a vote — it automatically wins over any LLM-generated candidate. Round 2 only resolves conflicts between model-generated candidates. The prompt is kept small — one word at a time, a clean numbered candidate list — to fit within a limited context window.
 **Input:** `stage-3-enrich/output/candidates/{lang}_candidates.json`
 **Output:** `stage-3-enrich/output/round2/{lang}_{model}.json` per run
 ```bash
 pnpm --filter @lila/pipeline enrich --round 2 --model {model}
 ```
 **Compiling votes**
 Once all round 2 runs are complete, compile all votes into a single file per language. This is the input to the merge stage.
 **Input:** `stage-3-enrich/output/round2/{lang}_{model}.json`
 **Output:** `stage-3-enrich/output/votes/{lang}_votes.json`
 ```bash
 pnpm --filter @lila/pipeline enrich --compile-votes
 ```
 Each record in the votes file looks like this:
 ```json
 {
  "source_id": "omw-en-12345",
  "pos": "noun",
  "translations": {
    "en": [
      {
        "text": "dog",
        "votes": { "cefr_source": "A1", "model_1": "A1", "model_2": "A1" }
      },
      {
        "text": "canine",
        "votes": { "cefr_source": "B2", "model_1": "B2", "model_2": "B1" }
      }
    ],
    "it": [
      {
        "text": "cane",
        "votes": { "cefr_source": "A1", "model_1": "A1", "model_2": "A1" }
      }
    ]
  },
  "glosses": {
    "en": { "text": "a domesticated carnivorous mammal", "source": "omw" },
    "fr": {
      "candidates": [
        { "text": "un mammifère carnivore domestiqué", "source": "model_1" },
        { "text": "un animal domestique carnivore", "source": "model_2" }
      ],
      "votes": { "model_1": 1, "model_2": 1 }
    }
  },
  "examples": {
    "en": [{ "text": "the dog barked at the stranger", "source": "omw" }],
    "fr": {
      "candidates": [
        { "text": "le chien a aboyé", "source": "model_1" },
        { "text": "le chien gardait la maison", "source": "model_2" }
      ],
      "votes": { "model_1": 2, "model_2": 1 }
    }
  },
  "descriptions": {
    "en": {
      "candidates": [
        {
          "text": "a common household pet known for loyalty",
          "source": "model_1"
        },
        {
          "text": "a domesticated animal and loyal companion",
          "source": "model_2"
        }
      ],
      "votes": { "model_1": 2, "model_2": 1 }
    }
  }
 }
 ```
 ### 4. Merge
 Reads the votes file per language and resolves the final value for every field. Produces two output files per language — fully resolved records ready for seeding, and flagged records that need manual review.
 **Merge rules:**
 - OMW data wins automatically and is never overridden
 - For CEFR levels: the level with the most votes wins. If no majority is reached, that translation is flagged
 - For LLM-generated text fields (gloss, examples, descriptions): the candidate with the most votes wins
 <!-- TODO: decide fallback strategy when no majority is reached for text fields -->
 **Difficulty mapping:**
 | CEFR   | Difficulty   |
 | ------ | ------------ |
 | A1, A2 | easy         |
 | B1, B2 | intermediate |
 | C1, C2 | hard         |
 **Input:** `stage-3-enrich/output/votes/{lang}_votes.json`
 **Output:**
 - `stage-4-merge/output/final/{lang}.json` — fully resolved, ready for seeding
 - `stage-4-merge/output/flagged/{lang}.json` — CEFR majority not reached, needs manual review before seeding
 ```bash
 pnpm --filter @lila/pipeline merge
 ```
 Each record in `final/{lang}.json` looks like this:
 ```json
 {
  "source_id": "omw-en-12345",
  "pos": "noun",
  "translations": {
    "en": [
      { "text": "dog", "cefr_level": "A1", "difficulty": "easy" },
      { "text": "canine", "cefr_level": "B2", "difficulty": "intermediate" }
    ],
    "it": [{ "text": "cane", "cefr_level": "A1", "difficulty": "easy" }]
  },
  "glosses": {
    "en": { "text": "a domesticated carnivorous mammal", "source": "omw" },
    "fr": { "text": "un mammifère carnivore domestiqué", "source": "model_1" }
  },
  "examples": {
    "en": [{ "text": "the dog barked at the stranger", "source": "omw" }],
    "fr": [{ "text": "le chien a aboyé", "source": "model_1" }]
  },
  "descriptions": {
    "en": {
      "text": "a common household pet known for loyalty and companionship",
      "source": "model_1"
    },
    "it": {
      "text": "un animale domestico comune noto per la sua fedeltà",
      "source": "model_2"
    }
  }
 }
 ```
 **Resolving flagged words:**
 Open `stage-4-merge/output/flagged/{lang}.json`, manually set the correct `cefr_level` and `difficulty` for each flagged translation, then move the resolved entries into `stage-4-merge/output/final/{lang}.json`. Re-run the seeder after resolving.
 ### 5. Compare / QA
 Read-only. Generates `COVERAGE.md` with a full breakdown of the pipeline
 output quality per language. Run this after merge to verify output before
 seeding the database.
 **Input:**
 - `stage-4-merge/output/final/{lang}.json`
 - `stage-4-merge/output/flagged/{lang}.json`
 **Output:** `COVERAGE.md`
 ```bash
 pnpm --filter @lila/pipeline compare
 ```
 `COVERAGE.md` reports the following per language:
 - Total synsets extracted
 - Total translations per language
 - POS breakdown per language — word counts for noun, verb, adjective, adverb
 - CEFR coverage per language — how many translations have a resolved CEFR level, broken down by level (A1, A2, B1, B2, C1, C2)
 - Difficulty breakdown per language — word counts for easy, intermediate, hard
 - Flagged count per language — how many translations are awaiting manual review
 - Gloss coverage per language — total glosses, broken down by source (omw vs LLM-generated) and which languages have no glosses at all
 - Example coverage per language — same breakdown as glosses
 - Description coverage per language — how many translations have a description, broken down by source
 - CEFR source file coverage per language — how many words from the source file were matched against OMW translations
 - LLM model contribution — how many CEFR votes and text candidates each anonymised model contributed
 ## Adding a new language
 1. Add the language code to `SUPPORTED_LANGUAGE_CODES` in `packages/shared/src/constants.ts`
 2. Build shared: `pnpm --filter @lila/shared build`
 3. Generate and run a DB migration: `pnpm --filter @lila/db generate` then `pnpm --filter @lila/db migrate`
 4. Download the OMW lexicon for the language using the `wn` Python library
 5. Add a CEFR source file at `stage-2-annotate/sources/cefr/{lang}.json`
 6. Run the full pipeline
 ## Constants and constraints
 These values are defined in `packages/shared/src/constants.ts` and enforced by database check constraints. The pipeline filters out any entries that violate them.
 | Constant        | Values                                |
 | --------------- | ------------------------------------- |
 | Languages       | `en`, `it`, `de`, `es`, `fr`          |
 | Parts of speech | `noun`, `verb`, `adjective`, `adverb` |
 | CEFR levels     | `A1`, `A2`, `B1`, `B2`, `C1`, `C2`    |
 | Difficulty      | `easy`, `intermediate`, `hard`        |
 Adding a new value to any of these requires a constants update and a database migration before re-running the pipeline. See **Adding a new language** for the full steps — the same process applies for new parts of speech.
 ## Further extensions
 These are not part of the current pipeline but are worth considering as the
 dataset matures:
 - **Grammatical gender and articles** — Wiktionary dumps contain gender and
  article data for nouns across all supported languages. Could be extracted
  and stored as a new `translation_forms` table.
 - **Conjugations** — Wiktionary also carries verb conjugation tables. Useful
  for a future grammar-focused quiz mode.
 - **IPA pronunciations** — Wiktionary and Forvo are potential sources for
  phonetic transcriptions per language.
 - **TTS audio files** — Generate pronunciation audio for each translation
  using a local or cloud TTS engine. Stored as static files, served alongside
  the quiz UI.
 - **Images** — Associate an image with each synset to support visual
  vocabulary learning. Could be sourced from open image datasets like
  ImageNet or WikiMedia Commons.
 - **Frequency data** — Word frequency rankings per language from sources like
  the Google Ngram dataset. Useful for smarter difficulty calibration beyond
  CEFR levels alone.
 - **Improved CEFR source files** — See note at the top of this document.
  UniversalCEFR and CEFR-J are good starting points.
 - **Additional languages** — The pipeline is language-agnostic. Adding a new
  language requires an OMW lexicon, a CEFR source file, and a constants
  update. See **Adding a new language**.
--- a/documentation/design.md
+++ b/documentation/design.md
@ -1,5 +0,0 @@
 # design
 ## notes
 break points
--- a/documentation/design/GAME_MODES.md
+++ b/documentation/design/GAME_MODES.md
--- a/documentation/roasts/gameService.md
+++ b/documentation/roasts/gameService.md
@ -1,348 +0,0 @@
 # 🔥 GameService Roast: `apps/api/src/services/gameService.ts`
 > *"It works on my machine" is not a scalability strategy.*
 **Project:** lila — Vocabulary Trainer  
 **File Roasted:** `gameService.ts`  
 **Date:** $(date)  
 **Roaster:** Qwen3.6  
 ---
 ## 📋 Executive Summary
 | Metric        | Score    | Notes                                                |
 | ------------- | -------- | ---------------------------------------------------- |
 | Code Quality  | 8/10     | Clean layering, good types, consistent style         |
 | Correctness   | 6/10     | Race condition + N+1 query are critical              |
 | Test Coverage | 7/10     | Good happy-path tests, missing concurrency tests     |
 | Scalability   | 5/10     | Will choke at ~100 concurrent users without fixes    |
 | **Overall**   | **7/10** | Solid foundation, but fix the footguns before launch |
 ---
 ## 🚨 Critical Issues (Fix Before Production)
 ### 1. Race Condition: Lost Update in `evaluateAnswer`
 **Location:** `gameService.ts:45-58` + `InMemoryGameSessionStore.ts:update()`
 // Current flow (VULNERABLE):
 const session = await store.get(submission.sessionId);  // READ
 const updatedAnswers = new Map(session.answers);         // MODIFY (local copy)
 updatedAnswers.delete(submission.questionId);
 await store.update(submission.sessionId, { answers: updatedAnswers }); // WRITE
 The Attack:
    Client submits answer A and answer B for the same question (network retry, bug, or malice)
    Both requests read the same session.answers Map (question still present)
    Both delete the question from their local copy
    Both write back → second write overwrites first
    Result: One answer is silently lost, session state desyncs
 Why Tests Missed It: Vitest runs tests synchronously. Race conditions require deliberate concurrency testing.
 Fix Options:
 // Option A: Add atomic operation to store interface
 interface GameSessionStore {
  deleteAnswer(sessionId: string, questionId: string): Promise<boolean>;
 }
 // Option B: Use Valkey Lua script for atomic read-modify-write
 // Option C: Optimistic locking with version numbers
 Priority: 🔴 CRITICAL — Data integrity issue
 2. N+1 Query: Database Performance Bomb
 Location: gameService.ts:24-26 + termModel.ts:getDistractors()
 // For each of N terms, we call getDistractors():
 const questions: GameQuestion[] = await Promise.all(
  terms.map(async (term) => {
    const distractorTexts = await getDistractors(term.termId, ...); // 🚩 N queries!
  })
 );
 Impact Analysis:
 Rounds
 DB Queries
 At 50 concurrent users
 3
 1 + 3 = 4
 200 queries/min
 10
 1 + 10 = 11
 550 queries/min
 20
 1 + 20 = 21
 1,050 queries/min
 Each getDistractors() runs:
 SELECT text FROM terms 
 JOIN translations ON ... 
 WHERE pos = $1 AND difficulty = $2 AND term_id != $3 AND text != $4 
 ORDER BY RANDOM() LIMIT 6
 Fix: Batch Fetch Distractors
 // Fetch all distractors in ONE query
 const allDistractors = await db
  .select({ termId: terms.id, text: translations.text })
  .from(terms)
  .innerJoin(translations, /* ... */)
  .where(and(
    eq(terms.pos, pos),
    eq(translations.difficulty, difficulty),
    inArray(terms.id, termIds), // Batch!
  ))
  .limit(DISTRACTOR_FETCH_COUNT * termIds.length);
 // Group by termId in JS, then slice to 3 unique distractors per term
 const distractorsByTerm = groupByTermId(allDistractors);
 Priority: 🔴 CRITICAL — Performance/scalability issue
 3. Error Handling Inconsistency
 Location: gameService.ts:33-36
 if (uniqueDistractors.length < 3) {
  throw new Error(`Not enough unique distractors for term: ${term.targetText}`); // 🚩
 }
 Problem: Raw Error bypasses your errorHandler middleware:
    No HTTP status mapping (defaults to 500)
    No structured logging
    Inconsistent API responses
 Fix:
 import { UnprocessableEntityError } from "../errors/AppError.js";
 if (uniqueDistractors.length < 3) {
  logger.warn({ termId: term.termId, uniqueCount: uniqueDistractors.length }, 
              "insufficient_distractors");
  throw new UnprocessableEntityError(
    `Not enough unique distractors for term: ${term.targetText}`
  );
 }
 Priority: 🟡 HIGH — Observability & UX issue
 ⚠️ High-Severity Smells
 4. Code Duplication: Singleplayer vs Multiplayer
 Compare: gameService.ts vs multiplayerGameService.ts
 // gameService.ts
 const optionTexts = [term.targetText, ...uniqueDistractors.slice(0, 3)];
 const shuffledTexts = shuffleArray(optionTexts);
 const correctOptionId = shuffledTexts.indexOf(term.targetText);
 // multiplayerGameService.ts (lines 35-45)
 const optionTexts = [correctAnswer.targetText, ...distractorTexts];
 const shuffledTexts = shuffle(optionTexts); // Different function, same logic!
 const correctOptionId = shuffledTexts.indexOf(correctAnswer.targetText);
 Risks:
    Fix shuffle bias in one place, forget the other
    Add new option type (e.g., etymology hint), update one service only
    Harder to test core game logic in isolation
 Fix: Extract pure function to @lila/shared or new @lila/game-logic:
 // packages/shared/src/game-logic.ts
 export const buildQuestionOptions = (
  correctAnswer: string,
  distractors: string[],
  optionCount: number = 4
 ): { options: AnswerOption[]; correctOptionId: number } => {
  const uniqueDistractors = [...new Set(distractors.filter(d => d !== correctAnswer))];
  const optionTexts = [correctAnswer, ...uniqueDistractors.slice(0, optionCount - 1)];
  const shuffled = shuffleSecure(optionTexts);
  const correctOptionId = shuffled.indexOf(correctAnswer);
  return {
    options: shuffled.map((text, idx) => ({ optionId: idx, text })),
    correctOptionId
  };
 };
 Priority: 🟡 HIGH — Maintainability issue
 5. Shuffle Bias: Math.random() Trap
 Location: utils.ts:shuffleArray() + multiplayerGameService.ts:shuffle()
 export const shuffleArray = <T>(array: T[]): T[] => {
  for (let i = result.length - 1; i > 0; i--) {
    const j = Math.floor(Math.random() * (i + 1)); // 🚩 Modulo bias + non-crypto RNG
    // ...
  }
 };
 The Math:
    Math.random() has ~53 bits of entropy (fine for vocab)
    Math.floor(rand * n) has modulo bias when n isn't a power of 2
    For n=4: bias is ~0.01% (tiny, but non-zero)
 When It Matters:
    Competitive leaderboards ("option 0 is correct 26% of the time")
    Achievement systems based on answer patterns
    Security-sensitive features (not applicable here, but principle matters)
 Fix (if needed):
 import { randomBytes } from "crypto";
 const shuffleSecure = <T>(array: T[]): T[] => {
  const result = [...array];
  for (let i = result.length - 1; i > 0; i--) {
    // Use crypto.getRandomValues for better randomness
    const rand = randomBytes(4).readUInt32LE(0);
    const j = rand % (i + 1);
    [result[i], result[j]] = [result[j], result[i]];
  }
  return result;
 };
 Priority: 🟢 LOW — Document tradeoff and move on for now
 6. Test Coverage Gaps
 File: gameService.test.ts
 ✅ Well Tested:
    Happy path: session creation, answer evaluation
    Edge cases: duplicate distractors, empty results, invalid inputs
    Error propagation from DB layer
 ❌ Missing Tests:
 // 1. Concurrency test (race condition)
 it("rejects duplicate answers for same question under concurrent load", async () => {
  const session = await createGameSession(validRequest, store, "user-1");
  const question = session.questions[0]!;
  // Submit two answers simultaneously
  const [result1, result2] = await Promise.allSettled([
    evaluateAnswer({ sessionId, questionId, selectedOptionId: 0 }, store, "user-1"),
    evaluateAnswer({ sessionId, questionId, selectedOptionId: 1 }, store, "user-1"),
  ]);
  // Exactly one should succeed, one should throw ConflictError
  expect([result1, result2].filter(r => r.status === "fulfilled")).toHaveLength(1);
 });
 // 2. TTL expiration test
 it("deletes session after TTL expires", async () => {
  vi.useFakeTimers();
  const session = await createGameSession(validRequest, store, "user-1");
  vi.advanceTimersByTime(31 * 60 * 1000); // 31 minutes
  await expect(store.get(session.sessionId)).resolves.toBeNull();
 });
 // 3. Distractor fallback strategy test
 it("uses fallback when <3 unique distractors available", async () => {
  mockGetDistractors.mockResolvedValue(["same", "same", "same", "same"]);
  // Should either: (a) fetch from broader pool, or (b) reduce rounds gracefully
 });
 Priority: 🟡 HIGH — Prevents regression on critical fixes
 🧼 Code Quality Nitpicks
 7. Magic Numbers
 // gameService.ts:52
 await store.create(sessionId, {...}, 30 * 60 * 1000); // What is this?
 // termModel.ts:65
 .limit(count); // count=6, but why?
 // shared/schemas/game.ts:15
 optionId: z.number().int().min(0).max(3), // Why 4 options?
 Fix: Centralize in @lila/shared/constants.ts:
 export const GAME_SESSION_TTL_MS = 30 * 60 * 1000;
 export const DISTRACTOR_FETCH_COUNT = 6;
 export const GAME_OPTION_COUNT = 4;
 export const MIN_UNIQUE_DISTRACTORS = 3;
 8. Mutable Reference Leakage
 Location: InMemoryGameSessionStore.ts:get()
 get(sessionId: string): Promise<GameSessionData | null> {
  return Promise.resolve(entry.data); // 🚩 Returns mutable reference to internal state
 }
 Risk: Any code that does session.answers.delete(...) mutates the store's internal Map directly.
 Fix:
 // Option A: Deep clone (simple, works for this data shape)
 return Promise.resolve(structuredClone(entry.data));
 // Option B: Return readonly view (TypeScript-only protection)
 return Promise.resolve(entry.data as Readonly<GameSessionData>);
 // Option C: Use immutable data structures (overkill for now)
 9. Zero Observability
 Problem: No logging, no metrics. You're flying blind in production.
 Minimal Fix (5 minutes):
 // apps/api/src/lib/logger.ts
 import pino from "pino";
 export const logger = pino({ 
  level: process.env.LOG_LEVEL || "info",
  transport: process.env.NODE_ENV === "production" 
    ? { target: "pino-pretty" } 
    : undefined 
 });
 // In gameService.ts:
 import { logger } from "../lib/logger.js";
 logger.info(
  { userId, sourceLang, targetLang, termCount: terms.length },
  "game_session_created"
 );
 logger.debug(
  { sessionId, questionId, isCorrect, responseTimeMs },
  "answer_evaluated"
 );
 Bonus: Export a Prometheus histogram for game_service_duration_seconds.
 10. ORDER BY RANDOM() Time Bomb
 Location: termModel.ts:getGameTerms() + getDistractors()
 .orderBy(sql`RANDOM()`) // 🚩 Fine for 10k rows, slow for 1M
 The Comment Admits It:
 // TODO(post-mvp): ORDER BY RANDOM() sorts the entire filtered result set...
 Reality Check: "Post-MVP" never comes without a ticket.
 Fix Options:
 -- Option A: Pre-computed random_seed column (updated nightly)
 WHERE ... AND random_seed >= random() 
 ORDER BY random_seed 
 LIMIT $1
 -- Option B: TABLESAMPLE for approximate sampling (Postgres 9.5+)
 FROM terms TABLESAMPLE SYSTEM(10) 
 WHERE ... 
 LIMIT $1
 -- Option C: Random offset (simple, but still scans)
 OFFSET floor(random() * (SELECT count(*) FROM terms WHERE ...))
 Action: Add a ticket to documentation/tickets/t00009.md now.
--- a/documentation/tickets/blueprint.md
+++ b/documentation/tickets/blueprint.md
@ -1,95 +0,0 @@
 # Ticket Blueprint
 Two formats depending on task type. Choose based on whether a meaningful
 decision between options was made.
 ---
 ## Format A — ADR (architectural/infrastructural decisions)
 Use when: you chose between options with long-term consequences.
 Prefix: `adr-`
 ---
 # ADR: <title>
 ## Status
 Accepted | Superseded by | Deprecated
 ## Date
 YYYY-MM-DD
 ## Context
 What is the problem? Why does it need to be solved?
 ## Decision
 What was chosen and why in one or two sentences.
 ## Options considered
 ### Option A — <name> ✅
 Description. Why it was chosen.
 ### Option B — <name>
 Description. Why it was rejected.
 ## Consequences
 - What gets better
 - What gets worse or more complex
 - Operational implications
 - What breaks if this needs to be redone
 ## Affected files / machines
 - List files, servers, or systems touched
 ## References
 - Links to relevant docs
 ---
 ## Setup guide / implementation notes
 Step-by-step of what was actually done.
 ---
 ## Format B — Task (features, fixes, chores)
 Use when: routine task with a clear solution.
 Prefix: `feat-` / `fix-` / `chore-`
 ---
 # <prefix>: <title>
 ## Problem
 What was wrong or missing?
 ## Options considered
 ### Option A — <name> ✅
 ### Option B — <name>
 ## Solution
 What was done and why.
 ## Files changed
 - `path/to/file.ts`
 ## Commit
 `<type>: <message>`
--- a/documentation/tickets/t00001.md
+++ b/documentation/tickets/t00001.md
@ -1,107 +0,0 @@
 # ADR: Docker Credential Helper Setup
 ## Status
 Accepted
 ## Date
 2026-04-26
 ## Context
 Docker credentials for `git.lilastudy.com` and `dhi.io` were stored as base64-encoded strings in `~/.docker/config.json` on both the dev laptop and the VPS. Base64 is not encryption — anyone with read access to the file can decode the credentials instantly.
 ## Decision
 Use `pass` (GPG-backed password store) as the Docker credential helper on both machines.
 ## Options considered
 ### Option A — `pass` (GPG-backed) ✅
 Stores credentials encrypted with a GPG key. Works on headless servers and desktops without GNOME. Industry standard for Linux servers.
 ### Option B — `secretservice` (GNOME keyring)
 Uses the desktop keyring daemon. Not suitable for a headless VPS, and not suitable for an i3 desktop without running `gnome-keyring-daemon` manually.
 ### Option C — `gnome-libsecret`
 Same limitations as Option B.
 ## Consequences
 - Credentials are now GPG-encrypted at rest on both machines
 - Requires GPG passphrase entry when Docker needs to pull credentials
  in a new session
 - Must be set up manually on each machine — not reproducible via the repo
 - VPS setup must be repeated if the server is reprovisioned
 ## Affected machines
 - Dev laptop (Debian 13, i3)
 - VPS (Debian 13, ARM64, headless)
 ## References
 - [docker docs](https://docs.docker.com/reference/cli/docker/login/#credential-stores)
 - [pass docs](https://www.passwordstore.org/)
 ---
 ## Setup guide
 Repeat these steps on each machine.
 ### 1. Install dependencies
 ```bash
 sudo apt-get install -y pass gnupg2 golang-docker-credential-helpers
 ```
 ### 2. Generate a GPG key
 ```bash
 gpg --full-generate-key
 ```
 Choose RSA, 4096 bits, no expiry. Set a strong passphrase.
 ### 3. Get the key ID
 ```bash
 gpg --list-secret-keys --keyid-format LONG
 ```
 Copy the hex string after the `/` on the `sec` line.
 ### 4. Initialise pass
 ```bash
 pass init <your-key-id>
 ```
 ### 5. Update `~/.docker/config.json`
 Replace the entire file contents with:
 ```json
 { "credsStore": "pass" }
 ```
 ### 6. Re-login to registries
 ```bash
 docker login git.lilastudy.com
 # dev laptop only:
 docker login dhi.io
 ```
 ### 7. Verify
 ```bash
 cat ~/.docker/config.json
 ```
 Should show only `"credsStore": "pass"` with no `auths` block.
--- a/documentation/tickets/t00002.md
+++ b/documentation/tickets/t00002.md
@ -1,149 +0,0 @@
 # ADR: Change GAME_ROUNDS from strings to numbers
 ## Status
 Accepted
 ## Date
 2026-04-28
 ## Context
 `GAME_ROUNDS` in `packages/shared/src/constants.ts` was typed as `["3", "10"] as const`, making `GameRounds` a string union (`"3" | "10"`). This meant `gameService.ts` had to cast the value with `Number(request.rounds)` deep in business logic — a type conversion happening far from the boundary where data enters the system. The type system was lying: `rounds` was described as a string everywhere but used as a number where it mattered.
 ## Decision
 Change `GAME_ROUNDS` to `[3, 10] as const` and update the Zod schema to use `z.literal(GAME_ROUNDS)` instead of `z.enum(GAME_ROUNDS)`. The single source of truth remains `constants.ts` — adding a new round count (e.g. `20`) requires only editing that file.
 ## Options considered
 ### Option A — Numbers everywhere ✅
 Change `GAME_ROUNDS` to `[3, 10] as const`. Use `z.literal(GAME_ROUNDS)` in the schema. Update the frontend component state and `SettingGroup` props. Drop `Number()` cast in the service.
 Chosen because: JSON carries numbers natively, both ends of the wire are owned by this codebase, and type conversions belong at the boundary — not inside business logic.
 ### Option B — Keep strings, accept the cast
 Leave `GAME_ROUNDS` as `["3", "10"]`. The `Number()` cast stays in `gameService.ts`.
 Rejected because: it pushes type conversion into business logic and makes the inferred `GameRequest` type misleading. The cast has to live somewhere — the schema boundary is the right place.
 ### Option C — Coerce at the schema boundary
 Keep `GAME_ROUNDS` as numbers but use `z.coerce.number().pipe(z.literal(GAME_ROUNDS))` so the frontend can keep sending strings.
 Rejected because: coercion is for untrusted or uncontrolled inputs (form fields, query params, third-party clients). We control both ends of the wire. Coercing a self-inflicted type mismatch is treating a wound we gave ourselves.
 ## Consequences
 - `GameRounds` is now `3 | 10` instead of `"3" | "10"`
 - `Number(request.rounds)` cast removed from `gameService.ts`
 - `SettingGroup` in `GameSetup.tsx` now accepts `string | number` options
 - `useState<string>` for rounds changed to `useState<number>`
 - Adding a new round count requires only editing `GAME_ROUNDS` in `constants.ts`
 - `z.enum` cannot be used for number literals — `z.literal` must be used instead (this is a Zod constraint, not a project convention)
 ## Affected files
 - `packages/shared/src/constants.ts`
 - `packages/shared/src/schemas/game.ts`
 - `apps/api/src/services/gameService.ts`
 - `apps/api/src/services/gameService.test.ts`
 - `apps/api/src/controllers/gameController.test.ts`
 - `apps/web/src/components/game/GameSetup.tsx`
 ## References
 - [Zod literals](https://zod.dev/?id=literals)
 ---
 ## Setup guide / implementation notes
 1. In `packages/shared/src/constants.ts`, change:
   ```ts
   export const GAME_ROUNDS = ["3", "10"] as const;
   ```
   to:
   ```ts
   export const GAME_ROUNDS = [3, 10] as const;
   ```
 2. In `packages/shared/src/schemas/game.ts`, change:
   ```ts
   rounds: z.enum(GAME_ROUNDS),
   ```
   to:
   ```ts
   rounds: z.literal(GAME_ROUNDS),
   ```
 3. In `apps/api/src/services/gameService.ts`, change:
   ```ts
   Number(request.rounds),
   ```
   to:
   ```ts
   request.rounds,
   ```
 4. In `apps/api/src/services/gameService.test.ts`, change:
   ```ts
   rounds: "3",
   ```
   to:
   ```ts
   rounds: 3,
   ```
 5. In `apps/api/src/controllers/gameController.test.ts`, change:
   ```ts
   rounds: "3",
   ```
   to:
   ```ts
   rounds: 3,
   ```
   Also add a pinning test before the refactor:
   ```ts
   it("returns 400 when rounds has an invalid value", async () => {
     const res = await request(app)
       .post("/api/v1/game/start")
       .send({ ...validBody, rounds: "invalid" });
     expect(res.status).toBe(400);
     expect(res.body.success).toBe(false);
   });
   ```
 6. In `apps/web/src/components/game/GameSetup.tsx`:
   - Update `SettingGroup` props to accept `string | number`:
     ```ts
     type SettingGroupProps = {
       options: readonly (string | number)[];
       selected: string | number;
       onSelect: (value: string | number) => void;
     };
     ```
   - Update `LABELS` lookup to `LABELS[String(option)]`
   - Change rounds state from `useState<string>` to `useState<number>`
--- a/documentation/tickets/t00003.md
+++ b/documentation/tickets/t00003.md
@ -1,37 +0,0 @@
 # refactor: extract shuffleArray to lib/utils, rename correctAnswers to terms
 ## Problem
 Two readability issues in `gameService.ts`:
 1. `shuffle` was defined as a private function at the bottom of `gameService.ts`, after the function that calls it. It is a pure generic utility with no dependency on game domain logic, so it had no business living there.
 2. The variable holding terms fetched from the database was named `correctAnswers`. These are word pairs — they only become "correct answers" once options are built around them. The name was premature and misleading.
 ## Options considered
 ### Option A — Move `shuffle` up in the same file
 Simple, no new files. Fixes the ordering issue but keeps a generic utility buried in domain code.
 ### Option B — Extract to `lib/utils.ts` ✅
 Move `shuffle` (renamed `shuffleArray`) to `apps/api/src/lib/utils.ts` and import it. Cleaner separation: domain logic stays in services, generic utilities live in `lib/`.
 Chosen because `lib/` already exists, the function is reusable, and it gives future utilities a home.
 ## Solution
 - Created `apps/api/src/lib/utils.ts` with `shuffleArray`
 - Renamed `shuffle` → `shuffleArray` for clarity at the call site
 - Removed the inline `shuffle` from `gameService.ts` and imported from `lib/utils.ts`
 - Renamed `correctAnswers` → `terms` and `correctAnswer` → `term` throughout `gameService.ts`
 ## Files changed
 - `apps/api/src/lib/utils.ts` — created
 - `apps/api/src/services/gameService.ts` — removed `shuffle`, updated import, renamed variables
 ## Commit
 `refactor: extract shuffleArray to lib/utils, rename correctAnswers to terms`
--- a/documentation/tickets/t00004.md
+++ b/documentation/tickets/t00004.md
@ -1,110 +0,0 @@
 # ADR: Dependency injection for GameSessionStore via composition root
 ## Status
 Accepted
 ## Date
 2026-04-28
 ## Context
 `gameService.ts` had a module-level singleton:
 ```ts
 const gameSessionStore = new InMemoryGameSessionStore();
 ```
 This made the store invisible to anything outside the file. The `GameSessionStore` interface existed to make the store swappable — but the singleton made that impossible without editing the service itself. Tests shared the same instance across every test run, creating the potential for ghost sessions leaking between tests. The controller also briefly owned the singleton during an intermediate step, which violated the principle that controllers should only handle HTTP concerns.
 ## Decision
 Adopt a composition root pattern. The store is created once in `createApp()` and passed down through factory functions: `createApiRouter(store)` → `createGameRouter(store)` → `createGameController(store)` → service calls. Neither the controller nor the service knows which implementation they're working with — they both see `GameSessionStore`.
 ## Options considered
 ### Option A — Composition root ✅
 Convert routers and controllers to factory functions. Create the store in `createApp()` and pass it down. The store is created once, at the top, and injected through the call chain.
 Chosen because: clean separation of concerns, no layer below `createApp()` needs to know the concrete implementation, swapping to `ValKeyGameSessionStore` is a one-line change in `app.ts`, and tests get fresh isolated store instances.
 ### Option B — Keep singleton in controller
 Leave the store as a module-level singleton in `gameController.ts`. Controllers own the store lifetime.
 Rejected because: controllers should only handle HTTP concerns. Owning infrastructure lifetime is not an HTTP concern.
 ### Option C — DI framework (tsyringe, inversify)
 Use a proper dependency injection container.
 Rejected because: overkill for the current scale. The composition root pattern achieves the same result with zero dependencies and no magic.
 ## Consequences
 - Swapping `InMemoryGameSessionStore` for `ValKeyGameSessionStore` requires editing one line in `app.ts`
 - Tests create fresh `InMemoryGameSessionStore` instances per test — no shared state, no ghost sessions
 - Routers and controllers are now factory functions instead of module-level singletons — slightly more verbose but explicitly testable
 - `gameController.test.ts` uses `createApp()` which owns the store — controller tests remain integration-style and unaffected
 - All layers below `createApp()` depend only on the `GameSessionStore` interface, never the concrete implementation
 ## Affected files
 - `apps/api/src/app.ts` — creates the store, passes to `createApiRouter`
 - `apps/api/src/routes/apiRouter.ts` — converted to `createApiRouter(store)` factory
 - `apps/api/src/routes/gameRouter.ts` — converted to `createGameRouter(store)` factory
 - `apps/api/src/controllers/gameController.ts` — converted to `createGameController(store)` factory
 - `apps/api/src/services/gameService.ts` — `store` parameter added to both functions, singleton removed
 - `apps/api/src/services/gameService.test.ts` — fresh store per describe block via `beforeEach`
 ## References
 - [Composition root pattern](https://blog.ploeh.dk/2011/07/28/CompositionRoot/)
 ---
 ## Setup guide / implementation notes
 1. `gameService.ts` — remove module-level singleton, add `store: GameSessionStore` parameter to `createGameSession` and `evaluateAnswer`
 2. `gameController.ts` — convert exported functions to a factory:
   ```ts
   export const createGameController = (store: GameSessionStore) => ({
     createGame: async (req, res, next) => { ... },
     submitAnswer: async (req, res, next) => { ... },
   });
   ```
 3. `gameRouter.ts` — convert to factory:
   ```ts
   export const createGameRouter = (store: GameSessionStore): Router => {
     const router = express.Router();
     const controller = createGameController(store);
     router.post("/start", controller.createGame);
     router.post("/answer", controller.submitAnswer);
     return router;
   };
   ```
 4. `apiRouter.ts` — convert to factory:
   ```ts
   export const createApiRouter = (store: GameSessionStore): Router => {
     const router = express.Router();
     router.use("/game", createGameRouter(store));
     return router;
   };
   ```
 5. `app.ts` — create the store at the composition root:
   ```ts
   const store = new InMemoryGameSessionStore();
   app.use("/api/v1", createApiRouter(store));
   ```
 6. `gameService.test.ts` — add `let store: InMemoryGameSessionStore` to each `describe` block, reset in `beforeEach`, pass to every service call
--- a/documentation/tickets/t00005.md
+++ b/documentation/tickets/t00005.md
@ -1,93 +0,0 @@
 # ADR: Session lifecycle — TTL and replay protection
 ## Status
 Accepted
 ## Date
 2026-04-28
 ## Context
 `InMemoryGameSessionStore` had no TTL and no cleanup mechanism. Every session created stayed in memory until the process restarted. Additionally, `evaluateAnswer` never removed a question from the answer key after evaluating it, meaning the same question could be submitted multiple times and receive a valid result each time — a potential exploit in multiplayer and a correctness bug in singleplayer.
 ## Decision
 Add a `ttlMs` parameter to `GameSessionStore.create()` so both the in-memory and future Valkey implementations handle expiry consistently. Delete questions from the answer key after evaluation. Delete the session when the last question is answered.
 ## Options considered
 ### Option A — Delete on last answer only
 Simple. Covers replay protection and normal session completion. Abandoned sessions (player starts game, never finishes) still leak memory.
 ### Option B — Delete on last answer + TTL on the interface ✅
 Delete on answer covers normal flow. TTL covers abandoned sessions. TTL on the interface means `ValKeyGameSessionStore` can use Redis-native `EXPIRE` without any interface changes during migration.
 Chosen because it closes the memory leak entirely and makes the Valkey migration a zero-interface-change operation.
 ### Option C — TTL hardcoded inside InMemoryGameSessionStore only
 Simpler short-term. But the interface wouldn't carry the TTL parameter, so `ValKeyGameSessionStore` would need a different mechanism — inconsistency between implementations.
 ## Consequences
 - Sessions expire after 30 minutes of inactivity regardless of completion state
 - Submitting the same question twice throws `NotFoundError` on the second attempt
 - Sessions are deleted automatically when the last question is answered
 - `GameSessionStore.create()` now requires a `ttlMs` argument — any future implementation must honour it
 - `ValKeyGameSessionStore` can implement TTL via Redis `EXPIRE` with no interface changes
 - `InMemoryGameSessionStore` stores `{ data, expiresAt }` entries instead of raw `GameSessionData` — expiry is checked lazily on `get()`
 ## Affected files
 - `apps/api/src/gameSessionStore/GameSessionStore.ts` — `ttlMs` added to `create`
 - `apps/api/src/gameSessionStore/InMemoryGameSessionStore.ts` — TTL implementation
 - `apps/api/src/gameSessionStore/InMemoryGameSessionStore.test.ts` — new test file
 - `apps/api/src/services/gameService.ts` — passes TTL to `store.create`, deletes question after evaluation, deletes session when empty
 - `apps/api/src/services/gameService.test.ts` — replay protection and session cleanup tests added
 ## References
 - [Redis EXPIRE command](https://redis.io/commands/expire/)
 ---
 ## Setup guide / implementation notes
 1. `GameSessionStore.ts` — add `ttlMs` to `create`:
   ```ts
   create(sessionId: string, data: GameSessionData, ttlMs: number): Promise<void>;
   ```
 2. `InMemoryGameSessionStore.ts` — wrap stored data with expiry:
   ```ts
   type SessionEntry = { data: GameSessionData; expiresAt: number };
   ```
   Check expiry on `get()`, delete expired entries lazily.
 3. `gameService.ts` — pass TTL when creating session:
   ```ts
   await store.create(sessionId, { answers: answerKey }, 30 * 60 * 1000);
   ```
   After evaluating an answer:
   ```ts
   session.answers.delete(submission.questionId);
   if (session.answers.size === 0) {
     await store.delete(submission.sessionId);
   }
   ```
 4. When implementing `ValKeyGameSessionStore`, pass `ttlMs` to Redis `EXPIRE`:
   ```ts
   await valkey.set(sessionId, serialize(data), "EX", Math.ceil(ttlMs / 1000));
   ```
--- a/documentation/tickets/t00006.md
+++ b/documentation/tickets/t00006.md
@ -1,125 +0,0 @@
 # ADR: Session ownership check and AuthenticatedRequest type
 ## Status
 Accepted
 ## Date
 2026-04-28
 ## Context
 `evaluateAnswer` accepted any `sessionId` without verifying it belonged to the requesting user. The only protection was the unguessability of a UUID — security through obscurity. If a user intercepted or guessed another user's `sessionId`, they could submit answers on their behalf.
 Additionally, protected controller handlers typed their `req` parameter as `Request`, making `session` optional even though `requireAuth` middleware guarantees it is present. This required non-null assertions (`req.session!`) in business logic — a type assertion that could cause a runtime crash if middleware ordering ever changed.
 ## Decision
 Store `userId` in `GameSessionData`. Pass `userId` from the controller into both `createGameSession` and `evaluateAnswer`. Assert ownership on evaluation — if the session's `userId` doesn't match the requesting user's ID, throw `NotFoundError`. Introduce `AuthenticatedRequest` to eliminate non-null assertions in protected handlers.
 ## Options considered
 ### Option A — AuthenticatedRequest type ✅
 Define `AuthenticatedRequest = Request & { session: { session: Session; user: User } }` in `types/express.d.ts`. Use it in protected controller handlers instead of `Request`. Requires a single `as express.RequestHandler` cast at route registration due to Express's type limitations.
 Chosen because: eliminates dangerous non-null assertions in business logic. The cast at route registration is a necessary cast caused by a third-party library limitation, not uncertain logic.
 ### Option B — Non-null assertion (`req.session!`)
 Keep `Request` on all handlers. Assert `req.session!` at every usage.
 Rejected because: non-null assertions in business logic are dangerous — if middleware ordering ever changes, the assertion silently passes and crashes at runtime.
 ---
 ### Option C — NotFoundError (404) on ownership failure ✅
 When a session exists but belongs to a different user, throw `NotFoundError` with the same message as a missing session.
 Chosen because: session IDs are opaque secrets. Returning 403 would confirm to the caller that the session ID is valid and belongs to someone else — information they shouldn't have. This pattern is used by GitHub, AWS, and most security-conscious APIs.
 ### Option D — ForbiddenError (403) on ownership failure
 Explicit error that distinguishes "not found" from "not allowed".
 Rejected because: for user-owned resources identified by opaque IDs, confirming existence to an unauthorised caller is an information leak. 404 is the industry standard for this case.
 ## Consequences
 - Alice cannot submit answers for Bob's session — ownership is verified at the service layer
 - `req.session.user.id` is accessible without non-null assertions in protected handlers
 - `GameSessionData` now carries `userId` — any future `GameSessionStore` implementation must store and return it
 - Route registration requires `as express.RequestHandler` cast for protected handlers — one cast per route, in wiring code only
 - `ValKeyGameSessionStore` must serialise and deserialise `userId` alongside `answers`
 ## Affected files
 - `apps/api/src/types/express.d.ts` — `AuthenticatedRequest` type added
 - `apps/api/src/gameSessionStore/GameSessionStore.ts` — `userId` added to `GameSessionData`
 - `apps/api/src/gameSessionStore/InMemoryGameSessionStore.test.ts` — updated data fixtures
 - `apps/api/src/services/gameService.ts` — `userId` parameter added to both functions, ownership assertion in `evaluateAnswer`
 - `apps/api/src/services/gameService.test.ts` — updated all calls, ownership test added
 - `apps/api/src/controllers/gameController.ts` — extracts `userId` from `req.session.user.id`, passes to service calls
 - `apps/api/src/routes/gameRouter.ts` — `as express.RequestHandler` cast at route registration
 ## References
 - [OWASP: Insecure Direct Object Reference](https://owasp.org/www-community/attacks/Insecure_Direct_Object_Reference)
 - [HTTP 403 vs 404 for authorization failures](https://stackoverflow.com/questions/3297048/403-forbidden-vs-401-unauthorized-http-responses)
 ---
 ## Setup guide / implementation notes
 1. `types/express.d.ts` — add:
   ```ts
   export type AuthenticatedRequest = Request & {
     session: { session: Session; user: User };
   };
   ```
 2. `GameSessionStore.ts` — add `userId` to `GameSessionData`:
   ```ts
   export type GameSessionData = { answers: Map<string, number>; userId: string };
   ```
 3. `gameService.ts` — add `userId` to both function signatures:
   ```ts
   export const createGameSession = async (
     request: GameRequest,
     store: GameSessionStore,
     userId: string,
   ): Promise<GameSession>
   ```
   Store it on create:
   ```ts
   await store.create(sessionId, { answers: answerKey, userId }, 30 * 60 * 1000);
   ```
   Assert on evaluate:
   ```ts
   if (!session || session.userId !== userId) {
     throw new NotFoundError(`Game session not found: ${submission.sessionId}`);
   }
   ```
 4. `gameController.ts` — extract from authenticated request:
   ```ts
   req.session.user.id
   ```
 5. `gameRouter.ts` — cast at registration:
   ```ts
   router.post("/start", controller.createGame as express.RequestHandler);
   router.post("/answer", controller.submitAnswer as express.RequestHandler);
   ```
--- a/documentation/tickets/t00007.md
+++ b/documentation/tickets/t00007.md
@ -1,41 +0,0 @@
 # feat: guard against empty terms in createGameSession
 ## Problem
 If `getGameTerms` returned an empty array — no vocabulary data matched the requested language, difficulty, and part of speech combination — `createGameSession` would create a session with zero questions and return it. The frontend would receive an empty `questions` array, attempt to render the first question, find nothing, and crash with no useful error message shown to the user.
 ## Options considered
 ### Option A — `NotFoundError` (404) ✅
 Throw when `terms.length === 0` before any session is created. The combination of filters yielded no data — that's a "not found" situation.
 Chosen because: the request is technically valid (all filter values are recognised), but the combination has no matching data. 404 is the correct semantic response.
 ### Option B — `ValidationError` (400)
 Treat empty results as a bad request.
 Rejected because: the client sent valid input. The problem is missing data, not invalid input. 400 would be misleading.
 ## Solution
 Added a guard in `createGameSession` immediately after `getGameTerms`:
 ```ts
 if (terms.length === 0) {
  throw new NotFoundError("No terms found for the given filters");
 }
 ```
 The error propagates through the controller's `try/catch` to the error handler, which returns a clean 404 response. No session is created.
 ## Files changed
 - `apps/api/src/services/gameService.ts` — empty terms guard added
 - `apps/api/src/services/gameService.test.ts` — pinning test added
 - `apps/api/src/controllers/gameController.test.ts` — pinning test added at HTTP layer
 ## Commit
 `feat: guard against empty terms in createGameSession`
--- a/documentation/tickets/t00008.md
+++ b/documentation/tickets/t00008.md
@ -1,54 +0,0 @@
 # fix: deduplicate distractors, replace tautological test
 ## Problem
 Two issues in `createGameSession` and its test suite:
 1. If `getDistractors` returned the correct answer as one of the distractors, `createGameSession` would include it in the options array without filtering it out. `indexOf` would then find the first occurrence, which might not be the one intended as the correct answer — producing a question where the correct answer appears twice and the stored `correctOptionId` is wrong.
 2. The test `"distractors are never the correct answer"` was tautological — it filtered the correct answer out of the options array, then asserted the remaining items were not the correct answer. It was testing that `Array.filter()` works. It could never fail.
 ## Options considered
 ### Option A — Filter duplicates after fetching, request extra distractors as buffer ✅
 Filter out any distractor that matches the correct answer after fetching. Request 6 distractors instead of 3 to ensure enough remain after deduplication. Take the first 3 valid ones with `slice(0, 3)`.
 Chosen because: deduplication at the service layer is the right place — `getDistractors` shouldn't need to know what the correct answer is. Requesting extra provides a buffer against collisions.
 ### Option B — Fix `getDistractors` to never return the correct answer
 Add a NOT filter in the database query.
 Not chosen for this ticket — the database query is in `@lila/db` and is a separate concern. The service layer should be defensive regardless of what the model layer returns.
 ## Solution
 - Filter distractors against the correct answer before building options:
  ```ts
  const uniqueDistractors = distractorTexts.filter((t) => t !== term.targetText);
  const optionTexts = [term.targetText, ...uniqueDistractors.slice(0, 3)];
  ```
 - Request 6 distractors instead of 3 to account for potential duplicates
 - Replaced tautological test with a test that actually exercises the duplicate case:
  ```ts
  it("correct answer appears exactly once even if getDistractors returns a duplicate", ...)
  ```
 - Added distractor failure propagation test:
  ```ts
  it("propagates getDistractors failure", ...)
  ```
 ## Files changed
 - `apps/api/src/services/gameService.ts` — deduplication logic, distractor count increased to 6
 - `apps/api/src/services/gameService.test.ts` — tautological test replaced, failure test added
 ## Commit
 `fix: deduplicate distractors, replace tautological test, add distractor failure test`
--- a/eslint.config.mjs
+++ b/eslint.config.mjs
@ -12,7 +12,6 @@ export default defineConfig([
    "node_modules/",
    "routeTree.gen.ts",
    "scripts/**",
    "data-pipeline/**/*",
  ]),
  eslint.configs.recommended,
--- a/package.json
+++ b/package.json
@ -23,7 +23,7 @@
      "prettier --write"
    ]
  },
-  "packageManager": "pnpm@10.33.1",
+  "packageManager": "pnpm@10.33.2",
  "devDependencies": {
    "@eslint/js": "^10.0.1",
    "@tanstack/eslint-plugin-router": "^1.161.6",
--- a/packages/db/drizzle/0011_nice_spyke.sql
+++ b/packages/db/drizzle/0011_nice_spyke.sql
@ -0,0 +1,46 @@
 CREATE TABLE "entry_translations" (
 	"id" uuid PRIMARY KEY DEFAULT gen_random_uuid() NOT NULL,
 	"entry_id" uuid NOT NULL,
 	"target_language_code" varchar(10) NOT NULL,
 	"translation" text NOT NULL,
 	"sense_hint" text,
 	"cefr_level" varchar(2),
 	"difficulty" varchar(20),
 	"source" varchar(50) DEFAULT 'kaikki' NOT NULL,
 	"created_at" timestamp with time zone DEFAULT now() NOT NULL,
 	CONSTRAINT "unique_translation" UNIQUE("entry_id","target_language_code","translation"),
 	CONSTRAINT "target_language_code_check" CHECK ("entry_translations"."target_language_code" IN ('en', 'it', 'de', 'fr', 'es')),
 	CONSTRAINT "cefr_check" CHECK ("entry_translations"."cefr_level" IS NULL OR "entry_translations"."cefr_level" IN ('A1', 'A2', 'B1', 'B2', 'C1', 'C2')),
 	CONSTRAINT "difficulty_check" CHECK ("entry_translations"."difficulty" IS NULL OR "entry_translations"."difficulty" IN ('easy', 'intermediate', 'hard'))
 );
 --> statement-breakpoint
 CREATE TABLE "vocabulary_entries" (
 	"id" uuid PRIMARY KEY DEFAULT gen_random_uuid() NOT NULL,
 	"headword" text NOT NULL,
 	"language_code" varchar(10) NOT NULL,
 	"pos" varchar(20) NOT NULL,
 	"sense_index" smallint DEFAULT 0 NOT NULL,
 	"gloss" text,
 	"examples" text[] DEFAULT '{}' NOT NULL,
 	"cefr_level" varchar(2),
 	"difficulty" varchar(20),
 	"source" varchar(50) DEFAULT 'kaikki' NOT NULL,
 	"created_at" timestamp with time zone DEFAULT now() NOT NULL,
 	CONSTRAINT "unique_entry" UNIQUE("headword","language_code","pos","sense_index"),
 	CONSTRAINT "language_code_check" CHECK ("vocabulary_entries"."language_code" IN ('en', 'it', 'de', 'fr', 'es')),
 	CONSTRAINT "pos_check" CHECK ("vocabulary_entries"."pos" IN ('noun', 'verb', 'adjective', 'adverb')),
 	CONSTRAINT "cefr_check" CHECK ("vocabulary_entries"."cefr_level" IS NULL OR "vocabulary_entries"."cefr_level" IN ('A1', 'A2', 'B1', 'B2', 'C1', 'C2')),
 	CONSTRAINT "difficulty_check" CHECK ("vocabulary_entries"."difficulty" IS NULL OR "vocabulary_entries"."difficulty" IN ('easy', 'intermediate', 'hard'))
 );
 --> statement-breakpoint
 DROP TABLE "deck_terms" CASCADE;--> statement-breakpoint
 DROP TABLE "decks" CASCADE;--> statement-breakpoint
 DROP TABLE "term_examples" CASCADE;--> statement-breakpoint
 DROP TABLE "term_glosses" CASCADE;--> statement-breakpoint
 DROP TABLE "term_topics" CASCADE;--> statement-breakpoint
 DROP TABLE "terms" CASCADE;--> statement-breakpoint
 DROP TABLE "topics" CASCADE;--> statement-breakpoint
 DROP TABLE "translations" CASCADE;--> statement-breakpoint
 ALTER TABLE "entry_translations" ADD CONSTRAINT "entry_translations_entry_id_vocabulary_entries_id_fk" FOREIGN KEY ("entry_id") REFERENCES "public"."vocabulary_entries"("id") ON DELETE cascade ON UPDATE no action;--> statement-breakpoint
 CREATE INDEX "idx_translations_target_lang" ON "entry_translations" USING btree ("target_language_code","difficulty","entry_id");--> statement-breakpoint
 CREATE INDEX "idx_entries_lang_pos" ON "vocabulary_entries" USING btree ("language_code","pos","difficulty");
--- a/packages/db/drizzle/meta/0011_snapshot.json
+++ b/packages/db/drizzle/meta/0011_snapshot.json
@ -0,0 +1,750 @@
 {
  "id": "6f1811a6-8573-4d43-912a-ceb5191341cc",
  "prevId": "6c1cb049-807d-43d0-b83e-d3575b80de33",
  "version": "7",
  "dialect": "postgresql",
  "tables": {
    "public.account": {
      "name": "account",
      "schema": "",
      "columns": {
        "id": {
          "name": "id",
          "type": "text",
          "primaryKey": true,
          "notNull": true
        },
        "account_id": {
          "name": "account_id",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        },
        "provider_id": {
          "name": "provider_id",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        },
        "user_id": {
          "name": "user_id",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        },
        "access_token": {
          "name": "access_token",
          "type": "text",
          "primaryKey": false,
          "notNull": false
        },
        "refresh_token": {
          "name": "refresh_token",
          "type": "text",
          "primaryKey": false,
          "notNull": false
        },
        "id_token": {
          "name": "id_token",
          "type": "text",
          "primaryKey": false,
          "notNull": false
        },
        "access_token_expires_at": {
          "name": "access_token_expires_at",
          "type": "timestamp",
          "primaryKey": false,
          "notNull": false
        },
        "refresh_token_expires_at": {
          "name": "refresh_token_expires_at",
          "type": "timestamp",
          "primaryKey": false,
          "notNull": false
        },
        "scope": {
          "name": "scope",
          "type": "text",
          "primaryKey": false,
          "notNull": false
        },
        "password": {
          "name": "password",
          "type": "text",
          "primaryKey": false,
          "notNull": false
        },
        "created_at": {
          "name": "created_at",
          "type": "timestamp",
          "primaryKey": false,
          "notNull": true,
          "default": "now()"
        },
        "updated_at": {
          "name": "updated_at",
          "type": "timestamp",
          "primaryKey": false,
          "notNull": true
        }
      },
      "indexes": {
        "account_userId_idx": {
          "name": "account_userId_idx",
          "columns": [
            {
              "expression": "user_id",
              "isExpression": false,
              "asc": true,
              "nulls": "last"
            }
          ],
          "isUnique": false,
          "concurrently": false,
          "method": "btree",
          "with": {}
        }
      },
      "foreignKeys": {
        "account_user_id_user_id_fk": {
          "name": "account_user_id_user_id_fk",
          "tableFrom": "account",
          "tableTo": "user",
          "columnsFrom": ["user_id"],
          "columnsTo": ["id"],
          "onDelete": "cascade",
          "onUpdate": "no action"
        }
      },
      "compositePrimaryKeys": {},
      "uniqueConstraints": {},
      "policies": {},
      "checkConstraints": {},
      "isRLSEnabled": false
    },
    "public.entry_translations": {
      "name": "entry_translations",
      "schema": "",
      "columns": {
        "id": {
          "name": "id",
          "type": "uuid",
          "primaryKey": true,
          "notNull": true,
          "default": "gen_random_uuid()"
        },
        "entry_id": {
          "name": "entry_id",
          "type": "uuid",
          "primaryKey": false,
          "notNull": true
        },
        "target_language_code": {
          "name": "target_language_code",
          "type": "varchar(10)",
          "primaryKey": false,
          "notNull": true
        },
        "translation": {
          "name": "translation",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        },
        "sense_hint": {
          "name": "sense_hint",
          "type": "text",
          "primaryKey": false,
          "notNull": false
        },
        "cefr_level": {
          "name": "cefr_level",
          "type": "varchar(2)",
          "primaryKey": false,
          "notNull": false
        },
        "difficulty": {
          "name": "difficulty",
          "type": "varchar(20)",
          "primaryKey": false,
          "notNull": false
        },
        "source": {
          "name": "source",
          "type": "varchar(50)",
          "primaryKey": false,
          "notNull": true,
          "default": "'kaikki'"
        },
        "created_at": {
          "name": "created_at",
          "type": "timestamp with time zone",
          "primaryKey": false,
          "notNull": true,
          "default": "now()"
        }
      },
      "indexes": {
        "idx_translations_target_lang": {
          "name": "idx_translations_target_lang",
          "columns": [
            {
              "expression": "target_language_code",
              "isExpression": false,
              "asc": true,
              "nulls": "last"
            },
            {
              "expression": "difficulty",
              "isExpression": false,
              "asc": true,
              "nulls": "last"
            },
            {
              "expression": "entry_id",
              "isExpression": false,
              "asc": true,
              "nulls": "last"
            }
          ],
          "isUnique": false,
          "concurrently": false,
          "method": "btree",
          "with": {}
        }
      },
      "foreignKeys": {
        "entry_translations_entry_id_vocabulary_entries_id_fk": {
          "name": "entry_translations_entry_id_vocabulary_entries_id_fk",
          "tableFrom": "entry_translations",
          "tableTo": "vocabulary_entries",
          "columnsFrom": ["entry_id"],
          "columnsTo": ["id"],
          "onDelete": "cascade",
          "onUpdate": "no action"
        }
      },
      "compositePrimaryKeys": {},
      "uniqueConstraints": {
        "unique_translation": {
          "name": "unique_translation",
          "nullsNotDistinct": false,
          "columns": ["entry_id", "target_language_code", "translation"]
        }
      },
      "policies": {},
      "checkConstraints": {
        "target_language_code_check": {
          "name": "target_language_code_check",
          "value": "\"entry_translations\".\"target_language_code\" IN ('en', 'it', 'de', 'fr', 'es')"
        },
        "cefr_check": {
          "name": "cefr_check",
          "value": "\"entry_translations\".\"cefr_level\" IS NULL OR \"entry_translations\".\"cefr_level\" IN ('A1', 'A2', 'B1', 'B2', 'C1', 'C2')"
        },
        "difficulty_check": {
          "name": "difficulty_check",
          "value": "\"entry_translations\".\"difficulty\" IS NULL OR \"entry_translations\".\"difficulty\" IN ('easy', 'intermediate', 'hard')"
        }
      },
      "isRLSEnabled": false
    },
    "public.lobbies": {
      "name": "lobbies",
      "schema": "",
      "columns": {
        "id": {
          "name": "id",
          "type": "uuid",
          "primaryKey": true,
          "notNull": true,
          "default": "gen_random_uuid()"
        },
        "code": {
          "name": "code",
          "type": "varchar(10)",
          "primaryKey": false,
          "notNull": true
        },
        "host_user_id": {
          "name": "host_user_id",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        },
        "status": {
          "name": "status",
          "type": "varchar(20)",
          "primaryKey": false,
          "notNull": true,
          "default": "'waiting'"
        },
        "created_at": {
          "name": "created_at",
          "type": "timestamp with time zone",
          "primaryKey": false,
          "notNull": true,
          "default": "now()"
        }
      },
      "indexes": {},
      "foreignKeys": {
        "lobbies_host_user_id_user_id_fk": {
          "name": "lobbies_host_user_id_user_id_fk",
          "tableFrom": "lobbies",
          "tableTo": "user",
          "columnsFrom": ["host_user_id"],
          "columnsTo": ["id"],
          "onDelete": "cascade",
          "onUpdate": "no action"
        }
      },
      "compositePrimaryKeys": {},
      "uniqueConstraints": {
        "lobbies_code_unique": {
          "name": "lobbies_code_unique",
          "nullsNotDistinct": false,
          "columns": ["code"]
        }
      },
      "policies": {},
      "checkConstraints": {
        "lobby_status_check": {
          "name": "lobby_status_check",
          "value": "\"lobbies\".\"status\" IN ('waiting', 'in_progress', 'finished')"
        }
      },
      "isRLSEnabled": false
    },
    "public.lobby_players": {
      "name": "lobby_players",
      "schema": "",
      "columns": {
        "lobby_id": {
          "name": "lobby_id",
          "type": "uuid",
          "primaryKey": false,
          "notNull": true
        },
        "user_id": {
          "name": "user_id",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        },
        "score": {
          "name": "score",
          "type": "integer",
          "primaryKey": false,
          "notNull": true,
          "default": 0
        },
        "joined_at": {
          "name": "joined_at",
          "type": "timestamp with time zone",
          "primaryKey": false,
          "notNull": true,
          "default": "now()"
        }
      },
      "indexes": {},
      "foreignKeys": {
        "lobby_players_lobby_id_lobbies_id_fk": {
          "name": "lobby_players_lobby_id_lobbies_id_fk",
          "tableFrom": "lobby_players",
          "tableTo": "lobbies",
          "columnsFrom": ["lobby_id"],
          "columnsTo": ["id"],
          "onDelete": "cascade",
          "onUpdate": "no action"
        },
        "lobby_players_user_id_user_id_fk": {
          "name": "lobby_players_user_id_user_id_fk",
          "tableFrom": "lobby_players",
          "tableTo": "user",
          "columnsFrom": ["user_id"],
          "columnsTo": ["id"],
          "onDelete": "cascade",
          "onUpdate": "no action"
        }
      },
      "compositePrimaryKeys": {
        "lobby_players_lobby_id_user_id_pk": {
          "name": "lobby_players_lobby_id_user_id_pk",
          "columns": ["lobby_id", "user_id"]
        }
      },
      "uniqueConstraints": {},
      "policies": {},
      "checkConstraints": {},
      "isRLSEnabled": false
    },
    "public.session": {
      "name": "session",
      "schema": "",
      "columns": {
        "id": {
          "name": "id",
          "type": "text",
          "primaryKey": true,
          "notNull": true
        },
        "expires_at": {
          "name": "expires_at",
          "type": "timestamp",
          "primaryKey": false,
          "notNull": true
        },
        "token": {
          "name": "token",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        },
        "created_at": {
          "name": "created_at",
          "type": "timestamp",
          "primaryKey": false,
          "notNull": true,
          "default": "now()"
        },
        "updated_at": {
          "name": "updated_at",
          "type": "timestamp",
          "primaryKey": false,
          "notNull": true
        },
        "ip_address": {
          "name": "ip_address",
          "type": "text",
          "primaryKey": false,
          "notNull": false
        },
        "user_agent": {
          "name": "user_agent",
          "type": "text",
          "primaryKey": false,
          "notNull": false
        },
        "user_id": {
          "name": "user_id",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        }
      },
      "indexes": {
        "session_userId_idx": {
          "name": "session_userId_idx",
          "columns": [
            {
              "expression": "user_id",
              "isExpression": false,
              "asc": true,
              "nulls": "last"
            }
          ],
          "isUnique": false,
          "concurrently": false,
          "method": "btree",
          "with": {}
        }
      },
      "foreignKeys": {
        "session_user_id_user_id_fk": {
          "name": "session_user_id_user_id_fk",
          "tableFrom": "session",
          "tableTo": "user",
          "columnsFrom": ["user_id"],
          "columnsTo": ["id"],
          "onDelete": "cascade",
          "onUpdate": "no action"
        }
      },
      "compositePrimaryKeys": {},
      "uniqueConstraints": {
        "session_token_unique": {
          "name": "session_token_unique",
          "nullsNotDistinct": false,
          "columns": ["token"]
        }
      },
      "policies": {},
      "checkConstraints": {},
      "isRLSEnabled": false
    },
    "public.user": {
      "name": "user",
      "schema": "",
      "columns": {
        "id": {
          "name": "id",
          "type": "text",
          "primaryKey": true,
          "notNull": true
        },
        "name": {
          "name": "name",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        },
        "email": {
          "name": "email",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        },
        "email_verified": {
          "name": "email_verified",
          "type": "boolean",
          "primaryKey": false,
          "notNull": true,
          "default": false
        },
        "image": {
          "name": "image",
          "type": "text",
          "primaryKey": false,
          "notNull": false
        },
        "created_at": {
          "name": "created_at",
          "type": "timestamp",
          "primaryKey": false,
          "notNull": true,
          "default": "now()"
        },
        "updated_at": {
          "name": "updated_at",
          "type": "timestamp",
          "primaryKey": false,
          "notNull": true,
          "default": "now()"
        }
      },
      "indexes": {},
      "foreignKeys": {},
      "compositePrimaryKeys": {},
      "uniqueConstraints": {
        "user_email_unique": {
          "name": "user_email_unique",
          "nullsNotDistinct": false,
          "columns": ["email"]
        }
      },
      "policies": {},
      "checkConstraints": {},
      "isRLSEnabled": false
    },
    "public.verification": {
      "name": "verification",
      "schema": "",
      "columns": {
        "id": {
          "name": "id",
          "type": "text",
          "primaryKey": true,
          "notNull": true
        },
        "identifier": {
          "name": "identifier",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        },
        "value": {
          "name": "value",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        },
        "expires_at": {
          "name": "expires_at",
          "type": "timestamp",
          "primaryKey": false,
          "notNull": true
        },
        "created_at": {
          "name": "created_at",
          "type": "timestamp",
          "primaryKey": false,
          "notNull": true,
          "default": "now()"
        },
        "updated_at": {
          "name": "updated_at",
          "type": "timestamp",
          "primaryKey": false,
          "notNull": true,
          "default": "now()"
        }
      },
      "indexes": {
        "verification_identifier_idx": {
          "name": "verification_identifier_idx",
          "columns": [
            {
              "expression": "identifier",
              "isExpression": false,
              "asc": true,
              "nulls": "last"
            }
          ],
          "isUnique": false,
          "concurrently": false,
          "method": "btree",
          "with": {}
        }
      },
      "foreignKeys": {},
      "compositePrimaryKeys": {},
      "uniqueConstraints": {},
      "policies": {},
      "checkConstraints": {},
      "isRLSEnabled": false
    },
    "public.vocabulary_entries": {
      "name": "vocabulary_entries",
      "schema": "",
      "columns": {
        "id": {
          "name": "id",
          "type": "uuid",
          "primaryKey": true,
          "notNull": true,
          "default": "gen_random_uuid()"
        },
        "headword": {
          "name": "headword",
          "type": "text",
          "primaryKey": false,
          "notNull": true
        },
        "language_code": {
          "name": "language_code",
          "type": "varchar(10)",
          "primaryKey": false,
          "notNull": true
        },
        "pos": {
          "name": "pos",
          "type": "varchar(20)",
          "primaryKey": false,
          "notNull": true
        },
        "sense_index": {
          "name": "sense_index",
          "type": "smallint",
          "primaryKey": false,
          "notNull": true,
          "default": 0
        },
        "gloss": {
          "name": "gloss",
          "type": "text",
          "primaryKey": false,
          "notNull": false
        },
        "examples": {
          "name": "examples",
          "type": "text[]",
          "primaryKey": false,
          "notNull": true,
          "default": "'{}'"
        },
        "cefr_level": {
          "name": "cefr_level",
          "type": "varchar(2)",
          "primaryKey": false,
          "notNull": false
        },
        "difficulty": {
          "name": "difficulty",
          "type": "varchar(20)",
          "primaryKey": false,
          "notNull": false
        },
        "source": {
          "name": "source",
          "type": "varchar(50)",
          "primaryKey": false,
          "notNull": true,
          "default": "'kaikki'"
        },
        "created_at": {
          "name": "created_at",
          "type": "timestamp with time zone",
          "primaryKey": false,
          "notNull": true,
          "default": "now()"
        }
      },
      "indexes": {
        "idx_entries_lang_pos": {
          "name": "idx_entries_lang_pos",
          "columns": [
            {
              "expression": "language_code",
              "isExpression": false,
              "asc": true,
              "nulls": "last"
            },
            {
              "expression": "pos",
              "isExpression": false,
              "asc": true,
              "nulls": "last"
            },
            {
              "expression": "difficulty",
              "isExpression": false,
              "asc": true,
              "nulls": "last"
            }
          ],
          "isUnique": false,
          "concurrently": false,
          "method": "btree",
          "with": {}
        }
      },
      "foreignKeys": {},
      "compositePrimaryKeys": {},
      "uniqueConstraints": {
        "unique_entry": {
          "name": "unique_entry",
          "nullsNotDistinct": false,
          "columns": ["headword", "language_code", "pos", "sense_index"]
        }
      },
      "policies": {},
      "checkConstraints": {
        "language_code_check": {
          "name": "language_code_check",
          "value": "\"vocabulary_entries\".\"language_code\" IN ('en', 'it', 'de', 'fr', 'es')"
        },
        "pos_check": {
          "name": "pos_check",
          "value": "\"vocabulary_entries\".\"pos\" IN ('noun', 'verb', 'adjective', 'adverb')"
        },
        "cefr_check": {
          "name": "cefr_check",
          "value": "\"vocabulary_entries\".\"cefr_level\" IS NULL OR \"vocabulary_entries\".\"cefr_level\" IN ('A1', 'A2', 'B1', 'B2', 'C1', 'C2')"
        },
        "difficulty_check": {
          "name": "difficulty_check",
          "value": "\"vocabulary_entries\".\"difficulty\" IS NULL OR \"vocabulary_entries\".\"difficulty\" IN ('easy', 'intermediate', 'hard')"
        }
      },
      "isRLSEnabled": false
    }
  },
  "enums": {},
  "schemas": {},
  "sequences": {},
  "roles": {},
  "policies": {},
  "views": {},
  "_meta": { "columns": {}, "schemas": {}, "tables": {} }
 }
--- a/packages/db/drizzle/meta/_journal.json
+++ b/packages/db/drizzle/meta/_journal.json
@ -78,6 +78,13 @@
      "when": 1776929932845,
      "tag": "0010_thankful_reaper",
      "breakpoints": true
    },
    {
      "idx": 11,
      "version": "7",
      "when": 1777994750330,
      "tag": "0011_nice_spyke",
      "breakpoints": true
    }
  ]
 }
--- a/packages/db/src/db/schema.ts
+++ b/packages/db/src/db/schema.ts
@ -10,6 +10,7 @@ import {
  index,
  boolean,
  integer,
  smallint,
 } from "drizzle-orm/pg-core";
 import { sql, relations } from "drizzle-orm";
@ -18,182 +19,100 @@ import {
  SUPPORTED_POS,
  SUPPORTED_LANGUAGE_CODES,
  CEFR_LEVELS,
  SUPPORTED_DECK_TYPES,
  DIFFICULTY_LEVELS,
  LOBBY_STATUSES,
 } from "@lila/shared";
-export const terms = pgTable(
+// ── Vocabulary ────────────────────────────────────────────────────────────────
-  "terms",
+
 export const vocabulary_entries = pgTable(
  "vocabulary_entries",
  {
    id: uuid().primaryKey().defaultRandom(),
-    source: varchar({ length: 50 }), // 'omw', 'wiktionary', null for manual
+    headword: text().notNull(),
-    source_id: text(), // synset_id value for omw, wiktionary QID, etc.
+    language_code: varchar({ length: 10 }).notNull(),
    pos: varchar({ length: 20 }).notNull(),
    sense_index: smallint().notNull().default(0),
    gloss: text(),
    examples: text().array().notNull().default([]),
    cefr_level: varchar({ length: 2 }),
    difficulty: varchar({ length: 20 }),
    source: varchar({ length: 50 }).notNull().default("kaikki"),
    created_at: timestamp({ withTimezone: true }).defaultNow().notNull(),
  },
  (table) => [
    unique("unique_entry").on(
      table.headword,
      table.language_code,
      table.pos,
      table.sense_index,
    ),
    check(
      "language_code_check",
      sql`${table.language_code} IN (${sql.raw(SUPPORTED_LANGUAGE_CODES.map((l) => `'${l}'`).join(", "))})`,
    ),
    check(
      "pos_check",
      sql`${table.pos} IN (${sql.raw(SUPPORTED_POS.map((p) => `'${p}'`).join(", "))})`,
    ),
    unique("unique_source_id").on(table.source, table.source_id),
    index("idx_terms_source_pos").on(table.source, table.pos),
  ],
 );
 export const term_glosses = pgTable(
  "term_glosses",
  {
    id: uuid().primaryKey().defaultRandom(),
    term_id: uuid()
      .notNull()
      .references(() => terms.id, { onDelete: "cascade" }),
    language_code: varchar({ length: 10 }).notNull(),
    text: text().notNull(),
    description: text(),
    created_at: timestamp({ withTimezone: true }).defaultNow().notNull(),
  },
  (table) => [
    unique("unique_term_gloss").on(table.term_id, table.language_code),
    check(
      "language_code_check",
      sql`${table.language_code} IN (${sql.raw(SUPPORTED_LANGUAGE_CODES.map((l) => `'${l}'`).join(", "))})`,
    ),
  ],
 );
 export const term_examples = pgTable(
  "term_examples",
  {
    id: uuid().primaryKey().defaultRandom(),
    term_id: uuid()
      .notNull()
      .references(() => terms.id, { onDelete: "cascade" }),
    language_code: varchar({ length: 10 }).notNull(),
    text: text().notNull(),
    created_at: timestamp({ withTimezone: true }).defaultNow().notNull(),
  },
  (table) => [
    unique("unique_term_example").on(
      table.term_id,
      table.language_code,
      table.text,
    ),
    check(
      "language_code_check",
      sql`${table.language_code} IN (${sql.raw(SUPPORTED_LANGUAGE_CODES.map((l) => `'${l}'`).join(", "))})`,
    ),
    index("idx_term_examples_term_id").on(table.term_id, table.language_code),
  ],
 );
 export const translations = pgTable(
  "translations",
  {
    id: uuid().primaryKey().defaultRandom(),
    term_id: uuid()
      .notNull()
      .references(() => terms.id, { onDelete: "cascade" }),
    language_code: varchar({ length: 10 }).notNull(),
    text: text().notNull(),
    cefr_level: varchar({ length: 2 }),
    difficulty: varchar({ length: 20 }),
    created_at: timestamp({ withTimezone: true }).defaultNow().notNull(),
  },
  (table) => [
    unique("unique_translations").on(
      table.term_id,
      table.language_code,
      table.text,
    ),
    check(
      "language_code_check",
      sql`${table.language_code} IN (${sql.raw(SUPPORTED_LANGUAGE_CODES.map((l) => `'${l}'`).join(", "))})`,
    ),
    check(
      "cefr_check",
-      sql`${table.cefr_level} IN (${sql.raw(CEFR_LEVELS.map((l) => `'${l}'`).join(", "))})`,
+      sql`${table.cefr_level} IS NULL OR ${table.cefr_level} IN (${sql.raw(CEFR_LEVELS.map((l) => `'${l}'`).join(", "))})`,
    ),
    check(
      "difficulty_check",
-      sql`${table.difficulty} IN (${sql.raw(DIFFICULTY_LEVELS.map((d) => `'${d}'`).join(", "))})`,
+      sql`${table.difficulty} IS NULL OR ${table.difficulty} IN (${sql.raw(DIFFICULTY_LEVELS.map((d) => `'${d}'`).join(", "))})`,
    ),
-    index("idx_translations_lang").on(
+    index("idx_entries_lang_pos").on(
      table.language_code,
      table.pos,
      table.difficulty,
      table.cefr_level,
      table.term_id,
    ),
  ],
 );
-export const decks = pgTable(
+export const entry_translations = pgTable(
-  "decks",
+  "entry_translations",
  {
    id: uuid().primaryKey().defaultRandom(),
-    name: text().notNull(),
+    entry_id: uuid()
-    description: text(),
+      .notNull()
-    source_language: varchar({ length: 10 }).notNull(),
+      .references(() => vocabulary_entries.id, { onDelete: "cascade" }),
-    validated_languages: varchar({ length: 10 }).array().notNull().default([]),
+    target_language_code: varchar({ length: 10 }).notNull(),
-    type: varchar({ length: 20 }).notNull(),
+    translation: text().notNull(),
    sense_hint: text(),
    cefr_level: varchar({ length: 2 }),
    difficulty: varchar({ length: 20 }),
    source: varchar({ length: 50 }).notNull().default("kaikki"),
    created_at: timestamp({ withTimezone: true }).defaultNow().notNull(),
  },
  (table) => [
-    check(
+    unique("unique_translation").on(
-      "source_language_check",
+      table.entry_id,
-      sql`${table.source_language} IN (${sql.raw(SUPPORTED_LANGUAGE_CODES.map((l) => `'${l}'`).join(", "))})`,
+      table.target_language_code,
      table.translation,
    ),
    check(
-      "validated_languages_check",
+      "target_language_code_check",
-      sql`validated_languages <@ ARRAY[${sql.raw(SUPPORTED_LANGUAGE_CODES.map((l) => `'${l}'`).join(", "))}]::varchar[]`,
+      sql`${table.target_language_code} IN (${sql.raw(SUPPORTED_LANGUAGE_CODES.map((l) => `'${l}'`).join(", "))})`,
    ),
    check(
-      "validated_languages_excludes_source",
+      "cefr_check",
-      sql`NOT (${table.source_language} = ANY(${table.validated_languages}))`,
+      sql`${table.cefr_level} IS NULL OR ${table.cefr_level} IN (${sql.raw(CEFR_LEVELS.map((l) => `'${l}'`).join(", "))})`,
    ),
    check(
-      "deck_type_check",
+      "difficulty_check",
-      sql`${table.type} IN (${sql.raw(SUPPORTED_DECK_TYPES.map((t) => `'${t}'`).join(", "))})`,
+      sql`${table.difficulty} IS NULL OR ${table.difficulty} IN (${sql.raw(DIFFICULTY_LEVELS.map((d) => `'${d}'`).join(", "))})`,
    ),
    index("idx_translations_target_lang").on(
      table.target_language_code,
      table.difficulty,
      table.entry_id,
    ),
    unique("unique_deck_name").on(table.name, table.source_language),
    index("idx_decks_type").on(table.type, table.source_language),
  ],
 );
-export const deck_terms = pgTable(
+// ── Auth (managed by Better Auth) ─────────────────────────────────────────────
  "deck_terms",
  {
    deck_id: uuid()
      .notNull()
      .references(() => decks.id, { onDelete: "cascade" }),
    term_id: uuid()
      .notNull()
      .references(() => terms.id, { onDelete: "cascade" }),
  },
  (table) => [primaryKey({ columns: [table.deck_id, table.term_id] })],
 );
 export const topics = pgTable("topics", {
  id: uuid().primaryKey().defaultRandom(),
  slug: varchar({ length: 50 }).notNull().unique(),
  label: text().notNull(),
  description: text(),
  created_at: timestamp({ withTimezone: true }).defaultNow().notNull(),
 });
 export const term_topics = pgTable(
  "term_topics",
  {
    term_id: uuid()
      .notNull()
      .references(() => terms.id, { onDelete: "cascade" }),
    topic_id: uuid()
      .notNull()
      .references(() => topics.id, { onDelete: "cascade" }),
  },
  (table) => [primaryKey({ columns: [table.term_id, table.topic_id] })],
 );
 export const user = pgTable("user", {
  id: text("id").primaryKey(),
@ -204,7 +123,7 @@ export const user = pgTable("user", {
  createdAt: timestamp("created_at").defaultNow().notNull(),
  updatedAt: timestamp("updated_at")
    .defaultNow()
-    .$onUpdate(() => /* @__PURE__ */ new Date())
+    .$onUpdate(() => new Date())
    .notNull(),
 });
@ -216,7 +135,7 @@ export const session = pgTable(
    token: text("token").notNull().unique(),
    createdAt: timestamp("created_at").defaultNow().notNull(),
    updatedAt: timestamp("updated_at")
-      .$onUpdate(() => /* @__PURE__ */ new Date())
+      .$onUpdate(() => new Date())
      .notNull(),
    ipAddress: text("ip_address"),
    userAgent: text("user_agent"),
@ -245,7 +164,7 @@ export const account = pgTable(
    password: text("password"),
    createdAt: timestamp("created_at").defaultNow().notNull(),
    updatedAt: timestamp("updated_at")
-      .$onUpdate(() => /* @__PURE__ */ new Date())
+      .$onUpdate(() => new Date())
      .notNull(),
  },
  (table) => [index("account_userId_idx").on(table.userId)],
@ -261,24 +180,13 @@ export const verification = pgTable(
    createdAt: timestamp("created_at").defaultNow().notNull(),
    updatedAt: timestamp("updated_at")
      .defaultNow()
-      .$onUpdate(() => /* @__PURE__ */ new Date())
+      .$onUpdate(() => new Date())
      .notNull(),
  },
  (table) => [index("verification_identifier_idx").on(table.identifier)],
 );
-export const userRelations = relations(user, ({ many }) => ({
+// ── Lobbies ───────────────────────────────────────────────────────────────────
  sessions: many(session),
  accounts: many(account),
 }));
 export const sessionRelations = relations(session, ({ one }) => ({
  user: one(user, { fields: [session.userId], references: [user.id] }),
 }));
 export const accountRelations = relations(account, ({ one }) => ({
  user: one(user, { fields: [account.userId], references: [user.id] }),
 }));
 export const lobbies = pgTable(
  "lobbies",
@ -318,6 +226,36 @@ export const lobby_players = pgTable(
  (table) => [primaryKey({ columns: [table.lobbyId, table.userId] })],
 );
 // ── Relations ─────────────────────────────────────────────────────────────────
 export const vocabularyEntryRelations = relations(
  vocabulary_entries,
  ({ many }) => ({ translations: many(entry_translations) }),
 );
 export const entryTranslationRelations = relations(
  entry_translations,
  ({ one }) => ({
    entry: one(vocabulary_entries, {
      fields: [entry_translations.entry_id],
      references: [vocabulary_entries.id],
    }),
  }),
 );
 export const userRelations = relations(user, ({ many }) => ({
  sessions: many(session),
  accounts: many(account),
 }));
 export const sessionRelations = relations(session, ({ one }) => ({
  user: one(user, { fields: [session.userId], references: [user.id] }),
 }));
 export const accountRelations = relations(account, ({ one }) => ({
  user: one(user, { fields: [account.userId], references: [user.id] }),
 }));
 export const lobbyRelations = relations(lobbies, ({ one, many }) => ({
  host: one(user, { fields: [lobbies.hostUserId], references: [user.id] }),
  players: many(lobby_players),
--- a/packages/db/src/models/termModel.ts
+++ b/packages/db/src/models/termModel.ts
@ -1,25 +1,27 @@
 import { db } from "@lila/db";
-import { eq, and, isNotNull, sql, ne } from "drizzle-orm";
+import { eq, and, ne, sql, isNotNull } from "drizzle-orm";
-import { terms, translations, term_glosses } from "@lila/db/schema";
+import { vocabulary_entries, entry_translations } from "@lila/db/schema";
 import { alias } from "drizzle-orm/pg-core";
 import type {
  SupportedLanguageCode,
  SupportedPos,
  DifficultyLevel,
 } from "@lila/shared";
 // ── Types ─────────────────────────────────────────────────────────────────────
 export type TranslationPairRow = {
-  termId: string;
+  entryId: string;
  sourceText: string;
  targetText: string;
  sourceGloss: string | null;
 };
-// Note: difficulty filter is intentionally asymmetric. We filter on the target
+// ── Queries ───────────────────────────────────────────────────────────────────
 // (answer) side only — a word can be A2 in Italian but B1 in English, and what
 // matters for the learner is the difficulty of the word they're being taught.
 // Note: difficulty filter is intentionally on the target (translation) side.
 // A word can be A2 in one language but B1 in another — what matters for the
 // learner is the difficulty of the word they are being tested on.
 export const getGameTerms = async (
  sourceLanguage: SupportedLanguageCode,
  targetLanguage: SupportedLanguageCode,
@ -27,53 +29,36 @@ export const getGameTerms = async (
  difficulty: DifficultyLevel,
  rounds: number,
 ): Promise<TranslationPairRow[]> => {
-  const sourceTranslations = alias(translations, "source_translations");
+  const sourceEntries = alias(vocabulary_entries, "source_entries");
-  const targetTranslations = alias(translations, "target_translations");
+  const targetTranslations = alias(entry_translations, "target_translations");
  const rows = await db
    .select({
-      termId: terms.id,
+      entryId: sourceEntries.id,
-      sourceText: sourceTranslations.text,
+      sourceText: sourceEntries.headword,
-      targetText: targetTranslations.text,
+      targetText: targetTranslations.translation,
-      sourceGloss: term_glosses.text,
+      sourceGloss: sourceEntries.gloss,
    })
-    .from(terms)
+    .from(sourceEntries)
    .innerJoin(
      sourceTranslations,
      and(
        eq(sourceTranslations.term_id, terms.id),
        eq(sourceTranslations.language_code, sourceLanguage), // Filter here!
      ),
    )
    .innerJoin(
      targetTranslations,
      and(
-        eq(targetTranslations.term_id, terms.id),
+        eq(targetTranslations.entry_id, sourceEntries.id),
-        eq(targetTranslations.language_code, targetLanguage), // Filter here!
+        eq(targetTranslations.target_language_code, targetLanguage),
-      ),
+        eq(targetTranslations.difficulty, difficulty),
-    )
+        isNotNull(targetTranslations.translation),
    .leftJoin(
      term_glosses,
      and(
        eq(term_glosses.term_id, terms.id),
        eq(term_glosses.language_code, sourceLanguage),
      ),
    )
    .where(
      and(
-        eq(terms.pos, pos),
+        eq(sourceEntries.language_code, sourceLanguage),
-        eq(targetTranslations.difficulty, difficulty),
+        eq(sourceEntries.pos, pos),
-        isNotNull(sourceTranslations.difficulty), // Good data quality check!
+        isNotNull(sourceEntries.difficulty),
      ),
    )
-    // TODO(post-mvp): ORDER BY RANDOM() sorts the entire filtered result set before
+    // TODO(post-mvp): ORDER BY RANDOM() sorts the entire filtered result set
-    // applying LIMIT, which is fine at current data volumes (low thousands of rows
+    // before applying LIMIT, which is fine at current data volumes but degrades
-    // after POS + difficulty filters) but degrades as the terms table grows. Once
+    // as the table grows. See original termModel.ts for optimisation options.
    // the database is fully populated and tagged, replace with one of:
    //   - TABLESAMPLE BERNOULLI(n) for approximate sampling on large tables
    //   - Random offset: SELECT ... OFFSET floor(random() * (SELECT count(*) ...))
    //   - Pre-computed random column with a btree index, reshuffled periodically
    // Benchmark first — don't optimise until it actually hurts.
    .orderBy(sql`RANDOM()`)
    .limit(rounds);
@ -81,32 +66,33 @@ export const getGameTerms = async (
 };
 export const getDistractors = async (
-  excludeTermId: string,
+  excludeEntryId: string,
  excludeText: string,
  sourceLanguage: SupportedLanguageCode,
  targetLanguage: SupportedLanguageCode,
  pos: SupportedPos,
  difficulty: DifficultyLevel,
  count: number,
 ): Promise<string[]> => {
  const rows = await db
-    .select({ text: translations.text })
+    .select({ text: entry_translations.translation })
-    .from(terms)
+    .from(vocabulary_entries)
    .innerJoin(
-      translations,
+      entry_translations,
      and(
-        eq(translations.term_id, terms.id),
+        eq(entry_translations.entry_id, vocabulary_entries.id),
-        eq(translations.language_code, targetLanguage),
+        eq(entry_translations.target_language_code, targetLanguage),
        eq(entry_translations.difficulty, difficulty),
      ),
    )
    .where(
      and(
-        eq(terms.pos, pos),
+        eq(vocabulary_entries.language_code, sourceLanguage),
-        eq(translations.difficulty, difficulty),
+        eq(vocabulary_entries.pos, pos),
-        ne(terms.id, excludeTermId),
+        ne(vocabulary_entries.id, excludeEntryId),
-        ne(translations.text, excludeText),
+        ne(entry_translations.translation, excludeText),
      ),
    )
    // TODO(post-mvp): same ORDER BY RANDOM() concern as getGameTerms — see comment there.
    .orderBy(sql`RANDOM()`)
    .limit(count);
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
@ -173,6 +173,9 @@ importers:
      typescript:
        specifier: ^5.9.3
        version: 5.9.3
      vitest:
        specifier: ^4.1.0
        version: 4.1.0(@opentelemetry/api@1.9.1)(@types/node@24.12.0)(jsdom@29.0.1(@noble/hashes@2.2.0))(vite@8.0.1(@types/node@24.12.0)(esbuild@0.27.4)(jiti@2.6.1)(tsx@4.21.0)(yaml@2.8.3))
  packages/db:
    dependencies:
@ -4391,7 +4394,6 @@ snapshots:
      magic-string: 0.30.21
    optionalDependencies:
      vite: 8.0.1(@types/node@24.12.0)(esbuild@0.27.4)(jiti@2.6.1)(tsx@4.21.0)(yaml@2.8.3)
    optional: true
  '@vitest/mocker@4.1.0(vite@8.0.1(@types/node@25.5.0)(esbuild@0.27.4)(jiti@2.6.1)(tsx@4.21.0)(yaml@2.8.3))':
    dependencies:
@ -6136,7 +6138,6 @@ snapshots:
      jsdom: 29.0.1(@noble/hashes@2.2.0)
    transitivePeerDependencies:
      - msw
    optional: true
  vitest@4.1.0(@opentelemetry/api@1.9.1)(@types/node@25.5.0)(jsdom@29.0.1(@noble/hashes@2.2.0))(vite@8.0.1(@types/node@25.5.0)(esbuild@0.27.4)(jiti@2.6.1)(tsx@4.21.0)(yaml@2.8.3)):
    dependencies:
Author	SHA1	Message	Date
lila	0118798e36	feat: guest play — allow singleplayer quiz without auth - Add optionalAuth middleware: attaches session when present, never blocks (guests pass through) - Make game endpoints (start/answer) accept optional auth - GameSessionStore.userId: string → string \| null - Rate limiter falls back to IP for unauthenticated users - Frontend: remove /play route guard, show 'Create account' CTA on score screen for guests - Add tests for guest session creation, answer submission, and cross-user session isolation	2026-05-31 21:28:08 +02:00
lila	d55a1ed648	wip	2026-05-30 03:47:59 +02:00
lila	37f6a55798	updating docs	2026-05-30 03:47:52 +02:00
lila	caa2f7d395	updating docs	2026-05-25 01:04:49 +02:00
lila	7e0311683f	updating documentation	2026-05-16 01:59:43 +02:00
lila	1ba57c7e9d	adding tasks	2026-05-15 23:12:55 +02:00
lila	97b0f302d0	refactoring documentation	2026-05-15 23:09:54 +02:00
lila	04a581efe1	WIP: checkpoint before stage-3 sub-stage rewrite	2026-05-12 22:13:14 +02:00
lila	73fb12ac35	feat: enrich script working, redesigning to sub-stage architecture - Enrich script functional with timeout, progress tracking, rejection mechanism - Identified ordering issue: CEFR voting needs validated translations first - Redesign: round1_gloss → round1_example → round1_translations → round1_cefr - Update data-pipeline.md with new sub-stage design and roadmap - Qwen3.5-4B confirmed working with thinking disabled	2026-05-07 13:09:43 +02:00
lila	7f10c35e03	docs: update roadmap — stage 3 enrich script written, llama.cpp next	2026-05-05 19:30:18 +02:00
lila	9642daf6dd	feat: add stage 3 round 1 enrich script and wire into orchestrator	2026-05-05 19:28:38 +02:00
lila	76af2ab093	fix: update db import validation tests to account for reverse links - Translation count test now adds reverse link count to expected total - Non-English translations test now filters to kaikki source only - Target language test now filters to kaikki source only — reverse links to English are valid and expected	2026-05-05 19:10:19 +02:00
lila	1c44ef989b	feat: update pipeline orchestrator for Kaikki — wire up stages 1 and 2 - Replace checkOmwExists with checkExtractedFilesExist - Wire up importKaikki and reverseLink as real stage implementations - Track reverse link completion via sentinel row in run_status - Update report to use resolved_entry_cefr and entry counts - Stages 3 onwards remain as stubs	2026-05-05 19:04:28 +02:00
lila	6f9a42c707	feat: add stage 2 reverse link sync script	2026-05-05 18:57:55 +02:00
lila	b5a76ee178	docs: update roadmap — stage 1 in progress, sample extraction complete	2026-05-05 18:52:10 +02:00
lila	ba2635e3f7	feat: add stage 1 and db import validation tests for Kaikki schema	2026-05-05 18:51:11 +02:00
lila	0cc643e308	feat: update extractor for all 5 languages, update import for multi-language - Extract.ts now processes all 5 language files, filters non-English entries by lang_code, skips translation extraction for non-English (no translations in source files) - Import.ts now imports all 5 language output files, uses language field from ExtractedSense instead of hardcoding en - Sample limit hardcoded to 500 entries per language for development	2026-05-05 18:46:32 +02:00
lila	209d52f54b	feat: add Kaikki extraction and import scripts for stage 1 - Add stage-1-extract/scripts/extract.ts — streams Kaikki JSONL, filters to supported POS and languages, skips abbreviations and senses with no translations in supported languages - Rewrite db/import.ts for Kaikki flat model — tracks sense_index offsets per headword+pos to handle duplicate JSONL entries - Rewrite db/schema.sql for Kaikki model — entries, translations, LLM vote tables, resolved tables - Add extract and db:import scripts to package.json - Sample mode hardcoded to 500 entries for development	2026-05-05 18:11:53 +02:00
lila	963bff4eb8	feat: migrate production schema from OMW to Kaikki flat vocabulary model - Replace terms/translations/term_glosses/term_examples with vocabulary_entries and entry_translations - Remove decks, topics and related tables (deferred) - Add cefr_level and difficulty to entry_translations for game query filtering - Update termModel.ts for new schema — getDistractors now takes sourceLanguage - Update gameService.ts and multiplayerGameService.ts for entryId rename - Update all test fixtures from termId to entryId - Generate and apply migration 0011	2026-05-05 17:39:25 +02:00
lila	38d8b85228	docs: rewrite data-pipeline.md for Kaikki migration	2026-05-05 17:14:48 +02:00
lila	87aeb072c5	feat: add pipeline orchestrator skeleton with startup checks, stage runners, shutdown handler, and report generation	2026-05-03 23:01:29 +02:00
lila	080fad1998	feat: enrich stage foundation — provider config, env setup, schema fix - Remove foreign key on run_status.source_id to support sentinel rows for tracking one-time pipeline steps (compile_candidates, compile_votes, merge, compare) - Add stage-3-enrich/config.ts with all provider configurations, ALL_PROVIDERS ordered local-first, and validateProviderKey() for startup key checks - Add .env.example with required API keys for OpenRouter and Anthropic - Add pipeline:run script to package.json using --env-file .env - Add .env to root .gitignore coverage for data-pipeline/.env	2026-05-03 22:44:14 +02:00
lila	4d42fe4397	removing db from git tracking, adding it to gitignore, add db import validation tests	2026-05-03 22:16:43 +02:00
lila	f59399be02	feat: add db import script, fix duplicate translations in extract, add annotate script	2026-05-03 22:05:10 +02:00
lila	4a842140b9	feat: add stage 1 and 2 validation tests	2026-05-03 21:36:56 +02:00
lila	4fa3073412	feat: add db schema, init, and vitest config	2026-05-03 17:56:29 +02:00
lila	74cfc82bdd	docs: finalise data-pipeline.md with tiebreak, pipeline.db, reports, sync	2026-05-03 17:21:02 +02:00
lila	6007fe1e38	docs: update data-pipeline.md and llm-setup.md to reflect sqlite architecture	2026-05-02 20:13:05 +02:00