updating documentation

2026-05-16 01:59:43 +02:00 · 2026-05-16 01:59:43 +02:00 · 7e0311683f
commit 7e0311683f
parent 1ba57c7e9d
25 changed files with 2660 additions and 226 deletions
--- a/documentation/archive/spec.md
+++ b/documentation/archive/spec.md
@ -0,0 +1,394 @@
+# lila — Project Specification
+
+> **This document is the single source of truth for the project.**
+> It is written to be handed to any LLM as context. It contains the project vision, the current MVP scope, the tech stack, the architecture, and the roadmap.
+
+---
+
+## 1. Project Overview
+
+A vocabulary trainer for English–Italian words. The quiz format is Duolingo-style: one word is shown as a prompt, and the user picks the correct translation from four choices (1 correct + 3 distractors of the same part-of-speech). The app supports both singleplayer and real-time multiplayer game modes.
+
+**The core learning loop:**
+Show word → pick answer → see result → next word → final score
+
+The vocabulary data comes from WordNet + the Open Multilingual Wordnet (OMW). A one-time Python script extracts English–Italian noun pairs and seeds the database. The data model is language-pair agnostic by design — adding a new language later requires no schema changes.
+
+### Core Principles
+
+- **Minimal but extendable**: working product fast, clean architecture for future growth
+- **Mobile-first**: touch-friendly Duolingo-like UX
+- **Type safety end-to-end**: TypeScript + Zod schemas shared between frontend and backend
+
+---
+
+## 2. Full Product Vision (Long-Term)
+
+- Users log in via Google or GitHub (Better Auth)
+- Singleplayer mode: 10-round quiz, score screen
+- Multiplayer mode: create a room, share a code, 2–4 players answer simultaneously in real time, live scores, winner screen
+- 1000+ English–Italian nouns seeded from WordNet
+
+This is the full vision. The current implementation already covers most of it; remaining items are captured in the roadmap and the Post-MVP ladder below.
+
+---
+
+## 3. MVP Scope
+
+**Goal:** A working, presentable vocabulary trainer that can be shown to real people (singleplayer and multiplayer), with a production deployment.
+
+### What is IN the MVP
+
+- Vocabulary data in a PostgreSQL database (already seeded)
+- REST API that returns quiz terms with distractors
+- Singleplayer quiz UI: configurable rounds (3 or 10), answer feedback, score screen
+- Clean, mobile-friendly UI (Tailwind + shadcn/ui)
+- Global error handler with typed error classes
+- Unit + integration tests for the API
+- Authentication via Better Auth (Google + GitHub)
+- Multiplayer lobby + game over WebSockets
+- Production deployment (Docker Compose + Caddy + Hetzner) and CI/CD (Forgejo Actions)
+
+### What is CUT from the MVP
+
+| Feature               | Why cut    |
+| --------------------- | ---------- |
+| User stats / profiles | Needs auth |
+
+These are not deleted from the plan — they are deferred. The architecture is already designed to support them. See Section 11 (Post-MVP Ladder).
+
+---
+
+## 4. Technology Stack
+
+The monorepo structure and tooling are already set up. This is the full stack.
+
+| Layer        | Technology                     | Status                                                 |
+| ------------ | ------------------------------ | ------------------------------------------------------ |
+| Monorepo     | pnpm workspaces                | ✅                                                     |
+| Frontend     | React 18, Vite, TypeScript     | ✅                                                     |
+| Routing      | TanStack Router                | ✅                                                     |
+| Server state | TanStack Query                 | ✅                                                     |
+| Client state | Zustand                        | ✅                                                     |
+| Styling      | Tailwind CSS + shadcn/ui       | ✅                                                     |
+| Backend      | Node.js, Express, TypeScript   | ✅                                                     |
+| Database     | PostgreSQL + Drizzle ORM       | ✅                                                     |
+| Validation   | Zod (shared schemas)           | ✅                                                     |
+| Testing      | Vitest, supertest              | ✅                                                     |
+| Auth         | Better Auth (Google + GitHub)  | ✅                                                     |
+| Deployment   | Docker Compose, Caddy, Hetzner | ✅                                                     |
+| CI/CD        | Forgejo Actions                | ✅                                                     |
+| Realtime     | WebSockets (`ws` library)      | ✅                                                     |
+| Cache        | Valkey                         | ⚠️ optional (used locally; production/state hardening) |
+
+---
+
+## 5. Repository Structure
+
+```text
+lila/
+├── .forgejo/
+│   └── workflows/
+│       └── deploy.yml              — CI/CD pipeline (build, push, deploy)
+├── apps/
+│   ├── api/
+│   │   └── src/
+│   │       ├── app.ts                  — createApp() factory, CORS, auth handler, error middleware
+│   │       ├── server.ts               — starts server on PORT
+│   │       ├── errors/
+│   │       │   └── AppError.ts         — AppError, ValidationError, NotFoundError
+│   │       ├── lib/
+│   │       │   └── auth.ts             — Better Auth config (Google + GitHub providers)
+│   │       ├── middleware/
+│   │       │   ├── authMiddleware.ts    — session validation for protected routes
+│   │       │   └── errorHandler.ts     — central error middleware
+│   │       ├── routes/
+│   │       │   ├── apiRouter.ts        — mounts /health and /game routers
+│   │       │   ├── gameRouter.ts       — POST /start, POST /answer
+│   │       │   └── healthRouter.ts
+│   │       ├── controllers/
+│   │       │   └── gameController.ts   — validates input, calls service, sends response
+│   │       ├── services/
+│   │       │   ├── gameService.ts      — builds quiz sessions, evaluates answers
+│   │       │   └── gameService.test.ts — unit tests (mocked DB)
+│   │       └── gameSessionStore/
+│   │           ├── GameSessionStore.ts — interface (async, Valkey-ready)
+│   │           ├── InMemoryGameSessionStore.ts
+│   │           └── index.ts
+│   └── web/
+│       ├── Dockerfile                  — multi-stage: dev + production (nginx:alpine)
+│       ├── nginx.conf                  — SPA fallback routing
+│       └── src/
+│           ├── routes/
+│           │   ├── index.tsx           — landing page
+│           │   ├── play.tsx            — the quiz
+│           │   ├── login.tsx           — Google + GitHub login buttons
+│           │   ├── about.tsx
+│           │   └── __root.tsx
+│           ├── lib/
+│           │   └── auth-client.ts      — Better Auth React client
+│           ├── components/
+│           │   └── game/
+│           │       ├── GameSetup.tsx    — settings UI
+│           │       ├── QuestionCard.tsx — prompt + 4 options
+│           │       ├── OptionButton.tsx — idle / correct / wrong states
+│           │       └── ScoreScreen.tsx  — final score + play again
+│           └── main.tsx
+├── packages/
+│   ├── shared/
+│   │   └── src/
+│   │       ├── constants.ts            — SUPPORTED_POS, DIFFICULTY_LEVELS, etc.
+│   │       ├── schemas/game.ts         — Zod schemas for all game types
+│   │       └── index.ts
+│   └── db/
+│       ├── drizzle/                    — migration SQL files
+│       └── src/
+│           ├── db/schema.ts            — Drizzle schema (terms, translations, auth tables)
+│           ├── models/termModel.ts     — getGameTerms(), getDistractors()
+│           ├── seeding-datafiles.ts    — seeds terms + translations from JSON
+│           ├── seeding-cefr-levels.ts  — enriches translations with CEFR data
+│           ├── generating-deck.ts      — builds curated decks
+│           └── index.ts
+├── scripts/                            — Python extraction/comparison/merge scripts
+├── documentation/                      — project docs
+├── docker-compose.yml                  — local dev stack
+├── Caddyfile                           — reverse proxy routing
+└── pnpm-workspace.yaml
+```
+
+`packages/shared` is the contract between frontend and backend. All request/response shapes are defined there as Zod schemas — never duplicated.
+
+---
+
+## 6. Architecture
+
+### The Layered Architecture
+
+```text
+HTTP Request
+     ↓
+  Router        — maps URL + HTTP method to a controller
+     ↓
+ Controller     — handles HTTP only: validates input, calls service, sends response
+     ↓
+  Service       — business logic only: no HTTP, no direct DB access
+     ↓
+  Model         — database queries only: no business logic
+     ↓
+  Database
+```
+
+**The rule:** each layer only talks to the layer directly below it. A controller never touches the database. A service never reads `req.body`. A model never knows what a quiz is.
+
+### Monorepo Package Responsibilities
+
+| Package           | Owns                                                     |
+| ----------------- | -------------------------------------------------------- |
+| `packages/shared` | Zod schemas, constants, derived TypeScript types         |
+| `packages/db`     | Drizzle schema, DB connection, all model/query functions |
+| `apps/api`        | Router, controllers, services, error handling            |
+| `apps/web`        | React frontend, consumes types from shared               |
+
+**Key principle:** all database code lives in `packages/db`. `apps/api` never imports `drizzle-orm` for queries — it only calls functions exported from `packages/db`.
+
+### Production Infrastructure
+
+```text
+Internet → Caddy (HTTPS termination)
+             ├── lilastudy.com       → web container (nginx, static files)
+             ├── api.lilastudy.com   → api container (Express, port 3000)
+             └── git.lilastudy.com   → forgejo container (git + registry, port 3000)
+
+SSH (port 2222) → forgejo container (git push/pull)
+```
+
+All containers communicate over an internal Docker network. Only Caddy (80/443) and Forgejo SSH (2222) are exposed to the internet.
+
+---
+
+## 7. Data Model (Current State)
+
+Words are modelled as language-neutral concepts (terms) separate from learning curricula (decks). Adding a new language pair requires no schema changes — only new rows in `translations`, `decks`.
+
+**Core tables:** `terms`, `translations`, `term_glosses`, `decks`, `deck_terms`, `topics`, `term_topics`
+
+**Auth tables (managed by Better Auth):** `user`, `session`, `account`, `verification`
+
+Key columns on `terms`: `id` (uuid), `pos` (CHECK-constrained), `source`, `source_id` (unique pair for idempotent imports)
+
+Key columns on `translations`: `id`, `term_id` (FK), `language_code` (CHECK-constrained), `text`, `cefr_level` (nullable varchar(2), CHECK A1–C2)
+
+Deck model uses `source_language` + `validated_languages` array — one deck serves multiple target languages. Decks are frequency tiers (e.g. `en-core-1000`), not POS splits.
+
+Full schema is in `packages/db/src/db/schema.ts`.
+
+---
+
+## 8. API
+
+### Endpoints
+
+```text
+POST /api/v1/game/start     GameRequest → GameSession      (requires auth)
+POST /api/v1/game/answer    AnswerSubmission → AnswerResult  (requires auth)
+GET  /api/v1/health          Health check                    (public)
+ALL  /api/auth/*             Better Auth handlers            (public)
+```
+
+### Schemas (packages/shared)
+
+**GameRequest:** `{ source_language, target_language, pos, difficulty, rounds }`
+**GameSession:** `{ sessionId: uuid, questions: GameQuestion[] }`
+**GameQuestion:** `{ questionId: uuid, prompt: string, gloss: string | null, options: AnswerOption[4] }`
+**AnswerOption:** `{ optionId: number (0-3), text: string }`
+**AnswerSubmission:** `{ sessionId: uuid, questionId: uuid, selectedOptionId: number (0-3) }`
+**AnswerResult:** `{ questionId: uuid, isCorrect: boolean, correctOptionId: number (0-3), selectedOptionId: number (0-3) }`
+
+### Error Handling
+
+Typed error classes (`AppError` base, `ValidationError` 400, `NotFoundError` 404) with central error middleware. Controllers validate with `safeParse`, throw on failure, and call `next(error)` in the catch. The middleware maps `AppError` instances to HTTP status codes; unknown errors return 500.
+
+### Key Design Rules
+
+- Server-side answer evaluation: the correct answer is never sent to the frontend
+- `POST` not `GET` for game start (configuration in request body)
+- `safeParse` over `parse` (clean 400s, not raw Zod 500s)
+- Session state stored in `GameSessionStore` (in-memory now, Valkey later)
+
+---
+
+## 9. Game Mechanics
+
+- **Format**: source-language word prompt + 4 target-language choices
+- **Distractors**: same POS, same difficulty, server-side, never the correct answer, never repeated within a session
+- **Session length**: 3 or 10 questions (configurable)
+- **Scoring**: +1 per correct answer (no speed bonus for MVP)
+- **Timer**: none in singleplayer MVP
+- **Auth required**: users must log in via Google or GitHub
+- **Submit-before-send**: user selects, then confirms (prevents misclicks)
+
+---
+
+## 10. Working Methodology
+
+This project is a learning exercise. The goal is to understand the code, not just to ship it.
+
+### How to use an LLM for help
+
+1. Paste this document as context
+2. Describe what you're working on and what you're stuck on
+3. Ask for hints, not solutions
+
+### Refactoring workflow
+
+After completing a task: share the code, ask what to refactor and why. The LLM should explain the concept, not write the implementation.
+
+---
+
+## 11. Post-MVP Ladder
+
+<<<<<<< HEAD
+| Phase | What it adds | Status |
+| ----------------- | ------------------------------------------------------------------------------- | ------ |
+| Auth | Better Auth (Google + GitHub), embedded in Express API, user rows in DB | ✅ |
+| Deployment | Docker Compose, Caddy, Forgejo, CI/CD, Hetzner VPS | ✅ |
+| Hardening (partial) | CI/CD pipeline, DB backups | ✅ |
+| User Stats | Games played, score history, profile page | ❌ |
+| Multiplayer Lobby | Room creation, join by code, WebSocket connection | ❌ |
+| Multiplayer Game | Simultaneous answers, server timer, live scores, winner screen | ❌ |
+| Hardening (rest) | Rate limiting, error boundaries, monitoring, accessibility | ❌ |
+=======
+| Phase | What it adds | Status |
+| ------------------- | ----------------------------------------------------------------------- | ------ |
+| Auth | Better Auth (Google + GitHub), embedded in Express API, user rows in DB | ✅ |
+| Deployment | Docker Compose, Caddy, Forgejo, CI/CD, Hetzner VPS | ✅ |
+| Hardening (partial) | CI/CD pipeline, DB backups | ✅ |
+| User Stats | Games played, score history, profile page | ❌ |
+| Multiplayer Lobby | Room creation, join by code, WebSocket connection | ✅ |
+| Multiplayer Game | Simultaneous answers, server timer, live scores, winner screen | ✅ |
+| Hardening (rest) | Rate limiting, error boundaries, monitoring, accessibility | ❌ |
+
+> > > > > > > dev
+
+### Future Data Model Extensions (deferred, additive)
+
+- `noun_forms` — gender, singular, plural, articles per language
+- `verb_forms` — conjugation tables per language
+- `term_pronunciations` — IPA and audio URLs per language
+- `user_decks` — which decks a user is studying
+- `user_term_progress` — spaced repetition state per user/term/language
+- `quiz_answers` — history log for stats
+
+All are new tables referencing existing `terms` rows via FK. No existing schema changes required.
+
+### Multiplayer Architecture (current + deferred)
+
+**Implemented now:**
+
+- WebSocket protocol uses the `ws` library with a Zod discriminated union for message types (defined in `packages/shared`)
+- Room model uses human-readable codes (no matchmaking queue)
+- Lobby flow (create/join/leave) is real-time over WS, backed by PostgreSQL for durable membership/state
+- Multiplayer game flow is real-time: host starts, all players see the same question, answers are collected simultaneously, with a server-enforced 15s timer and live scoring
+- WebSocket connections are authenticated (Better Auth session validation on upgrade)
+
+**Deferred / hardening:**
+
+- Valkey-backed ephemeral state (room/game/session store) where in-memory state becomes a bottleneck
+- Graceful reconnect/resume flows and more robust failure handling (tracked in Phase 7)
+
+### Infrastructure (current)
+
+- `lilastudy.com` → React frontend (nginx serving static files)
+- `api.lilastudy.com` → Express API + Better Auth
+- `git.lilastudy.com` → Forgejo (git server + container registry)
+- Docker Compose with Caddy for automatic HTTPS via Let's Encrypt
+- CI/CD via Forgejo Actions (build on push to main, deploy via SSH)
+- Daily DB backups with cron, synced to dev laptop
+
+See `deployment.md` for full infrastructure documentation.
+
+---
+
+## 12. Definition of Done (Current Baseline)
+
+- [x] API returns quiz terms with correct distractors
+- [x] User can complete a quiz without errors
+- [x] Score screen shows final result and a play-again option
+- [x] App is usable on a mobile screen
+- [x] No hardcoded data — everything comes from the database
+- [x] Global error handler with typed error classes
+- [x] Unit + integration tests for API
+- [x] Auth works end-to-end (Google + GitHub via Better Auth)
+- [x] Multiplayer works end-to-end (lobby + real-time game over WebSockets)
+- [x] Production deployment is live behind HTTPS (Caddy) with CI/CD deploys via Forgejo Actions
+
+---
+
+## 13. Roadmap
+
+See `roadmap.md` for the full roadmap with task-level checkboxes.
+
+### Dependency Graph
+
+```text
+Phase 0 (Foundation) ✅
+└── Phase 1 (Vocabulary Data + API) ✅
+    └── Phase 2 (Singleplayer UI) ✅
+        ├── Phase 3 (Auth) ✅
+        │   └── Phase 6 (Deployment + CI/CD) ✅
+        └── Phase 4 (Multiplayer Lobby) ✅
+            └── Phase 5 (Multiplayer Game) ✅
+                └── Phase 7 (Hardening)
+```
+
+---
+
+## 14. Game Flow (Future)
+
+Singleplayer: choose direction (en→it or it→en) → top-level category → part of speech → difficulty (A1–C2) → round count → game starts.
+
+**Top-level categories (post-MVP):**
+
+- **Grammar** — practice nouns, verb conjugations, etc.
+- **Media** — practice vocabulary from specific books, films, songs, etc.
+- **Thematic** — animals, kitchen, etc. (requires category metadata research)