From 94f02b99049bad6e7d3f2e54cf975c67fb432b29 Mon Sep 17 00:00:00 2001 From: lila Date: Fri, 20 Mar 2026 09:21:06 +0100 Subject: [PATCH] adding documentation --- documentation/roadmap.md | 149 +++++++++++++ documentation/spec.md | 436 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 585 insertions(+) create mode 100644 documentation/roadmap.md create mode 100644 documentation/spec.md diff --git a/documentation/roadmap.md b/documentation/roadmap.md new file mode 100644 index 0000000..602121e --- /dev/null +++ b/documentation/roadmap.md @@ -0,0 +1,149 @@ +# Vocabulary Trainer — Roadmap + +Each phase produces a working, deployable increment. Nothing is built speculatively. + +--- + +## Phase 0 — Foundation +**Goal**: Empty repo that builds, lints, and runs end-to-end. +**Done when**: `pnpm dev` starts both apps; `GET /api/health` returns 200; React renders a hello page. + +- [ ] Initialise pnpm workspace monorepo: `apps/web`, `apps/api`, `packages/shared`, `packages/db` +- [ ] Configure TypeScript project references across packages +- [ ] Set up ESLint + Prettier with shared configs in root +- [ ] Set up Vitest in `api` and `web` +- [ ] Scaffold Express app with `GET /api/health` +- [ ] Scaffold Vite + React app with TanStack Router (single root route) +- [ ] Configure Drizzle ORM + connection to local PostgreSQL +- [ ] Write first migration (empty — just validates the pipeline works) +- [ ] `docker-compose.yml` for local dev: `api`, `web`, `postgres`, `valkey` +- [ ] `.env.example` files for `apps/api` and `apps/web` + +--- + +## Phase 1 — Vocabulary Data +**Goal**: Word data lives in the DB and can be queried via the API. +**Done when**: `GET /api/terms?pair=en-it&limit=10` returns 10 terms, each with 3 distractors attached. + +- [ ] Run `scripts/extract_omw.py` locally → generates `packages/db/src/seed.json` +- [ ] Write Drizzle schema: `terms`, `translations`, `language_pairs` +- [ ] Write and run migration +- [ ] Write `packages/db/src/seed.ts` (reads `seed.json`, populates tables) +- [ ] Implement `TermRepository.getRandom(pairId, limit)` +- [ ] Implement `QuizService.attachDistractors(terms)` — same POS, server-side, no duplicates +- [ ] Implement `GET /language-pairs` and `GET /terms` endpoints +- [ ] Define Zod response schemas in `packages/shared` +- [ ] Unit tests for `QuizService` (correct POS filtering, never includes the answer) + +--- + +## Phase 2 — Auth +**Goal**: Users can log in via Google or GitHub and stay logged in. +**Done when**: JWT from OpenAuth is validated by the API; protected routes redirect unauthenticated users; user row is created on first login. + +- [ ] Add OpenAuth service to `docker-compose.yml` +- [ ] Write Drizzle schema: `users` +- [ ] Write and run migration +- [ ] Implement JWT validation middleware in `apps/api` +- [ ] Implement `GET /api/auth/me` (validate token, upsert user row, return user) +- [ ] Define auth Zod schemas in `packages/shared` +- [ ] Frontend: login page with "Continue with Google" + "Continue with GitHub" buttons +- [ ] Frontend: redirect to `auth.yourdomain.com` → receive JWT → store in memory + HttpOnly cookie +- [ ] Frontend: TanStack Router auth guard (redirects unauthenticated users) +- [ ] Frontend: TanStack Query `api.ts` attaches token to every request +- [ ] Unit tests for JWT middleware + +--- + +## Phase 3 — Single-player Mode +**Goal**: A logged-in user can complete a full solo quiz session. +**Done when**: User sees 10 questions, picks answers, sees their final score. + +- [ ] Frontend: `/singleplayer` route +- [ ] `useQuizSession` hook: fetch terms, manage question index + score state +- [ ] `QuestionCard` component: prompt word + 4 answer buttons +- [ ] `OptionButton` component: idle / correct / wrong states +- [ ] `ScoreScreen` component: final score + play-again button +- [ ] TanStack Query integration for `GET /terms` +- [ ] RTL tests for `QuestionCard` and `OptionButton` + +--- + +## Phase 4 — Multiplayer Rooms (Lobby) +**Goal**: Players can create and join rooms; the host sees all joined players in real time. +**Done when**: Two browser tabs can join the same room and see each other's display names update live via WebSocket. + +- [ ] Write Drizzle schema: `rooms`, `room_players` +- [ ] Write and run migration +- [ ] `POST /rooms` and `POST /rooms/:code/join` REST endpoints +- [ ] `RoomService`: create room with short code, join room, enforce max player limit +- [ ] WebSocket server: attach `ws` upgrade handler to the Express HTTP server +- [ ] WS auth middleware: validate OpenAuth JWT on upgrade +- [ ] WS message router: dispatch incoming messages by `type` +- [ ] `room:join` / `room:leave` handlers → broadcast `room:state` to all room members +- [ ] Room membership tracked in Valkey (ephemeral) + `room_players` in PostgreSQL (durable) +- [ ] Define all WS event Zod schemas in `packages/shared` +- [ ] Frontend: `/multiplayer/lobby` — create room form + join-by-code form +- [ ] Frontend: `/multiplayer/room/:code` — player list, room code display, "Start Game" (host only) +- [ ] Frontend: `ws.ts` singleton WS client with reconnect on drop +- [ ] Frontend: Zustand `gameStore` handles incoming `room:state` events + +--- + +## Phase 5 — Multiplayer Game +**Goal**: Host starts a game; all players answer simultaneously in real time; a winner is declared. +**Done when**: 2–4 players complete a 10-round game with correct live scores and a winner screen. + +- [ ] `GameService`: generate question sequence for a room, enforce server-side 15 s timer +- [ ] `room:start` WS handler → begin question loop, broadcast first `game:question` +- [ ] `game:answer` WS handler → collect per-player answers +- [ ] On all-answered or timeout → evaluate, broadcast `game:answer_result` +- [ ] After N rounds → broadcast `game:finished`, update `rooms.status` + `room_players.score` in DB +- [ ] Frontend: `/multiplayer/game/:code` route +- [ ] Frontend: extend Zustand store with `currentQuestion`, `roundAnswers`, `scores` +- [ ] Frontend: reuse `QuestionCard` + `OptionButton`; add countdown timer ring +- [ ] Frontend: `ScoreBoard` component — live per-player scores after each round +- [ ] Frontend: `GameFinished` screen — winner highlight, final scores, "Play Again" button +- [ ] Unit tests for `GameService` (round evaluation, tie-breaking, timeout auto-advance) + +--- + +## Phase 6 — Production Deployment +**Goal**: App is live on Hetzner, accessible via HTTPS on all subdomains. +**Done when**: `https://app.yourdomain.com` loads; `wss://api.yourdomain.com` connects; auth flow works end-to-end. + +- [ ] `docker-compose.prod.yml`: all services + `nginx-proxy` + `acme-companion` +- [ ] Nginx config per container: `VIRTUAL_HOST` + `LETSENCRYPT_HOST` env vars +- [ ] Production `.env` files on VPS (OpenAuth secrets, DB credentials, Valkey URL) +- [ ] Drizzle migration runs on `api` container start +- [ ] Seed production DB (run `seed.ts` once) +- [ ] Smoke test: login → solo game → create room → multiplayer game end-to-end + +--- + +## Phase 7 — Polish & Hardening *(post-MVP)* + +Not required to ship, but address before real users arrive. + +- [ ] Rate limiting on API endpoints (`express-rate-limit`) +- [ ] Graceful WS reconnect with exponential back-off +- [ ] React error boundaries +- [ ] `GET /users/me/stats` endpoint + profile page +- [ ] Accessibility pass (keyboard nav, ARIA on quiz buttons) +- [ ] Favicon, page titles, Open Graph meta +- [ ] CI/CD pipeline (GitHub Actions → SSH deploy on push to `main`) +- [ ] Database backups (cron → Hetzner Object Storage) + +--- + +## Dependency Graph + +``` +Phase 0 (Foundation) + └── Phase 1 (Vocabulary Data) + └── Phase 2 (Auth) + ├── Phase 3 (Singleplayer) ← parallel with Phase 4 + └── Phase 4 (Room Lobby) + └── Phase 5 (Multiplayer Game) + └── Phase 6 (Deployment) +``` diff --git a/documentation/spec.md b/documentation/spec.md new file mode 100644 index 0000000..5960a63 --- /dev/null +++ b/documentation/spec.md @@ -0,0 +1,436 @@ +# Vocabulary Trainer — Project Specification + +## 1. Overview + +A multiplayer English–Italian vocabulary trainer with a Duolingo-style quiz interface (one word prompt, four answer choices). Supports both single-player practice and real-time competitive multiplayer rooms of 2–4 players. Designed from the ground up to be language-pair agnostic. + +### Core Principles + +- **Minimal but extendable**: Working product fast, clean architecture for future growth +- **Mobile-first**: Touch-friendly Duolingo-like UX +- **Type safety end-to-end**: TypeScript + Zod schemas shared between frontend and backend + +--- + +## 2. Technology Stack + +| Layer | Technology | +|---|---| +| Monorepo | pnpm workspaces | +| Frontend | React 18, Vite, TypeScript | +| Routing | TanStack Router | +| Server state | TanStack Query | +| Client state | Zustand | +| Styling | Tailwind CSS + shadcn/ui | +| Backend | Node.js, Express, TypeScript | +| Realtime | WebSockets (`ws` library) | +| Database | PostgreSQL 16 | +| ORM | Drizzle ORM | +| Cache / Queue | Valkey 8 | +| Auth | OpenAuth (Google + GitHub) | +| Validation | Zod (shared schemas) | +| Testing | Vitest, React Testing Library | +| Linting / Formatting | ESLint, Prettier | +| Containerisation | Docker, Docker Compose | +| Hosting | Hetzner VPS | + +### Why `ws` over Socket.io +`ws` is the raw WebSocket library. For rooms of 2–4 players there is no need for Socket.io's transport fallbacks or room-management abstractions. The protocol is defined explicitly in `packages/shared`, which gives the same guarantees without the overhead. + +### Why Valkey +Valkey stores ephemeral room state that does not need to survive a server restart. It keeps the PostgreSQL schema clean and makes room lookups O(1). + +### Why pnpm workspaces without Turborepo +Turborepo adds parallel task running and build caching on top of pnpm workspaces. For a two-app monorepo of this size, the plain pnpm workspace commands (`pnpm -r run build`, `pnpm --filter`) are sufficient and there is one less tool to configure and maintain. + +--- + +## 3. Repository Structure + +``` +vocab-trainer/ +├── apps/ +│ ├── web/ # React SPA (Vite + TanStack Router) +│ │ ├── src/ +│ │ │ ├── routes/ +│ │ │ ├── components/ +│ │ │ ├── stores/ # Zustand stores +│ │ │ └── lib/ +│ │ └── Dockerfile +│ └── api/ # Express REST + WebSocket server +│ ├── src/ +│ │ ├── routes/ +│ │ ├── services/ +│ │ ├── repositories/ +│ │ └── websocket/ +│ └── Dockerfile +├── packages/ +│ ├── shared/ # Zod schemas, TypeScript types, constants +│ └── db/ # Drizzle schema, migrations, seed script +├── scripts/ +│ └── extract_omw.py # One-time WordNet + OMW extraction → seed.json +├── docker-compose.yml +├── docker-compose.prod.yml +├── pnpm-workspace.yaml +└── package.json +``` + +`packages/shared` is the contract between frontend and backend. All request/response shapes and WebSocket event payloads are defined there as Zod schemas and inferred TypeScript types — never duplicated. + +### pnpm workspace config + +`pnpm-workspace.yaml` declares: +``` +packages: + - 'apps/*' + - 'packages/*' +``` + +### Root scripts + +The root `package.json` defines convenience scripts that delegate to workspaces: +- `dev` — starts `api` and `web` in parallel +- `build` — builds all packages in dependency order +- `test` — runs Vitest across all workspaces +- `lint` — runs ESLint across all workspaces + +For parallel dev, use `concurrently` or just two terminal tabs for MVP. + +--- + +## 4. Architecture — N-Tier / Layered + +``` +┌────────────────────────────────────┐ +│ Presentation (React SPA) │ apps/web +├────────────────────────────────────┤ +│ API / Transport │ HTTP REST + WebSocket +├────────────────────────────────────┤ +│ Application (Controllers) │ apps/api/src/routes +│ Domain (Business logic) │ apps/api/src/services +│ Data Access (Repositories) │ apps/api/src/repositories +├────────────────────────────────────┤ +│ Database (PostgreSQL via Drizzle) │ packages/db +│ Cache (Valkey) │ apps/api/src/lib/valkey.ts +└────────────────────────────────────┘ +``` + +Each layer only communicates with the layer directly below it. Business logic lives in services, not in route handlers or repositories. + +--- + +## 5. Infrastructure + +### Domain structure + +| Subdomain | Service | +|---|---| +| `app.yourdomain.com` | React frontend | +| `api.yourdomain.com` | Express API + WebSocket | +| `auth.yourdomain.com` | OpenAuth service | + +### Docker Compose services (production) + +| Container | Role | +|---|---| +| `postgres` | PostgreSQL 16, named volume | +| `valkey` | Valkey 8, ephemeral (no persistence needed) | +| `openauth` | OpenAuth service | +| `api` | Express + WS server | +| `web` | Nginx serving the Vite build | +| `nginx-proxy` | Automatic reverse proxy | +| `acme-companion` | Let's Encrypt certificate automation | + +``` +nginx-proxy (:80/:443) + app.domain → web:80 + api.domain → api:3000 (HTTP + WS upgrade) + auth.domain → openauth:3001 +``` + +SSL is fully automatic via `nginx-proxy` + `acme-companion`. No manual Certbot needed. + +--- + +## 6. Data Model + +### Design principle +Words are modelled as language-neutral **terms** with one or more **translations** per language. Adding a new language pair (e.g. English–French) requires **no schema changes** — only new rows in `translations` and `language_pairs`. The flat `english/italian` column pattern is explicitly avoided. + +### Core tables + +``` +terms + id uuid PK + synset_id text UNIQUE -- WordNet synset offset e.g. "wn:01234567n" + pos varchar(20) -- "noun" | "verb" | "adjective" + frequency_rank integer -- 1–1000, reserved for difficulty filtering + created_at timestamptz + +translations + id uuid PK + term_id uuid FK → terms.id + language_code varchar(10) -- BCP 47: "en", "it", "de", ... + text text + UNIQUE (term_id, language_code) + +language_pairs + id uuid PK + source varchar(10) -- "en" + target varchar(10) -- "it" + label text -- "English → Italian" + active boolean DEFAULT true + UNIQUE (source, target) + +users + id uuid PK -- OpenAuth sub claim + email varchar(255) UNIQUE + display_name varchar(100) + games_played integer DEFAULT 0 + games_won integer DEFAULT 0 + created_at timestamptz + last_login_at timestamptz + +rooms + id uuid PK + code varchar(8) UNIQUE -- human-readable e.g. "WOLF-42" + host_id uuid FK → users.id + pair_id uuid FK → language_pairs.id + status text -- "waiting" | "in_progress" | "finished" + max_players smallint DEFAULT 4 + round_count smallint DEFAULT 10 + created_at timestamptz + +room_players + room_id uuid FK → rooms.id + user_id uuid FK → users.id + score integer DEFAULT 0 + joined_at timestamptz + PRIMARY KEY (room_id, user_id) +``` + +### Indexes + +```sql +CREATE INDEX ON terms (pos, frequency_rank); +CREATE INDEX ON rooms (status); +CREATE INDEX ON room_players (user_id); +``` + +--- + +## 7. Vocabulary Data — WordNet + OMW + +### Source +- **Princeton WordNet** — English words + synset IDs +- **Open Multilingual Wordnet (OMW)** — Italian translations keyed by synset ID + +### Extraction process +1. Run `scripts/extract_omw.py` once locally using NLTK +2. Filter to the 1 000 most common nouns (by WordNet frequency data) +3. Output: `packages/db/src/seed.json` — committed to the repo +4. `packages/db/src/seed.ts` reads the JSON and populates `terms` + `translations` + +`terms.synset_id` stores the WordNet offset (e.g. `wn:01234567n`) for traceability and future re-imports with additional languages. + +--- + +## 8. Authentication — OpenAuth + +All auth is delegated to the OpenAuth service at `auth.yourdomain.com`. Providers: Google, GitHub. + +The API validates the JWT from OpenAuth on every protected request. User rows are created or updated on first login via the `sub` claim as the primary key. + +**Auth endpoint on the API:** + +| Method | Path | Description | +|---|---|---| +| GET | `/api/auth/me` | Validate token, return user | + +All other auth flows (login, callback, token refresh) are handled entirely by OpenAuth — the frontend redirects to `auth.yourdomain.com` and receives a JWT back. + +--- + +## 9. REST API + +All endpoints prefixed `/api`. Request and response bodies validated with Zod on both sides using schemas from `packages/shared`. + +### Vocabulary + +| Method | Path | Description | +|---|---|---| +| GET | `/language-pairs` | List active language pairs | +| GET | `/terms?pair=en-it&limit=10` | Fetch quiz terms with distractors | + +### Rooms + +| Method | Path | Description | +|---|---|---| +| POST | `/rooms` | Create a room → returns room + code | +| GET | `/rooms/:code` | Get current room state | +| POST | `/rooms/:code/join` | Join a room | + +### Users + +| Method | Path | Description | +|---|---|---| +| GET | `/users/me` | Current user profile | +| GET | `/users/me/stats` | Games played, win rate | + +--- + +## 10. WebSocket Protocol + +One WS connection per client. Authenticated by passing the OpenAuth JWT as a query param on the upgrade request: `wss://api.yourdomain.com?token=...`. + +All messages are JSON: `{ type: string, payload: unknown }`. The full set of types is a Zod discriminated union in `packages/shared` — both sides validate every message they receive. + +### Client → Server + +| type | payload | Description | +|---|---|---| +| `room:join` | `{ code }` | Subscribe to a room's WS channel | +| `room:leave` | — | Unsubscribe | +| `room:start` | — | Host starts the game | +| `game:answer` | `{ questionId, answerId }` | Player submits an answer | + +### Server → Client + +| type | payload | Description | +|---|---|---| +| `room:state` | Full room snapshot | Sent on join and on any player join/leave | +| `game:question` | `{ id, prompt, options[], timeLimit }` | New question broadcast to all players | +| `game:answer_result` | `{ questionId, correct, correctAnswerId, scores }` | Broadcast after all answer or timeout | +| `game:finished` | `{ scores[], winner }` | End of game summary | +| `error` | `{ message }` | Protocol or validation error | + +### Multiplayer game mechanic — simultaneous answers + +All players see the same question at the same time. Everyone submits independently. The server waits until all players have answered **or** the 15-second timeout fires — then broadcasts `game:answer_result` with updated scores. There is no buzz-first mechanic. This keeps the experience Duolingo-like and symmetric. + +### Game flow + +``` +host creates room (REST) → +players join via room code (REST + WS room:join) → +room:state broadcasts player list → +host sends room:start → +server broadcasts game:question → +players send game:answer → +server collects all answers or waits for timeout → +server broadcasts game:answer_result → +repeat for N rounds → +server broadcasts game:finished +``` + +### Room state in Valkey + +Active room state (connected players, current question, answers received this round) is stored in Valkey with a TTL. PostgreSQL holds the durable record (`rooms`, `room_players`). On server restart, in-progress games are considered abandoned — acceptable for MVP. + +--- + +## 11. Game Mechanics + +- **Question format**: source-language word prompt + 4 target-language choices (1 correct + 3 distractors of the same POS) +- **Distractors**: generated server-side, never include the correct answer, never repeat within a session +- **Scoring**: +1 point per correct answer. Speed bonus is out of scope for MVP. +- **Timer**: 15 seconds per question, server-authoritative +- **Single-player**: uses `GET /terms` and runs entirely client-side. No WebSocket. + +--- + +## 12. Frontend Structure + +``` +apps/web/src/ +├── routes/ +│ ├── index.tsx # Landing / mode select +│ ├── auth/ +│ ├── singleplayer/ +│ └── multiplayer/ +│ ├── lobby.tsx # Create or join by code +│ ├── room.$code.tsx # Waiting room +│ └── game.$code.tsx # Active game +├── components/ +│ ├── quiz/ # QuestionCard, OptionButton, ScoreBoard +│ ├── room/ # PlayerList, RoomCode, ReadyState +│ └── ui/ # shadcn/ui wrappers: Button, Card, Dialog ... +├── stores/ +│ └── gameStore.ts # Zustand: game session, scores, WS state +├── lib/ +│ ├── api.ts # TanStack Query wrappers +│ └── ws.ts # WS client singleton +└── main.tsx +``` + +### Zustand store (single store for MVP) + +```typescript +interface AppStore { + user: User | null; + gameSession: GameSession | null; + currentQuestion: Question | null; + scores: Record; + isLoading: boolean; + error: string | null; +} +``` + +TanStack Query handles all server data fetching. Zustand handles ephemeral UI and WebSocket-driven state. + +--- + +## 13. Testing Strategy + +| Type | Tool | Scope | +|---|---|---| +| Unit | Vitest | Services, QuizService distractor logic, Zod schemas | +| Component | Vitest + RTL | QuestionCard, OptionButton, auth forms | +| Integration | Vitest | API route handlers against a test DB | +| E2E | Out of scope for MVP | — | + +Tests are co-located with source files (`*.test.ts` / `*.test.tsx`). + +**Critical paths to cover:** +- Distractor generation (correct POS, no duplicates, never includes answer) +- Answer validation (server-side, correct scoring) +- Game session lifecycle (create → play → complete) +- JWT validation middleware + +--- + +## 14. Definition of Done + +### Functional +- [ ] User can log in via Google or GitHub (OpenAuth) +- [ ] User can play singleplayer: 10 rounds, score, result screen +- [ ] User can create a room and share a code +- [ ] User can join a room via code +- [ ] Multiplayer: 10 rounds, simultaneous answers, real-time score sync +- [ ] 1 000 English–Italian words seeded from WordNet + OMW + +### Technical +- [ ] Deployed to Hetzner with HTTPS on all three subdomains +- [ ] Docker Compose running all services +- [ ] Drizzle migrations applied on container start +- [ ] 10–20 passing tests covering critical paths +- [ ] pnpm workspace build pipeline green + +### Documentation +- [ ] `SPEC.md` complete +- [ ] `.env.example` files for all apps +- [ ] `README.md` with local dev setup instructions + +--- + +## 15. Out of Scope (MVP) + +- Difficulty levels *(`frequency_rank` column exists, ready to use)* +- Additional language pairs *(schema already supports it — just add rows)* +- Leaderboards *(`games_played`, `games_won` columns exist)* +- Streaks / daily challenges +- Friends / private invites +- Audio pronunciation +- CI/CD pipeline (manual deploy for now) +- Rate limiting *(add before going public)* +- Admin panel for vocabulary management