adding documentation

This commit is contained in:
lila 2026-03-20 09:21:06 +01:00
parent 25bb43ee4b
commit 94f02b9904
2 changed files with 585 additions and 0 deletions

149
documentation/roadmap.md Normal file
View file

@ -0,0 +1,149 @@
# Vocabulary Trainer — Roadmap
Each phase produces a working, deployable increment. Nothing is built speculatively.
---
## Phase 0 — Foundation
**Goal**: Empty repo that builds, lints, and runs end-to-end.
**Done when**: `pnpm dev` starts both apps; `GET /api/health` returns 200; React renders a hello page.
- [ ] Initialise pnpm workspace monorepo: `apps/web`, `apps/api`, `packages/shared`, `packages/db`
- [ ] Configure TypeScript project references across packages
- [ ] Set up ESLint + Prettier with shared configs in root
- [ ] Set up Vitest in `api` and `web`
- [ ] Scaffold Express app with `GET /api/health`
- [ ] Scaffold Vite + React app with TanStack Router (single root route)
- [ ] Configure Drizzle ORM + connection to local PostgreSQL
- [ ] Write first migration (empty — just validates the pipeline works)
- [ ] `docker-compose.yml` for local dev: `api`, `web`, `postgres`, `valkey`
- [ ] `.env.example` files for `apps/api` and `apps/web`
---
## Phase 1 — Vocabulary Data
**Goal**: Word data lives in the DB and can be queried via the API.
**Done when**: `GET /api/terms?pair=en-it&limit=10` returns 10 terms, each with 3 distractors attached.
- [ ] Run `scripts/extract_omw.py` locally → generates `packages/db/src/seed.json`
- [ ] Write Drizzle schema: `terms`, `translations`, `language_pairs`
- [ ] Write and run migration
- [ ] Write `packages/db/src/seed.ts` (reads `seed.json`, populates tables)
- [ ] Implement `TermRepository.getRandom(pairId, limit)`
- [ ] Implement `QuizService.attachDistractors(terms)` — same POS, server-side, no duplicates
- [ ] Implement `GET /language-pairs` and `GET /terms` endpoints
- [ ] Define Zod response schemas in `packages/shared`
- [ ] Unit tests for `QuizService` (correct POS filtering, never includes the answer)
---
## Phase 2 — Auth
**Goal**: Users can log in via Google or GitHub and stay logged in.
**Done when**: JWT from OpenAuth is validated by the API; protected routes redirect unauthenticated users; user row is created on first login.
- [ ] Add OpenAuth service to `docker-compose.yml`
- [ ] Write Drizzle schema: `users`
- [ ] Write and run migration
- [ ] Implement JWT validation middleware in `apps/api`
- [ ] Implement `GET /api/auth/me` (validate token, upsert user row, return user)
- [ ] Define auth Zod schemas in `packages/shared`
- [ ] Frontend: login page with "Continue with Google" + "Continue with GitHub" buttons
- [ ] Frontend: redirect to `auth.yourdomain.com` → receive JWT → store in memory + HttpOnly cookie
- [ ] Frontend: TanStack Router auth guard (redirects unauthenticated users)
- [ ] Frontend: TanStack Query `api.ts` attaches token to every request
- [ ] Unit tests for JWT middleware
---
## Phase 3 — Single-player Mode
**Goal**: A logged-in user can complete a full solo quiz session.
**Done when**: User sees 10 questions, picks answers, sees their final score.
- [ ] Frontend: `/singleplayer` route
- [ ] `useQuizSession` hook: fetch terms, manage question index + score state
- [ ] `QuestionCard` component: prompt word + 4 answer buttons
- [ ] `OptionButton` component: idle / correct / wrong states
- [ ] `ScoreScreen` component: final score + play-again button
- [ ] TanStack Query integration for `GET /terms`
- [ ] RTL tests for `QuestionCard` and `OptionButton`
---
## Phase 4 — Multiplayer Rooms (Lobby)
**Goal**: Players can create and join rooms; the host sees all joined players in real time.
**Done when**: Two browser tabs can join the same room and see each other's display names update live via WebSocket.
- [ ] Write Drizzle schema: `rooms`, `room_players`
- [ ] Write and run migration
- [ ] `POST /rooms` and `POST /rooms/:code/join` REST endpoints
- [ ] `RoomService`: create room with short code, join room, enforce max player limit
- [ ] WebSocket server: attach `ws` upgrade handler to the Express HTTP server
- [ ] WS auth middleware: validate OpenAuth JWT on upgrade
- [ ] WS message router: dispatch incoming messages by `type`
- [ ] `room:join` / `room:leave` handlers → broadcast `room:state` to all room members
- [ ] Room membership tracked in Valkey (ephemeral) + `room_players` in PostgreSQL (durable)
- [ ] Define all WS event Zod schemas in `packages/shared`
- [ ] Frontend: `/multiplayer/lobby` — create room form + join-by-code form
- [ ] Frontend: `/multiplayer/room/:code` — player list, room code display, "Start Game" (host only)
- [ ] Frontend: `ws.ts` singleton WS client with reconnect on drop
- [ ] Frontend: Zustand `gameStore` handles incoming `room:state` events
---
## Phase 5 — Multiplayer Game
**Goal**: Host starts a game; all players answer simultaneously in real time; a winner is declared.
**Done when**: 24 players complete a 10-round game with correct live scores and a winner screen.
- [ ] `GameService`: generate question sequence for a room, enforce server-side 15 s timer
- [ ] `room:start` WS handler → begin question loop, broadcast first `game:question`
- [ ] `game:answer` WS handler → collect per-player answers
- [ ] On all-answered or timeout → evaluate, broadcast `game:answer_result`
- [ ] After N rounds → broadcast `game:finished`, update `rooms.status` + `room_players.score` in DB
- [ ] Frontend: `/multiplayer/game/:code` route
- [ ] Frontend: extend Zustand store with `currentQuestion`, `roundAnswers`, `scores`
- [ ] Frontend: reuse `QuestionCard` + `OptionButton`; add countdown timer ring
- [ ] Frontend: `ScoreBoard` component — live per-player scores after each round
- [ ] Frontend: `GameFinished` screen — winner highlight, final scores, "Play Again" button
- [ ] Unit tests for `GameService` (round evaluation, tie-breaking, timeout auto-advance)
---
## Phase 6 — Production Deployment
**Goal**: App is live on Hetzner, accessible via HTTPS on all subdomains.
**Done when**: `https://app.yourdomain.com` loads; `wss://api.yourdomain.com` connects; auth flow works end-to-end.
- [ ] `docker-compose.prod.yml`: all services + `nginx-proxy` + `acme-companion`
- [ ] Nginx config per container: `VIRTUAL_HOST` + `LETSENCRYPT_HOST` env vars
- [ ] Production `.env` files on VPS (OpenAuth secrets, DB credentials, Valkey URL)
- [ ] Drizzle migration runs on `api` container start
- [ ] Seed production DB (run `seed.ts` once)
- [ ] Smoke test: login → solo game → create room → multiplayer game end-to-end
---
## Phase 7 — Polish & Hardening *(post-MVP)*
Not required to ship, but address before real users arrive.
- [ ] Rate limiting on API endpoints (`express-rate-limit`)
- [ ] Graceful WS reconnect with exponential back-off
- [ ] React error boundaries
- [ ] `GET /users/me/stats` endpoint + profile page
- [ ] Accessibility pass (keyboard nav, ARIA on quiz buttons)
- [ ] Favicon, page titles, Open Graph meta
- [ ] CI/CD pipeline (GitHub Actions → SSH deploy on push to `main`)
- [ ] Database backups (cron → Hetzner Object Storage)
---
## Dependency Graph
```
Phase 0 (Foundation)
└── Phase 1 (Vocabulary Data)
└── Phase 2 (Auth)
├── Phase 3 (Singleplayer) ← parallel with Phase 4
└── Phase 4 (Room Lobby)
└── Phase 5 (Multiplayer Game)
└── Phase 6 (Deployment)
```

436
documentation/spec.md Normal file
View file

@ -0,0 +1,436 @@
# Vocabulary Trainer — Project Specification
## 1. Overview
A multiplayer EnglishItalian vocabulary trainer with a Duolingo-style quiz interface (one word prompt, four answer choices). Supports both single-player practice and real-time competitive multiplayer rooms of 24 players. Designed from the ground up to be language-pair agnostic.
### Core Principles
- **Minimal but extendable**: Working product fast, clean architecture for future growth
- **Mobile-first**: Touch-friendly Duolingo-like UX
- **Type safety end-to-end**: TypeScript + Zod schemas shared between frontend and backend
---
## 2. Technology Stack
| Layer | Technology |
|---|---|
| Monorepo | pnpm workspaces |
| Frontend | React 18, Vite, TypeScript |
| Routing | TanStack Router |
| Server state | TanStack Query |
| Client state | Zustand |
| Styling | Tailwind CSS + shadcn/ui |
| Backend | Node.js, Express, TypeScript |
| Realtime | WebSockets (`ws` library) |
| Database | PostgreSQL 16 |
| ORM | Drizzle ORM |
| Cache / Queue | Valkey 8 |
| Auth | OpenAuth (Google + GitHub) |
| Validation | Zod (shared schemas) |
| Testing | Vitest, React Testing Library |
| Linting / Formatting | ESLint, Prettier |
| Containerisation | Docker, Docker Compose |
| Hosting | Hetzner VPS |
### Why `ws` over Socket.io
`ws` is the raw WebSocket library. For rooms of 24 players there is no need for Socket.io's transport fallbacks or room-management abstractions. The protocol is defined explicitly in `packages/shared`, which gives the same guarantees without the overhead.
### Why Valkey
Valkey stores ephemeral room state that does not need to survive a server restart. It keeps the PostgreSQL schema clean and makes room lookups O(1).
### Why pnpm workspaces without Turborepo
Turborepo adds parallel task running and build caching on top of pnpm workspaces. For a two-app monorepo of this size, the plain pnpm workspace commands (`pnpm -r run build`, `pnpm --filter`) are sufficient and there is one less tool to configure and maintain.
---
## 3. Repository Structure
```
vocab-trainer/
├── apps/
│ ├── web/ # React SPA (Vite + TanStack Router)
│ │ ├── src/
│ │ │ ├── routes/
│ │ │ ├── components/
│ │ │ ├── stores/ # Zustand stores
│ │ │ └── lib/
│ │ └── Dockerfile
│ └── api/ # Express REST + WebSocket server
│ ├── src/
│ │ ├── routes/
│ │ ├── services/
│ │ ├── repositories/
│ │ └── websocket/
│ └── Dockerfile
├── packages/
│ ├── shared/ # Zod schemas, TypeScript types, constants
│ └── db/ # Drizzle schema, migrations, seed script
├── scripts/
│ └── extract_omw.py # One-time WordNet + OMW extraction → seed.json
├── docker-compose.yml
├── docker-compose.prod.yml
├── pnpm-workspace.yaml
└── package.json
```
`packages/shared` is the contract between frontend and backend. All request/response shapes and WebSocket event payloads are defined there as Zod schemas and inferred TypeScript types — never duplicated.
### pnpm workspace config
`pnpm-workspace.yaml` declares:
```
packages:
- 'apps/*'
- 'packages/*'
```
### Root scripts
The root `package.json` defines convenience scripts that delegate to workspaces:
- `dev` — starts `api` and `web` in parallel
- `build` — builds all packages in dependency order
- `test` — runs Vitest across all workspaces
- `lint` — runs ESLint across all workspaces
For parallel dev, use `concurrently` or just two terminal tabs for MVP.
---
## 4. Architecture — N-Tier / Layered
```
┌────────────────────────────────────┐
│ Presentation (React SPA) │ apps/web
├────────────────────────────────────┤
│ API / Transport │ HTTP REST + WebSocket
├────────────────────────────────────┤
│ Application (Controllers) │ apps/api/src/routes
│ Domain (Business logic) │ apps/api/src/services
│ Data Access (Repositories) │ apps/api/src/repositories
├────────────────────────────────────┤
│ Database (PostgreSQL via Drizzle) │ packages/db
│ Cache (Valkey) │ apps/api/src/lib/valkey.ts
└────────────────────────────────────┘
```
Each layer only communicates with the layer directly below it. Business logic lives in services, not in route handlers or repositories.
---
## 5. Infrastructure
### Domain structure
| Subdomain | Service |
|---|---|
| `app.yourdomain.com` | React frontend |
| `api.yourdomain.com` | Express API + WebSocket |
| `auth.yourdomain.com` | OpenAuth service |
### Docker Compose services (production)
| Container | Role |
|---|---|
| `postgres` | PostgreSQL 16, named volume |
| `valkey` | Valkey 8, ephemeral (no persistence needed) |
| `openauth` | OpenAuth service |
| `api` | Express + WS server |
| `web` | Nginx serving the Vite build |
| `nginx-proxy` | Automatic reverse proxy |
| `acme-companion` | Let's Encrypt certificate automation |
```
nginx-proxy (:80/:443)
app.domain → web:80
api.domain → api:3000 (HTTP + WS upgrade)
auth.domain → openauth:3001
```
SSL is fully automatic via `nginx-proxy` + `acme-companion`. No manual Certbot needed.
---
## 6. Data Model
### Design principle
Words are modelled as language-neutral **terms** with one or more **translations** per language. Adding a new language pair (e.g. EnglishFrench) requires **no schema changes** — only new rows in `translations` and `language_pairs`. The flat `english/italian` column pattern is explicitly avoided.
### Core tables
```
terms
id uuid PK
synset_id text UNIQUE -- WordNet synset offset e.g. "wn:01234567n"
pos varchar(20) -- "noun" | "verb" | "adjective"
frequency_rank integer -- 11000, reserved for difficulty filtering
created_at timestamptz
translations
id uuid PK
term_id uuid FK → terms.id
language_code varchar(10) -- BCP 47: "en", "it", "de", ...
text text
UNIQUE (term_id, language_code)
language_pairs
id uuid PK
source varchar(10) -- "en"
target varchar(10) -- "it"
label text -- "English → Italian"
active boolean DEFAULT true
UNIQUE (source, target)
users
id uuid PK -- OpenAuth sub claim
email varchar(255) UNIQUE
display_name varchar(100)
games_played integer DEFAULT 0
games_won integer DEFAULT 0
created_at timestamptz
last_login_at timestamptz
rooms
id uuid PK
code varchar(8) UNIQUE -- human-readable e.g. "WOLF-42"
host_id uuid FK → users.id
pair_id uuid FK → language_pairs.id
status text -- "waiting" | "in_progress" | "finished"
max_players smallint DEFAULT 4
round_count smallint DEFAULT 10
created_at timestamptz
room_players
room_id uuid FK → rooms.id
user_id uuid FK → users.id
score integer DEFAULT 0
joined_at timestamptz
PRIMARY KEY (room_id, user_id)
```
### Indexes
```sql
CREATE INDEX ON terms (pos, frequency_rank);
CREATE INDEX ON rooms (status);
CREATE INDEX ON room_players (user_id);
```
---
## 7. Vocabulary Data — WordNet + OMW
### Source
- **Princeton WordNet** — English words + synset IDs
- **Open Multilingual Wordnet (OMW)** — Italian translations keyed by synset ID
### Extraction process
1. Run `scripts/extract_omw.py` once locally using NLTK
2. Filter to the 1 000 most common nouns (by WordNet frequency data)
3. Output: `packages/db/src/seed.json` — committed to the repo
4. `packages/db/src/seed.ts` reads the JSON and populates `terms` + `translations`
`terms.synset_id` stores the WordNet offset (e.g. `wn:01234567n`) for traceability and future re-imports with additional languages.
---
## 8. Authentication — OpenAuth
All auth is delegated to the OpenAuth service at `auth.yourdomain.com`. Providers: Google, GitHub.
The API validates the JWT from OpenAuth on every protected request. User rows are created or updated on first login via the `sub` claim as the primary key.
**Auth endpoint on the API:**
| Method | Path | Description |
|---|---|---|
| GET | `/api/auth/me` | Validate token, return user |
All other auth flows (login, callback, token refresh) are handled entirely by OpenAuth — the frontend redirects to `auth.yourdomain.com` and receives a JWT back.
---
## 9. REST API
All endpoints prefixed `/api`. Request and response bodies validated with Zod on both sides using schemas from `packages/shared`.
### Vocabulary
| Method | Path | Description |
|---|---|---|
| GET | `/language-pairs` | List active language pairs |
| GET | `/terms?pair=en-it&limit=10` | Fetch quiz terms with distractors |
### Rooms
| Method | Path | Description |
|---|---|---|
| POST | `/rooms` | Create a room → returns room + code |
| GET | `/rooms/:code` | Get current room state |
| POST | `/rooms/:code/join` | Join a room |
### Users
| Method | Path | Description |
|---|---|---|
| GET | `/users/me` | Current user profile |
| GET | `/users/me/stats` | Games played, win rate |
---
## 10. WebSocket Protocol
One WS connection per client. Authenticated by passing the OpenAuth JWT as a query param on the upgrade request: `wss://api.yourdomain.com?token=...`.
All messages are JSON: `{ type: string, payload: unknown }`. The full set of types is a Zod discriminated union in `packages/shared` — both sides validate every message they receive.
### Client → Server
| type | payload | Description |
|---|---|---|
| `room:join` | `{ code }` | Subscribe to a room's WS channel |
| `room:leave` | — | Unsubscribe |
| `room:start` | — | Host starts the game |
| `game:answer` | `{ questionId, answerId }` | Player submits an answer |
### Server → Client
| type | payload | Description |
|---|---|---|
| `room:state` | Full room snapshot | Sent on join and on any player join/leave |
| `game:question` | `{ id, prompt, options[], timeLimit }` | New question broadcast to all players |
| `game:answer_result` | `{ questionId, correct, correctAnswerId, scores }` | Broadcast after all answer or timeout |
| `game:finished` | `{ scores[], winner }` | End of game summary |
| `error` | `{ message }` | Protocol or validation error |
### Multiplayer game mechanic — simultaneous answers
All players see the same question at the same time. Everyone submits independently. The server waits until all players have answered **or** the 15-second timeout fires — then broadcasts `game:answer_result` with updated scores. There is no buzz-first mechanic. This keeps the experience Duolingo-like and symmetric.
### Game flow
```
host creates room (REST) →
players join via room code (REST + WS room:join) →
room:state broadcasts player list →
host sends room:start →
server broadcasts game:question →
players send game:answer →
server collects all answers or waits for timeout →
server broadcasts game:answer_result →
repeat for N rounds →
server broadcasts game:finished
```
### Room state in Valkey
Active room state (connected players, current question, answers received this round) is stored in Valkey with a TTL. PostgreSQL holds the durable record (`rooms`, `room_players`). On server restart, in-progress games are considered abandoned — acceptable for MVP.
---
## 11. Game Mechanics
- **Question format**: source-language word prompt + 4 target-language choices (1 correct + 3 distractors of the same POS)
- **Distractors**: generated server-side, never include the correct answer, never repeat within a session
- **Scoring**: +1 point per correct answer. Speed bonus is out of scope for MVP.
- **Timer**: 15 seconds per question, server-authoritative
- **Single-player**: uses `GET /terms` and runs entirely client-side. No WebSocket.
---
## 12. Frontend Structure
```
apps/web/src/
├── routes/
│ ├── index.tsx # Landing / mode select
│ ├── auth/
│ ├── singleplayer/
│ └── multiplayer/
│ ├── lobby.tsx # Create or join by code
│ ├── room.$code.tsx # Waiting room
│ └── game.$code.tsx # Active game
├── components/
│ ├── quiz/ # QuestionCard, OptionButton, ScoreBoard
│ ├── room/ # PlayerList, RoomCode, ReadyState
│ └── ui/ # shadcn/ui wrappers: Button, Card, Dialog ...
├── stores/
│ └── gameStore.ts # Zustand: game session, scores, WS state
├── lib/
│ ├── api.ts # TanStack Query wrappers
│ └── ws.ts # WS client singleton
└── main.tsx
```
### Zustand store (single store for MVP)
```typescript
interface AppStore {
user: User | null;
gameSession: GameSession | null;
currentQuestion: Question | null;
scores: Record<string, number>;
isLoading: boolean;
error: string | null;
}
```
TanStack Query handles all server data fetching. Zustand handles ephemeral UI and WebSocket-driven state.
---
## 13. Testing Strategy
| Type | Tool | Scope |
|---|---|---|
| Unit | Vitest | Services, QuizService distractor logic, Zod schemas |
| Component | Vitest + RTL | QuestionCard, OptionButton, auth forms |
| Integration | Vitest | API route handlers against a test DB |
| E2E | Out of scope for MVP | — |
Tests are co-located with source files (`*.test.ts` / `*.test.tsx`).
**Critical paths to cover:**
- Distractor generation (correct POS, no duplicates, never includes answer)
- Answer validation (server-side, correct scoring)
- Game session lifecycle (create → play → complete)
- JWT validation middleware
---
## 14. Definition of Done
### Functional
- [ ] User can log in via Google or GitHub (OpenAuth)
- [ ] User can play singleplayer: 10 rounds, score, result screen
- [ ] User can create a room and share a code
- [ ] User can join a room via code
- [ ] Multiplayer: 10 rounds, simultaneous answers, real-time score sync
- [ ] 1 000 EnglishItalian words seeded from WordNet + OMW
### Technical
- [ ] Deployed to Hetzner with HTTPS on all three subdomains
- [ ] Docker Compose running all services
- [ ] Drizzle migrations applied on container start
- [ ] 1020 passing tests covering critical paths
- [ ] pnpm workspace build pipeline green
### Documentation
- [ ] `SPEC.md` complete
- [ ] `.env.example` files for all apps
- [ ] `README.md` with local dev setup instructions
---
## 15. Out of Scope (MVP)
- Difficulty levels *(`frequency_rank` column exists, ready to use)*
- Additional language pairs *(schema already supports it — just add rows)*
- Leaderboards *(`games_played`, `games_won` columns exist)*
- Streaks / daily challenges
- Friends / private invites
- Audio pronunciation
- CI/CD pipeline (manual deploy for now)
- Rate limiting *(add before going public)*
- Admin panel for vocabulary management