updating documentation

This commit is contained in:
lila 2026-05-16 01:59:43 +02:00
parent 1ba57c7e9d
commit 7e0311683f
25 changed files with 2660 additions and 226 deletions

View file

@ -0,0 +1,394 @@
# lila — Project Specification
> **This document is the single source of truth for the project.**
> It is written to be handed to any LLM as context. It contains the project vision, the current MVP scope, the tech stack, the architecture, and the roadmap.
---
## 1. Project Overview
A vocabulary trainer for EnglishItalian words. The quiz format is Duolingo-style: one word is shown as a prompt, and the user picks the correct translation from four choices (1 correct + 3 distractors of the same part-of-speech). The app supports both singleplayer and real-time multiplayer game modes.
**The core learning loop:**
Show word → pick answer → see result → next word → final score
The vocabulary data comes from WordNet + the Open Multilingual Wordnet (OMW). A one-time Python script extracts EnglishItalian noun pairs and seeds the database. The data model is language-pair agnostic by design — adding a new language later requires no schema changes.
### Core Principles
- **Minimal but extendable**: working product fast, clean architecture for future growth
- **Mobile-first**: touch-friendly Duolingo-like UX
- **Type safety end-to-end**: TypeScript + Zod schemas shared between frontend and backend
---
## 2. Full Product Vision (Long-Term)
- Users log in via Google or GitHub (Better Auth)
- Singleplayer mode: 10-round quiz, score screen
- Multiplayer mode: create a room, share a code, 24 players answer simultaneously in real time, live scores, winner screen
- 1000+ EnglishItalian nouns seeded from WordNet
This is the full vision. The current implementation already covers most of it; remaining items are captured in the roadmap and the Post-MVP ladder below.
---
## 3. MVP Scope
**Goal:** A working, presentable vocabulary trainer that can be shown to real people (singleplayer and multiplayer), with a production deployment.
### What is IN the MVP
- Vocabulary data in a PostgreSQL database (already seeded)
- REST API that returns quiz terms with distractors
- Singleplayer quiz UI: configurable rounds (3 or 10), answer feedback, score screen
- Clean, mobile-friendly UI (Tailwind + shadcn/ui)
- Global error handler with typed error classes
- Unit + integration tests for the API
- Authentication via Better Auth (Google + GitHub)
- Multiplayer lobby + game over WebSockets
- Production deployment (Docker Compose + Caddy + Hetzner) and CI/CD (Forgejo Actions)
### What is CUT from the MVP
| Feature | Why cut |
| --------------------- | ---------- |
| User stats / profiles | Needs auth |
These are not deleted from the plan — they are deferred. The architecture is already designed to support them. See Section 11 (Post-MVP Ladder).
---
## 4. Technology Stack
The monorepo structure and tooling are already set up. This is the full stack.
| Layer | Technology | Status |
| ------------ | ------------------------------ | ------------------------------------------------------ |
| Monorepo | pnpm workspaces | ✅ |
| Frontend | React 18, Vite, TypeScript | ✅ |
| Routing | TanStack Router | ✅ |
| Server state | TanStack Query | ✅ |
| Client state | Zustand | ✅ |
| Styling | Tailwind CSS + shadcn/ui | ✅ |
| Backend | Node.js, Express, TypeScript | ✅ |
| Database | PostgreSQL + Drizzle ORM | ✅ |
| Validation | Zod (shared schemas) | ✅ |
| Testing | Vitest, supertest | ✅ |
| Auth | Better Auth (Google + GitHub) | ✅ |
| Deployment | Docker Compose, Caddy, Hetzner | ✅ |
| CI/CD | Forgejo Actions | ✅ |
| Realtime | WebSockets (`ws` library) | ✅ |
| Cache | Valkey | ⚠️ optional (used locally; production/state hardening) |
---
## 5. Repository Structure
```text
lila/
├── .forgejo/
│ └── workflows/
│ └── deploy.yml — CI/CD pipeline (build, push, deploy)
├── apps/
│ ├── api/
│ │ └── src/
│ │ ├── app.ts — createApp() factory, CORS, auth handler, error middleware
│ │ ├── server.ts — starts server on PORT
│ │ ├── errors/
│ │ │ └── AppError.ts — AppError, ValidationError, NotFoundError
│ │ ├── lib/
│ │ │ └── auth.ts — Better Auth config (Google + GitHub providers)
│ │ ├── middleware/
│ │ │ ├── authMiddleware.ts — session validation for protected routes
│ │ │ └── errorHandler.ts — central error middleware
│ │ ├── routes/
│ │ │ ├── apiRouter.ts — mounts /health and /game routers
│ │ │ ├── gameRouter.ts — POST /start, POST /answer
│ │ │ └── healthRouter.ts
│ │ ├── controllers/
│ │ │ └── gameController.ts — validates input, calls service, sends response
│ │ ├── services/
│ │ │ ├── gameService.ts — builds quiz sessions, evaluates answers
│ │ │ └── gameService.test.ts — unit tests (mocked DB)
│ │ └── gameSessionStore/
│ │ ├── GameSessionStore.ts — interface (async, Valkey-ready)
│ │ ├── InMemoryGameSessionStore.ts
│ │ └── index.ts
│ └── web/
│ ├── Dockerfile — multi-stage: dev + production (nginx:alpine)
│ ├── nginx.conf — SPA fallback routing
│ └── src/
│ ├── routes/
│ │ ├── index.tsx — landing page
│ │ ├── play.tsx — the quiz
│ │ ├── login.tsx — Google + GitHub login buttons
│ │ ├── about.tsx
│ │ └── __root.tsx
│ ├── lib/
│ │ └── auth-client.ts — Better Auth React client
│ ├── components/
│ │ └── game/
│ │ ├── GameSetup.tsx — settings UI
│ │ ├── QuestionCard.tsx — prompt + 4 options
│ │ ├── OptionButton.tsx — idle / correct / wrong states
│ │ └── ScoreScreen.tsx — final score + play again
│ └── main.tsx
├── packages/
│ ├── shared/
│ │ └── src/
│ │ ├── constants.ts — SUPPORTED_POS, DIFFICULTY_LEVELS, etc.
│ │ ├── schemas/game.ts — Zod schemas for all game types
│ │ └── index.ts
│ └── db/
│ ├── drizzle/ — migration SQL files
│ └── src/
│ ├── db/schema.ts — Drizzle schema (terms, translations, auth tables)
│ ├── models/termModel.ts — getGameTerms(), getDistractors()
│ ├── seeding-datafiles.ts — seeds terms + translations from JSON
│ ├── seeding-cefr-levels.ts — enriches translations with CEFR data
│ ├── generating-deck.ts — builds curated decks
│ └── index.ts
├── scripts/ — Python extraction/comparison/merge scripts
├── documentation/ — project docs
├── docker-compose.yml — local dev stack
├── Caddyfile — reverse proxy routing
└── pnpm-workspace.yaml
```
`packages/shared` is the contract between frontend and backend. All request/response shapes are defined there as Zod schemas — never duplicated.
---
## 6. Architecture
### The Layered Architecture
```text
HTTP Request
Router — maps URL + HTTP method to a controller
Controller — handles HTTP only: validates input, calls service, sends response
Service — business logic only: no HTTP, no direct DB access
Model — database queries only: no business logic
Database
```
**The rule:** each layer only talks to the layer directly below it. A controller never touches the database. A service never reads `req.body`. A model never knows what a quiz is.
### Monorepo Package Responsibilities
| Package | Owns |
| ----------------- | -------------------------------------------------------- |
| `packages/shared` | Zod schemas, constants, derived TypeScript types |
| `packages/db` | Drizzle schema, DB connection, all model/query functions |
| `apps/api` | Router, controllers, services, error handling |
| `apps/web` | React frontend, consumes types from shared |
**Key principle:** all database code lives in `packages/db`. `apps/api` never imports `drizzle-orm` for queries — it only calls functions exported from `packages/db`.
### Production Infrastructure
```text
Internet → Caddy (HTTPS termination)
├── lilastudy.com → web container (nginx, static files)
├── api.lilastudy.com → api container (Express, port 3000)
└── git.lilastudy.com → forgejo container (git + registry, port 3000)
SSH (port 2222) → forgejo container (git push/pull)
```
All containers communicate over an internal Docker network. Only Caddy (80/443) and Forgejo SSH (2222) are exposed to the internet.
---
## 7. Data Model (Current State)
Words are modelled as language-neutral concepts (terms) separate from learning curricula (decks). Adding a new language pair requires no schema changes — only new rows in `translations`, `decks`.
**Core tables:** `terms`, `translations`, `term_glosses`, `decks`, `deck_terms`, `topics`, `term_topics`
**Auth tables (managed by Better Auth):** `user`, `session`, `account`, `verification`
Key columns on `terms`: `id` (uuid), `pos` (CHECK-constrained), `source`, `source_id` (unique pair for idempotent imports)
Key columns on `translations`: `id`, `term_id` (FK), `language_code` (CHECK-constrained), `text`, `cefr_level` (nullable varchar(2), CHECK A1C2)
Deck model uses `source_language` + `validated_languages` array — one deck serves multiple target languages. Decks are frequency tiers (e.g. `en-core-1000`), not POS splits.
Full schema is in `packages/db/src/db/schema.ts`.
---
## 8. API
### Endpoints
```text
POST /api/v1/game/start GameRequest → GameSession (requires auth)
POST /api/v1/game/answer AnswerSubmission → AnswerResult (requires auth)
GET /api/v1/health Health check (public)
ALL /api/auth/* Better Auth handlers (public)
```
### Schemas (packages/shared)
**GameRequest:** `{ source_language, target_language, pos, difficulty, rounds }`
**GameSession:** `{ sessionId: uuid, questions: GameQuestion[] }`
**GameQuestion:** `{ questionId: uuid, prompt: string, gloss: string | null, options: AnswerOption[4] }`
**AnswerOption:** `{ optionId: number (0-3), text: string }`
**AnswerSubmission:** `{ sessionId: uuid, questionId: uuid, selectedOptionId: number (0-3) }`
**AnswerResult:** `{ questionId: uuid, isCorrect: boolean, correctOptionId: number (0-3), selectedOptionId: number (0-3) }`
### Error Handling
Typed error classes (`AppError` base, `ValidationError` 400, `NotFoundError` 404) with central error middleware. Controllers validate with `safeParse`, throw on failure, and call `next(error)` in the catch. The middleware maps `AppError` instances to HTTP status codes; unknown errors return 500.
### Key Design Rules
- Server-side answer evaluation: the correct answer is never sent to the frontend
- `POST` not `GET` for game start (configuration in request body)
- `safeParse` over `parse` (clean 400s, not raw Zod 500s)
- Session state stored in `GameSessionStore` (in-memory now, Valkey later)
---
## 9. Game Mechanics
- **Format**: source-language word prompt + 4 target-language choices
- **Distractors**: same POS, same difficulty, server-side, never the correct answer, never repeated within a session
- **Session length**: 3 or 10 questions (configurable)
- **Scoring**: +1 per correct answer (no speed bonus for MVP)
- **Timer**: none in singleplayer MVP
- **Auth required**: users must log in via Google or GitHub
- **Submit-before-send**: user selects, then confirms (prevents misclicks)
---
## 10. Working Methodology
This project is a learning exercise. The goal is to understand the code, not just to ship it.
### How to use an LLM for help
1. Paste this document as context
2. Describe what you're working on and what you're stuck on
3. Ask for hints, not solutions
### Refactoring workflow
After completing a task: share the code, ask what to refactor and why. The LLM should explain the concept, not write the implementation.
---
## 11. Post-MVP Ladder
<<<<<<< HEAD
| Phase | What it adds | Status |
| ----------------- | ------------------------------------------------------------------------------- | ------ |
| Auth | Better Auth (Google + GitHub), embedded in Express API, user rows in DB | ✅ |
| Deployment | Docker Compose, Caddy, Forgejo, CI/CD, Hetzner VPS | ✅ |
| Hardening (partial) | CI/CD pipeline, DB backups | ✅ |
| User Stats | Games played, score history, profile page | ❌ |
| Multiplayer Lobby | Room creation, join by code, WebSocket connection | ❌ |
| Multiplayer Game | Simultaneous answers, server timer, live scores, winner screen | ❌ |
| Hardening (rest) | Rate limiting, error boundaries, monitoring, accessibility | ❌ |
=======
| Phase | What it adds | Status |
| ------------------- | ----------------------------------------------------------------------- | ------ |
| Auth | Better Auth (Google + GitHub), embedded in Express API, user rows in DB | ✅ |
| Deployment | Docker Compose, Caddy, Forgejo, CI/CD, Hetzner VPS | ✅ |
| Hardening (partial) | CI/CD pipeline, DB backups | ✅ |
| User Stats | Games played, score history, profile page | ❌ |
| Multiplayer Lobby | Room creation, join by code, WebSocket connection | ✅ |
| Multiplayer Game | Simultaneous answers, server timer, live scores, winner screen | ✅ |
| Hardening (rest) | Rate limiting, error boundaries, monitoring, accessibility | ❌ |
> > > > > > > dev
### Future Data Model Extensions (deferred, additive)
- `noun_forms` — gender, singular, plural, articles per language
- `verb_forms` — conjugation tables per language
- `term_pronunciations` — IPA and audio URLs per language
- `user_decks` — which decks a user is studying
- `user_term_progress` — spaced repetition state per user/term/language
- `quiz_answers` — history log for stats
All are new tables referencing existing `terms` rows via FK. No existing schema changes required.
### Multiplayer Architecture (current + deferred)
**Implemented now:**
- WebSocket protocol uses the `ws` library with a Zod discriminated union for message types (defined in `packages/shared`)
- Room model uses human-readable codes (no matchmaking queue)
- Lobby flow (create/join/leave) is real-time over WS, backed by PostgreSQL for durable membership/state
- Multiplayer game flow is real-time: host starts, all players see the same question, answers are collected simultaneously, with a server-enforced 15s timer and live scoring
- WebSocket connections are authenticated (Better Auth session validation on upgrade)
**Deferred / hardening:**
- Valkey-backed ephemeral state (room/game/session store) where in-memory state becomes a bottleneck
- Graceful reconnect/resume flows and more robust failure handling (tracked in Phase 7)
### Infrastructure (current)
- `lilastudy.com` → React frontend (nginx serving static files)
- `api.lilastudy.com` → Express API + Better Auth
- `git.lilastudy.com` → Forgejo (git server + container registry)
- Docker Compose with Caddy for automatic HTTPS via Let's Encrypt
- CI/CD via Forgejo Actions (build on push to main, deploy via SSH)
- Daily DB backups with cron, synced to dev laptop
See `deployment.md` for full infrastructure documentation.
---
## 12. Definition of Done (Current Baseline)
- [x] API returns quiz terms with correct distractors
- [x] User can complete a quiz without errors
- [x] Score screen shows final result and a play-again option
- [x] App is usable on a mobile screen
- [x] No hardcoded data — everything comes from the database
- [x] Global error handler with typed error classes
- [x] Unit + integration tests for API
- [x] Auth works end-to-end (Google + GitHub via Better Auth)
- [x] Multiplayer works end-to-end (lobby + real-time game over WebSockets)
- [x] Production deployment is live behind HTTPS (Caddy) with CI/CD deploys via Forgejo Actions
---
## 13. Roadmap
See `roadmap.md` for the full roadmap with task-level checkboxes.
### Dependency Graph
```text
Phase 0 (Foundation) ✅
└── Phase 1 (Vocabulary Data + API) ✅
└── Phase 2 (Singleplayer UI) ✅
├── Phase 3 (Auth) ✅
│ └── Phase 6 (Deployment + CI/CD) ✅
└── Phase 4 (Multiplayer Lobby) ✅
└── Phase 5 (Multiplayer Game) ✅
└── Phase 7 (Hardening)
```
---
## 14. Game Flow (Future)
Singleplayer: choose direction (en→it or it→en) → top-level category → part of speech → difficulty (A1C2) → round count → game starts.
**Top-level categories (post-MVP):**
- **Grammar** — practice nouns, verb conjugations, etc.
- **Media** — practice vocabulary from specific books, films, songs, etc.
- **Thematic** — animals, kitchen, etc. (requires category metadata research)