updating documentation

This commit is contained in:
lila 2026-05-16 01:59:43 +02:00
parent 1ba57c7e9d
commit 7e0311683f
25 changed files with 2660 additions and 226 deletions

View file

@ -0,0 +1,229 @@
# Architecture
> How Lila is structured, how data flows, and why the boundaries are where they are.
---
## Monorepo Layout
```
lila/
├── apps/
│ ├── api/ — Express backend (HTTP + WebSocket)
│ └── web/ — React frontend (Vite, TanStack Router)
├── packages/
│ ├── shared/ — Zod schemas + constants (API/web contract)
│ └── db/ — Drizzle schema, migrations, models, seeding
├── data-pipeline/ — Kaikki extraction → enrichment → PostgreSQL sync
├── documentation/ — Project docs
├── Caddyfile — Reverse proxy routing
├── docker-compose.yml — Local dev stack
└── pnpm-workspace.yaml — Workspace definition
```
**Package boundaries:**
| Package | Owns | Consumed by |
| ----------------- | ----------------------------------------------------------------- | ------------------------------------- |
| `packages/shared` | Zod schemas, constants, derived TypeScript types | `apps/api`, `apps/web`, `packages/db` |
| `packages/db` | Drizzle schema, DB connection, all model/query functions | `apps/api` |
| `apps/api` | Router, controllers, services, error handling, WebSocket handlers | — |
| `apps/web` | React components, routes, client-side state | — |
**Rule:** `apps/api` never imports `drizzle-orm` for queries. It only calls functions exported from `packages/db`.
---
## Layered Architecture (HTTP)
```
HTTP Request
Router — maps URL + HTTP method to a controller
Controller — handles HTTP only: validates input (Zod safeParse),
calls service, sends response or next(error)
Service — business logic only: no HTTP, no direct DB access
Model — database queries only: no business logic
Database — PostgreSQL via Drizzle ORM
```
**The rule:** each layer only talks to the layer directly below it.
- **Controller** never touches the database.
- **Service** never reads `req.body`.
- **Model** never knows what a quiz is.
### Error Flow
```
Controller throws ValidationError (400) or calls next(error)
Central errorHandler middleware in app.ts
Maps AppError subclasses to HTTP status codes
Unknown errors → 500
```
---
## WebSocket Architecture
The WebSocket server is attached to the same Express HTTP server. It upgrades connections on the `/ws` path.
```
WS Connection Upgrade
Auth middleware — validates Better Auth session from cookie
Message Router — dispatches by `type` field (Zod discriminated union)
Handler (lobby or game) — business logic, broadcasts state
In-memory stores (lobby game state, game session state)
```
**Message protocol:** All WebSocket messages are validated against Zod schemas defined in `packages/shared/src/schemas/lobby.ts` and `packages/shared/src/schemas/game.ts`. The `type` field is a discriminated union — the router switches on it and validates the payload against the corresponding schema.
**State storage:**
- **Lobby membership** — stored in PostgreSQL (`lobbies`, `lobby_players` tables) for durability
- **Game/room state** — stored in-memory (`InMemoryLobbyGameStore`, `InMemoryGameSessionStore`). Valkey migration is planned.
---
## Database Schema (Core)
**Concept:** Words are language-neutral concepts (`terms`) with per-language `translations`. Adding a new language requires no schema changes — only new rows.
### Core Tables
| Table | Purpose |
| -------------- | -------------------------------------------------------------------------------- |
| `terms` | Language-neutral concept: `id`, `pos` (noun/verb/adj/adv), `source`, `source_id` |
| `translations` | Per-language word: `term_id` (FK), `language_code`, `text`, `cefr_level` (A1C2) |
| `term_glosses` | Per-language definition: `term_id` (FK), `language_code`, `text` |
| `decks` | Curated wordlists: `source_language`, `validated_languages`, frequency tier |
| `deck_terms` | Junction: which terms belong to which deck |
### Auth Tables (managed by Better Auth)
| Table | Purpose |
| -------------- | --------------------------------------------------------------------------------- |
| `user` | Account: `id`, `name`, `email`, `image` |
| `session` | Active sessions: `id`, `user_id`, `token`, `expires_at` |
| `account` | Social provider links: `user_id`, `provider` (google/github), `providerAccountId` |
| `verification` | Email verification tokens (unused for social-only auth) |
**Key constraints:**
- `language_code` is CHECK-constrained against `SUPPORTED_LANGUAGE_CODES` (`en`, `it`, `de`, `es`, `fr`)
- `pos` is CHECK-constrained against `SUPPORTED_POS` (`noun`, `verb`, `adjective`, `adverb`)
- `cefr_level` is nullable `varchar(2)` with CHECK `A1``C2`
- `translations` has UNIQUE `(term_id, language_code, text)` — allows synonyms, prevents exact duplicates
---
## Data Flow: Quiz Session
### Singleplayer
```
User clicks "Start Quiz"
POST /api/v1/game/start (GameRequestSchema: source_lang, target_lang, pos, difficulty, rounds)
gameController.validate → gameService.createGameSession
termModel.getGameTerms(filters) + termModel.getDistractors(filters)
Service shuffles options, stores session in GameSessionStore
Returns GameSession { sessionId, questions[] } — correct answer NEVER sent to frontend
User answers → POST /api/v1/game/answer (AnswerSubmissionSchema)
Service evaluates server-side, returns AnswerResult { isCorrect, correctOptionId, selectedOptionId }
```
### Multiplayer
```
Host creates lobby → POST /api/v1/lobbies → returns room code
Players join via code → POST /api/v1/lobbies/:code/join
All players connect WebSocket → send lobby:join with room code
Server broadcasts lobby:state (player list) to all connections in room
Host clicks "Start" → WS lobby:start
Server generates questions via MultiplayerGameService, broadcasts game:question
Players submit answers via WS game:answer within 15s server timer
On all-answered or timeout → evaluate, broadcast game:answer_result
After N rounds → broadcast game:finished with final scores
```
---
## The `packages/shared` Contract
`packages/shared` is the **single source of truth** for all data shapes crossing the API boundary.
**What lives here:**
- `constants.ts``SUPPORTED_LANGUAGE_CODES`, `SUPPORTED_POS`, `DIFFICULTY_LEVELS`, `CEFR_LEVELS`, `GAME_ROUNDS`
- `schemas/game.ts``GameRequestSchema`, `GameSessionSchema`, `GameQuestionSchema`, `AnswerOptionSchema`, `AnswerSubmissionSchema`, `AnswerResultSchema`
- `schemas/lobby.ts``LobbyCreateSchema`, `LobbyJoinSchema`, `LobbyStateSchema`, `WebSocketMessageSchema` (discriminated union)
- `schemas/auth.ts` — Auth-related shared types
**Why this matters:** If the shape changes, TypeScript compilation fails in both `apps/api` and `apps/web` simultaneously. Silent drift is impossible.
---
## GameSessionStore Abstraction
The service layer stores session state through an interface, not a concrete implementation:
```typescript
interface GameSessionStore {
createSession(session: GameSession): Promise<void>;
getSession(sessionId: string): Promise<GameSession | null>;
// ...
}
```
**Current:** `InMemoryGameSessionStore` — Map-based, lives in `apps/api` process memory. Lost on restart.
**Planned:** `ValkeyGameSessionStore` — Redis-compatible, persists across restarts, enables horizontal scaling.
The same pattern applies to `LobbyGameStore` (lobby state).
---
## Key Design Decisions (Quick Reference)
| Decision | Where it's explained |
| --------------------------------- | ----------------------------- |
| Why Drizzle over Prisma | `DECISIONS.md` → ORM |
| Why `ws` over Socket.io | `DECISIONS.md` → WebSocket |
| Why server-side answer evaluation | `DECISIONS.md` → Architecture |
| Why Better Auth over Keycloak | `DECISIONS.md` → Auth |
| Why terms/translations schema | `DECISIONS.md` → Data Model |
| Why Caddy over Nginx/Traefik | `DECISIONS.md` → Deployment |
---
## Further Reading
- [DATA_PIPELINE.md](DATA_PIPELINE.md) — How vocabulary data gets from Kaikki into PostgreSQL
- [DEPLOYMENT.md](DEPLOYMENT.md) — Production infrastructure and ops
- [MODEL_STRATEGY.md](MODEL_STRATEGY.md) — LLM voter architecture for CEFR assignment
- [design/GAME_MODES.md](design/GAME_MODES.md) — Planned multiplayer modes