366 lines
16 KiB
Markdown
366 lines
16 KiB
Markdown
# Glossa — Project Specification
|
||
|
||
> **This document is the single source of truth for the project.**
|
||
> It is written to be handed to any LLM as context. It contains the project vision, the current MVP scope, the tech stack, the architecture, and the roadmap.
|
||
|
||
---
|
||
|
||
## 1. Project Overview
|
||
|
||
A vocabulary trainer for English–Italian words. The quiz format is Duolingo-style: one word is shown as a prompt, and the user picks the correct translation from four choices (1 correct + 3 distractors of the same part-of-speech). The long-term vision is a multiplayer competitive game, but the MVP is a polished singleplayer experience.
|
||
|
||
**The core learning loop:**
|
||
Show word → pick answer → see result → next word → final score
|
||
|
||
The vocabulary data comes from WordNet + the Open Multilingual Wordnet (OMW). A one-time Python script extracts English–Italian noun pairs and seeds the database. The data model is language-pair agnostic by design — adding a new language later requires no schema changes.
|
||
|
||
### Core Principles
|
||
|
||
- **Minimal but extendable**: working product fast, clean architecture for future growth
|
||
- **Mobile-first**: touch-friendly Duolingo-like UX
|
||
- **Type safety end-to-end**: TypeScript + Zod schemas shared between frontend and backend
|
||
|
||
---
|
||
|
||
## 2. Full Product Vision (Long-Term)
|
||
|
||
- Users log in via Google or GitHub (OpenAuth)
|
||
- Singleplayer mode: 10-round quiz, score screen
|
||
- Multiplayer mode: create a room, share a code, 2–4 players answer simultaneously in real time, live scores, winner screen
|
||
- 1000+ English–Italian nouns seeded from WordNet
|
||
|
||
This is the full vision. The MVP deliberately ignores most of it.
|
||
|
||
---
|
||
|
||
## 3. MVP Scope
|
||
|
||
**Goal:** A working, presentable singleplayer quiz that can be shown to real people.
|
||
|
||
### What is IN the MVP
|
||
|
||
- Vocabulary data in a PostgreSQL database (already seeded)
|
||
- REST API that returns quiz terms with distractors
|
||
- Singleplayer quiz UI: configurable rounds (3 or 10), answer feedback, score screen
|
||
- Clean, mobile-friendly UI (Tailwind + shadcn/ui)
|
||
- Global error handler with typed error classes
|
||
- Unit + integration tests for the API
|
||
- Local dev only (no deployment for MVP)
|
||
|
||
### What is CUT from the MVP
|
||
|
||
| Feature | Why cut |
|
||
| ------------------------------- | -------------------------------------- |
|
||
| Authentication (OpenAuth) | No user accounts needed for a demo |
|
||
| Multiplayer (WebSockets, rooms) | Core quiz works without it |
|
||
| Valkey / Redis cache | Only needed for multiplayer room state |
|
||
| Deployment to Hetzner | Ship to people locally first |
|
||
| User stats / profiles | Needs auth |
|
||
|
||
These are not deleted from the plan — they are deferred. The architecture is already designed to support them. See Section 11 (Post-MVP Ladder).
|
||
|
||
---
|
||
|
||
## 4. Technology Stack
|
||
|
||
The monorepo structure and tooling are already set up. This is the full stack — the MVP uses a subset of it.
|
||
|
||
| Layer | Technology | MVP? |
|
||
| ------------ | ------------------------------ | ----------- |
|
||
| Monorepo | pnpm workspaces | ✅ |
|
||
| Frontend | React 18, Vite, TypeScript | ✅ |
|
||
| Routing | TanStack Router | ✅ |
|
||
| Server state | TanStack Query | ✅ |
|
||
| Client state | Zustand | ✅ |
|
||
| Styling | Tailwind CSS + shadcn/ui | ✅ |
|
||
| Backend | Node.js, Express, TypeScript | ✅ |
|
||
| Database | PostgreSQL + Drizzle ORM | ✅ |
|
||
| Validation | Zod (shared schemas) | ✅ |
|
||
| Testing | Vitest, supertest | ✅ |
|
||
| Auth | OpenAuth (Google + GitHub) | ❌ post-MVP |
|
||
| Realtime | WebSockets (`ws` library) | ❌ post-MVP |
|
||
| Cache | Valkey | ❌ post-MVP |
|
||
| Deployment | Docker Compose, Hetzner, Nginx | ❌ post-MVP |
|
||
|
||
---
|
||
|
||
## 5. Repository Structure
|
||
|
||
```text
|
||
vocab-trainer/
|
||
├── apps/
|
||
│ ├── api/
|
||
│ │ └── src/
|
||
│ │ ├── app.ts — createApp() factory, express.json(), error middleware
|
||
│ │ ├── server.ts — starts server on PORT
|
||
│ │ ├── errors/
|
||
│ │ │ └── AppError.ts — AppError, ValidationError, NotFoundError
|
||
│ │ ├── middleware/
|
||
│ │ │ └── errorHandler.ts — central error middleware
|
||
│ │ ├── routes/
|
||
│ │ │ ├── apiRouter.ts — mounts /health and /game routers
|
||
│ │ │ ├── gameRouter.ts — POST /start, POST /answer
|
||
│ │ │ └── healthRouter.ts
|
||
│ │ ├── controllers/
|
||
│ │ │ └── gameController.ts — validates input, calls service, sends response
|
||
│ │ ├── services/
|
||
│ │ │ ├── gameService.ts — builds quiz sessions, evaluates answers
|
||
│ │ │ └── gameService.test.ts — unit tests (mocked DB)
|
||
│ │ └── gameSessionStore/
|
||
│ │ ├── GameSessionStore.ts — interface (async, Valkey-ready)
|
||
│ │ ├── InMemoryGameSessionStore.ts
|
||
│ │ └── index.ts
|
||
│ └── web/
|
||
│ └── src/
|
||
│ ├── routes/
|
||
│ │ ├── index.tsx — landing page
|
||
│ │ └── play.tsx — the quiz
|
||
│ ├── components/
|
||
│ │ └── game/
|
||
│ │ ├── GameSetup.tsx — settings UI
|
||
│ │ ├── QuestionCard.tsx — prompt + 4 options
|
||
│ │ ├── OptionButton.tsx — idle / correct / wrong states
|
||
│ │ └── ScoreScreen.tsx — final score + play again
|
||
│ └── main.tsx
|
||
├── packages/
|
||
│ ├── shared/
|
||
│ │ └── src/
|
||
│ │ ├── constants.ts — SUPPORTED_POS, DIFFICULTY_LEVELS, etc.
|
||
│ │ ├── schemas/game.ts — Zod schemas for all game types
|
||
│ │ └── index.ts
|
||
│ └── db/
|
||
│ ├── drizzle/ — migration SQL files
|
||
│ └── src/
|
||
│ ├── db/schema.ts — Drizzle schema
|
||
│ ├── models/termModel.ts — getGameTerms(), getDistractors()
|
||
│ ├── seeding-datafiles.ts — seeds terms + translations from JSON
|
||
│ ├── seeding-cefr-levels.ts — enriches translations with CEFR data
|
||
│ ├── generating-deck.ts — builds curated decks
|
||
│ └── index.ts
|
||
├── scripts/ — Python extraction/comparison/merge scripts
|
||
├── documentation/ — project docs
|
||
├── docker-compose.yml
|
||
└── pnpm-workspace.yaml
|
||
```
|
||
|
||
`packages/shared` is the contract between frontend and backend. All request/response shapes are defined there as Zod schemas — never duplicated.
|
||
|
||
---
|
||
|
||
## 6. Architecture
|
||
|
||
### The Layered Architecture
|
||
|
||
```text
|
||
HTTP Request
|
||
↓
|
||
Router — maps URL + HTTP method to a controller
|
||
↓
|
||
Controller — handles HTTP only: validates input, calls service, sends response
|
||
↓
|
||
Service — business logic only: no HTTP, no direct DB access
|
||
↓
|
||
Model — database queries only: no business logic
|
||
↓
|
||
Database
|
||
```
|
||
|
||
**The rule:** each layer only talks to the layer directly below it. A controller never touches the database. A service never reads `req.body`. A model never knows what a quiz is.
|
||
|
||
### Monorepo Package Responsibilities
|
||
|
||
| Package | Owns |
|
||
| ----------------- | -------------------------------------------------------- |
|
||
| `packages/shared` | Zod schemas, constants, derived TypeScript types |
|
||
| `packages/db` | Drizzle schema, DB connection, all model/query functions |
|
||
| `apps/api` | Router, controllers, services, error handling |
|
||
| `apps/web` | React frontend, consumes types from shared |
|
||
|
||
**Key principle:** all database code lives in `packages/db`. `apps/api` never imports `drizzle-orm` for queries — it only calls functions exported from `packages/db`.
|
||
|
||
---
|
||
|
||
## 7. Data Model (Current State)
|
||
|
||
Words are modelled as language-neutral concepts (terms) separate from learning curricula (decks). Adding a new language pair requires no schema changes — only new rows in `translations`, `decks`.
|
||
|
||
**Core tables:** `terms`, `translations`, `term_glosses`, `decks`, `deck_terms`, `categories`, `term_categories`
|
||
|
||
Key columns on `terms`: `id` (uuid), `pos` (CHECK-constrained), `source`, `source_id` (unique pair for idempotent imports)
|
||
|
||
Key columns on `translations`: `id`, `term_id` (FK), `language_code` (CHECK-constrained), `text`, `cefr_level` (nullable varchar(2), CHECK A1–C2)
|
||
|
||
Deck model uses `source_language` + `validated_languages` array — one deck serves multiple target languages. Decks are frequency tiers (e.g. `en-core-1000`), not POS splits.
|
||
|
||
Full schema is in `packages/db/src/db/schema.ts`.
|
||
|
||
---
|
||
|
||
## 8. API
|
||
|
||
### Endpoints
|
||
|
||
```text
|
||
POST /api/v1/game/start GameRequest → GameSession
|
||
POST /api/v1/game/answer AnswerSubmission → AnswerResult
|
||
GET /api/v1/health Health check
|
||
```
|
||
|
||
### Schemas (packages/shared)
|
||
|
||
**GameRequest:** `{ source_language, target_language, pos, difficulty, rounds }`
|
||
**GameSession:** `{ sessionId: uuid, questions: GameQuestion[] }`
|
||
**GameQuestion:** `{ questionId: uuid, prompt: string, gloss: string | null, options: AnswerOption[4] }`
|
||
**AnswerOption:** `{ optionId: number (0-3), text: string }`
|
||
**AnswerSubmission:** `{ sessionId: uuid, questionId: uuid, selectedOptionId: number (0-3) }`
|
||
**AnswerResult:** `{ questionId: uuid, isCorrect: boolean, correctOptionId: number (0-3), selectedOptionId: number (0-3) }`
|
||
|
||
### Error Handling
|
||
|
||
Typed error classes (`AppError` base, `ValidationError` 400, `NotFoundError` 404) with central error middleware. Controllers validate with `safeParse`, throw on failure, and call `next(error)` in the catch. The middleware maps `AppError` instances to HTTP status codes; unknown errors return 500.
|
||
|
||
### Key Design Rules
|
||
|
||
- Server-side answer evaluation: the correct answer is never sent to the frontend
|
||
- `POST` not `GET` for game start (configuration in request body)
|
||
- `safeParse` over `parse` (clean 400s, not raw Zod 500s)
|
||
- Session state stored in `GameSessionStore` (in-memory now, Valkey later)
|
||
|
||
---
|
||
|
||
## 9. Game Mechanics
|
||
|
||
- **Format**: source-language word prompt + 4 target-language choices
|
||
- **Distractors**: same POS, same difficulty, server-side, never the correct answer, never repeated within a session
|
||
- **Session length**: 3 or 10 questions (configurable)
|
||
- **Scoring**: +1 per correct answer (no speed bonus for MVP)
|
||
- **Timer**: none in singleplayer MVP
|
||
- **No auth required**: anonymous users
|
||
- **Submit-before-send**: user selects, then confirms (prevents misclicks)
|
||
|
||
---
|
||
|
||
## 10. Working Methodology
|
||
|
||
This project is a learning exercise. The goal is to understand the code, not just to ship it.
|
||
|
||
### How to use an LLM for help
|
||
|
||
1. Paste this document as context
|
||
2. Describe what you're working on and what you're stuck on
|
||
3. Ask for hints, not solutions
|
||
|
||
### Refactoring workflow
|
||
|
||
After completing a task: share the code, ask what to refactor and why. The LLM should explain the concept, not write the implementation.
|
||
|
||
---
|
||
|
||
## 11. Post-MVP Ladder
|
||
|
||
| Phase | What it adds |
|
||
| ----------------- | -------------------------------------------------------------- |
|
||
| Auth | OpenAuth (Google + GitHub), JWT middleware, user rows in DB |
|
||
| User Stats | Games played, score history, profile page |
|
||
| Multiplayer Lobby | Room creation, join by code, WebSocket connection |
|
||
| Multiplayer Game | Simultaneous answers, server timer, live scores, winner screen |
|
||
| Deployment | Docker Compose prod config, Nginx, Let's Encrypt, Hetzner VPS |
|
||
| Hardening | Rate limiting, error boundaries, CI/CD, DB backups |
|
||
|
||
### Future Data Model Extensions (deferred, additive)
|
||
|
||
- `noun_forms` — gender, singular, plural, articles per language
|
||
- `verb_forms` — conjugation tables per language
|
||
- `term_pronunciations` — IPA and audio URLs per language
|
||
- `user_decks` — which decks a user is studying
|
||
- `user_term_progress` — spaced repetition state per user/term/language
|
||
- `quiz_answers` — history log for stats
|
||
|
||
All are new tables referencing existing `terms` rows via FK. No existing schema changes required.
|
||
|
||
### Multiplayer Architecture (deferred)
|
||
|
||
- WebSocket protocol: `ws` library, Zod discriminated union for message types
|
||
- Room model: human-readable codes (e.g. `WOLF-42`), not matchmaking queue
|
||
- Game mechanic: simultaneous answers, 15-second server timer, all players see same question
|
||
- Valkey for ephemeral room state, PostgreSQL for durable records
|
||
|
||
### Infrastructure (deferred)
|
||
|
||
- `app.yourdomain.com` → React frontend
|
||
- `api.yourdomain.com` → Express API + WebSocket
|
||
- `auth.yourdomain.com` → OpenAuth service
|
||
- Docker Compose with `nginx-proxy` + `acme-companion` for automatic SSL
|
||
|
||
---
|
||
|
||
## 12. Definition of Done (MVP)
|
||
|
||
- [x] API returns quiz terms with correct distractors
|
||
- [x] User can complete a quiz without errors
|
||
- [x] Score screen shows final result and a play-again option
|
||
- [x] App is usable on a mobile screen
|
||
- [x] No hardcoded data — everything comes from the database
|
||
- [x] Global error handler with typed error classes
|
||
- [x] Unit + integration tests for API
|
||
|
||
---
|
||
|
||
## 13. Roadmap
|
||
|
||
### Phase 0 — Foundation ✅
|
||
|
||
Empty repo that builds, lints, and runs end-to-end. `pnpm dev` starts both apps; `GET /api/health` returns 200; React renders a hello page.
|
||
|
||
### Phase 1 — Vocabulary Data + API ✅
|
||
|
||
Word data lives in the DB. API returns quiz sessions with distractors. CEFR enrichment pipeline complete. Global error handler and tests implemented.
|
||
|
||
### Phase 2 — Singleplayer Quiz UI ✅
|
||
|
||
User can complete a full quiz in the browser. Settings UI, question cards, answer feedback, score screen.
|
||
|
||
### Phase 3 — Auth
|
||
|
||
Users can log in via Google or GitHub and stay logged in. JWT validated by API. User row created on first login.
|
||
|
||
### Phase 4 — Multiplayer Lobby
|
||
|
||
Players can create and join rooms. Two browser tabs can join the same room and see each other via WebSocket.
|
||
|
||
### Phase 5 — Multiplayer Game
|
||
|
||
Host starts a game. All players answer simultaneously in real time. Winner declared.
|
||
|
||
### Phase 6 — Production Deployment
|
||
|
||
App is live on Hetzner with HTTPS. Auth flow works end-to-end.
|
||
|
||
### Phase 7 — Polish & Hardening
|
||
|
||
Rate limiting, reconnect logic, error boundaries, CI/CD, DB backups.
|
||
|
||
### Dependency Graph
|
||
|
||
```text
|
||
Phase 0 (Foundation)
|
||
└── Phase 1 (Vocabulary Data + API)
|
||
└── Phase 2 (Singleplayer UI)
|
||
└── Phase 3 (Auth)
|
||
├── Phase 4 (Room Lobby)
|
||
│ └── Phase 5 (Multiplayer Game)
|
||
│ └── Phase 6 (Deployment)
|
||
└── Phase 7 (Hardening)
|
||
```
|
||
|
||
---
|
||
|
||
## 14. Game Flow (Future)
|
||
|
||
Singleplayer: choose direction (en→it or it→en) → top-level category → part of speech → difficulty (A1–C2) → round count → game starts.
|
||
|
||
**Top-level categories (post-MVP):**
|
||
|
||
- **Grammar** — practice nouns, verb conjugations, etc.
|
||
- **Media** — practice vocabulary from specific books, films, songs, etc.
|
||
- **Thematic** — animals, kitchen, etc. (requires category metadata research)
|