2026-04-19 18:48:20 +02:00

17 KiB

Raw Blame History

lila — Project Specification

This document is the single source of truth for the project. It is written to be handed to any LLM as context. It contains the project vision, the current MVP scope, the tech stack, the architecture, and the roadmap.

1. Project Overview

A vocabulary trainer for English–Italian words. The quiz format is Duolingo-style: one word is shown as a prompt, and the user picks the correct translation from four choices (1 correct + 3 distractors of the same part-of-speech). The app supports both singleplayer and real-time multiplayer game modes.

The core learning loop: Show word → pick answer → see result → next word → final score

The vocabulary data comes from WordNet + the Open Multilingual Wordnet (OMW). A one-time Python script extracts English–Italian noun pairs and seeds the database. The data model is language-pair agnostic by design — adding a new language later requires no schema changes.

Core Principles

Minimal but extendable: working product fast, clean architecture for future growth
Mobile-first: touch-friendly Duolingo-like UX
Type safety end-to-end: TypeScript + Zod schemas shared between frontend and backend

2. Full Product Vision (Long-Term)

Users log in via Google or GitHub (Better Auth)
Singleplayer mode: 10-round quiz, score screen
Multiplayer mode: create a room, share a code, 2–4 players answer simultaneously in real time, live scores, winner screen
1000+ English–Italian nouns seeded from WordNet

This is the full vision. The current implementation already covers most of it; remaining items are captured in the roadmap and the Post-MVP ladder below.

3. MVP Scope

Goal: A working, presentable vocabulary trainer that can be shown to real people (singleplayer and multiplayer), with a production deployment.

What is IN the MVP

Vocabulary data in a PostgreSQL database (already seeded)
REST API that returns quiz terms with distractors
Singleplayer quiz UI: configurable rounds (3 or 10), answer feedback, score screen
Clean, mobile-friendly UI (Tailwind + shadcn/ui)
Global error handler with typed error classes
Unit + integration tests for the API
Authentication via Better Auth (Google + GitHub)
Multiplayer lobby + game over WebSockets
Production deployment (Docker Compose + Caddy + Hetzner) and CI/CD (Forgejo Actions)

What is CUT from the MVP

Feature	Why cut
User stats / profiles	Needs auth

These are not deleted from the plan — they are deferred. The architecture is already designed to support them. See Section 11 (Post-MVP Ladder).

4. Technology Stack

The monorepo structure and tooling are already set up. This is the full stack.

Layer	Technology	Status
Monorepo	pnpm workspaces	✅
Frontend	React 18, Vite, TypeScript	✅
Routing	TanStack Router	✅
Server state	TanStack Query	✅
Client state	Zustand	✅
Styling	Tailwind CSS + shadcn/ui	✅
Backend	Node.js, Express, TypeScript	✅
Database	PostgreSQL + Drizzle ORM	✅
Validation	Zod (shared schemas)	✅
Testing	Vitest, supertest	✅
Auth	Better Auth (Google + GitHub)	✅
Deployment	Docker Compose, Caddy, Hetzner	✅
CI/CD	Forgejo Actions	✅
Realtime	WebSockets (`ws` library)	✅
Cache	Valkey	⚠️ optional (used locally; production/state hardening)

5. Repository Structure

lila/
├── .forgejo/
│   └── workflows/
│       └── deploy.yml              — CI/CD pipeline (build, push, deploy)
├── apps/
│   ├── api/
│   │   └── src/
│   │       ├── app.ts                  — createApp() factory, CORS, auth handler, error middleware
│   │       ├── server.ts               — starts server on PORT
│   │       ├── errors/
│   │       │   └── AppError.ts         — AppError, ValidationError, NotFoundError
│   │       ├── lib/
│   │       │   └── auth.ts             — Better Auth config (Google + GitHub providers)
│   │       ├── middleware/
│   │       │   ├── authMiddleware.ts    — session validation for protected routes
│   │       │   └── errorHandler.ts     — central error middleware
│   │       ├── routes/
│   │       │   ├── apiRouter.ts        — mounts /health and /game routers
│   │       │   ├── gameRouter.ts       — POST /start, POST /answer
│   │       │   └── healthRouter.ts
│   │       ├── controllers/
│   │       │   └── gameController.ts   — validates input, calls service, sends response
│   │       ├── services/
│   │       │   ├── gameService.ts      — builds quiz sessions, evaluates answers
│   │       │   └── gameService.test.ts — unit tests (mocked DB)
│   │       └── gameSessionStore/
│   │           ├── GameSessionStore.ts — interface (async, Valkey-ready)
│   │           ├── InMemoryGameSessionStore.ts
│   │           └── index.ts
│   └── web/
│       ├── Dockerfile                  — multi-stage: dev + production (nginx:alpine)
│       ├── nginx.conf                  — SPA fallback routing
│       └── src/
│           ├── routes/
│           │   ├── index.tsx           — landing page
│           │   ├── play.tsx            — the quiz
│           │   ├── login.tsx           — Google + GitHub login buttons
│           │   ├── about.tsx
│           │   └── __root.tsx
│           ├── lib/
│           │   └── auth-client.ts      — Better Auth React client
│           ├── components/
│           │   └── game/
│           │       ├── GameSetup.tsx    — settings UI
│           │       ├── QuestionCard.tsx — prompt + 4 options
│           │       ├── OptionButton.tsx — idle / correct / wrong states
│           │       └── ScoreScreen.tsx  — final score + play again
│           └── main.tsx
├── packages/
│   ├── shared/
│   │   └── src/
│   │       ├── constants.ts            — SUPPORTED_POS, DIFFICULTY_LEVELS, etc.
│   │       ├── schemas/game.ts         — Zod schemas for all game types
│   │       └── index.ts
│   └── db/
│       ├── drizzle/                    — migration SQL files
│       └── src/
│           ├── db/schema.ts            — Drizzle schema (terms, translations, auth tables)
│           ├── models/termModel.ts     — getGameTerms(), getDistractors()
│           ├── seeding-datafiles.ts    — seeds terms + translations from JSON
│           ├── seeding-cefr-levels.ts  — enriches translations with CEFR data
│           ├── generating-deck.ts      — builds curated decks
│           └── index.ts
├── scripts/                            — Python extraction/comparison/merge scripts
├── documentation/                      — project docs
├── docker-compose.yml                  — local dev stack
├── Caddyfile                           — reverse proxy routing
└── pnpm-workspace.yaml

packages/shared is the contract between frontend and backend. All request/response shapes are defined there as Zod schemas — never duplicated.

6. Architecture

The Layered Architecture

HTTP Request
     ↓
  Router        — maps URL + HTTP method to a controller
     ↓
 Controller     — handles HTTP only: validates input, calls service, sends response
     ↓
  Service       — business logic only: no HTTP, no direct DB access
     ↓
  Model         — database queries only: no business logic
     ↓
  Database

The rule: each layer only talks to the layer directly below it. A controller never touches the database. A service never reads req.body. A model never knows what a quiz is.

Monorepo Package Responsibilities

Package	Owns
`packages/shared`	Zod schemas, constants, derived TypeScript types
`packages/db`	Drizzle schema, DB connection, all model/query functions
`apps/api`	Router, controllers, services, error handling
`apps/web`	React frontend, consumes types from shared

Key principle: all database code lives in packages/db. apps/api never imports drizzle-orm for queries — it only calls functions exported from packages/db.

Production Infrastructure

Internet → Caddy (HTTPS termination)
             ├── lilastudy.com       → web container (nginx, static files)
             ├── api.lilastudy.com   → api container (Express, port 3000)
             └── git.lilastudy.com   → forgejo container (git + registry, port 3000)

SSH (port 2222) → forgejo container (git push/pull)

All containers communicate over an internal Docker network. Only Caddy (80/443) and Forgejo SSH (2222) are exposed to the internet.

7. Data Model (Current State)

Words are modelled as language-neutral concepts (terms) separate from learning curricula (decks). Adding a new language pair requires no schema changes — only new rows in translations, decks.

Core tables: terms, translations, term_glosses, decks, deck_terms, topics, term_topics

Auth tables (managed by Better Auth): user, session, account, verification

Key columns on terms: id (uuid), pos (CHECK-constrained), source, source_id (unique pair for idempotent imports)

Key columns on translations: id, term_id (FK), language_code (CHECK-constrained), text, cefr_level (nullable varchar(2), CHECK A1–C2)

Deck model uses source_language + validated_languages array — one deck serves multiple target languages. Decks are frequency tiers (e.g. en-core-1000), not POS splits.

Full schema is in packages/db/src/db/schema.ts.

8. API

Endpoints

POST /api/v1/game/start     GameRequest → GameSession      (requires auth)
POST /api/v1/game/answer    AnswerSubmission → AnswerResult  (requires auth)
GET  /api/v1/health          Health check                    (public)
ALL  /api/auth/*             Better Auth handlers            (public)

Schemas (packages/shared)

GameRequest: { source_language, target_language, pos, difficulty, rounds } GameSession: { sessionId: uuid, questions: GameQuestion[] } GameQuestion: { questionId: uuid, prompt: string, gloss: string | null, options: AnswerOption[4] } AnswerOption: { optionId: number (0-3), text: string } AnswerSubmission: { sessionId: uuid, questionId: uuid, selectedOptionId: number (0-3) } AnswerResult: { questionId: uuid, isCorrect: boolean, correctOptionId: number (0-3), selectedOptionId: number (0-3) }

Error Handling

Typed error classes (AppError base, ValidationError 400, NotFoundError 404) with central error middleware. Controllers validate with safeParse, throw on failure, and call next(error) in the catch. The middleware maps AppError instances to HTTP status codes; unknown errors return 500.

Key Design Rules

Server-side answer evaluation: the correct answer is never sent to the frontend
POST not GET for game start (configuration in request body)
safeParse over parse (clean 400s, not raw Zod 500s)
Session state stored in GameSessionStore (in-memory now, Valkey later)

9. Game Mechanics

Format: source-language word prompt + 4 target-language choices
Distractors: same POS, same difficulty, server-side, never the correct answer, never repeated within a session
Session length: 3 or 10 questions (configurable)
Scoring: +1 per correct answer (no speed bonus for MVP)
Timer: none in singleplayer MVP
Auth required: users must log in via Google or GitHub
Submit-before-send: user selects, then confirms (prevents misclicks)

10. Working Methodology

This project is a learning exercise. The goal is to understand the code, not just to ship it.

How to use an LLM for help

Paste this document as context
Describe what you're working on and what you're stuck on
Ask for hints, not solutions

Refactoring workflow

After completing a task: share the code, ask what to refactor and why. The LLM should explain the concept, not write the implementation.

11. Post-MVP Ladder

Phase	What it adds	Status
Auth	Better Auth (Google + GitHub), embedded in Express API, user rows in DB	✅
Deployment	Docker Compose, Caddy, Forgejo, CI/CD, Hetzner VPS	✅
Hardening (partial)	CI/CD pipeline, DB backups	✅
User Stats	Games played, score history, profile page	❌
Multiplayer Lobby	Room creation, join by code, WebSocket connection	✅
Multiplayer Game	Simultaneous answers, server timer, live scores, winner screen	✅
Hardening (rest)	Rate limiting, error boundaries, monitoring, accessibility	❌

Future Data Model Extensions (deferred, additive)

noun_forms — gender, singular, plural, articles per language
verb_forms — conjugation tables per language
term_pronunciations — IPA and audio URLs per language
user_decks — which decks a user is studying
user_term_progress — spaced repetition state per user/term/language
quiz_answers — history log for stats

All are new tables referencing existing terms rows via FK. No existing schema changes required.

Multiplayer Architecture (current + deferred)

Implemented now:

WebSocket protocol uses the ws library with a Zod discriminated union for message types (defined in packages/shared)
Room model uses human-readable codes (no matchmaking queue)
Lobby flow (create/join/leave) is real-time over WS, backed by PostgreSQL for durable membership/state
Multiplayer game flow is real-time: host starts, all players see the same question, answers are collected simultaneously, with a server-enforced 15s timer and live scoring
WebSocket connections are authenticated (Better Auth session validation on upgrade)

Deferred / hardening:

Valkey-backed ephemeral state (room/game/session store) where in-memory state becomes a bottleneck
Graceful reconnect/resume flows and more robust failure handling (tracked in Phase 7)

Infrastructure (current)

lilastudy.com → React frontend (nginx serving static files)
api.lilastudy.com → Express API + Better Auth
git.lilastudy.com → Forgejo (git server + container registry)
Docker Compose with Caddy for automatic HTTPS via Let's Encrypt
CI/CD via Forgejo Actions (build on push to main, deploy via SSH)
Daily DB backups with cron, synced to dev laptop

See deployment.md for full infrastructure documentation.

12. Definition of Done (Current Baseline)

API returns quiz terms with correct distractors
User can complete a quiz without errors
Score screen shows final result and a play-again option
App is usable on a mobile screen
No hardcoded data — everything comes from the database
Global error handler with typed error classes
Unit + integration tests for API
Auth works end-to-end (Google + GitHub via Better Auth)
Multiplayer works end-to-end (lobby + real-time game over WebSockets)
Production deployment is live behind HTTPS (Caddy) with CI/CD deploys via Forgejo Actions

13. Roadmap

See roadmap.md for the full roadmap with task-level checkboxes.

Dependency Graph

Phase 0 (Foundation) ✅
└── Phase 1 (Vocabulary Data + API) ✅
    └── Phase 2 (Singleplayer UI) ✅
        ├── Phase 3 (Auth) ✅
        │   └── Phase 6 (Deployment + CI/CD) ✅
        └── Phase 4 (Multiplayer Lobby) ✅
            └── Phase 5 (Multiplayer Game) ✅
                └── Phase 7 (Hardening)

14. Game Flow (Future)

Singleplayer: choose direction (en→it or it→en) → top-level category → part of speech → difficulty (A1–C2) → round count → game starts.

Top-level categories (post-MVP):

Grammar — practice nouns, verb conjugations, etc.
Media — practice vocabulary from specific books, films, songs, etc.
Thematic — animals, kitchen, etc. (requires category metadata research)

17 KiB Raw Blame History Unescape Escape