lila/documentation/decisions.md
lila 3f7bc4111e chore: rename project from glossa to lila
- Update all package names from @glossa/* to @lila/*
- Update all imports, container names, volume names
- Update documentation references
- Recreate database with new credentials
2026-04-13 10:00:52 +02:00

18 KiB
Raw Blame History

Decisions Log

A record of non-obvious technical decisions made during development, with reasoning. Intended to preserve context across sessions. Grouped by topic area.


Tooling

Monorepo: pnpm workspaces (not Turborepo)

Turborepo adds parallel task running and build caching on top of pnpm workspaces. For a two-app monorepo of this size, plain pnpm workspace commands are sufficient and there is one less tool to configure and maintain.

TypeScript runner: tsx (not ts-node)

tsx is faster, requires no configuration, and uses esbuild under the hood. ts-node is older and more complex to configure. tsx does not do type checking — that is handled separately by tsc and the editor. Installed as a dev dependency in apps/api only.

ORM: Drizzle (not Prisma)

Drizzle is lighter — no binary, no engine. Queries map closely to SQL. Migrations are plain SQL files. Works naturally with Zod for type inference. Prisma would add Docker complexity (engine binary in containers) and abstraction that is not needed for this schema.

WebSocket: ws library (not Socket.io)

For rooms of 24 players, Socket.io's room management, transport fallbacks, and reconnection abstractions are unnecessary overhead. The WS protocol is defined explicitly as a Zod discriminated union in packages/shared, giving the same type safety guarantees. Reconnection logic is deferred to Phase 7.

Auth: Better Auth (not OpenAuth or Keycloak)

Better Auth embeds as middleware in the Express API — no separate auth service or Docker container. It connects to the existing PostgreSQL via the Drizzle adapter and manages its own tables (user, session, account, verification). Social providers (Google, GitHub) are configured in a single config object. Session validation is a function call within the same process, not a network request. OpenAuth was considered but requires a standalone service and leaves user management to you. Keycloak is too heavy for a single-app project.


Docker

Multi-stage builds for monorepo context

Both apps/web and apps/api use multi-stage Dockerfiles (deps, dev, builder, runner) because the monorepo structure requires copying pnpm-workspace.yaml, root package.json, and cross-dependencies before installing. Stages allow caching pnpm install separately from source code changes.

Vite as dev server (not Nginx)

In development, apps/web uses vite dev directly, not Nginx. HMR requires Vite's WebSocket dev server. Production will use Nginx to serve static Vite build output.


Architecture

Express app structure: factory function pattern

app.ts exports a createApp() factory function. server.ts imports it and calls .listen(). This allows tests to import the app directly without starting a server (used by supertest).

Zod schemas belong in packages/shared

Both the API and frontend import from the same schemas. If the shape changes, TypeScript compilation fails in both places simultaneously — silent drift is impossible.

Server-side answer evaluation

The correct answer is never sent to the frontend in GameQuestion. It is only revealed in AnswerResult after the client submits. Prevents cheating and keeps game logic authoritative on the server.

safeParse over parse in controllers

parse throws a raw Zod error → ugly 500 response. safeParse returns a result object → clean 400 with early return via the error handler.

POST not GET for game start

GET requests have no body. Game configuration is submitted as a JSON body → POST is semantically correct.

Model parameters use shared types, not GameRequestType

The model layer should not know about GameRequestType — that's an HTTP boundary concern. Parameters are typed using the derived constant types (SupportedLanguageCode, SupportedPos, DifficultyLevel) exported from packages/shared.

Model returns neutral field names, not quiz semantics

getGameTerms returns sourceText / targetText / sourceGloss rather than prompt / answer / gloss. Quiz semantics are applied in the service layer. Keeps the model reusable for non-quiz features.

Asymmetric difficulty filter

Difficulty is filtered on the target (answer) side only. A word can be A2 in Italian but B1 in English, and what matters is the difficulty of the word being learned.

optionId as integer 0-3, not UUID

Options only need uniqueness within a single question; cheating prevented by shuffling, not opaque IDs.

questionId and sessionId as UUIDs

Globally unique, opaque, natural Valkey keys when storage moves later.

gloss is string | null rather than optional

Predictable shape on the frontend — always present, sometimes null.

GameSessionStore stores only the answer key

Minimal payload (questionId → correctOptionId) for easy Valkey migration. All methods are async even for the in-memory implementation, so the service layer is already written for Valkey.

Distractors fetched per-question (N+1 queries)

Correct shape for the problem; 10 queries on local Postgres is negligible latency.

No fallback logic for insufficient distractors

Data volumes are sufficient; strict query throws if something is genuinely broken.

Distractor query excludes both term ID and answer text

Prevents duplicate options from different terms with the same translation.

Submit-before-send flow on frontend

User selects, then confirms. Prevents misclicks.

Multiplayer mechanic: simultaneous answers (not buzz-first)

All players see the same question at the same time and submit independently. The server waits for all answers or a 15-second timeout, then broadcasts the result. Keeps the experience symmetric.

Room model: room codes (not matchmaking queue)

Players create rooms and share a human-readable code (e.g. WOLF-42). Auto-matchmaking deferred.


Error Handling

AppError base class over error code maps

A statusCode on the error itself means the middleware doesn't need a lookup table. New error types are self-contained — one class, one status code. ValidationError (400) and NotFoundError (404) extend AppError.

next(error) over res.status().json() in controllers

Express requires explicit next(error) for async handlers — it does not catch async errors automatically. Centralises all error formatting in one middleware. Controllers stay clean: validate, call service, send response.

Zod .message over .issues[0]?.message

Returns all validation failures at once, not just the first. Output is verbose (raw JSON string) — revisit formatting post-MVP if the frontend needs structured { field, message }[] error objects.

Where errors are thrown

ValidationError is thrown in the controller (the layer that runs safeParse). NotFoundError is thrown in the service (the layer that knows whether a session or question exists). The service doesn't know about HTTP — it throws a typed error, and the middleware maps it to a status code.


Testing

Mocked DB for unit tests (not test database)

Unit tests mock @lila/db via vi.mock — the real database is never touched. Tests run in milliseconds with no infrastructure dependency. Integration tests with a real test DB are deferred post-MVP.

Co-located test files

gameService.test.ts lives next to gameService.ts, not in a separate __tests__/ directory. Convention matches the vitest default and keeps related files together.

supertest for endpoint tests

Uses createApp() factory directly — no server started. Tests the full HTTP layer (routing, middleware, error handler) with real request/response assertions.


TypeScript Configuration

Base config: no lib, module, or moduleResolution

Intentionally omitted from tsconfig.base.json because different packages need different values — apps/api uses NodeNext, apps/web uses ESNext/bundler (Vite). Each package declares its own.

outDir: "./dist" per package

The base config originally had outDir: "dist" which resolved relative to the base file location, pointing to the root dist folder. Overridden in each package with "./dist".

apps/web tsconfig: deferred to Vite scaffold

Filled in after pnpm create vite generated tsconfig files. The generated files were trimmed to remove options already covered by the base.

rootDir: "." on apps/api

Set explicitly to allow vitest.config.ts (outside src/) to be included in the TypeScript program.

Type naming: PascalCase

supportedLanguageCodeSupportedLanguageCode. TypeScript convention.

Primitive types: always lowercase

number not Number, string not String. The uppercase versions are object wrappers and not assignable to Drizzle's expected primitive types.

globals: true with "types": ["vitest/globals"]

Using Vitest globals requires "types": ["vitest/globals"] in each package's tsconfig. Added to apps/api, packages/shared, packages/db, and apps/web/tsconfig.app.json.


ESLint

Two-config approach for apps/web

Root eslint.config.mjs handles TypeScript linting across all packages. apps/web/eslint.config.js adds React-specific plugins only. ESLint flat config merges them by directory proximity.

Coverage config at root only

Vitest coverage configuration lives in the root vitest.config.ts only. Produces a single aggregated report.


Data Model

Users: Better Auth manages the user table

Better Auth creates and owns the user table (plus session, account, verification). The account table links social provider identities to users — one user can have both Google and GitHub linked. Other tables (rooms, stats) reference user.id via FK. No need to design a custom user schema or handle provider-specific claims manually.

Rooms: updated_at for stale recovery only

Most tables omit updated_at. rooms.updated_at is kept specifically for identifying rooms stuck in in_progress status after server crashes.

Translations: UNIQUE (term_id, language_code, text)

Allows multiple synonyms per language per term (e.g. "dog", "hound" for same synset). Prevents exact duplicate rows.

One gloss per term per language

The unique constraint on term_glosses was tightened from (term_id, language_code, text) to (term_id, language_code) to prevent left joins from multiplying question rows. Revisit if multiple glosses per language are ever needed.

Decks: source_language + validated_languages (not pair_id)

One deck can serve multiple target languages as long as translations exist for all its terms. source_language is the language the wordlist was curated from. validated_languages is recalculated on every generation script run. Enforced via CHECK: source_language is never in validated_languages.

Decks: wordlist tiers as scope (not POS-split decks)

One deck per frequency tier per source language (e.g. en-core-1000). POS, difficulty, and category are query filters applied inside that boundary. Decks must not overlap — each term appears in exactly one tier.

Decks: SUBTLEX as wordlist source (not manual curation)

The most common 1000 nouns in English are not the same 1000 nouns that are most common in Italian. SUBTLEX exists in per-language editions derived from subtitle corpora using the same methodology — making them comparable. en-core-1000 built from SUBTLEX-EN, it-core-1000 from SUBTLEX-IT.

language_pairs table: dropped

Valid pairs are implicitly defined by decks.source_language + decks.validated_languages. The table was redundant.

Terms: synset_id nullable (not NOT NULL)

Non-WordNet terms won't have a synset ID. Postgres UNIQUE on a nullable column allows multiple NULL values.

Terms: source + source_id columns

Once multiple import pipelines exist (OMW, Wiktionary), synset_id alone is insufficient as an idempotency key. Unique constraint on the pair. Postgres allows multiple NULL pairs. synset_id remains for now — deprecate during a future pipeline refactor.

cefr_level on translations (not terms)

CEFR difficulty is language-relative, not concept-relative. "House" in English is A1, "domicile" is also English but B2 — same concept, different words, different difficulty. Added as nullable varchar(2) with CHECK.

Categories + term_categories: empty for MVP

Schema exists. Grammar maps to POS (already on terms), Media maps to deck membership. Thematic categories require a metadata source still under research.

CHECK over pgEnum for extensible value sets

ALTER TYPE enum_name ADD VALUE in Postgres is non-transactional — cannot be rolled back if a migration fails. CHECK constraints are fully transactional. Rule: pgEnum for truly static sets, CHECK for any set tied to a growing constant.

language_code always CHECK-constrained

Unlike source (only written by import scripts), language_code is a query-critical filter column. A typo would silently produce missing data. Rule: any column game queries filter on should be CHECK-constrained.

Unique constraints make explicit FK indexes redundant

Postgres automatically creates an index to enforce a unique constraint. A separate index on the leading column of an existing unique constraint adds no value.


Data Pipeline

Seeding v1: batch, truncate-based

For dev/first-time setup. Read JSON, batch inserts in groups of 500, truncate tables before each run. Simple and fast.

Key pitfalls encountered:

  • Duplicate key on re-run: truncate before seeding
  • onConflictDoNothing breaks FK references: when it skips a terms insert, the in-memory UUID is never written, causing FK violations on translations
  • forEach doesn't await: use for...of
  • Final batch not flushed: guard with if (termsArray.length > 0) after loop

Seeding v2: incremental upsert, multi-file

For production / adding languages. Extends the database without truncating. Each synset processed individually (no batching — need real term.id from DB before inserting translations). Filename convention: sourcelang-targetlang-pos.json.

CEFR enrichment pipeline

Staged ETL: extract-*.pycompare-*.py (quality gate) → merge-*.py (resolve conflicts) → enrich.ts (write to DB). Source priority: English en_m3 > cefrj > octanove > random, Italian it_m3 > italian.

Enrichment results: English 42,527/171,394 (~25%), Italian 23,061/54,603 (~42%). Both sufficient for MVP. Italian C2 has only 242 terms — noted as constraint for distractor algorithm.

Term glosses: Italian coverage is sparse

OMW gloss data is primarily English. English glosses: 95,882 (~100%), Italian: 1,964 (~2%). UI falls back to English gloss when no gloss exists for the user's language.

Glosses can leak answers

Some WordNet glosses contain the target-language word in the definition text (e.g. "Padre" in the English gloss for "father"). Address during post-MVP data enrichment — clean glosses, replace with custom definitions, or filter at service layer.

packages/db exports fix

The exports field must be an object, not an array:

"exports": {
  ".": "./src/index.ts",
  "./schema": "./src/db/schema.ts"
}

API Development: Problems & Solutions

  1. Messy API structure. Responsibilities bleeding across layers. Fixed with strict layered architecture.
  2. No shared contract. API could return different shapes silently. Fixed with Zod schemas in packages/shared.
  3. Type safety gaps. any types, Number vs number. Fixed with derived types from constants.
  4. getGameTerms in wrong package. Model queries in apps/api meant direct drizzle-orm dependency. Moved to packages/db/src/models/.
  5. Deck generation complexity. 12 decks assumed, only 2 needed. Then skipped entirely for MVP — query terms table directly.
  6. GAME_ROUNDS type conflict. z.enum() only accepts strings. Keep as strings, convert to number in service.
  7. Gloss join multiplied rows. Multiple glosses per term per language. Fixed by tightening unique constraint.
  8. Model leaked quiz semantics. Return fields named prompt/answer. Renamed to neutral sourceText/targetText.
  9. AnswerResult wasn't self-contained. Frontend needed selectedOptionId but schema didn't include it. Added.
  10. Distractor could duplicate correct answer. Different terms with same translation. Fixed with ne(translations.text, excludeText).
  11. TypeScript strict mode flagged Fisher-Yates shuffle. noUncheckedIndexedAccess treats result[i] as T | undefined. Fixed with non-null assertion + temp variable.

Known Issues / Dev Notes

lila-web has no healthcheck

Vite's dev server has no built-in health endpoint. depends_on uses API healthcheck as proxy. For production (Nginx), add a health endpoint or TCP port check.

Valkey memory overcommit warning

Harmless in dev. Fix before production: add vm.overcommit_memory = 1 to host /etc/sysctl.conf.


Open Research

Semantic category metadata source

Categories (animals, kitchen, etc.) are in the schema but empty. Options researched:

  1. WordNet domain labels — already in OMW, coarse and patchy
  2. Princeton WordNet Domains — ~200 hierarchical domains, freely available, meaningfully better
  3. Kelly Project — CEFR levels AND semantic fields, designed for language learning. Could solve frequency tiers and categories in one shot
  4. BabelNet / WikiData — rich but complex integration, licensing issues
  5. LLM-assisted categorization — fast and cheap at current term counts, not reproducible without saving output
  6. Hybrid (WordNet Domains + LLM gap-fill) — likely most practical
  7. Manual curation — full control, too expensive at scale

Current recommendation: research Kelly Project first. If coverage is insufficient, go with Option 6.

SUBTLEX → cefr_level mapping strategy

Raw frequency ranks need mapping to A1C2 bands before tiered decks are meaningful. Decision pending.

Future extensions: morphology and pronunciation

All deferred post-MVP, purely additive (new tables referencing existing terms):

  • noun_forms — gender, singular, plural, articles per language (source: Wiktionary)
  • verb_forms — conjugation tables per language (source: Wiktionary)
  • term_pronunciations — IPA and audio URLs per language (source: Wiktionary / Forvo)