2026-04-10 20:20:09 +02:00

12 KiB

Raw Blame History

Glossa — Architecture & API Development Summary

A record of all architectural discussions, decisions, and outcomes from the initial API design through the quiz model implementation.

Project Overview

Glossa is a vocabulary trainer (Duolingo-style) built as a pnpm monorepo. Users see a word and pick from 4 possible translations. Supports singleplayer and multiplayer. Stack: Express API, React frontend, Drizzle ORM, Postgres, Valkey, WebSockets.

Architectural Foundation

The Layered Architecture

The core mental model established for the entire API:

HTTP Request
     ↓
  Router        — maps URL + HTTP method to a controller
     ↓
 Controller     — handles HTTP only: validates input, calls service, sends response
     ↓
  Service       — business logic only: no HTTP, no direct DB access
     ↓
  Model         — database queries only: no business logic
     ↓
  Database

The rule: each layer only talks to the layer directly below it. A controller never touches the database. A service never reads req.body. A model never knows what a quiz is.

Monorepo Package Responsibilities

Package	Owns
`packages/shared`	Zod schemas, constants, derived TypeScript types
`packages/db`	Drizzle schema, DB connection, all model/query functions
`apps/api`	Router, controllers, services
`apps/web`	React frontend, consumes types from shared

Key principle: all database code lives in packages/db. apps/api never imports drizzle-orm for queries — it only calls functions exported from packages/db.

Problems Faced & Solutions

Problem 1: Messy API structure Symptom: responsibilities bleeding across layers — DB code in controllers, business logic in routes. Solution: strict layered architecture with one responsibility per layer.
Problem 2: No shared contract between API and frontend Symptom: API could return different shapes silently, frontend breaks at runtime. Solution: Zod schemas in packages/shared as the single source of truth. Both API (validation) and frontend (type inference) consume the same schemas.
Problem 3: Type safety gaps Symptom: TypeScript any types on model parameters, Number vs number confusion. Solution: derived types from constants using typeof CONSTANT[number] pattern. All valid values defined once in constants, types derived automatically.
Problem 4: getGameTerms in wrong package Symptom: model queries living in apps/api/src/models/ meant apps/api had a direct drizzle-orm dependency and was accessing the DB itself. Solution: moved models folder to packages/db/src/models/. All Drizzle code now lives in one package.
Problem 5: Deck generation complexity Initial assumption: 12 decks needed (nouns/verbs × easy/intermediate/hard × en/it). Correction: decks are pools, not presets. POS and difficulty are query filters applied at runtime — not deck properties. Only 2 decks needed (en-core, it-core). Final decision: skip deck generation entirely for MVP. Query the terms table directly with difficulty + POS filters. Revisit post-MVP when spaced repetition or progression features require curated pools.
Problem 6: GAME_ROUNDS type conflict Problem: z.enum() only accepts strings. GAME_ROUNDS = ["3", "10"] works with z.enum() but requires Number(rounds) conversion in the service. Decision: keep as strings, convert to number in the service before passing to the model. Documented coupling acknowledged with a comment.
Problem 7: Gloss join could multiply question rows. Schema allowed multiple glosses per term per language, so the left join would duplicate rows. Fixed by tightening the unique constraint.
Problem 8: Model leaked quiz semantics. Return fields were named prompt / answer, baking HTTP-layer concepts into the database layer. Renamed to neutral field names.

Decisions Made

Zod schemas belong in packages/shared Both the API and frontend import from the same schemas. If the shape changes, TypeScript compilation fails in both places simultaneously — silent drift is impossible.
Server-side answer evaluation The correct answer is never sent to the frontend in QuizQuestion. It is only revealed in AnswerResult after the client submits. Prevents cheating and keeps game logic authoritative on the server.
safeParse over parse in controllers parse throws a raw Zod error → ugly 500 response. safeParse returns a result object → clean 400 with early return. Global error handler to be implemented later (Step 6 of roadmap) will centralise this pattern.
POST not GET for game start GET requests have no body. Game configuration is submitted as a JSON body → POST is semantically correct.
express.json() middleware required Without it, req.body is undefined. Added to createApp() in app.ts.
Type naming: PascalCase TypeScript convention. supportedLanguageCode → SupportedLanguageCode etc.
Primitive types: always lowercase number not Number, string not String. The uppercase versions are object wrappers and not assignable to Drizzle's expected primitive types.
Model parameters use shared types, not GameRequestType The model layer should not know about GameRequestType — that's an HTTP boundary concern. Instead, parameters are typed using the derived constant types (SupportedLanguageCode, SupportedPos, DifficultyLevel) exported from packages/shared.
One gloss per term per language. The unique constraint on term_glosses was tightened from (term_id, language_code, text) to (term_id, language_code) to prevent the left join from multiplying question rows. Revisit if multiple glosses per language are ever needed (e.g. register or domain variants).
Model returns neutral field names, not quiz semantics. getGameTerms returns sourceText / targetText / sourceGloss rather than prompt / answer / gloss. Quiz semantics are applied in the service layer. Keeps the model reusable for non-quiz features.
Asymmetric difficulty filter. Difficulty is filtered on the target (answer) side only. A word can be A2 in Italian but B1 in English, and what matters is the difficulty of the word being learned.

Data Pipeline Work (Pre-API)

CEFR Enrichment Pipeline (completed)

A staged ETL pipeline was built to enrich translation records with CEFR levels and difficulty ratings:

Raw source files
      ↓
extract-*.py      — normalise each source to standard JSON
      ↓
compare-*.py      — quality gate: surface conflicts between sources (read-only)
      ↓
merge-*.py        — resolve conflicts by source priority, derive difficulty
      ↓
enrich.ts         — write cefr_level + difficulty to DB translations table

Source priority:

English: en_m3 > cefrj > octanove > random
Italian: it_m3 > italian

Enrichment results:

Language	Enriched	Total	Coverage
English	42,527	171,394	~25%
Italian	23,061	54,603	~42%

Both languages have sufficient coverage for MVP. Italian C2 has only 242 terms — noted as a potential constraint for the distractor algorithm at high difficulty.

API Schemas (packages/shared)

`GameRequestSchema` (implemented)

{
  source_language: z.enum(SUPPORTED_LANGUAGE_CODES),
  target_language: z.enum(SUPPORTED_LANGUAGE_CODES),
  pos: z.enum(SUPPORTED_POS),
  difficulty: z.enum(DIFFICULTY_LEVELS),
  rounds: z.enum(GAME_ROUNDS),
}

Planned schemas (not yet implemented)

QuizQuestion      — prompt, optional gloss, 4 options (no correct answer)
QuizOption        — optionId + text
AnswerSubmission  — questionId + selectedOptionId
AnswerResult      — correct boolean, correctOptionId, selectedOptionId

API Endpoints

POST /api/v1/game/start     GameRequest → QuizQuestion[]
POST /api/v1/game/answer    AnswerSubmission → AnswerResult

Current File Structure (apps/api)

apps/api/src/
├── app.ts                  — Express app, express.json() middleware
├── server.ts               — starts server on PORT
├── routes/
│   ├── apiRouter.ts        — mounts /health and /game routers
│   ├── gameRouter.ts       — POST /start → createGame controller
│   └── healthRouter.ts
├── controllers/
│   └── gameController.ts   — validates GameRequest, calls service
└── services/
    └── gameService.ts      — calls getGameTerms, returns raw rows

Current File Structure (packages/db)

packages/db/src/
├── db/
│   └── schema.ts           — Drizzle schema (terms, translations, users, decks...)
├── models/
│   └── termModel.ts        — getGameTerms() query
└── index.ts                — exports db connection + getGameTerms

Completed Tasks

Layered architecture established and understood
GameRequestSchema defined in packages/shared
Derived types (SupportedLanguageCode, SupportedPos, DifficultyLevel) exported from constants
getGameTerms() model implemented with POS / language / difficulty / limit filters
Model correctly placed in packages/db
prepareGameQuestions() service skeleton calling the model
createGame controller with Zod safeParse validation
POST /api/v1/game/start route wired
End-to-end pipeline verified with test script — returns correct rows
CEFR enrichment pipeline complete for English and Italian
Double join on translations implemented (source + target language)
Gloss left join implemented
Model return type uses neutral field names (sourceText, targetText, sourceGloss)
Schema: gloss unique constraint tightened to one gloss per term per language

Roadmap Ahead

Step 1 — Learn SQL fundamentals (in progress)

Concepts needed: SELECT, FROM, JOIN, WHERE, LIMIT. Resources: sqlzoo.net or Khan Academy SQL section. Required before: implementing the double join for source language prompt.

Step 2 — Complete the model layer

Double join on translations — once for source language (prompt), once for target language (answer)
GlossModel.getGloss(termId, languageCode) — fetch gloss if available

Step 3 — Define remaining Zod schemas

QuizQuestion, QuizOption, AnswerSubmission, AnswerResult in packages/shared

Step 4 — Complete the service layer

QuizService.buildSession() — assemble raw rows into QuizQuestion[]
- Generate questionId per question
- Map source language translation as prompt
- Attach gloss if available
- Fetch 3 distractors (same POS, different term, same difficulty)
- Shuffle options so correct answer is not always in same position
QuizService.evaluateAnswer() — validate correctness, return AnswerResult

Step 5 — Implement answer endpoint

POST /api/v1/game/answer route, controller, service method

Step 6 — Global error handler

Typed error classes (ValidationError, NotFoundError)
Central error middleware in app.ts
Remove temporary safeParse error handling from controllers

Step 7 — Tests

Unit tests for QuizService — correct POS filtering, distractor never equals correct answer
Unit tests for evaluateAnswer — correct and incorrect cases
Integration tests for both endpoints

Step 8 — Auth (Phase 2 from original roadmap)

OpenAuth integration
JWT validation middleware
GET /api/auth/me endpoint
Frontend auth guard

Open Questions

Distractor algorithm: when Italian C2 has only 242 terms, should the difficulty filter fall back gracefully or return an error? Decision needed before implementing buildSession().
Session statefulness: game loop is currently stateless (fetch all questions upfront). Confirm this is still the intended MVP approach before building buildSession().

12 KiB Raw Blame History Unescape Escape