12 KiB
Glossa — Architecture & API Development Summary
A record of all architectural discussions, decisions, and outcomes from the initial API design through the quiz model implementation.
Project Overview
Glossa is a vocabulary trainer (Duolingo-style) built as a pnpm monorepo. Users see a word and pick from 4 possible translations. Supports singleplayer and multiplayer. Stack: Express API, React frontend, Drizzle ORM, Postgres, Valkey, WebSockets.
Architectural Foundation
The Layered Architecture
The core mental model established for the entire API:
HTTP Request
↓
Router — maps URL + HTTP method to a controller
↓
Controller — handles HTTP only: validates input, calls service, sends response
↓
Service — business logic only: no HTTP, no direct DB access
↓
Model — database queries only: no business logic
↓
Database
The rule: each layer only talks to the layer directly below it. A controller never
touches the database. A service never reads req.body. A model never knows what a quiz is.
Monorepo Package Responsibilities
| Package | Owns |
|---|---|
packages/shared |
Zod schemas, constants, derived TypeScript types |
packages/db |
Drizzle schema, DB connection, all model/query functions |
apps/api |
Router, controllers, services |
apps/web |
React frontend, consumes types from shared |
Key principle: all database code lives in packages/db. apps/api never imports
drizzle-orm for queries — it only calls functions exported from packages/db.
Problems Faced & Solutions
- Problem 1: Messy API structure Symptom: responsibilities bleeding across layers — DB code in controllers, business logic in routes. Solution: strict layered architecture with one responsibility per layer.
- Problem 2: No shared contract between API and frontend
Symptom: API could return different shapes silently, frontend breaks at runtime.
Solution: Zod schemas in
packages/sharedas the single source of truth. Both API (validation) and frontend (type inference) consume the same schemas. - Problem 3: Type safety gaps
Symptom: TypeScript
anytypes on model parameters,Numbervsnumberconfusion. Solution: derived types from constants usingtypeof CONSTANT[number]pattern. All valid values defined once in constants, types derived automatically. - Problem 4:
getGameTermsin wrong package Symptom: model queries living inapps/api/src/models/meantapps/apihad a directdrizzle-ormdependency and was accessing the DB itself. Solution: moved models folder topackages/db/src/models/. All Drizzle code now lives in one package. - Problem 5: Deck generation complexity Initial assumption: 12 decks needed (nouns/verbs × easy/intermediate/hard × en/it). Correction: decks are pools, not presets. POS and difficulty are query filters applied at runtime — not deck properties. Only 2 decks needed (en-core, it-core). Final decision: skip deck generation entirely for MVP. Query the terms table directly with difficulty + POS filters. Revisit post-MVP when spaced repetition or progression features require curated pools.
- Problem 6: GAME_ROUNDS type conflict
Problem:
z.enum()only accepts strings.GAME_ROUNDS = ["3", "10"]works withz.enum()but requiresNumber(rounds)conversion in the service. Decision: keep as strings, convert to number in the service before passing to the model. Documented coupling acknowledged with a comment. - Problem 7: Gloss join could multiply question rows. Schema allowed multiple glosses per term per language, so the left join would duplicate rows. Fixed by tightening the unique constraint.
- Problem 8: Model leaked quiz semantics. Return fields were named prompt / answer, baking HTTP-layer concepts into the database layer. Renamed to neutral field names.
Decisions Made
- Zod schemas belong in
packages/sharedBoth the API and frontend import from the same schemas. If the shape changes, TypeScript compilation fails in both places simultaneously — silent drift is impossible. - Server-side answer evaluation
The correct answer is never sent to the frontend in
QuizQuestion. It is only revealed inAnswerResultafter the client submits. Prevents cheating and keeps game logic authoritative on the server. safeParseoverparsein controllersparsethrows a raw Zod error → ugly 500 response.safeParsereturns a result object → clean 400 with early return. Global error handler to be implemented later (Step 6 of roadmap) will centralise this pattern.- POST not GET for game start
GETrequests have no body. Game configuration is submitted as a JSON body →POSTis semantically correct. express.json()middleware required Without it,req.bodyisundefined. Added tocreateApp()inapp.ts.- Type naming: PascalCase
TypeScript convention.
supportedLanguageCode→SupportedLanguageCodeetc. - Primitive types: always lowercase
numbernotNumber,stringnotString. The uppercase versions are object wrappers and not assignable to Drizzle's expected primitive types. - Model parameters use shared types, not
GameRequestTypeThe model layer should not know aboutGameRequestType— that's an HTTP boundary concern. Instead, parameters are typed using the derived constant types (SupportedLanguageCode,SupportedPos,DifficultyLevel) exported frompackages/shared. - One gloss per term per language. The unique constraint on term_glosses was tightened from (term_id, language_code, text) to (term_id, language_code) to prevent the left join from multiplying question rows. Revisit if multiple glosses per language are ever needed (e.g. register or domain variants).
- Model returns neutral field names, not quiz semantics. getGameTerms returns sourceText / targetText / sourceGloss rather than prompt / answer / gloss. Quiz semantics are applied in the service layer. Keeps the model reusable for non-quiz features.
- Asymmetric difficulty filter. Difficulty is filtered on the target (answer) side only. A word can be A2 in Italian but B1 in English, and what matters is the difficulty of the word being learned.
Data Pipeline Work (Pre-API)
CEFR Enrichment Pipeline (completed)
A staged ETL pipeline was built to enrich translation records with CEFR levels and difficulty ratings:
Raw source files
↓
extract-*.py — normalise each source to standard JSON
↓
compare-*.py — quality gate: surface conflicts between sources (read-only)
↓
merge-*.py — resolve conflicts by source priority, derive difficulty
↓
enrich.ts — write cefr_level + difficulty to DB translations table
Source priority:
- English:
en_m3>cefrj>octanove>random - Italian:
it_m3>italian
Enrichment results:
| Language | Enriched | Total | Coverage |
|---|---|---|---|
| English | 42,527 | 171,394 | ~25% |
| Italian | 23,061 | 54,603 | ~42% |
Both languages have sufficient coverage for MVP. Italian C2 has only 242 terms — noted as a potential constraint for the distractor algorithm at high difficulty.
API Schemas (packages/shared)
GameRequestSchema (implemented)
{
source_language: z.enum(SUPPORTED_LANGUAGE_CODES),
target_language: z.enum(SUPPORTED_LANGUAGE_CODES),
pos: z.enum(SUPPORTED_POS),
difficulty: z.enum(DIFFICULTY_LEVELS),
rounds: z.enum(GAME_ROUNDS),
}
Planned schemas (not yet implemented)
QuizQuestion — prompt, optional gloss, 4 options (no correct answer)
QuizOption — optionId + text
AnswerSubmission — questionId + selectedOptionId
AnswerResult — correct boolean, correctOptionId, selectedOptionId
API Endpoints
POST /api/v1/game/start GameRequest → QuizQuestion[]
POST /api/v1/game/answer AnswerSubmission → AnswerResult
Current File Structure (apps/api)
apps/api/src/
├── app.ts — Express app, express.json() middleware
├── server.ts — starts server on PORT
├── routes/
│ ├── apiRouter.ts — mounts /health and /game routers
│ ├── gameRouter.ts — POST /start → createGame controller
│ └── healthRouter.ts
├── controllers/
│ └── gameController.ts — validates GameRequest, calls service
└── services/
└── gameService.ts — calls getGameTerms, returns raw rows
Current File Structure (packages/db)
packages/db/src/
├── db/
│ └── schema.ts — Drizzle schema (terms, translations, users, decks...)
├── models/
│ └── termModel.ts — getGameTerms() query
└── index.ts — exports db connection + getGameTerms
Completed Tasks
- Layered architecture established and understood
GameRequestSchemadefined inpackages/shared- Derived types (
SupportedLanguageCode,SupportedPos,DifficultyLevel) exported from constants getGameTerms()model implemented with POS / language / difficulty / limit filters- Model correctly placed in
packages/db prepareGameQuestions()service skeleton calling the modelcreateGamecontroller with ZodsafeParsevalidationPOST /api/v1/game/startroute wired- End-to-end pipeline verified with test script — returns correct rows
- CEFR enrichment pipeline complete for English and Italian
- Double join on translations implemented (source + target language)
- Gloss left join implemented
- Model return type uses neutral field names (sourceText, targetText, sourceGloss)
- Schema: gloss unique constraint tightened to one gloss per term per language
Roadmap Ahead
Step 1 — Learn SQL fundamentals (in progress)
Concepts needed: SELECT, FROM, JOIN, WHERE, LIMIT. Resources: sqlzoo.net or Khan Academy SQL section. Required before: implementing the double join for source language prompt.
Step 2 — Complete the model layer
- Double join on
translations— once for source language (prompt), once for target language (answer) GlossModel.getGloss(termId, languageCode)— fetch gloss if available
Step 3 — Define remaining Zod schemas
QuizQuestion,QuizOption,AnswerSubmission,AnswerResultinpackages/shared
Step 4 — Complete the service layer
QuizService.buildSession()— assemble raw rows intoQuizQuestion[]- Generate
questionIdper question - Map source language translation as prompt
- Attach gloss if available
- Fetch 3 distractors (same POS, different term, same difficulty)
- Shuffle options so correct answer is not always in same position
- Generate
QuizService.evaluateAnswer()— validate correctness, returnAnswerResult
Step 5 — Implement answer endpoint
POST /api/v1/game/answerroute, controller, service method
Step 6 — Global error handler
- Typed error classes (
ValidationError,NotFoundError) - Central error middleware in
app.ts - Remove temporary
safeParseerror handling from controllers
Step 7 — Tests
- Unit tests for
QuizService— correct POS filtering, distractor never equals correct answer - Unit tests for
evaluateAnswer— correct and incorrect cases - Integration tests for both endpoints
Step 8 — Auth (Phase 2 from original roadmap)
- OpenAuth integration
- JWT validation middleware
GET /api/auth/meendpoint- Frontend auth guard
Open Questions
- Distractor algorithm: when Italian C2 has only 242 terms, should the difficulty
filter fall back gracefully or return an error? Decision needed before implementing
buildSession(). - Session statefulness: game loop is currently stateless (fetch all questions upfront).
Confirm this is still the intended MVP approach before building
buildSession().