lila 874dd5e4c7 adding documentation and roadmap for the most minimal mvp

2026-04-02 18:28:44 +02:00

18 KiB

Raw Blame History

glossa mvp

This document is the single source of truth for the project. It is written to be handed to any LLM as context. It contains the project vision, the current MVP scope, the tech stack, the working methodology, and the roadmap.

1. Project Overview

A vocabulary trainer for English–Italian words. The quiz format is Duolingo-style: one word is shown as a prompt, and the user picks the correct translation from four choices (1 correct + 3 distractors of the same part-of-speech). The long-term vision is a multiplayer competitive game, but the MVP is a polished singleplayer experience.

The core learning loop: Show word → pick answer → see result → next word → final score

The vocabulary data comes from WordNet + the Open Multilingual Wordnet (OMW). A one-time Python script extracts English–Italian noun pairs and seeds the database. The data model is language-pair agnostic by design — adding a new language later requires no schema changes.

2. What the Full Product Looks Like (Long-Term Vision)

Users log in via Google or GitHub (OpenAuth)
Singleplayer mode: 10-round quiz, score screen
Multiplayer mode: create a room, share a code, 2–4 players answer simultaneously in real time, live scores, winner screen
1000+ English–Italian nouns seeded from WordNet

This is documented in spec.md and the full roadmap.md. The MVP deliberately ignores most of it.

3. MVP Scope

Goal: A working, presentable singleplayer quiz that can be shown to real people.

What is IN the MVP

Vocabulary data in a PostgreSQL database (already seeded)
REST API that returns quiz terms with distractors
Singleplayer quiz UI: 10 questions, answer feedback, score screen
Clean, mobile-friendly UI (Tailwind + shadcn/ui)
Local dev only (no deployment for MVP)

What is CUT from the MVP

Feature	Why cut
Authentication (OpenAuth)	No user accounts needed for a demo
Multiplayer (WebSockets, rooms)	Core quiz works without it
Valkey / Redis cache	Only needed for multiplayer room state
Deployment to Hetzner	Ship to people locally first
User stats / profiles	Needs auth
Testing suite	Add after the UI stabilises

These are not deleted from the plan — they are deferred. The architecture is already designed to support them. See Section 9 (Post-MVP Ladder).

4. Technology Stack

The monorepo structure and tooling are already set up (Phase 0 complete). This is the full stack — the MVP uses a subset of it.

Layer	Technology	MVP?
Monorepo	pnpm workspaces	✅
Frontend	React 18, Vite, TypeScript	✅
Routing	TanStack Router	✅
Server state	TanStack Query	✅
Client state	Zustand	✅
Styling	Tailwind CSS + shadcn/ui	✅
Backend	Node.js, Express, TypeScript	✅
Database	PostgreSQL + Drizzle ORM	✅
Validation	Zod (shared schemas)	✅
Auth	OpenAuth (Google + GitHub)	❌ post-MVP
Realtime	WebSockets (`ws` library)	❌ post-MVP
Cache	Valkey	❌ post-MVP
Testing	Vitest, React Testing Library	❌ post-MVP
Deployment	Docker Compose, Hetzner, Nginx	❌ post-MVP

Repository Structure (actual, as of Phase 1 data pipeline complete)

vocab-trainer/
├── apps/
│   ├── api/
│   │   └── src/
│   │       ├── app.ts        # createApp() factory — routes registered here
│   │       └── server.ts     # calls app.listen()
│   └── web/
│       └── src/
│           ├── routes/
│           │   ├── __root.tsx
│           │   ├── index.tsx  # placeholder landing page
│           │   └── about.tsx
│           ├── main.tsx
│           └── index.css
├── packages/
│   ├── shared/
│   │   └── src/
│   │       ├── index.ts      # empty — Zod schemas go here next
│   │       └── constants.ts
│   └── db/
│       ├── drizzle/          # migration SQL files
│       └── src/
│           ├── db/schema.ts          # full Drizzle schema
│           ├── seeding-datafiles.ts  # seeds terms + translations
│           ├── generating-deck.ts    # builds curated decks
│           └── index.ts
├── documentation/            # all project docs live here
│   ├── spec.md
│   ├── roadmap.md
│   ├── decisions.md
│   ├── mvp.md                # this file
│   └── CLAUDE.md
├── scripts/
│   ├── extract-en-it-nouns.py
│   └── datafiles/en-it-noun.json
├── docker-compose.yml
└── pnpm-workspace.yaml

What does not exist yet (to be built in MVP phases):

apps/api/src/routes/ — no route handlers yet
apps/api/src/services/ — no business logic yet
apps/api/src/repositories/ — no DB queries yet
apps/web/src/components/ — no UI components yet
apps/web/src/stores/ — no Zustand store yet
apps/web/src/lib/api.ts — no TanStack Query wrappers yet
packages/shared/src/schemas/ — no Zod schemas yet

packages/shared is the contract between frontend and backend. All request/response shapes are defined there as Zod schemas — never duplicated.

5. Data Model (relevant tables for MVP)

export const terms = pgTable(
  "terms",
  {
    id: uuid().primaryKey().defaultRandom(),
    synset_id: text().unique().notNull(),
    pos: varchar({ length: 20 }).notNull(),
    created_at: timestamp({ withTimezone: true }).defaultNow().notNull(),
  },
  (table) => [
    check(
      "pos_check",
      sql`${table.pos} IN (${sql.raw(SUPPORTED_POS.map((p) => `'${p}'`).join(", "))})`,
    ),
    index("idx_terms_pos").on(table.pos),
  ],
);

export const translations = pgTable(
  "translations",
  {
    id: uuid().primaryKey().defaultRandom(),
    term_id: uuid()
      .notNull()
      .references(() => terms.id, { onDelete: "cascade" }),
    language_code: varchar({ length: 10 }).notNull(),
    text: text().notNull(),
    created_at: timestamp({ withTimezone: true }).defaultNow().notNull(),
  },
  (table) => [
    unique("unique_translations").on(
      table.term_id,
      table.language_code,
      table.text,
    ),
    index("idx_translations_lang").on(table.language_code, table.term_id),
  ],
);

export const decks = pgTable(
  "decks",
  {
    id: uuid().primaryKey().defaultRandom(),
    name: text().notNull(),
    description: text(),
    source_language: varchar({ length: 10 }).notNull(),
    validated_languages: varchar({ length: 10 }).array().notNull().default([]),
    is_public: boolean().default(false).notNull(),
    created_at: timestamp({ withTimezone: true }).defaultNow().notNull(),
  },
  (table) => [
    check(
      "source_language_check",
      sql`${table.source_language} IN (${sql.raw(SUPPORTED_LANGUAGE_CODES.map((l) => `'${l}'`).join(", "))})`,
    ),
    check(
      "validated_languages_check",
      sql`validated_languages <@ ARRAY[${sql.raw(SUPPORTED_LANGUAGE_CODES.map((l) => `'${l}'`).join(", "))}]::varchar[]`,
    ),
    check(
      "validated_languages_excludes_source",
      sql`NOT (${table.source_language} = ANY(${table.validated_languages}))`,
    ),
    unique("unique_deck_name").on(table.name, table.source_language),
  ],
);

export const deck_terms = pgTable(
  "deck_terms",
  {
    deck_id: uuid()
      .notNull()
      .references(() => decks.id, { onDelete: "cascade" }),
    term_id: uuid()
      .notNull()
      .references(() => terms.id, { onDelete: "cascade" }),
    added_at: timestamp({ withTimezone: true }).defaultNow().notNull(),
  },
  (table) => [primaryKey({ columns: [table.deck_id, table.term_id] })],
);

The seed + deck-build scripts have already been run. Data exists in the database.

6. API Endpoints (MVP)

All endpoints prefixed /api. Schemas live in packages/shared and are validated with Zod on both sides.

Method	Path	Description
GET	`/api/health`	Health check (already done)
GET	`/api/language-pairs`	List active language pairs
GET	`/api/decks`	List available decks
GET	`/api/decks/:id/terms`	Fetch terms with distractors for a quiz

Distractor Logic

The QuizService picks 3 distractors server-side:

Same part-of-speech as the correct answer
Never the correct answer
Never repeated within a session

7. Frontend Structure (MVP)

apps/web/src/
├── routes/
│   ├── index.tsx             # Landing page / mode select
│   └── singleplayer/
│       └── index.tsx         # The quiz
├── components/
│   ├── quiz/
│   │   ├── QuestionCard.tsx  # Prompt word + 4 answer buttons
│   │   ├── OptionButton.tsx  # idle / correct / wrong states
│   │   └── ScoreScreen.tsx   # Final score + play again
│   └── ui/                   # shadcn/ui wrappers
├── stores/
│   └── gameStore.ts          # Zustand: question index, score, answers
└── lib/
    └── api.ts                # TanStack Query fetch wrappers

State Management

TanStack Query handles fetching quiz data from the API. Zustand handles the local quiz session (current question index, score, selected answers). There is no overlap between the two.

8. Working Methodology

Read this section before asking for help with any task.

This project is a learning exercise. The goal is to understand the code, not just to ship it.

How tasks are structured

The roadmap (Section 10) lists broad phases. When work starts on a phase, it gets broken into smaller, concrete subtasks with clear done-conditions before any code is written.

How to use an LLM for help

When asking an LLM for help:

Paste this document (or the relevant sections) as context
Describe what you're working on and what specifically you're stuck on
Ask for hints, not solutions. Example prompts:
- "I'm trying to implement X. My current approach is Y. What am I missing conceptually?"
- "Here is my code. What would you change about the structure and why?"
- "Can you point me to the relevant docs for Z?"

Refactoring workflow

After completing a task or a block of work:

Share the current state of the code with the LLM
Ask: "What would you refactor here, and why? Don't show me the code — point me in the right direction and link relevant documentation."
The LLM should explain the what and why, link to relevant docs/guides, and let you implement the fix yourself

The LLM should never write the implementation for you. If it does, ask it to delete it and explain the concept instead.

Decisions log

Keep a decisions.md file in the root. When you make a non-obvious choice (a library, a pattern, a trade-off), write one short paragraph explaining what you chose and why. This is also useful context for any LLM session.

9. Game Mechanics

Format: source-language word prompt + 4 target-language choices
Distractors: same POS, server-side, never the correct answer, no repeats in a session
Session length: 10 questions
Scoring: +1 per correct answer (no speed bonus for MVP)
Timer: none in singleplayer MVP
No auth required: anonymous users

10. MVP Roadmap

Tasks are written at a high level. When starting a phase, break it into smaller subtasks before writing any code.

Current Status

Phase 0 (Foundation) — ✅ Complete Phase 1 (Vocabulary Data) — 🔄 Data pipeline complete. API layer is the immediate next step.

What is already in the database:

999 unique English terms (nouns), fully seeded from WordNet/OMW
3171 term IDs resolved (higher than word count due to homonyms)
Full Italian translation coverage (3171/3171 terms)
Decks created and populated via packages/db/src/generating-decks.ts
34 words from the source wordlist had no WordNet match (expected, not a bug)

Phase 1 — Finish the API Layer

Goal: The frontend can fetch quiz data from the API.

Done when: GET /api/decks/1/terms?limit=10 returns 10 terms, each with 3 distractors of the same POS attached.

Broadly, what needs to happen:

Define Zod response schemas in packages/shared for terms, decks, and language pairs
Implement a repository layer that queries the DB for terms belonging to a deck
Implement a service layer that attaches distractors to each term (same POS, no duplicates, no correct answer included)
Wire up the REST endpoints (GET /language-pairs, GET /decks, GET /decks/:id/terms)
Manually test the endpoints (curl or a REST client like Bruno/Insomnia)

Key concepts to understand before starting:

Drizzle ORM query patterns (joins, where clauses)
The repository pattern (data access separated from business logic)
Zod schema definition and inference
How pnpm workspace packages reference each other

Phase 2 — Singleplayer Quiz UI

Goal: A user can complete a full 10-question quiz in the browser.

Done when: User visits /singleplayer, answers 10 questions, sees a score screen, and can play again.

Broadly, what needs to happen:

Build the QuestionCard component (prompt word + 4 answer buttons)
Build the OptionButton component with three visual states: idle, correct, wrong
Build the ScoreScreen component (score summary + play again)
Implement a Zustand store to track quiz session state (current question index, score, whether an answer has been picked)
Wire up TanStack Query to fetch terms from the API on mount
Create the /singleplayer route and assemble the components
Handle the between-question transition (brief delay showing result → next question)

Key concepts to understand before starting:

TanStack Query: useQuery, loading/error states
Zustand: defining a store, reading and writing state from components
TanStack Router: defining routes, navigating between them
React component composition
Controlled state for the answer selection (which button is selected, when to lock input)

Phase 3 — UI Polish

Goal: The app looks good enough to show to people.

Done when: The quiz is usable on mobile, readable on desktop, and has a coherent visual style.

Broadly, what needs to happen:

Apply Tailwind utility classes and shadcn/ui components consistently
Make the layout mobile-first (touch-friendly buttons, readable font sizes)
Add a simple landing page (/) with a "Start Quiz" button
Add loading and error states for the API fetch
Visual feedback on correct/wrong answers (colour, maybe a brief animation)
Deck selection: let the user pick a deck from a list before starting

Key concepts to understand before starting:

Tailwind CSS utility-first approach
shadcn/ui component library and how to add components
Responsive design with Tailwind breakpoints
CSS transitions for simple animations

11. Key Technical Decisions

These are the non-obvious decisions already made. Any LLM helping with this project should be aware of them and not suggest alternatives without good reason.

Architecture

Express app: factory function pattern app.ts exports createApp(). server.ts imports it and calls .listen(). This keeps tests isolated — a test can import the app without starting a server.

Layered architecture: routes → services → repositories Business logic lives in services, not route handlers or repositories. Each layer only talks to the layer directly below it. For the MVP API, this means:

routes/ — parse request, call service, return response
services/ — business logic (e.g. attaching distractors)
repositories/ — all DB queries live here, nowhere else

Shared Zod schemas in packages/shared All request/response shapes are defined once as Zod schemas in packages/shared and imported by both apps/api and apps/web. Types are inferred from schemas (z.infer<typeof Schema>), never written by hand.

Data Model

Decks separate from terms (not frequency-rank filtering) Terms are raw WordNet data. Decks are curated lists. This separation exists because WordNet frequency data is unreliable for learning — common chemical element symbols ranked highly, for example. Bad words are excluded at the deck level, not filtered from terms.

Deck language model: source_language + validated_languages array A deck is not tied to a single language pair. source_language is the language the wordlist was curated from. validated_languages is an array of target languages with full translation coverage — calculated and updated by the deck generation script on every run.

Tooling

Drizzle ORM (not Prisma): No binary, no engine. Queries map closely to SQL. Works naturally with Zod. Migrations are plain SQL files.

tsx as TypeScript runner (not ts-node): Faster, zero config, uses esbuild. Does not type-check — that is handled by tsc and the editor.

pnpm workspaces (not Turborepo): Two apps don't need the extra build caching complexity.

12. Post-MVP Ladder

These phases are deferred but planned. The architecture already supports them.

Phase	What it adds
Auth	OpenAuth (Google + GitHub), JWT middleware, user rows in DB
User Stats	Games played, score history, profile page
Multiplayer Lobby	Room creation, join by code, WebSocket connection
Multiplayer Game	Simultaneous answers, server timer, live scores, winner screen
Deployment	Docker Compose prod config, Nginx, Let's Encrypt, Hetzner VPS
Hardening	Rate limiting, error boundaries, CI/CD, DB backups

Each of these maps to a phase in the full roadmap.md.

13. Definition of Done (MVP)

GET /api/decks/:id/terms returns terms with correct distractors
User can complete a 10-question quiz without errors
Score screen shows final result and a play-again option
App is usable on a mobile screen
No hardcoded data — everything comes from the database

18 KiB Raw Blame History Unescape Escape

glossa mvp

1. Project Overview

2. What the Full Product Looks Like (Long-Term Vision)

3. MVP Scope

What is IN the MVP

What is CUT from the MVP

4. Technology Stack

Repository Structure (actual, as of Phase 1 data pipeline complete)

5. Data Model (relevant tables for MVP)

6. API Endpoints (MVP)

Distractor Logic

7. Frontend Structure (MVP)

State Management

8. Working Methodology

How tasks are structured

How to use an LLM for help

Refactoring workflow

Decisions log

9. Game Mechanics

10. MVP Roadmap

Current Status

Phase 1 — Finish the API Layer

Phase 2 — Singleplayer Quiz UI

Phase 3 — UI Polish

11. Key Technical Decisions

Architecture

Data Model

Tooling

12. Post-MVP Ladder

13. Definition of Done (MVP)

18 KiB

Raw Blame History