6.8 KiB
00 — Project Overview
Purpose: Give any LLM instant context on what Lila is, what makes it different, and what's currently built vs. planned. Concatenate this file with domain-specific files (01–06) and 99-current-task.md before handing to an LLM. Last updated: 2026-05-15 Depends on: Nothing (this is the entry point)
What Lila Is
Lila is a vocabulary learning app with two core differentiators:
- Media-based practice — Users learn vocabulary extracted from real media they love: a Shakira song, the first chapter of Harry Potter, an episode of Breaking Bad. The app extracts vocabulary from subtitles/lyrics/text and turns it into quiz questions.
- Multiplayer modes — Users practice vocabulary together or competitively in real-time sessions (2–4 players, simultaneous answers, live scoring).
The core learning loop is Duolingo-style: a word appears in one language, the user picks the correct translation from four choices.
Live at lilastudy.com.
Current State (2026-05-15)
What Works Today
- Singleplayer quiz — 5 language pairs (en↔it/de/es/fr), 3 or 10 rounds, POS + difficulty filters
- Multiplayer — Create/join lobby by room code, 2–4 players, simultaneous answers, 15s server timer, live scoring, winner screen
- Auth — Google + GitHub via Better Auth
- Deployment — Live on Hetzner VPS, Caddy HTTPS, Docker Compose, CI/CD via Forgejo Actions
- Database — PostgreSQL with Drizzle ORM, daily backups
What's In Progress / Blocked
- Kaikki data pipeline migration — Replacing OpenWordNet/OMW with sense-disambiguated Kaikki data. Stage 1 (extract) and Stage 2 (reverse link) complete on sample data. Stage 3 (enrich) being rewritten for sub-stage architecture.
- Guest play — No try-before-signup flow yet. Auth required for all game routes.
- Game session store — Still in-memory. Valkey container exists locally but not wired up.
- Media ingestion — Not started. No pipeline for subtitles/lyrics → vocab extraction yet.
The Strategic Gap
The app is currently a generic vocabulary quiz. The media-based practice feature (the differentiator) does not exist yet. It depends on:
- Kaikki pipeline reaching production (fixes translation quality)
- A media ingestion prototype (subtitles/lyrics → text → vocab extraction → quiz)
Tech Stack
| Layer | Technology |
|---|---|
| Monorepo | pnpm workspaces |
| Frontend | React 18, Vite, TanStack Router, TanStack Query, Tailwind CSS |
| Backend | Node.js, Express, TypeScript, WebSockets (ws library) |
| Database | PostgreSQL + Drizzle ORM |
| Auth | Better Auth (Google + GitHub) |
| Validation | Zod (shared between frontend and backend in packages/shared) |
| Testing | Vitest, supertest |
| Deployment | Docker Compose, Caddy, Hetzner VPS |
| CI/CD | Forgejo Actions |
| Data Pipeline | Kaikki (Wiktionary) → SQLite (pipeline.db) → PostgreSQL |
Repository Structure
lila/
├── apps/
│ ├── api/ — Express backend (HTTP + WebSocket)
│ └── web/ — React frontend (Vite, TanStack Router)
├── packages/
│ ├── shared/ — Zod schemas + constants (API/web contract)
│ └── db/ — Drizzle schema, migrations, models, seeding
├── data-pipeline/ — Kaikki extraction → enrichment → PostgreSQL sync
└── documentation/ — Project docs (human + AI-context branches)
Key rule: packages/shared is the single source of truth for all data shapes crossing the API boundary. Both frontend and backend import from it. If a schema changes, TypeScript compilation fails in both places simultaneously.
Key Architecture Principles
- Layered architecture — Router → Controller → Service → Model → Database. Each layer only talks to the layer below it.
- Server-side answer evaluation — The correct answer is never sent to the frontend. All evaluation happens server-side.
- Zod discriminated unions for WebSockets — All WS messages are typed via Zod schemas in
packages/shared. The router switches on thetypefield. - GameSessionStore abstraction — Session state is stored through an interface (
InMemoryGameSessionStorenow,ValkeyGameSessionStoreplanned). - Language-neutral data model —
termsare concepts;translationsare per-language words. Adding a language requires no schema changes.
Key Decisions (Summary)
| Topic | Decision | Why |
|---|---|---|
| ORM | Drizzle, not Prisma | No binary, no engine, closer to SQL |
| WebSocket | ws library, not Socket.io |
2–4 players, explicit Zod protocol sufficient |
| Auth | Better Auth, not Keycloak | Embedded middleware, no separate service |
| Answer eval | Server-side only | Correct answer never sent to frontend |
| Data source | Kaikki, not OMW | Sense-disambiguated translations |
Further Reading (AI-Context Files)
| File | What it covers |
|---|---|
| 01-architecture.md | Monorepo structure, layered architecture, data flow diagrams |
| 02-data-model.md | Database schema, tables, relationships, constraints |
| 03-api-contract.md | REST endpoints, request/response schemas, Zod types |
| 04-websocket-protocol.md | WS message types, game flow, auth, state management |
| 05-data-pipeline.md | Kaikki pipeline stages, enrich sub-stages, sync |
| 06-deployment.md | Docker, Caddy, CI/CD, backups |
| prompts/meta.md | How to work with LLMs on this codebase |
| 99-current-task.md | Template: fill this out before giving a task to an LLM |