Commit graph

65 commits

Author SHA1 Message Date
lila
b3b32167c9 formatting 2026-04-10 20:09:46 +02:00
lila
b59fac493d feat(api): implement game terms query with double join
- Add double join on translations for source/target languages
- Left join term_glosses for optional source-language glosses
- Filter difficulty on target side only (intentionally asymmetric:
  a word's difficulty can differ between languages, and what matters
  is the difficulty of the word being learned)
- Return neutral field names (sourceText, targetText, sourceGloss)
  instead of quiz semantics; service layer maps to prompt/answer
- Tighten term_glosses unique constraint to (term_id, language_code)
  to prevent the left join from multiplying question rows
- Add TODO for ORDER BY RANDOM() scaling post-MVP
2026-04-10 18:02:03 +02:00
lila
9fc3ba375a feat: scaffold quiz API vertical slice
- Add GameRequestSchema and derived types to packages/shared
- Add SupportedLanguageCode, SupportedPos, DifficultyLevel type exports
- Add getGameTerms() model to packages/db with pos/language/difficulty/limit filters
- Add prepareGameQuestions() service skeleton in apps/api
- Add createGame controller with Zod safeParse validation
- Wire POST /api/v1/game/start route
- Add scripts/gametest/test-game.ts for manual end-to-end testing
2026-04-09 13:47:01 +02:00
lila
13cc709b09 adding script to check cefr coverage between json files and database, adding script to write cefr levels from json to db 2026-04-09 10:25:20 +02:00
lila
3374bd8b20 feat(scripts): add Italian CEFR data pipeline
- Add extractors for Italian sources: it_m3.xls and italian.json
- Add comparison script (compare-italian.py) to report source overlaps and conflicts
- Add merge script (merge-italian-json.py) with priority order ['italian', 'it_m3']
- Output authoritative dataset to datafiles/italian-merged.json
- Update README to document both English and Italian pipelines
2026-04-08 18:32:03 +02:00
lila
59152950d6 extraction, comparison and merging scripts for english are done, final english.json exists 2026-04-08 17:50:25 +02:00
lila
3596f76492 extraction datafiles with cefr annotations 2026-04-08 13:09:47 +02:00
lila
e79fa6922b updating schema 2026-04-07 01:03:22 +02:00
lila
0cb9fe1485 adding datafiles + updating documentation 2026-04-07 00:00:58 +02:00
lila
60cf48ef97 updating documentation 2026-04-06 17:01:34 +02:00
lila
570dbff25e updating seeding script 2026-04-06 17:01:17 +02:00
lila
aa1a332226 removing files 2026-04-06 17:01:04 +02:00
lila
6cb0068d1a adding datafiles for all english and italian nousn and verbs 2026-04-05 19:35:52 +02:00
lila
88691a345e extracted all english and italian nouns and verbs from own 2026-04-05 19:34:11 +02:00
lila
2a8630660e generating and migrating new schema 2026-04-05 19:30:05 +02:00
lila
e3c05b5596 updating seeding pipeline 2026-04-05 19:29:47 +02:00
lila
dfeb6a4cb0 updating seeding pipeline 2026-04-05 19:29:17 +02:00
lila
c49c2fe2c3 updating docs 2026-04-05 19:28:53 +02:00
lila
e80f291c41 refactoring data model 2026-04-05 18:57:09 +02:00
lila
b16b5db3f7 updating data models 2026-04-05 01:21:32 +02:00
lila
bfc09180f1 updating documentation 2026-04-05 01:21:18 +02:00
lila
7d80b20390 wip version of the api 2026-04-05 00:33:34 +02:00
lila
c24967dc74 updating docs 2026-04-05 00:33:05 +02:00
lila
1accb10f49 typo 2026-04-04 03:37:58 +02:00
lila
5180ecc864 installing zod + adding zod schemas 2026-04-02 20:02:26 +02:00
lila
874dd5e4c7 adding documentation and roadmap for the most minimal mvp 2026-04-02 18:28:44 +02:00
lila
a9cbcb719c refactoring schema + generate + migrate 2026-04-02 15:48:48 +02:00
lila
38a62ca3a4 refactoring 2026-04-02 15:48:31 +02:00
lila
cdedbc44cd refactoring 2026-04-02 13:37:54 +02:00
lila
b0c0baf9ab updating documentation 2026-04-01 18:02:12 +02:00
lila
3bb8bfdb39 feat(db): complete deck generation script for top english nouns
- add deck_terms to schema imports
- add addTermsToDeck — diffs source term IDs against existing deck_terms,
  inserts only new ones, returns count of inserted terms
- add updateValidatedLanguages — recalculates and persists validated_languages
  on every run so coverage stays accurate as translation data grows
- wire both functions into main with isNewDeck guard to avoid redundant
  validated_languages update on deck creation
- add final summary report
- fix possible undefined on result[0] in createDeck
- tick off remaining roadmap items
2026-04-01 17:56:31 +02:00
lila
7fdcedd1dd wip 2026-04-01 02:43:55 +02:00
lila
a49bce4a5a adding tasks 2026-04-01 01:22:21 +02:00
lila
4ef70b3876 updating decks to include source language 2026-04-01 01:03:41 +02:00
lila
5603f15fe3 adding bug description as todo comment 2026-03-31 18:34:23 +02:00
lila
488f0dab11 wip 2026-03-31 18:28:29 +02:00
lila
9d1a82bdf0 reviewing and updating deck generation 2026-03-31 16:48:40 +02:00
lila
521ffe3b6e adding migration script 2026-03-31 10:09:30 +02:00
lila
e3a2136720 formatting 2026-03-31 10:06:06 +02:00
lila
20fa6a9331 adding datafiles and seeding script 2026-03-31 10:05:36 +02:00
lila
068949b4cb adjusting path where the database file is saved, so the data persists after reboot 2026-03-31 10:04:50 +02:00
lila
2b177aad5b feat(db): add incremental upsert seed script for WordNet vocabulary
Implements packages/db/src/seed.ts — reads all JSON files from
scripts/datafiles/, validates filenames against supported language
codes and POS, and upserts synsets into  and
via onConflictDoNothing. Safe to re-run; produces 0 writes on
a duplicate run.
2026-03-30 15:58:01 +02:00
lila
55885336ba feat(db): add drizzle schema for vocabulary and deck tables
- terms, translations, term_glosses with cascade deletes and pos check constraint
- language_pairs with source/target language check constraints and no-self-pair guard
- users with openauth_sub as identity provider key
- decks and deck_terms with composite PK and position ordering
- indexes on all hot query paths (distractor generation, deck lookups, FK joins)
- SUPPORTED_POS and SUPPORTED_LANGUAGE_CODES as single source of truth in @glossa/shared
2026-03-28 19:02:10 +01:00
lila
be7a7903c5 refactor: migrate to deck-based vocabulary curation
Database Schema:
- Add decks table for curated word lists (A1, Most Common, etc.)
- Add deck_terms join table with position ordering
- Link rooms to decks via rooms.deck_id FK
- Remove frequency_rank from terms (now deck-scoped)
- Change users.id to uuid, add openauth_sub for auth mapping
- Add room_players.left_at for disconnect tracking
- Add rooms.updated_at for stale room recovery
- Add CHECK constraints for data integrity (pos, status, etc.)

Extraction Script:
- Rewrite extract.py to mirror complete OMW dataset
- Extract all 25,204 bilingual noun synsets (en-it)
- Remove frequency filtering and block lists
- Output all lemmas per synset for full synonym support
- Seed data now uncurated; decks handle selection

Architecture:
- Separate concerns: raw OMW data in DB, curation in decks
- Enables user-created decks and multiple difficulty levels
- Rooms select vocabulary by choosing a deck
2026-03-27 16:53:26 +01:00
lila
e9e750da3e setting up python env, download word data 2026-03-26 11:41:46 +01:00
lila
a4a14828e8 no isPrimary 2026-03-26 10:11:25 +01:00
lila
c1b90b9643 chore: complete phase 0 - update decisions.md and mark phase complete 2026-03-26 09:51:03 +01:00
lila
5561d54a24 feat(infra): add docker-compose and dockerfiles for all services 2026-03-26 09:43:39 +01:00
lila
2ebf0d0a83 infra: add Docker Compose setup for local development
- Configure PostgreSQL 18 and Valkey 9.1 services
- Create multi-stage Dockerfiles for API and Web apps
- Set up pnpm workspace support in container builds
- Configure hot reload via volume mounts for both services
- Add healthchecks for service orchestration
- Support dev/production stage targets (tsx watch vs compiled)
2026-03-25 18:56:04 +01:00
lila
671d542d2d chore(db): add drizzle migration pipeline with empty schema 2026-03-24 11:04:40 +01:00