feat(db): add incremental upsert seed script for WordNet vocabulary

Implements packages/db/src/seed.ts — reads all JSON files from
scripts/datafiles/, validates filenames against supported language
codes and POS, and upserts synsets into  and
via onConflictDoNothing. Safe to re-run; produces 0 writes on
a duplicate run.
This commit is contained in:
lila 2026-03-30 15:58:01 +02:00
parent 55885336ba
commit 2b177aad5b
12 changed files with 1349 additions and 10 deletions

View file

@ -26,17 +26,17 @@ Done when: `GET /api/decks/1/terms?limit=10` returns 10 terms from a specific de
[x] Run `extract-en-it-nouns.py` locally → generates `datafiles/en-it-nouns.json`
-- Import ALL available OMW noun synsets (no frequency filtering)
[ ] Write Drizzle schema: `terms`, `translations`, `language_pairs`, `term_glosses`, `decks`, `deck_terms`
[ ] Write and run migration (includes CHECK constraints for `pos`, `gloss_type`)
[ ] Write `packages/db/src/seed.ts` (imports ALL terms + translations, NO decks)
[ ] Write `scripts/build_decks.ts` (reads external CEFR lists, matches to DB, creates decks)
[x] Write Drizzle schema: `terms`, `translations`, `language_pairs`, `term_glosses`, `decks`, `deck_terms`
[x] Write and run migration (includes CHECK constraints for `pos`, `gloss_type`)
[x] Write `packages/db/src/seed.ts` (imports ALL terms + translations, NO decks)
[ ] Download CEFR A1/A2 noun lists (from GitHub repos)
[ ] Write `scripts/build_decks.ts` (reads external CEFR lists, matches to DB, creates decks)
[ ] Run `pnpm db:seed` → populates terms
[ ] Run `pnpm db:build-decks` → creates curated decks
[ ] Define Zod response schemas in `packages/shared`
[ ] Implement `DeckRepository.getTerms(deckId, limit, offset)`
[ ] Implement `QuizService.attachDistractors(terms)` — same POS, server-side, no duplicates
[ ] Implement `GET /language-pairs`, `GET /decks`, `GET /decks/:id/terms` endpoints
[ ] Define Zod response schemas in `packages/shared`
[ ] Unit tests for `QuizService` (correct POS filtering, never includes the answer)
[ ] update decisions.md