- Add stage-1-extract/scripts/extract.ts — streams Kaikki JSONL, filters to supported POS and languages, skips abbreviations and senses with no translations in supported languages - Rewrite db/import.ts for Kaikki flat model — tracks sense_index offsets per headword+pos to handle duplicate JSONL entries - Rewrite db/schema.sql for Kaikki model — entries, translations, LLM vote tables, resolved tables - Add extract and db:import scripts to package.json - Sample mode hardcoded to 500 entries for development |
||
|---|---|---|
| .. | ||
| db | ||
| sample | ||
| stage-1-extract/scripts | ||
| stage-3-enrich | ||
| .env.example | ||
| audit.ts | ||
| package.json | ||
| pipeline.ts | ||
| tsconfig.json | ||
| vitest.config.ts | ||