- Add stage-1-extract/scripts/extract.ts — streams Kaikki JSONL, filters to supported POS and languages, skips abbreviations and senses with no translations in supported languages - Rewrite db/import.ts for Kaikki flat model — tracks sense_index offsets per headword+pos to handle duplicate JSONL entries - Rewrite db/schema.sql for Kaikki model — entries, translations, LLM vote tables, resolved tables - Add extract and db:import scripts to package.json - Sample mode hardcoded to 500 entries for development |
||
|---|---|---|
| .. | ||
| import.ts | ||
| index.ts | ||
| init.ts | ||
| pipeline.db-shm | ||
| pipeline.db-wal | ||
| schema.sql | ||