lila/data-pipeline/db
lila 0cc643e308 feat: update extractor for all 5 languages, update import for multi-language
- Extract.ts now processes all 5 language files, filters non-English
  entries by lang_code, skips translation extraction for non-English
  (no translations in source files)
- Import.ts now imports all 5 language output files, uses language
  field from ExtractedSense instead of hardcoding en
- Sample limit hardcoded to 500 entries per language for development
2026-05-05 18:46:32 +02:00
..
import.ts feat: update extractor for all 5 languages, update import for multi-language 2026-05-05 18:46:32 +02:00
index.ts feat: add db schema, init, and vitest config 2026-05-03 17:56:29 +02:00
init.ts feat: add pipeline orchestrator skeleton with startup checks, stage runners, shutdown handler, and report generation 2026-05-03 23:01:29 +02:00
pipeline.db-shm removing db from git tracking, adding it to gitignore, add db import validation tests 2026-05-03 22:16:43 +02:00
pipeline.db-wal removing db from git tracking, adding it to gitignore, add db import validation tests 2026-05-03 22:16:43 +02:00
schema.sql feat: add Kaikki extraction and import scripts for stage 1 2026-05-05 18:11:53 +02:00