- Extract.ts now processes all 5 language files, filters non-English
entries by lang_code, skips translation extraction for non-English
(no translations in source files)
- Import.ts now imports all 5 language output files, uses language
field from ExtractedSense instead of hardcoding en
- Sample limit hardcoded to 500 entries per language for development
- Remove foreign key on run_status.source_id to support sentinel rows
for tracking one-time pipeline steps (compile_candidates, compile_votes,
merge, compare)
- Add stage-3-enrich/config.ts with all provider configurations,
ALL_PROVIDERS ordered local-first, and validateProviderKey() for
startup key checks
- Add .env.example with required API keys for OpenRouter and Anthropic
- Add pipeline:run script to package.json using --env-file .env
- Add .env to root .gitignore coverage for data-pipeline/.env