Commit graph

8 commits

Author SHA1 Message Date
lila
04a581efe1 WIP: checkpoint before stage-3 sub-stage rewrite 2026-05-12 22:13:14 +02:00
lila
6f9a42c707 feat: add stage 2 reverse link sync script 2026-05-05 18:57:55 +02:00
lila
0cc643e308 feat: update extractor for all 5 languages, update import for multi-language
- Extract.ts now processes all 5 language files, filters non-English
  entries by lang_code, skips translation extraction for non-English
  (no translations in source files)
- Import.ts now imports all 5 language output files, uses language
  field from ExtractedSense instead of hardcoding en
- Sample limit hardcoded to 500 entries per language for development
2026-05-05 18:46:32 +02:00
lila
080fad1998 feat: enrich stage foundation — provider config, env setup, schema fix
- Remove foreign key on run_status.source_id to support sentinel rows
  for tracking one-time pipeline steps (compile_candidates, compile_votes,
  merge, compare)
- Add stage-3-enrich/config.ts with all provider configurations,
  ALL_PROVIDERS ordered local-first, and validateProviderKey() for
  startup key checks
- Add .env.example with required API keys for OpenRouter and Anthropic
- Add pipeline:run script to package.json using --env-file .env
- Add .env to root .gitignore coverage for data-pipeline/.env
2026-05-03 22:44:14 +02:00
lila
f59399be02 feat: add db import script, fix duplicate translations in extract, add annotate script 2026-05-03 22:05:10 +02:00
lila
4fa3073412 feat: add db schema, init, and vitest config 2026-05-03 17:56:29 +02:00
lila
9ea35568e5 updating config 2026-04-21 12:01:29 +02:00
lila
a3d19d36f6 adding the data-pipeline to ts and pnpm workspaces 2026-04-20 09:05:27 +02:00