docs: update roadmap — stage 1 in progress, sample extraction complete
This commit is contained in:
parent
ba2635e3f7
commit
b5a76ee178
1 changed files with 11 additions and 7 deletions
|
|
@ -314,9 +314,9 @@ These are not part of the current pipeline but are worth considering as the data
|
|||
|
||||
## Roadmap
|
||||
|
||||
**Current state:** Data source migrated from OMW to Kaikki. Production schema and pipeline being rewritten on `feat/kaikki-vocabulary-schema`. Pipeline infrastructure (orchestrator, db init, reporting, tests) is in place and carries forward.
|
||||
**Current state:** Production schema migrated to Kaikki flat model. Stage 1 extraction scripts written and sample run complete (500 entries per language). pipeline.db initialised and imported with sample data. Stage 2 reverse link sync not yet written. llama.cpp not installed.
|
||||
|
||||
**Next action:** Rewrite production schema in `packages/db`, then rewrite pipeline extraction stage for Kaikki.
|
||||
**Next action:** Write the stage 2 reverse link sync script.
|
||||
|
||||
| Stage | Status |
|
||||
| --------------- | -------------- |
|
||||
|
|
@ -328,12 +328,16 @@ These are not part of the current pipeline but are worth considering as the data
|
|||
| 5. Compare / QA | 🔲 not started |
|
||||
| 6. Sync | 🔲 not started |
|
||||
|
||||
### Stage 1 — Extract `🔲 not started`
|
||||
### Stage 1 — Extract `🔄 in progress`
|
||||
|
||||
- [ ] Download Kaikki JSONL files for all 5 languages
|
||||
- [ ] Write extraction script
|
||||
- [ ] Write stage 1 validation tests
|
||||
- [ ] Run extraction → `pipeline.db`
|
||||
- [x] Download Kaikki JSONL files for all 5 languages
|
||||
- [x] Write extraction script
|
||||
- [x] Write stage 1 validation tests
|
||||
- [x] Write db schema, init, and import scripts
|
||||
- [x] Write db import validation tests
|
||||
- [x] Run sample extraction → `stage-1-extract/output/{lang}.json`
|
||||
- [ ] Remove sample limit and run full extraction
|
||||
- [ ] Re-run full import → `pipeline.db`
|
||||
|
||||
### Stage 2 — Reverse link sync `🔲 not started`
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue