Commit graph

3 commits

Author SHA1 Message Date
lila
3374bd8b20 feat(scripts): add Italian CEFR data pipeline
- Add extractors for Italian sources: it_m3.xls and italian.json
- Add comparison script (compare-italian.py) to report source overlaps and conflicts
- Add merge script (merge-italian-json.py) with priority order ['italian', 'it_m3']
- Output authoritative dataset to datafiles/italian-merged.json
- Update README to document both English and Italian pipelines
2026-04-08 18:32:03 +02:00
lila
59152950d6 extraction, comparison and merging scripts for english are done, final english.json exists 2026-04-08 17:50:25 +02:00
lila
3596f76492 extraction datafiles with cefr annotations 2026-04-08 13:09:47 +02:00