updating documentation
This commit is contained in:
parent
570dbff25e
commit
60cf48ef97
3 changed files with 243 additions and 31 deletions
|
|
@ -325,6 +325,25 @@ Exercise types split naturally into Type A (translation, current model) and Type
|
|||
|
||||
---
|
||||
|
||||
### Term glosses: Italian coverage is sparse (expected)
|
||||
|
||||
OMW gloss data is primarily in English. After full import:
|
||||
|
||||
- English glosses: 95,882 (~100% of terms)
|
||||
- Italian glosses: 1,964 (~2% of terms)
|
||||
|
||||
This is not a data pipeline problem — it reflects the actual state of OMW. Italian
|
||||
glosses simply don't exist for most synsets in the dataset.
|
||||
|
||||
**Handling in the UI:** fall back to the English gloss when no gloss exists for the
|
||||
user's language. This is acceptable UX — a definition in the wrong language is better
|
||||
than no definition at all.
|
||||
|
||||
If Italian gloss coverage needs to improve in the future, Wiktionary is the most
|
||||
likely source — it has broader multilingual definition coverage than OMW.
|
||||
|
||||
---
|
||||
|
||||
## Open Research
|
||||
|
||||
### Semantic category metadata source
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue