Welcome to BOUQuET π , Benchmark and Open-initiative for Universal Quality Evaluation in Translation.
Letβs make machine translation available for any written language!
Please take part in shaping the future - your help will be greatly appreciated.
We are inviting everyone to contribute to BOUQuET π - a project aimed at building an open source evaluation dataset for massively multilingual text-to-text machine translation systems.
You are very welcome to provide your language translation choosing the source you feel more comfortable with, including English, Egyptian Arabic, Mandarin Chinese, German, French, Hindi, Indonesian, Russian or Spanish. Please take a look at Contributor guidelines that will further inform you on how to proceed.
You can also find more details on BOUQuET π scientific context and purpose in the BOUQuET paper. An extensive example of using it for benchmarking can be found in the Omnilingual MT paper.
Dataset
The dataset is accessible at https://huggingface.co/datasets/facebook/bouquet. We are going to update it regularly, as the contributions in new languages are completed and validated.
Leaderboard
To see how the various translation systems perform on BOUQuET, refer to the "Leaderboard" tab!
If you want another system evaluated, please open a discussion in the "Community" tab or evaluate it on your own using the code in https://github.com/facebookresearch/bouquet.
Contribute
If you want to contribute dataset translations for a new language or validate existing translations, check out our crowdsourcing system: https://bouquet.metademolab.com.
License
The dataset collected by the BOUQuET initiative and your contributions to this dataset will be released under the Creative Commons Attribution 4.0 license. Full text: https://choosealicense.com/licenses/cc-by-4.0/.
Reference
- [Omnilingual MT Team et al., 2025] Omnilingual MT Team, BOUQuET π : dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation, ArXiv, 2025
- [Omnilingual MT Team et al., 2026] Omnilingual MT Team, Omnilingual MT: Machine Translation for 1,600 Languages, ArXiv, 2026