Welcome to BOUQuET šŸ’ , Benchmark and Open-initiative for Universal Quality Evaluation in Translation.

Let’s make machine translation available for any written language!

Please take part in shaping the future - your help will be greatly appreciated.

We are inviting everyone to contribute to BOUQuET šŸ’ - a project aimed at building an open source evaluation dataset for massively multilingual text-to-text machine translation systems.

You are very welcome to provide your language translation choosing the source you feel more comfortable with, including English, Egyptian Arabic, Mandarin Chinese, German, French, Hindi, Indonesian, Russian or Spanish. Please take a look at Contributor guidelines that will further inform you on how to proceed.

You can also find more details on BOUQuET šŸ’ scientific context and purpose in the paper [OmnilingualMT team et al., 2025].

Dataset

The dataset is accessible at https://huggingface.co/datasets/facebook/bouquet. We are going to update it regularly, as the contributions in new languages are completed and validated.

Leaderboard

To see how the various translation systems perform on BOUQuET, refer to the "Leaderboard" tab!

If you want another system evaluated, please open a discussion in the "Community" tab.

Contribute

If you want to contribute dataset translations for a new language or validate existing translations, check out our crowdsourcing system: https://bouquet.metademolab.com.

Reference

  • [Omnilingual MT Team et al., 2025] Omnilingual MT Team, BOUQuET šŸ’ : dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation, ArXiv, 2025