Notice

This page show a previous version of the article

Go back to tatoeba.org.

FAQ

What should I do if I have trouble logging in?

Try clicking the "Remember Me" checkbox before you type in your username and password. If you still have problems logging in when you do that, try clearing your cache and cookies. Here are some guides:

If you still have problems, send an email to Team Tatoeba (team@tatoeba.org) telling us your browser and other details that might be important.

Why are some translations in grey?

Grey translations are indirect translations. In other words, they are translations of the translations, and not translations of the main sentence (the main sentence is the sentence in big letters).

We display them because they can be useful, but you should be careful. Their meaning may differ a little from the main sentence.

Why do I not see all the translations I expect to see?

If you have listed a series of language codes in your settings, Tatoeba will only display translations in the languages you indicated. Leave the field empty to display translations in all languages.

Why are some sentences in red?

Red sentences are not approved. They raise copyright issues or are otherwise problematic.

You should not translate them.

A sentence is not marked with the right language. How do I fix it?

Click on the language icon to the right of the sentence and select the correct language from the list.

How can I add tags to a sentence?

To add tags, you must be an advanced contributor.

=> See article: advanced-contributors

How can I link or unlink sentences?

To link or unlink sentences, you must be an advanced contributor.

=> See article: advanced-contributors

How can I become an advanced contributor?

=> See article: advanced-contributors

How can I help translate the website?

=> See article: interface-translation

How can I request a new language?

=> See article: new-language-request

How do I contribute audio to Tatoeba?

=> See article: contribute-audio

When contributing in Chinese, should I use simplified or traditional characters?

You can use whichever you like. We have a tool that will automatically convert simplified into traditional, and traditional into simplified.

When browsing sentences, if you set the Chinese sentence as the main sentence, you will see an additional icon at the top of the sentence.

  • traditional traditional
  • simplified simplified

Below each Chinese sentence, you will also see the transcription in pinyin, and below the pinyin, the conversion into simplified or traditional.

You can browse the Chinese sentences to see what they look like.

How do I delete my account?

=> See article: delete-account

Does Tatoeba provide an API?

No, it does not (yet).

We unfortunately do not have the proper infrastructure to host a public API. Nonetheless, please do not hesitate to contact us to let us know that you would be interested in this.

With more and more people asking us, we will eventually start something and we will be happy to hear more details about the needs of your application/project.

Meanwhile, what you can do is download our sentences from the Downloads page, then build your own API from there.

I would like to use Tatoeba's data for my project, how do I give proper attribution?

For the textual data

Basically you just need to write somewhere that some/all of your sentences are from Tatoeba, with a link to https://tatoeba.org, and mention that Tatoeba's data is released under CC-BY 2.0 FR.

Here's an example of good attribution: https://www.clozemaster.com/about#where-are-the-sentences-from

For the audio data

Our audio corpus has a wider range of licenses and isn't just restricted to CC-BY. You should therefore be more careful about which audio you are using, especially if your project/app is commercial.

You can check the license of each audio recording from the file we release under "Sentences with audio" on our Downloads page.

We recommend that you mention the username of each member whose audio you are reusing, as well as the license they chose.

Here's an example of attribution:

All the audio comes from Tatoeba (https://tatoeba.org), more specifically from the following members of Tatoeba:
  - userA (license: CC-BY-SA)
  - userB (license: CC-BY-NC)
  - userC (license: CC-BY)

Where can I download Tatoeba's audio data?

Currently the only way you can download audio is by fetching each audio file one by one. We don't have one big ZIP file that contains all our audio.

We only have one ZIP file for English tatoeba_audio_eng.zip 3.8 GB which was generated back in November 2017, upon request from the Common Voice project who wanted to mention our data on their Datasets page.

If you'd like something more up-to-date or in other languages, you would have to do some scripting, using the files on our Downloads page, more specifically under the sections:

  • "Sentences with audio": to have the ID's of all the sentences that have audio
  • "Sentences": to know what is the language of each sentence

Once you have the language and the ID, the URL to download the audio file is: https://audio.tatoeba.org/sentences/{lang}/{id}.mp3

For instance: https://audio.tatoeba.org/sentences/eng/7347611.mp3

Note: if you are going to use this data in one of your projects/apps, please be mindful about the license!

How can I download all sentences and translations in specific languages?

We don't provide yet such a feature. You will need to write your own script and process the "Sentences" and "Links" files that can be found on our Downloads page.

If you are looking for sentences and translations from/to English, you may find what you need on this page: http://www.manythings.org/anki/

Otherwise, some people may have already written scripts that do what you want to do. Here are some examples of Google searches you can do to find these scripts:

You can also check our related Google Group thread. If you have written a script that you want to share, feel free to post a reply to this thread.