Version at: 29/04/2013, 03:02

# Using the Tatoeba Corpus for Your Own Projects

## Terms of Use

* Read the [Terms of Use](http://tatoeba.org/eng/terms_of_use).

## Warning: The Tatoeba Corpus is not error-free.

* Due to the nature of a public collaborative project, this data will never be 100% free of errors.
* Be aware of the following.
   * We allow non-native speakers to contribute in languages they are learning.
   * We ask our members not to change archaic language to something that currently sounds natural.
   * We allow our members to submit book titles and other things you might not consider sentences.
 * Translations may not always be accurate, even though the linked sentences are correct sentences.

## Suggestions for Those Planning to Use the Corpus

* Don't use the whole corpus, but do some filtering out of obviously suspect items. (Things like items tagged @need native check, @change, archaic, non-sentence, etc. [Browse Tags](http://tatoeba.org/eng/tags/view_all) to find others.)
* You may want to eliminate all sentences not "owned" by native speakers.  However, even this will not guarantee perfect data.
* You should inform your audience that the data may contain errors and explain what steps you have taken to help minimize the errors.
* Since corrections are being made all the time, you should frequently update your project so your audience benefits from these corrections.




version at: 29/04/2013, 03:11

# Using the Tatoeba Corpus for Your Own Projects

## Terms of Use

* Read the [Terms of Use](http://tatoeba.org/eng/terms_of_use).

## Warning: The Tatoeba Corpus is not error-free.

* Due to the nature of a public collaborative project, this data will never be 100% free of errors.
* Be aware of the following.
   * We allow non-native speakers to contribute in languages they are learning.
   * We ask our members not to change archaic language to something that currently sounds natural.
   * We allow our members to submit book titles and other things you might not consider sentences.
 * Translations may not always be accurate, even though the linked sentences are correct sentences.

## Suggestions for Those Planning to Use the Corpus

* Don't use the whole corpus, but do some filtering out of obviously suspect items. (Things like items tagged @need native check, @change, archaic, non-sentence, etc. [Browse Tags](http://tatoeba.org/eng/tags/view_all) to find others.)
* You may want to eliminate all sentences not "owned" by native speakers.  However, even this will not guarantee perfect data.
* You should inform your audience that the data may contain errors and explain what steps you have taken to help minimize the errors.
* Since corrections are being made all the time, you should frequently update your project so your audience benefits from these corrections.
* You might want to only use sentences you have personally proofread if you are creating materials for people studying a foreign language to help make sure what you are teaching people isn't a mistake.

Note

The lines in green are the lines that have been added in the new version. The lines in red are those that have been removed.