Notice

This page show a previous version of the article

How to request a new language to be added

1) Contact Trang (by <a href="mailto:team@tatoeba.org">email</a> or private message) and indicate in the title the language(s) that you would like us to add.

IMPORTANT: We cannot add your language if it does not have an ISO 639-3 code. At this point we already have a lot of languages to deal with, and it's a bit too complicated to deal with languages that are not "officially" recognized.

2) In your email, tell us what icon we can use for each requested language. It does not necessarily have to be the flag of a country. We just want a picture that people can easily associate to the language. If you don't have any specific image, give Pharamp some ideas of what she could use to represent the language (a color, a symbol, an animal). Give all the ideas you have so that we can invent a nice icon for the language. Keep in mind that our icons are only 30x20 pixels.

You do not have create the icon yourself. For graphical consistency, it's better that we do it. Simply send us (or link us) an image from which we will create the icon.

3) Translate 5 sentences into your language(s). Don't worry if the language detection fails. For instance it's possible that the language is detected as Hungarian, but you are not adding a sentence in Hungarian. This is not a problem. If it happens, you can click on the language icon, and select "other language" in the list that appears (it's the first option). Then later, you will be able to set the correct language once it's added in Tatoeba.

4) Create a public list and name it with the name of your language. Add your 5 translations into that list. Feel free to translate more than 5 sentences, but always add them into the list, so that we can easily find them!

5) Don't hesitate to ask us if you don't know how to do something!

IMPORTANT: We will only add your language(s) if you have done all of this. And please forgive us if we take forever despite your going through all these steps.

Procedure for developers

Source: https://www.assembla.com/spaces/tatoeba2/wiki/Adding_a_language_in_Tatoeba

In the source code

app/model/sentence.php Add the language ISO code in the $validate array. Languages that are not part of this array are not allowed.

app/controllers/components/google_language_api.php Add the corresponding case in the google2TatoebaCode() method, if Google supports the detection for the language. See the Language enum.

app/views/helpers/languages.php Add the language ISO code and the name in the languagesArray() method.

app/webroot/img/ Add an icon for the new language. Dimensions 30 x 20. Format png. Modify luminosity so that it looks a bit more pale than the original and add a 1 pixel border on right and bottom (color #dcdcdc).

docs/generate_sphinx_conf.php Add the language ISO code and name in the $languages array. Also add the ISO code in the $cjkLanguages array if the languages uses Chinese, Japanese or Korean characters.

In your local Tatoeba

  • Connect to mysql and select the database.
  • If you haven't done it yet, run the following script: docs/database/scripts/add_new_language.sql. It will create a procedure to easily add a new language and do the necessary updates in the database.
  • CALL add_new_language(iso_code, list_id, tag_name);
  • Read the comments in add_new_language.sql to have examples of the procedure.
  • Test that the language detection works (or can work) by adding a sentence with 'auto-detect'. There should be on Tatoeba a list of sentences in the language in question (named after the language in question).
  • Test that you can change the language of a sentence into the language in question.
  • Check that the count displays properly in the languages stats.
  • If it's all fine, commit and refer to the ticket #225 in your comment (=> re #225) and indicate the languages that were added.

On the dev

  • Go to the 'dev' repertory.
  • svn up
  • Connect to the mysql database of the dev version.
  • CALL add_new_language(iso_code, list_id, tag_name);
  • Test the same things you have tested in local.

On the prod

  • If everything is fine with the dev, go the the 'prod' repertory.
  • htop
  • Check that the load is below 2.
  • svn up
  • Connect to the mysql database of the prod version.
  • CALL add_new_language(iso_code, list_id, tag_name);
  • exit
  • Check that the sentences that were in the list and tags have now the appropriate icon.
  • Check that the language appears in the languages stats.
  • cp /usr/local/etc/sphinx.conf /usr/local/etc/sphinx.conf.old
  • php generate_sphinx_conf.php > /usr/local/etc/sphinx.conf
  • Change the necessary things in the new config file (user, password, database and port). Look at the old conf file for reference.
  • indexer --all --rotate & disown

Pending requests

Cornish

  • Requested by: Sisial, Gulliver
  • ISO code: cor
  • icon: http://www.stratton-gardens.co.uk/flag.jpg
  • list: http://tatoeba.org/sentences_lists/show/1199

Azerbaijani

  • Requested by: LanguageExpert, Imp
  • ISO code: ?
  • icon: http://en.wikipedia.org/wiki/File:Flag_of_Azerbaijan.svg
  • list: 1897911, 1897912, 1897913, 499687, 499684

Khmer

  • Requested by: LanguageExpert
  • ISO code: khm
  • icon: http://www.senojflags.com/images/country-flag-icons/Cambodia-Flag.png
  • list: 1897914, 1897915, 1897916, 1897917, 1897918, http://tatoeba.org/eng/sentences_lists/show/765

Haitian Creole

  • Requested by: LanguageExpert
  • ISO code: ?
  • icon: ?
  • list: 1897919, 1897920, 1897921, 1897923, 1387888

Lao

  • Requested by: LanguageExpert
  • ISO code: ?
  • icon: ?
  • list: 1897933, 1897940, 1897943, 1897946, 1897951

Sinhala

  • Requested by: honolulu
  • ISO code: sin
  • icon: Use Sri Lankan national flag, as this is the most frequently associated icon with this language.
  • list: ?

Sango

  • Requested by: ambroise
  • ISO code: sag
  • icon: http://www.pays-monde.fr/continent-afrique-1/drapeau-national-republique-centrafricaine-41.html
  • list: ?

Turkmen

  • Requested by: Imp
  • ISO code: tuk
  • icon: http://en.wikipedia.org/wiki/File:Flag_of_Turkmenistan.svg
  • list: ?

Tajik

  • Requested by: Imp
  • ISO code: tgk
  • icon: http://en.wikipedia.org/wiki/File:Flag_of_Tajikistan.svg
  • list: ?