Version at: 14/02/2015, 20:57
# GSoC 2015 Project ideas
This page lists project ideas for students who would like to take part in [Google Summer of Code 2015](http://www.google-melange.com/gsoc/homepage/google/gsoc2015) and be mentored by [Tatoeba](http://tatoeba.org).
## About Tatoeba
[Tatoeba](http://tatoeba.org) is a platform that aims to build a large **database of sentences** translated into as many languages as possible. The initial idea was to have a tool in which you could search certain words, and it would return example sentences containing these words with their translations in the desired languages. The name Tatoeba resulted from this concept, because **tatoeba** means **for example** in Japanese.
You can browse the [blog](http://blog.tatoeba.org/) or the [wiki](http://en.wiki.tatoeba.org/) for more information about the project.
## Contact
* Google group: [tatoebaproject](https://groups.google.com/forum/#!forum/tatoebaproject)
* IRC: [#tatoeba on freenode](irc://irc.freenode.net/tatoeba), [Webchat](http://webchat.freenode.net?channels=tatoeba)
* XMPP: [Tatoeba conference room on chat.tatoeba.org](xmpp:tatoeba@chat.tatoeba.org?join)
To get a feeling for the discussions taking place within the Tatoeba contributor community, visit the [Tatoeba Wall page](http://tatoeba.org/wall/index).
## How to submit ideas
If you would like to submit an idea and do not have access to the wiki, please [contact us](#contact) and send us the information below.
If you have access to the wiki, simply edit this page and add the information in the [Ideas](#ideas) section.
<pre>
### Project title
#### Description
Brief description of the project. If you have already specified a lot of things about the project, do not write all the details here. Create a separate wiki page for it and only write a summary here, with a link to that wiki page.
#### Deliverables
What is the student expected to deliver at the end of the summer.
#### Prerequisite knowledge
Technical knowledge required to be able to complete the project. If you do not know what are the prerequisite knowledge for the project you are proposing, you can leave this blank, someone else will complete it.
#### Possible mentors
People from the team that may be able to mentor that idea. You can leave this section blank if you’re a student. Please only add a mentor’s name if you are that person or if the person explicitly agrees.
</pre>
## A note for students
If you are a student and are interested to work on one of the projects listed below, note that at this stage Google has not yet chosen which organizations will participate to GSoC 2015. The list of accepted mentoring organizations will be published on [**March 2**](http://www.google-melange.com/gsoc/events/google/gsoc2015). Until that date, Tatoeba is not officially part of GSoC 2015.
Of course this should not stop you from getting started on a project ahead of time. If you do so, we recommend you the following.
1. Make sure that you have read the [GSoC FAQ](http://www.google-melange.com/gsoc/document/show/gsoc_program/google/gsoc2015/help_page) and that you understand how the program works. Please check the [calendar](http://www.google-melange.com/gsoc/document/show/gsoc_program/google/gsoc2015/help_page) for the various deadlines.
2. If the project you are interested in involves implementing code in the current version of Tatoeba, [install Tatoeba on your machine](https://github.com/Tatoeba/tatoeba2), explore the code, experiment with it.
4. Start preparing your [proposal](http://en.flossmanuals.net/GSoCStudentGuide/ch008_writing-a-proposal/). You won't be implementing anything (at least not anything related to a GSoC project) until you are officially a GSoC student for Tatoeba.
3. If you would like to contribute code to get familiar with the project before GSoC, but don't know how to get started, you can read [this guide](guide-for-new-developers).
## Ideas
### Mobile friendly user interface
#### Description
Around 30% of the visitors of Tatoeba are browsing the website from a mobile device, but the usuability of the current website on mobile devices is very poor. The idea of this project is to redesign the UI to improve the user experience for visitors who are using a mobile.
Discussion in Google group: [GSoC 2015 - Mobile friendly user interface](https://groups.google.com/forum/#!topic/tatoebaproject/ssK6N3T6in4)
#### Deliverables
Implementation in Tatoeba's source code.
#### Prerequisite knowledge
PHP, HTML, CSS
#### Possible mentors
Trang
### Extension of the search feature
#### Description
The search feature is currently available only for sentences and the search criteria are limited to the source/target language and the sentence's text. The goal of this project would be:
1. To implement more search criteria (tags, username, audio, date...) (See [issue #53](https://github.com/Tatoeba/tatoeba2/issues/53))
2. To extend the search feature to comments, wall messages, and possibly other contents (private messages, profile...).
Here are some examples of search we would like to be able to do:
* Get all sentences in a given language by a given user that have not been translated into a given language. For example: Show me all English sentences by user "CK" not yet translated into Japanese.
* Same as above, but limited to sentences with audio. For example: Show me all English sentences by "CK" with audio that have not been translated into Japanese.
* Get all sentences in a given language with a certain tag not translated into a given language. For example: Show me all Georgian sentences with the tag "restaurant" not translated into Armenian.
* Same as above, but limited to sentences by native speakers not translated into a given language. For example: Show me all Korean sentences by native speakers with the tag "weather" not translated into Japanese.
* Get all sentences in a given language under a certain length not yet translated into a given language. For example: Show me all Japanese sentences fewer than 50 characters in length not translated into French.
* Same as above, but limited to sentences by a given user.
* Get all sentences of a given language that match a given search keyword that have not been translated into a given language. For example: Show all English sentences with the word "mountain" not translated into Japanese.
* Same as above, but limited to sentences by a given user.
#### Deliverables
Implementation in Tatoeba's source code.
#### Prerequisite knowledge
CakePHP, [Sphinx](http://sphinxsearch.com/)
#### Possible mentors
gillux
### Wish list for words and expression
#### Description
The wish list for words and expressions allows users to add words and expressions to a list and other users can fulfill the wishes by adding sentences with these words and expressions.
The implementation of this feature consists of three new views/pages: "Add to wish list", "Browse wish list", and "Wish: xxx in language by user username".
* "Add to wish list" is a page where users can submit new wishes.
* "Browse wish list" is a page where users can browse the wishes the other users have submited.
* "Wish: xxx in language by user username" is a page for each individual wish where the orginal submiter of the wish can modify the wish, other users can fulfil the wish, and all the users can discuss about the wish.
At the upper part of all of these pages there are two tabs/links: "Add to wish list" and "Browse wish list" for easy access from page to page.
See more detailed description: [Wish list for words and expression](wish_list)
#### Deliverables
Implementation in Tatoeba's source code.
#### Prerequisite knowledge
CakePHP
#### Possible mentors
gillux
### Achievement system
#### Description
The goal of this project is to implement a system of achievements that would give users specific tasks to do and reward them with a badge/medal when they complete the tasks.
Such a system would be particularly helpful for new contributors. Tatoeba is indeed still not very intuitive. At the moment, when a user registers, they are redirected to a "Getting started" page where information is too dense and that most of them probably don't read.
The badge system would guide these new contributors into learning about the features of Tatoeba progressively.
This can of course also make contributing more engaging for the more advanced contributors.
#### Deliverables
Implementation in Tatoeba's source code.
#### Prerequisite knowledge
CakePHP, MySQL, knowledge about gamification
#### Possible mentors
Trang
### Improvement of communication tools
#### Description
The Wall is the main place for members to communicate with each other publicly. There are however no categories like in a regular forum. All the topics are mixed together. As a result, one cannot easily find all the posts where people introduce each other, or all the posts where people submit suggestions, or all the posts that are announcements from the admins.
The private messages are very old style. There is no notion of a discussion thread, and therefore each message is displayed alone, even if it was a reply of a previous message. This makes it rather unpractical to have a conversation with private messages.
The goal of this project is:
1. to improve the Wall, or possibly replace it with a forum, or implement a forum in addition to the Wall.
2. change the private messages system to display all the messages from a same discussion in a same thread, rather than separated into several private messages.
#### Deliverables
Implementation in Tatoeba's source code.
#### Prerequisite knowledge
CakePHP, MySQL
#### Possible mentors
?
### Permissions management
#### Description
The permissions of a user are based mostly on the user's status: depending on whether you are a contributor, advanced contributor, corpus maintainer or admin, you will have access to more or less features. For instance advanced contributors an add tags to a sentence, while regular contributors cannot. Corpus maintainers can delete sentences while other contributors cannot.
The goal of this project is to design and implement a more refined permission system, with an interface to manage these permissions.
Here are example of things that we cannot do at the moment, and that could be part of the project:
* Disallow a user to add new sentences, but still allow them to translate sentences.
* Restrict the languages in which a user can contribute.
* Disallow a user from posting comments only on the Wall, but not on sentences.
#### Deliverables
Implementation in Tatoeba's source code.
#### Prerequisite knowledge
CakePHP, MySQL
#### Possible mentors
?
### Audio
#### Description
Tatoeba provides [audio](http://tatoeba.org/eng/sentences/with_audio) for some sentences. These audio are recorded by volunteers, and the process of contributing audio is a bit complicated. This is due to the fact that audio was not at the core of the project.
Audio is still a great addition to the project and Tatoeba has received more and more audio contributions over the years. But the audio content lacks the structure that the sentences in the textual corpus benefit of.
* There is no way to know (from the website) who is the author of an audio file, not when it was contributed (cf. [Github issue #547](https://github.com/Tatoeba/tatoeba2/issues/547)).
* It is not possible either to attach several audio to a same sentence (to illustrate different accents of a same language for instance).
* Last but not least, it is a bit tedious to update and maintain the audio.
The goal of this project would be to implement the necessary features for a better management the audio content in Tatoeba.
#### Deliverables
Could be either implementation in Tatoeba's source code, or development of a separate application.
#### Prerequisite knowledge
CakePHP if implementation in Tatoeba.
#### Possible mentors
?
### API
#### Description
Projects that are built upon Tatoeba's data currently have to use the CSV dumps that we provide through our [Downloads](http://tatoeba.org/eng/downloads) page.
Even though it was requested for a long time, we still do not have any official API for other projects to plug their applications directly to our data (this is especially demanded from developers of mobile apps).
There is a beginning of API based on Django that was developed by one of our students from GSoC 2014, but the project did not come to a stable enough state to be released.
The goal of this project is to release an API.
#### Deliverables
A web application that provides a set of API calls for data stored in the current database.
Ideally this should be done with pytoeba, but we do not exclude other solutions if there's a good reason for it.
#### Prerequisite knowledge
A web application language (Python or PHP preferred), MySQL, and a data exchange format such as JSON or XML.
#### Possible mentors
?
version at: 14/02/2015, 21:11
# GSoC 2015 Project ideas
This page lists project ideas for students who would like to take part in [Google Summer of Code 2015](http://www.google-melange.com/gsoc/homepage/google/gsoc2015) and be mentored by [Tatoeba](http://tatoeba.org).
## About Tatoeba
[Tatoeba](http://tatoeba.org) is a platform that aims to build a large **database of sentences** translated into as many languages as possible. The initial idea was to have a tool in which you could search certain words, and it would return example sentences containing these words with their translations in the desired languages. The name Tatoeba resulted from this concept, because **tatoeba** means **for example** in Japanese.
You can browse the [blog](http://blog.tatoeba.org/) or the [wiki](http://en.wiki.tatoeba.org/) for more information about the project.
## Contact
* Google group: [tatoebaproject](https://groups.google.com/forum/#!forum/tatoebaproject)
* IRC: [#tatoeba on freenode](irc://irc.freenode.net/tatoeba), [Webchat](http://webchat.freenode.net?channels=tatoeba)
* XMPP: [Tatoeba conference room on chat.tatoeba.org](xmpp:tatoeba@chat.tatoeba.org?join)
To get a feeling for the discussions taking place within the Tatoeba contributor community, visit the [Tatoeba Wall page](http://tatoeba.org/wall/index).
## How to submit ideas
If you would like to submit an idea and do not have access to the wiki, please [contact us](#contact) and send us the information below.
If you have access to the wiki, simply edit this page and add the information in the [Ideas](#ideas) section.
<pre>
### Project title
#### Description
Brief description of the project. If you have already specified a lot of things about the project, do not write all the details here. Create a separate wiki page for it and only write a summary here, with a link to that wiki page.
#### Deliverables
What is the student expected to deliver at the end of the summer.
#### Prerequisite knowledge
Technical knowledge required to be able to complete the project. If you do not know what are the prerequisite knowledge for the project you are proposing, you can leave this blank, someone else will complete it.
#### Possible mentors
People from the team that may be able to mentor that idea. You can leave this section blank if you’re a student. Please only add a mentor’s name if you are that person or if the person explicitly agrees.
</pre>
## A note for students
If you are a student and are interested to work on one of the projects listed below, note that at this stage Google has not yet chosen which organizations will participate to GSoC 2015. The list of accepted mentoring organizations will be published on [**March 2**](http://www.google-melange.com/gsoc/events/google/gsoc2015). Until that date, Tatoeba is not officially part of GSoC 2015.
Of course this should not stop you from getting started on a project ahead of time. If you do so, we recommend you the following.
1. Make sure that you have read the [GSoC FAQ](http://www.google-melange.com/gsoc/document/show/gsoc_program/google/gsoc2015/help_page) and that you understand how the program works. Please check the [calendar](http://www.google-melange.com/gsoc/document/show/gsoc_program/google/gsoc2015/help_page) for the various deadlines.
2. If the project you are interested in involves implementing code in the current version of Tatoeba, [install Tatoeba on your machine](https://github.com/Tatoeba/tatoeba2), explore the code, experiment with it.
4. Start preparing your [proposal](http://en.flossmanuals.net/GSoCStudentGuide/ch008_writing-a-proposal/). You won't be implementing anything (at least not anything related to a GSoC project) until you are officially a GSoC student for Tatoeba.
3. If you would like to contribute code to get familiar with the project before GSoC, but don't know how to get started, you can read [this guide](guide-for-new-developers).
## Ideas
### Mobile friendly user interface
#### Description
Around 30% of the visitors of Tatoeba are browsing the website from a mobile device, but the usuability of the current website on mobile devices is very poor. The idea of this project is to redesign the UI to improve the user experience for visitors who are using a mobile.
Discussion in Google group: [GSoC 2015 - Mobile friendly user interface](https://groups.google.com/forum/#!topic/tatoebaproject/ssK6N3T6in4)
#### Deliverables
Implementation in Tatoeba's source code.
#### Prerequisite knowledge
PHP, HTML, CSS
#### Possible mentors
Trang
### Extension of the search feature
#### Description
The search feature is currently available only for sentences and the search criteria are limited to the source/target language and the sentence's text. The goal of this project would be:
1. To implement more search criteria (tags, username, audio, date...) (See [issue #53](https://github.com/Tatoeba/tatoeba2/issues/53))
2. To extend the search feature to comments, wall messages, and possibly other contents (private messages, profile...).
Here are some examples of search we would like to be able to do:
* Get all sentences in a given language by a given user that have not been translated into a given language. For example: Show me all English sentences by user "CK" not yet translated into Japanese.
* Same as above, but limited to sentences with audio. For example: Show me all English sentences by "CK" with audio that have not been translated into Japanese.
* Get all sentences in a given language with a certain tag not translated into a given language. For example: Show me all Georgian sentences with the tag "restaurant" not translated into Armenian.
* Same as above, but limited to sentences by native speakers not translated into a given language. For example: Show me all Korean sentences by native speakers with the tag "weather" not translated into Japanese.
* Get all sentences in a given language under a certain length not yet translated into a given language. For example: Show me all Japanese sentences fewer than 50 characters in length not translated into French.
* Same as above, but limited to sentences by a given user.
* Get all sentences of a given language that match a given search keyword that have not been translated into a given language. For example: Show all English sentences with the word "mountain" not translated into Japanese.
* Same as above, but limited to sentences by a given user.
#### Deliverables
Implementation in Tatoeba's source code.
#### Prerequisite knowledge
CakePHP, [Sphinx](http://sphinxsearch.com/)
#### Possible mentors
gillux
### Wish list for words and expression
#### Description
The wish list for words and expressions allows users to add words and expressions to a list and other users can fulfill the wishes by adding sentences with these words and expressions.
The implementation of this feature consists of three new views/pages: "Add to wish list", "Browse wish list", and "Wish: xxx in language by user username".
* "Add to wish list" is a page where users can submit new wishes.
* "Browse wish list" is a page where users can browse the wishes the other users have submited.
* "Wish: xxx in language by user username" is a page for each individual wish where the orginal submiter of the wish can modify the wish, other users can fulfil the wish, and all the users can discuss about the wish.
At the upper part of all of these pages there are two tabs/links: "Add to wish list" and "Browse wish list" for easy access from page to page.
See more detailed description: [Wish list for words and expression](wish_list)
#### Deliverables
Implementation in Tatoeba's source code.
#### Prerequisite knowledge
CakePHP
#### Possible mentors
gillux
### Achievement system
#### Description
The goal of this project is to implement a system of achievements that would give users specific tasks to do and reward them with a badge/medal when they complete the tasks.
Such a system would be particularly helpful for new contributors. Tatoeba is indeed still not very intuitive. At the moment, when a user registers, they are redirected to a "Getting started" page where information is too dense and that most of them probably don't read.
The badge system would guide these new contributors into learning about the features of Tatoeba progressively.
This can of course also make contributing more engaging for the more advanced contributors.
#### Deliverables
Implementation in Tatoeba's source code.
#### Prerequisite knowledge
CakePHP, MySQL, knowledge about gamification
#### Possible mentors
Trang
### Improvement of communication tools
#### Description
The Wall is the main place for members to communicate with each other publicly. There are however no categories like in a regular forum. All the topics are mixed together. As a result, one cannot easily find all the posts where people introduce each other, or all the posts where people submit suggestions, or all the posts that are announcements from the admins.
The private messages are very old style. There is no notion of a discussion thread, and therefore each message is displayed alone, even if it was a reply of a previous message. This makes it rather unpractical to have a conversation with private messages.
The goal of this project is:
1. to improve the Wall, or possibly replace it with a forum, or implement a forum in addition to the Wall.
2. change the private messages system to display all the messages from a same discussion in a same thread, rather than separated into several private messages.
#### Deliverables
Implementation in Tatoeba's source code.
#### Prerequisite knowledge
CakePHP, MySQL
#### Possible mentors
?
### Permissions management
#### Description
The permissions of a user are based mostly on the user's status: depending on whether you are a contributor, advanced contributor, corpus maintainer or admin, you will have access to more or less features. For instance advanced contributors an add tags to a sentence, while regular contributors cannot. Corpus maintainers can delete sentences while other contributors cannot.
The goal of this project is to design and implement a more refined permission system, with an interface to manage these permissions.
Here are example of things that we cannot do at the moment, and that could be part of the project:
* Disallow a user to add new sentences, but still allow them to translate sentences.
* Restrict the languages in which a user can contribute.
* Disallow a user from posting comments only on the Wall, but not on sentences.
#### Deliverables
Implementation in Tatoeba's source code.
#### Prerequisite knowledge
CakePHP, MySQL
#### Possible mentors
?
### Audio
#### Description
Tatoeba provides [audio](http://tatoeba.org/eng/sentences/with_audio) for some sentences. These audio are recorded by volunteers, and the process of contributing audio is a bit complicated. This is due to the fact that audio was not at the core of the project.
Audio is still a great addition to the project and Tatoeba has received more and more audio contributions over the years. But the audio content lacks the structure that the sentences in the textual corpus benefit of.
* There is no way to know (from the website) who is the author of an audio file, not when it was contributed (cf. [Github issue #547](https://github.com/Tatoeba/tatoeba2/issues/547)).
* It is not possible either to attach several audio to a same sentence (to illustrate different accents of a same language for instance).
* Last but not least, it is a bit tedious to update and maintain the audio.
The goal of this project would be to implement the necessary features for a better management the audio content in Tatoeba.
#### Deliverables
Could be either implementation in Tatoeba's source code, or development of a separate application.
#### Prerequisite knowledge
CakePHP if implementation in Tatoeba.
#### Possible mentors
?
### API
#### Description
Projects that are built upon Tatoeba's data currently have to use the CSV dumps that we provide through our [Downloads](http://tatoeba.org/eng/downloads) page.
Even though it was requested for a long time, we still do not have any official API for other projects to plug their applications directly to our data (this is especially demanded from developers of mobile apps).
There is a beginning of API based on Django that was developed by one of our students from GSoC 2014, but the project did not come to a stable enough state to be released.
The goal of this project is to release an API.
#### Deliverables
A web application that provides a set of API calls for data stored in the current database.
Ideally this should be done with pytoeba, but we do not exclude other solutions if there's a good reason for it.
#### Prerequisite knowledge
A web application language (Python or PHP preferred), MySQL, and a data exchange format such as JSON or XML.
#### Possible mentors
?