Notice

This page show a previous version of the article

How to Prepare a Development Environment for Tatoeba Using a Pre-made Virtual Machine

Installing the VM

  • Grab the vm file http://mirrors.bouah.net/pub/tatoeba/Tatovm/

  • Untar the file:

    • On Windows:

      • Download both the vbox and the gz file

      • Use 7zip [http://www.7-zip.org/] to extract Tatovm.vmdk file from the gz file (using the "Extract here" item from the right-click menu)

    • On Linux: use file-roller or from the terminal, type:

    tar -xvf vmfile

  • Get and install VirtualBox [https://www.virtualbox.org/wiki/Downloads]

  • Load the VM files in VirtualBox:

    From the GUI: Machine -> Add, then browse to the location of the .vbox file

    From the command line: VBoxManage registervm /path/to/vm.vbox

Accessing the VM

  • The default http port is 8080 and the default SSH port is 4242.

  • On Windows, you may want to download PuTTY as your SSH GUI.

  • To SSH into the machine, use the username tatoeba and password tatoeba:

    ssh -p 4242 tatoeba@127.0.0.1

  • Now you can see the website running in your browser by pointing it to the following address:

    127.0.0.1:8080

  • Steps such as installing packages require superuser privileges. Prior to such steps, execute:

    su -

    Type in the password tatovm when prompted.

    After performing your operation, execute "exit" to end superuser access.

  • The MySQL user is root and password is tatoeba in case you need to do operations directly on the table or import more data.

Performing Additional Configuration Steps

These steps will eventually be folded into a new VM, but for now, they must be performed after the VM is installed.

  • Optional: add the following to .bashrc:

    export TERM=xterm-256color

  • Log in as superuser (see above) and execute these commands:

    apt-get update

    apt-get install git

    apt-get install php5-curl

    apt-get install bzr

    apt-get install libpq5 (used with Sphinx)

    apt-get install poedit (used with UI translations)

    apt-get install flac (used with audio)

    apt-get install lame (used with audio)

  • The current VM was assembled shortly before we made the transition from a Subversion repository on Assembla to a Git repository on GitHub, so execute the following steps to update your code from the new repository:

    • Log in as superuser.

    • Rename /var/http/tatoeba to /var/http/tatoeba-bak .

    • Execute this line: chmod 0777 /var/http

    • Log out as superuser (important).

    • In the /var/http directory, pull the code from the GitHub Tatoeba repository as follows:

    git clone https://github.com/Tatoeba/tatoeba2.git /var/http/tatoeba

    • If you will be committing code, configure your user.name and user.email. (You can do this retroactively after your first commit, but it's better to do it beforehand.) For instance, if your username at GitHub is ghuser, and your e-mail address is address@example.com, you'll execute:

      git config --global user.name "ghuser"

      git config --global user.email address@example.com

    • Your new directory /var/http/tatoeba should have the same directory structure as the old /var/http/tatoeba-bak. You can now delete /var/http/tatoeba-bak .

  • Execute SQL scripts as follows:

    mysql -u root -ptatoeba tatoeba < /var/http/tatoeba/docs/database/updates/2013-05-31.sql

    mysql -u root -ptatoeba tatoeba < /var/http/tatoeba/docs/database/updates/2013-08-13.sql

    mysql -u root -ptatoeba tatoeba < /var/http/tatoeba/docs/database/scripts/create_fill_langStats.sql

  • Also, log into mysql and execute the following statement:

DELETE FROM sentences_translations WHERE translation_id > 2976558 OR sentence_id > 2976558;

The purpose of the statement is to get rid of dangling links that point to sentences whose IDs are higher than the maximum that is actually in the database. If you do not execute this statement, the new sentences that you add may match these dangling pointers and end up having incorrect links.

  • As superuser, make tmp/cache and its subfolders writable:

    su

    chmod 0777 /var/http/tatoeba/app/tmp/cache/

    chmod 0777 /var/http/tatoeba/app/tmp/cache/*

  • You may find it useful at this point to back up your databases so that you can return them to their virgin state. Make a directory (for instance, /backup ; this may require root permission) and then execute a command such as this one:

    mysqldump -u root -ptatoeba -A > /backup/all_dbs.sql

Install and Configure Sphinx Search

Follow the instructions under Installing and Configuring Sphinx Search.

Customizing Your Installation

  • There are three ways to access the codebase via your favorite editor in the comfort of your host computer:

    • Mount a drive over SSH:

      • On Windows: download NetDrive [www.netdrive.net] and use the aforementioned credentials and port

      • On Linux: install SSHFS and then mount it using:

      sshfs tatoeba@127.0.0.1:4242 /path/to/mountpoint

    • Mount a drive over WebDAV:

      • On Windows: use NetDrive. The user and password are tatoeba, and the port is 8080.

      • On Linux: use your favorite file manager with WebDAV support, or install cadaver and connect using the above credentials.

    • Mount a shared file (slow and not recommended):

      • Set up Guest additions [https://help.ubuntu.com/community/VirtualBox/GuestAdditions]

      • In the GUI select Devices -> Shared Folders -> Add

      • Browse to the folder you want to share from your host and select it

      • Select the Make permanent option

      • Now mount the shared file on the guest system:

      mount -t vboxfs /media/sharefoldername /path/to/mountpoint

  • You can also install a graphical environment (GNOME or any other development environment) to work directly from the VM:

    apt-get install task-gnome-desktop

Logging Into Tatoeba on the VM

The users provided by default are:

admin

corpus_maintainer

advanced_contributor

contributor

inactive

spammer

The default password for each user is '123456'.

In addition, you can register new users.