See Contribute audio for Tatoeba for an overview of audio recorder tools specialized for recording audio for Tatoeba.
Shtooka Recorder runs on Windows. We have verified that it works on Windows XP and Windows 7, and it should run on Windows 8 as well. It is distributed in a kit that contains other tools, but only the Shtooka Recorder is of interest to us. You can find the Shtooka Recorder within the "Kit Shtooka" folder on your Start Menu once you have installed it.
Installing and Configuring Shtooka Recorder
1) Download the Shtooka installer from the Tatoeba download site.
2) On the "Words" tab:
a) Select the ISO 639-3 code for your language (for example, "eng" for English). If you are unsure of the code, see the SIL ISO 639-3 finder.
b) Set "Silence between words" to 0.50.
4) On the "Speaker" tab, fill in your name or username and your native language. The other fields are optional.
5) On the "Output" tab:
a) Fill in the "Target" field with the location to which you want your audio files to be written. It's best to choose a top-level folder here under which you will add an individual subfolder for each recording session.
b) In the "Mask" field, write:
%1
c) Leave "Ogg compression" unchecked, but check "Flac compression", "Delete .wav file after compression", and "Delete .txt file after tagging".
Obtaining a list
If you would like to obtain a ready-made list, send a private message to CK.
Otherwise, follow these instructions to prepare your own list.
Select sentences that have been written by a native speaker. Make sure that no audio has already been recorded for these sentences. Refer to CK's list of self-identified native speakers if you do not already have a speaker in mind.
Create a list by going to "Browse"/"Browse by List". Type the name of a new list (for example, "English-for-recording-A-00") underneath "Create a new list".
Either add your own sentences to the list, or (better yet) find sentences by a particular user who is a native speaker, and use the "Add to list" button above each sentence that you want to add to your list. Make sure to click "OK", or your sentence will not be added to your list. The first time you record, construct a list of only two or three sentences. Your later lists can be longer, but they should contain no more than 100 sentences each, since longer lists cannot be downloaded.
Once you have your list, click on the "Download this list" button on the right.
In the "Download" form, make sure that you check the "Insert id" checkbox, but leave "Translations" set to "None".
Click the "Download" button.
From your browser, choose to either save the file or to open it immediately in a text editor. A program like Notepad++ or Emacs (both of which are free) is good, while Notepad (which comes with Windows) is not, and an editor like Microsoft Word or OpenOffice.org Writer should only be used if you remember to save the output as text.
The columns should be in this format:
id tab text
Convert them (manually or with a macro) to this format (note the semicolon):
text;id
Copy the contents of the list to the Clipboard.
Recording the list of sentences
Start Shtooka Recorder if it is not already running.
Go to the "Words" tab.
Paste the contents of the Clipboard into the "Words to record" field if they are not already there.
Make sure that your output location (from the "Output" tab) is correct. Note that if you are entering the path manually, you must put a trailing backslash at the end.
Click on the "Continue" button on the "Words" tab.
Press the space bar to indicate that you are ready, and read the sentence that is highlighted in red. The software will detect when to start and stop recording, and will jump to the next sentence automatically. All you have to do is read what's highlighted in red. If you want to take a break, you can press space to pause, so that it doesn't record something unrelated. If you need to listen to a sentence's audio, simply select it and press "Enter". (You can navigate with the directional keys: up, down, left, right.)
Close the recording window if you wish.
Your output directory will now contain one .flac file for each sentence you have read. For instance, if you recorded sentences 15949, 16262, and 16480, the files will be 15949.flac, 16262.flac, and 16480.flac.
Sending the files to us
If you would like to save space and also spare the admin an extra step, convert the FLAC files to MP3 (which are about 4-5 times smaller), using whichever tool you can find. However, this is optional.
If you can upload your files somewhere, please do, and use the e-mail address in the next item to notify the correct person of the link. Otherwise, send the files by e-mail. If there are more than two or three files, zip them up (in ZIP format). You can use 7z (which is free) or WinZip to perform the zipping.
Whether you are sending the FLAC files to be assessed for sound quality or are uploading files after your sound quality has already been approved, send them to ck@tatoeba.org.
Videos and additional documentation
Note that the instructions given above will differ in several details from those given in the videos. The most notable difference may be the format of the sentence lists. Please adhere to the instructions above unless you are an expert and have your own procedure for providing us with a set of sentence_id.flac or sentence_id.mp3 files.
- youtube.com/watch?v=AcJoLBjUOaY (video by AmberShadow)
- bit.ly/shtooka (made by CK) = video and documentation