Version at: 23/06/2015, 15:38 vs. version at: 23/06/2015, 15:39
11#How to Search for Text
22
33## Tatoeba.org uses [Sphinx Search](http://sphinxsearch.com/docs/current.html#boolean-syntax)
44
55These instructions tell you how to use the search bar at the top of every Tatoeba page. Our search works much like a search engine such as Google, but has some important differences.
66
77* To find English sentences with "live", "lives", "living" or "lived", search for the word "live". (This will also find sentences with "Live", "Living", etc., since capitalization is ignored.)
88
99 * [live](http://tatoeba.org/eng/sentences/search?query=live+&from=eng&to=und)
1010
1111* To match a word exactly (ignoring capitalization), put an equals sign (=) before it.
1212
1313 * [=live](http://tatoeba.org/eng/sentences/search?query=%3Dlive+&from=eng&to=und)
1414
1515* Put a $ after a word to find sentences ending with that word. The example finds English sentences ending with "Tom".
1616
1717 * [Tom$](http://tatoeba.org/eng/sentences/search?query=Tom%24&from=eng&to=und)
1818
1919* Put a ^ before a word to find sentences beginning with that word. The example finds English sentences beginning with "Tom".
2020
2121 * [^Tom](http://tatoeba.org/eng/sentences/search?query=%5ETom&from=eng&to=und)
2222
2323* This example finds English sentences beginning with "Tom" and ending with "Mary".
2424
2525 * [^Tom Mary$](http://tatoeba.org/eng/sentences/search?query=%5ETom+Mary%24&from=eng&to=und)
2626
2727* This example finds English sentences beginning with either "Tom" or "He".
2828
2929 * [(^Tom|^He)](http://tatoeba.org/eng/sentences/search?query=%28%5ETom%7C%5EHe%29&from=eng&to=und)
3030
3131
3232* To search for a phrase, put quotes (") around it. Put an equals sign in front of each word that you want to be matched exactly.
3333 * If you want to see phrases like "live in Boston", "living in Boston", or "lives in Boston", use the following search:
3434
3535 * ["live in boston"](http://tatoeba.org/eng/sentences/search?query=%22live+in+boston%22&from=eng&to=und)
3636
3737 * The following search will only find sentences with the exact phrase "live in Boston".
3838
3939 * ["=live =in =boston"](http://tatoeba.org/eng/sentences/search?query=%22%3Dlive+%3Din+%3Dboston%22&from=eng&to=und)
4040
4141 * This search will only find sentences consisting of the exact words "I live in Boston" with no other content.
4242
4343 * ["^I =live =in =Boston$"](http://tatoeba.org/eng/sentences/search?query=%22%5EI+%3Dlive+%3Din+%3DBoston%24%22&from=eng&to=und)
4444
4545* This example finds English sentences that have "Tom", but don't begin with "Tom."
4646
4747 * [-^Tom Tom](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom&from=eng&to=und)
4848
4949* This example finds English sentences that have "Tom", but don't begin or end with "Tom."
5050
5151 * [-^Tom Tom -Tom$](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom+-Tom%24&from=eng&to=und)
5252
53* This example finds English sentences that have "Tom", then 2 words, then "Mary", then 1 word, then "John."
53* This example finds English sentences that have "Tom", then 2 words, then "Mary", then 1 word, and then "John."
5454
5555 * ["Tom * * Mary * John"](https://tatoeba.org/eng/sentences/search?query=%22Tom+*+*+Mary+*+John%22&from=eng&to=und
5656)
57
5857
5958* Use a minus sign (-) to mean "no" (to find sentences without certain words). The following search will find sentences with "cheek" (in any form: cheeks, etc.) that don't include any of the words preceded by a minus sign (-).
6059
6160 * [cheek -tear -slap -burn -red -hollow](http://tatoeba.org/eng/sentences/search?query=cheek+-tear+-slap+-burn+-red+-hollow&from=eng&to=und)
6261
6362* Use a star (*) to indicate a string of zero or more characters that you don't care about matching exactly. There must be at least three characters preceding the star, and no characters following it. The following search will find sentences with words that begin with "break", such as "breaks" and "breakfast". It will also look for sentences that contain the word "break" itself.
6463
6564 * [break*](https://tatoeba.org/eng/sentences/search?query=break*&from=eng&to=und)
6665
6766* Leave punctuation out of your search string. Most punctuation will be ignored, but a final exclamation mark will actually interfere with the search.
6867 * The following yields no results:
6968
7069 * [how strange!](http://tatoeba.org/eng/sentences/search?query=how+strange!&from=eng&to=und)
7170
7271 * but this search will find *How strange!* among other results:
7372
7473 * [how strange](http://tatoeba.org/eng/sentences/search?query=how+strange&from=eng&to=und)
7574
7675## Languages without word boundaries
7776
7877For languages that don't use space characters to separate words, like Japanese, Chinese etc. the search engine interprets each character as a single word. For instance, searching for 逆に will return the same results as 逆 に, which actually matches sentences that only *include* these characters, but not necessarily in that particular order, or not contiguously. So you want to surround keywords with quotes: ["逆に"](http://tatoeba.org/jpn/sentences/search?query=%22%E9%80%86%E3%81%AB%22&from=jpn).
7978
8079## More Details
8180
8281The search ignores capitalization and punctuation (unless the punctuation happens to match one of the special characters described elsewhere on the page). An apostrophe within a word is not treated as punctuation, so you can find such words as "don't" by including them in an ordinary search string.
8382
8483In some languages, including English, the search engine **stems** the search words by default. This means that it removes certain trailing sequences from both search words and indexed words. Thus a search for *live* will also find *lived* and *living*.
8584
8685The languages in which the search engine stems words are: German, English, Finnish, French, Italian, Dutch, Portuguese, Russian, Spanish, Swedish and Turkish.
8786
8887If you want to find an exact match for a word, you must precede it with an equals sign, as in *=live*. This may come as a surprise to users who are accustomed to Google Search, where wrapping a word or phrase in double quotes forces an exact match. In Sphinx, double quotes have a different function, which only affects multiword (phrase) searches: wrapping a phrase in double quotes requires matching sentences to contain words in the specified continuous sequence. Simply placing a phrase in quotes does not suppress stemming of its individual words. To do that, you will need to place an equals sign before each word in the phrase for which you want to suppress stemming.
8988
9089As an example, take the search *like thing*. This will find *like things*, *likely things*, and even *things like*. Adding quotes, as in *"like thing"*, will prevent a match against *things like* (where the words appear in the wrong order), but it will continue to match *like things*, *likely things*, and so on. By contrast, *"=like =thing"* will only match *like thing* (which does not occur in the Tatoeba corpus). Removing the double quotes, *=like =thing*, will match *What made you do a silly thing like that?* Removing one of the equals signs, as in *like =thing*, will find *Such a strange thing is not likely to happen.*
9190
9291## Other search operators
9392
9493* A vertical bar (representing "or") finds examples where either of the words appears:
9594 * *hate | detest* will match sentences with either *hate* or *detest* (or both).
9695
9796* If you want to combine an or-expression with other terms, you need to put the or-expression in parentheses:
9897 * *(red|blue) house* will match sentences in which the word "house" appears together with either "red" or "blue" (or both)
9998
10099* A dash (or exclamation point) before a word prevents matches with sentences where the word appears: *like -thing* (or *like !thing*) will match *I like ice cream* but not *I like that red thing*.
101100
102101* Putting a caret (^) before a word will match only sentences that begin with that word: *^great* will match *Great people are not always wise.* but not *You are the great love of my life.*
103102
104103* Putting a dollar sign ($) after a word will match only sentences that end with that word: *life$* will match *This is the best day of my life.* but not *Life means nothing without friends.*
105104
106105* If you want to search for sentences that contain nothing other than the specified words, use double quotes, a caret, and a dollar sign in combination: *"^i love you$"* will find *I love you.* and *I love you!* but not *I love you more than you love me.* (However, it will find *I loved you.* To prevent this match, use *"^i =love you$"*.)
107106
108107
109108See the [Sphinx documentation](http://sphinxsearch.com/docs/current.html#boolean-syntax) for other functionality. Note that the documentation mentions keywords pertaining to specific fields in a document, but these are not relevant to Tatoeba.
diff view generated by jsdifflib

Version at: 23/06/2015, 15:38

#How to Search for Text

## Tatoeba.org uses  [Sphinx Search](http://sphinxsearch.com/docs/current.html#boolean-syntax) 

These instructions tell you how to use the search bar at the top of every Tatoeba page. Our search works much like a search engine such as Google, but has some important differences. 

* To find English sentences with "live", "lives", "living" or "lived", search for the word "live". (This will also find sentences with "Live", "Living", etc., since capitalization is ignored.)

  * [live](http://tatoeba.org/eng/sentences/search?query=live+&from=eng&to=und)

* To match a word exactly (ignoring capitalization), put an equals sign (=) before it. 

  * [=live](http://tatoeba.org/eng/sentences/search?query=%3Dlive+&from=eng&to=und)

* Put a $ after a word to find sentences ending with that word. The example finds English sentences ending with "Tom".

  * [Tom$](http://tatoeba.org/eng/sentences/search?query=Tom%24&from=eng&to=und)

* Put a ^ before  a word to find sentences beginning with that word. The example finds English sentences beginning with "Tom".

  * [^Tom](http://tatoeba.org/eng/sentences/search?query=%5ETom&from=eng&to=und)

* This example finds English sentences beginning with "Tom" and ending with "Mary".

  * [^Tom Mary$](http://tatoeba.org/eng/sentences/search?query=%5ETom+Mary%24&from=eng&to=und)

* This example finds English sentences beginning with either "Tom" or "He".

  * [(^Tom|^He)](http://tatoeba.org/eng/sentences/search?query=%28%5ETom%7C%5EHe%29&from=eng&to=und)


* To search for a phrase, put quotes (") around it. Put an equals sign in front of each word that you want to be matched exactly.
  * If you want to see phrases like "live in Boston", "living in Boston", or "lives in Boston", use the following search:

      * ["live in boston"](http://tatoeba.org/eng/sentences/search?query=%22live+in+boston%22&from=eng&to=und)

  * The following search will only find sentences with the exact phrase "live in Boston".

      * ["=live =in =boston"](http://tatoeba.org/eng/sentences/search?query=%22%3Dlive+%3Din+%3Dboston%22&from=eng&to=und)

  * This search will only find sentences consisting of the exact words "I live in Boston" with no other content.

      * ["^I =live =in =Boston$"](http://tatoeba.org/eng/sentences/search?query=%22%5EI+%3Dlive+%3Din+%3DBoston%24%22&from=eng&to=und)

* This example finds English sentences that have "Tom", but don't begin with "Tom."

  * [-^Tom Tom](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom&from=eng&to=und)

* This example finds English sentences that have "Tom", but don't begin or end with "Tom."

  * [-^Tom Tom -Tom$](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom+-Tom%24&from=eng&to=und)

* This example finds English sentences that have "Tom", then 2 words, then "Mary", then 1 word, then "John."

  * ["Tom * * Mary * John"](https://tatoeba.org/eng/sentences/search?query=%22Tom+*+*+Mary+*+John%22&from=eng&to=und
)


* Use a minus sign (-) to mean "no" (to find sentences without certain words). The following search will find sentences with "cheek" (in any form: cheeks, etc.) that don't include any of the words preceded by a minus sign (-).

  * [cheek -tear -slap -burn -red -hollow](http://tatoeba.org/eng/sentences/search?query=cheek+-tear+-slap+-burn+-red+-hollow&from=eng&to=und)

* Use a star (*) to indicate a string of zero or more characters that you don't care about matching exactly. There must be at least three characters preceding the star, and no characters following it. The following search will find sentences with words that begin with "break", such as "breaks" and "breakfast". It will also look for sentences that contain the word "break" itself.

  * [break*](https://tatoeba.org/eng/sentences/search?query=break*&from=eng&to=und)

* Leave punctuation out of your search string. Most punctuation will be ignored, but a final exclamation mark will actually interfere with the search. 
  * The following yields no results:

      * [how strange!](http://tatoeba.org/eng/sentences/search?query=how+strange!&from=eng&to=und)

  * but this search will find *How strange!* among other results:

      * [how strange](http://tatoeba.org/eng/sentences/search?query=how+strange&from=eng&to=und)

## Languages without word boundaries

For languages that don't use space characters to separate words, like Japanese, Chinese etc. the search engine interprets each character as a single word. For instance, searching for 逆に will return the same results as 逆 に, which actually matches sentences that only *include* these characters, but not necessarily in that particular order, or not contiguously. So you want to surround keywords with quotes: ["逆に"](http://tatoeba.org/jpn/sentences/search?query=%22%E9%80%86%E3%81%AB%22&from=jpn).

## More Details

The search ignores capitalization and punctuation (unless the punctuation happens to match one of the special characters described elsewhere on the page). An apostrophe within a word is not treated as punctuation, so you can find such words as "don't" by including them in an ordinary search string. 

In some languages, including English, the search engine **stems** the search words by default. This means that it removes certain trailing sequences from both search words and indexed words. Thus a search for *live* will also find *lived* and *living*.

The languages in which the search engine stems words are: German, English, Finnish, French, Italian, Dutch, Portuguese, Russian, Spanish, Swedish and Turkish.

If you want to find an exact match for a word, you must precede it with an equals sign, as in *=live*. This may come as a surprise to users who are accustomed to Google Search, where wrapping a word or phrase in double quotes forces an exact match. In Sphinx, double quotes have a different function, which only affects multiword (phrase) searches: wrapping a phrase in double quotes requires matching sentences to contain words in the specified continuous sequence. Simply placing a phrase in quotes does not suppress stemming of its individual words. To do that, you will need to place an equals sign before each word in the phrase for which you want to suppress stemming.

As an example, take the search *like thing*. This will find *like things*, *likely things*, and even *things like*. Adding quotes, as in *"like thing"*, will prevent a match against *things like* (where the words appear in the wrong order), but it will continue to match *like things*, *likely things*, and so on. By contrast, *"=like =thing"* will only match *like thing* (which does not occur in the Tatoeba corpus). Removing the double quotes, *=like =thing*, will match *What made you do a silly thing like that?* Removing one of the equals signs, as in *like =thing*, will find *Such a strange thing is not likely to happen.* 

## Other search operators

* A vertical bar (representing "or") finds examples where either of the words appears:
  *    *hate | detest* will match sentences with either *hate* or *detest* (or both). 

* If you want to combine an or-expression with other terms, you need to put the or-expression in parentheses: 
  *    *(red|blue) house* will match sentences in which the word "house" appears together with either "red" or "blue" (or both) 

* A dash (or exclamation point) before a word prevents matches with sentences where the word appears: *like -thing* (or *like !thing*) will match *I like ice cream* but not *I like that red thing*.

* Putting a caret (^) before a word will match only sentences that begin with that word: *^great* will match *Great people are not always wise.* but not *You are the great love of my life.* 

* Putting a dollar sign ($) after a word will match only sentences that end with that word: *life$* will match *This is the best day of my life.* but not *Life means nothing without friends.*

* If you want to search for sentences that contain nothing other than the specified words, use double quotes, a caret, and a dollar sign in combination: *"^i love you$"* will find *I love you.* and *I love you!* but not *I love you more than you love me.* (However, it will find *I loved you.* To prevent this match, use *"^i =love you$"*.)


See the [Sphinx documentation](http://sphinxsearch.com/docs/current.html#boolean-syntax) for other functionality. Note that the documentation mentions keywords pertaining to specific fields in a document, but these are not relevant to Tatoeba.

version at: 23/06/2015, 15:39

#How to Search for Text

## Tatoeba.org uses  [Sphinx Search](http://sphinxsearch.com/docs/current.html#boolean-syntax) 

These instructions tell you how to use the search bar at the top of every Tatoeba page. Our search works much like a search engine such as Google, but has some important differences. 

* To find English sentences with "live", "lives", "living" or "lived", search for the word "live". (This will also find sentences with "Live", "Living", etc., since capitalization is ignored.)

  * [live](http://tatoeba.org/eng/sentences/search?query=live+&from=eng&to=und)

* To match a word exactly (ignoring capitalization), put an equals sign (=) before it. 

  * [=live](http://tatoeba.org/eng/sentences/search?query=%3Dlive+&from=eng&to=und)

* Put a $ after a word to find sentences ending with that word. The example finds English sentences ending with "Tom".

  * [Tom$](http://tatoeba.org/eng/sentences/search?query=Tom%24&from=eng&to=und)

* Put a ^ before  a word to find sentences beginning with that word. The example finds English sentences beginning with "Tom".

  * [^Tom](http://tatoeba.org/eng/sentences/search?query=%5ETom&from=eng&to=und)

* This example finds English sentences beginning with "Tom" and ending with "Mary".

  * [^Tom Mary$](http://tatoeba.org/eng/sentences/search?query=%5ETom+Mary%24&from=eng&to=und)

* This example finds English sentences beginning with either "Tom" or "He".

  * [(^Tom|^He)](http://tatoeba.org/eng/sentences/search?query=%28%5ETom%7C%5EHe%29&from=eng&to=und)


* To search for a phrase, put quotes (") around it. Put an equals sign in front of each word that you want to be matched exactly.
  * If you want to see phrases like "live in Boston", "living in Boston", or "lives in Boston", use the following search:

      * ["live in boston"](http://tatoeba.org/eng/sentences/search?query=%22live+in+boston%22&from=eng&to=und)

  * The following search will only find sentences with the exact phrase "live in Boston".

      * ["=live =in =boston"](http://tatoeba.org/eng/sentences/search?query=%22%3Dlive+%3Din+%3Dboston%22&from=eng&to=und)

  * This search will only find sentences consisting of the exact words "I live in Boston" with no other content.

      * ["^I =live =in =Boston$"](http://tatoeba.org/eng/sentences/search?query=%22%5EI+%3Dlive+%3Din+%3DBoston%24%22&from=eng&to=und)

* This example finds English sentences that have "Tom", but don't begin with "Tom."

  * [-^Tom Tom](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom&from=eng&to=und)

* This example finds English sentences that have "Tom", but don't begin or end with "Tom."

  * [-^Tom Tom -Tom$](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom+-Tom%24&from=eng&to=und)

* This example finds English sentences that have "Tom", then 2 words, then "Mary", then 1 word, and then "John."

  * ["Tom * * Mary * John"](https://tatoeba.org/eng/sentences/search?query=%22Tom+*+*+Mary+*+John%22&from=eng&to=und
)

* Use a minus sign (-) to mean "no" (to find sentences without certain words). The following search will find sentences with "cheek" (in any form: cheeks, etc.) that don't include any of the words preceded by a minus sign (-).

  * [cheek -tear -slap -burn -red -hollow](http://tatoeba.org/eng/sentences/search?query=cheek+-tear+-slap+-burn+-red+-hollow&from=eng&to=und)

* Use a star (*) to indicate a string of zero or more characters that you don't care about matching exactly. There must be at least three characters preceding the star, and no characters following it. The following search will find sentences with words that begin with "break", such as "breaks" and "breakfast". It will also look for sentences that contain the word "break" itself.

  * [break*](https://tatoeba.org/eng/sentences/search?query=break*&from=eng&to=und)

* Leave punctuation out of your search string. Most punctuation will be ignored, but a final exclamation mark will actually interfere with the search. 
  * The following yields no results:

      * [how strange!](http://tatoeba.org/eng/sentences/search?query=how+strange!&from=eng&to=und)

  * but this search will find *How strange!* among other results:

      * [how strange](http://tatoeba.org/eng/sentences/search?query=how+strange&from=eng&to=und)

## Languages without word boundaries

For languages that don't use space characters to separate words, like Japanese, Chinese etc. the search engine interprets each character as a single word. For instance, searching for 逆に will return the same results as 逆 に, which actually matches sentences that only *include* these characters, but not necessarily in that particular order, or not contiguously. So you want to surround keywords with quotes: ["逆に"](http://tatoeba.org/jpn/sentences/search?query=%22%E9%80%86%E3%81%AB%22&from=jpn).

## More Details

The search ignores capitalization and punctuation (unless the punctuation happens to match one of the special characters described elsewhere on the page). An apostrophe within a word is not treated as punctuation, so you can find such words as "don't" by including them in an ordinary search string. 

In some languages, including English, the search engine **stems** the search words by default. This means that it removes certain trailing sequences from both search words and indexed words. Thus a search for *live* will also find *lived* and *living*.

The languages in which the search engine stems words are: German, English, Finnish, French, Italian, Dutch, Portuguese, Russian, Spanish, Swedish and Turkish.

If you want to find an exact match for a word, you must precede it with an equals sign, as in *=live*. This may come as a surprise to users who are accustomed to Google Search, where wrapping a word or phrase in double quotes forces an exact match. In Sphinx, double quotes have a different function, which only affects multiword (phrase) searches: wrapping a phrase in double quotes requires matching sentences to contain words in the specified continuous sequence. Simply placing a phrase in quotes does not suppress stemming of its individual words. To do that, you will need to place an equals sign before each word in the phrase for which you want to suppress stemming.

As an example, take the search *like thing*. This will find *like things*, *likely things*, and even *things like*. Adding quotes, as in *"like thing"*, will prevent a match against *things like* (where the words appear in the wrong order), but it will continue to match *like things*, *likely things*, and so on. By contrast, *"=like =thing"* will only match *like thing* (which does not occur in the Tatoeba corpus). Removing the double quotes, *=like =thing*, will match *What made you do a silly thing like that?* Removing one of the equals signs, as in *like =thing*, will find *Such a strange thing is not likely to happen.* 

## Other search operators

* A vertical bar (representing "or") finds examples where either of the words appears:
  *    *hate | detest* will match sentences with either *hate* or *detest* (or both). 

* If you want to combine an or-expression with other terms, you need to put the or-expression in parentheses: 
  *    *(red|blue) house* will match sentences in which the word "house" appears together with either "red" or "blue" (or both) 

* A dash (or exclamation point) before a word prevents matches with sentences where the word appears: *like -thing* (or *like !thing*) will match *I like ice cream* but not *I like that red thing*.

* Putting a caret (^) before a word will match only sentences that begin with that word: *^great* will match *Great people are not always wise.* but not *You are the great love of my life.* 

* Putting a dollar sign ($) after a word will match only sentences that end with that word: *life$* will match *This is the best day of my life.* but not *Life means nothing without friends.*

* If you want to search for sentences that contain nothing other than the specified words, use double quotes, a caret, and a dollar sign in combination: *"^i love you$"* will find *I love you.* and *I love you!* but not *I love you more than you love me.* (However, it will find *I loved you.* To prevent this match, use *"^i =love you$"*.)


See the [Sphinx documentation](http://sphinxsearch.com/docs/current.html#boolean-syntax) for other functionality. Note that the documentation mentions keywords pertaining to specific fields in a document, but these are not relevant to Tatoeba.

Note

The lines in green are the lines that have been added in the new version. The lines in red are those that have been removed.