Version at: 16/11/2013, 17:17 vs. version at: 16/11/2013, 17:42
11#How to Search for Text
22
33## Briefly, Tatoeba.org uses [Sphinx Search](http://sphinxsearch.com/docs/current.html#boolean-syntax)
44
55* To find English sentences with "live", "lives", "living" or "lived", you can search using the word "live".
66
77 * [live](http://tatoeba.org/eng/sentences/search?query=live+&from=eng&to=und)
88
99* For exact matches, you need to use an equals sign (=) before a word.
1010
1111 * [=live](http://tatoeba.org/eng/sentences/search?query=%3Dlive+&from=eng&to=und)
1212
1313* To search for a phrase, put quotes (") around it. Put an equals sign in front of each word that you want to be matched exactly.
1414 * If you want to see phrases like "live in Boston", "living in Boston", or "lives in Boston", use the following search:
1515
1616 * ["live in Boston"](http://tatoeba.org/eng/sentences/search?query=%22live+in+Boston%22&from=eng&to=und)
1717
1818 * The following search will only find sentences with the exact phrase "live in Boston".
1919
2020 * ["=live =in =Boston"](http://tatoeba.org/eng/sentences/search?query=%22%3Dlive+%3Din+%3DBoston%22&from=eng&to=und)
2121
2222
2323* Use a minus sign (-) to mean "no" (to find sentences without certain words). The following search will find sentences with "cheek" (in any form: cheeks, etc.) that don't also include any of the words preceded by a minus sign (-).
2424
2525 * [cheek -tear -slap -burn -red -hollow](http://tatoeba.org/eng/sentences/search?query=cheek+-tear+-slap+-burn+-red+-hollow&from=eng&to=und)
2626
2727## Warning
2828
2929When searching to see if a sentence already exists before submitting a new sentence, it's a good idea to remove the final punctuation.
3030
3131* Searching the following is likely to get no results.
3232
3333 * [How strange!](http://tatoeba.org/eng/sentences/search?query=How+strange!&from=eng&to=und)
3434
3535* However, you'll find that sentence if you search those words without punctuation.
3636
3737 * [How strange](http://tatoeba.org/eng/sentences/search?query=How+strange&from=eng&to=und)
3838
3939The ! has a special function in Sphinx Search (the same as the minus sign).
4040
4141
4242## More Details
4343
4444Each page on Tatoeba features a box that allows you to search for text within the collection of sentences. The search will only find sentences that have been indexed by a script that is run every few months. Sentences that have been added more recently will not appear in the results. However, you can find the latest sentences added by a particular user (perhaps yourself), either by looking at the user's profile and selecting "Show latest activity" or by going to an address like [http://tatoeba.org/eng/sentences/of_user/trang](http://tatoeba.org/eng/sentences/of_user/trang) (where you replace "trang" with the name of the user whose sentences you want to see).
4545
4646The search engine used on Tatoeba is Sphinx. In many languages, including English, Sphinx **stems** the search words by default. This means that it removes certain trailing sequences from both search words and indexed words. Thus a search for *pare* will also find *pared* and *paring*.
4747
4848If you want to find an exact match for a word, you must precede it with an equals sign, as in *=pare*. This may come as a surprise to users who are accustomed to Google Search, where wrapping a word or phrase in double quotes forces an exact match. In Sphinx, double quotes have a different function, which only affects multiword (phrase) searches: wrapping a phrase in double quotes requires matching sentences to contain words in the specified continuous sequence. Simply placing a phrase in quotes does not suppress stemming of its individual words. To do that, you will need to place an equals sign before each word in the phrase for which you want to suppress stemming.
4949
5050As an example, take the search *like thing*. This will find *like things*, *likely things*, and even *things like*. Adding quotes, as in *"like thing"*, will prevent a match against *things like* (where the words appear in the wrong order), but it will continue to match *like things*, *likely things*, and so on. By contrast, *"=like =thing"* will only match *like thing* (which does not occur in the Tatoeba corpus). Removing the double quotes, *=like =thing*, will match *What made you do a silly thing like that?* Removing one of the equals signs, as in *like =thing*, will find *Such a strange thing is not likely to happen.*
5151
5252You may be interested in other features, such as the following:
5353
5454* A vertical bar (representing "or") finds examples where either of the words appears:
5555 * *hate | detest* will match sentences with either *hate* or *detest* (or both).
5656
5757* If you want to combine an or-expression with other terms, you need to put the or-expression in parentheses:
5858 * *(red|blue) house* will match sentences in which the word "house" appears together with either "red" or "blue" (or both)
5959
6060* A dash (or exclamation point) before a word prevents matches with sentences where the word appears: *like -thing* (or *like !thing*) will match *I like ice cream* but not *I like that red thing*.
6161
6262* Putting a caret (^) before a word will match only sentences that begin with that word: *^great* will match *Great people are not always wise.* but not *You are the great love of my life.*
6363
6464* Putting a dollar sign ($) after a word will match only sentences that end with that word: *life$* will match *Life resembles a novel more often than novels resemble life.* but not *Life never ends but earthly life does.*
6565
66* If you want to search for sentences that contain nothing other than the specified words, use double quotes, a caret, and a dollar sign in combination: *"^i love you$"* will find *I love you.* and *I love you!* but not *I love you more than you love me.* (However, it will find *I loved you.* To prevent this match, use *"^i =love you$"*)
67
68
6669See the [Sphinx documentation](http://sphinxsearch.com/docs/current.html#boolean-syntax) for other functionality. Note that keywords pertaining to specific fields in a document are not relevant to Tatoeba.
diff view generated by jsdifflib

Version at: 16/11/2013, 17:17

#How to Search for Text

## Briefly, Tatoeba.org uses  [Sphinx Search](http://sphinxsearch.com/docs/current.html#boolean-syntax) 

* To find English sentences with "live", "lives", "living" or "lived", you can search using the word "live".

  * [live](http://tatoeba.org/eng/sentences/search?query=live+&from=eng&to=und)

* For exact matches, you need to use an equals sign (=) before a word. 

  * [=live](http://tatoeba.org/eng/sentences/search?query=%3Dlive+&from=eng&to=und)

* To search for a phrase, put quotes (") around it. Put an equals sign in front of each word that you want to be matched exactly.
  * If you want to see phrases like "live in Boston", "living in Boston", or "lives in Boston", use the following search:

      * ["live in Boston"](http://tatoeba.org/eng/sentences/search?query=%22live+in+Boston%22&from=eng&to=und)

  * The following search will only find sentences with the exact phrase "live in Boston".

      * ["=live =in =Boston"](http://tatoeba.org/eng/sentences/search?query=%22%3Dlive+%3Din+%3DBoston%22&from=eng&to=und)


* Use a minus sign (-) to mean "no" (to find sentences without certain words). The following search will find sentences with "cheek" (in any form: cheeks, etc.) that don't also include any of the words preceded by a minus sign (-).

  * [cheek -tear -slap -burn -red -hollow](http://tatoeba.org/eng/sentences/search?query=cheek+-tear+-slap+-burn+-red+-hollow&from=eng&to=und)

## Warning

When searching to see if a sentence already exists before submitting a new sentence, it's a good idea to remove the final punctuation.

* Searching the following is likely to get no results.

  * [How strange!](http://tatoeba.org/eng/sentences/search?query=How+strange!&from=eng&to=und)

* However, you'll find that sentence if you search those words without punctuation.

  * [How strange](http://tatoeba.org/eng/sentences/search?query=How+strange&from=eng&to=und)

The ! has a special function in Sphinx Search (the same as the minus sign).


## More Details

Each page on Tatoeba features a box that allows you to search for text within the collection of sentences. The search will only find sentences that have been indexed by a script that is run every few months. Sentences that have been added more recently will not appear in the results. However, you can find the latest sentences added by a particular user (perhaps yourself), either by looking at the user's profile and selecting "Show latest activity" or by going to an address like [http://tatoeba.org/eng/sentences/of_user/trang](http://tatoeba.org/eng/sentences/of_user/trang) (where you replace "trang" with the name of the user whose sentences you want to see).

The search engine used on Tatoeba is Sphinx. In many languages, including English, Sphinx **stems** the search words by default. This means that it removes certain trailing sequences from both search words and indexed words. Thus a search for *pare* will also find *pared* and *paring*.

If you want to find an exact match for a word, you must precede it with an equals sign, as in *=pare*. This may come as a surprise to users who are accustomed to Google Search, where wrapping a word or phrase in double quotes forces an exact match. In Sphinx, double quotes have a different function, which only affects multiword (phrase) searches: wrapping a phrase in double quotes requires matching sentences to contain words in the specified continuous sequence. Simply placing a phrase in quotes does not suppress stemming of its individual words. To do that, you will need to place an equals sign before each word in the phrase for which you want to suppress stemming.

As an example, take the search *like thing*. This will find *like things*, *likely things*, and even *things like*. Adding quotes, as in *"like thing"*, will prevent a match against *things like* (where the words appear in the wrong order), but it will continue to match *like things*, *likely things*, and so on. By contrast, *"=like =thing"* will only match *like thing* (which does not occur in the Tatoeba corpus). Removing the double quotes, *=like =thing*, will match *What made you do a silly thing like that?* Removing one of the equals signs, as in *like =thing*, will find *Such a strange thing is not likely to happen.* 

You may be interested in other features, such as the following:

* A vertical bar (representing "or") finds examples where either of the words appears:
  *    *hate | detest* will match sentences with either *hate* or *detest* (or both). 

* If you want to combine an or-expression with other terms, you need to put the or-expression in parentheses: 
  *    *(red|blue) house* will match sentences in which the word "house" appears together with either "red" or "blue" (or both) 

* A dash (or exclamation point) before a word prevents matches with sentences where the word appears: *like -thing* (or *like !thing*) will match *I like ice cream* but not *I like that red thing*.

* Putting a caret (^) before a word will match only sentences that begin with that word: *^great* will match *Great people are not always wise.* but not *You are the great love of my life.* 

* Putting a dollar sign ($) after a word will match only sentences that end with that word: *life$* will match *Life resembles a novel more often than novels resemble life.* but not *Life never ends but earthly life does.*

See the [Sphinx documentation](http://sphinxsearch.com/docs/current.html#boolean-syntax) for other functionality. Note that keywords pertaining to specific fields in a document are not relevant to Tatoeba.

version at: 16/11/2013, 17:42

#How to Search for Text

## Briefly, Tatoeba.org uses  [Sphinx Search](http://sphinxsearch.com/docs/current.html#boolean-syntax) 

* To find English sentences with "live", "lives", "living" or "lived", you can search using the word "live".

  * [live](http://tatoeba.org/eng/sentences/search?query=live+&from=eng&to=und)

* For exact matches, you need to use an equals sign (=) before a word. 

  * [=live](http://tatoeba.org/eng/sentences/search?query=%3Dlive+&from=eng&to=und)

* To search for a phrase, put quotes (") around it. Put an equals sign in front of each word that you want to be matched exactly.
  * If you want to see phrases like "live in Boston", "living in Boston", or "lives in Boston", use the following search:

      * ["live in Boston"](http://tatoeba.org/eng/sentences/search?query=%22live+in+Boston%22&from=eng&to=und)

  * The following search will only find sentences with the exact phrase "live in Boston".

      * ["=live =in =Boston"](http://tatoeba.org/eng/sentences/search?query=%22%3Dlive+%3Din+%3DBoston%22&from=eng&to=und)


* Use a minus sign (-) to mean "no" (to find sentences without certain words). The following search will find sentences with "cheek" (in any form: cheeks, etc.) that don't also include any of the words preceded by a minus sign (-).

  * [cheek -tear -slap -burn -red -hollow](http://tatoeba.org/eng/sentences/search?query=cheek+-tear+-slap+-burn+-red+-hollow&from=eng&to=und)

## Warning

When searching to see if a sentence already exists before submitting a new sentence, it's a good idea to remove the final punctuation.

* Searching the following is likely to get no results.

  * [How strange!](http://tatoeba.org/eng/sentences/search?query=How+strange!&from=eng&to=und)

* However, you'll find that sentence if you search those words without punctuation.

  * [How strange](http://tatoeba.org/eng/sentences/search?query=How+strange&from=eng&to=und)

The ! has a special function in Sphinx Search (the same as the minus sign).


## More Details

Each page on Tatoeba features a box that allows you to search for text within the collection of sentences. The search will only find sentences that have been indexed by a script that is run every few months. Sentences that have been added more recently will not appear in the results. However, you can find the latest sentences added by a particular user (perhaps yourself), either by looking at the user's profile and selecting "Show latest activity" or by going to an address like [http://tatoeba.org/eng/sentences/of_user/trang](http://tatoeba.org/eng/sentences/of_user/trang) (where you replace "trang" with the name of the user whose sentences you want to see).

The search engine used on Tatoeba is Sphinx. In many languages, including English, Sphinx **stems** the search words by default. This means that it removes certain trailing sequences from both search words and indexed words. Thus a search for *pare* will also find *pared* and *paring*.

If you want to find an exact match for a word, you must precede it with an equals sign, as in *=pare*. This may come as a surprise to users who are accustomed to Google Search, where wrapping a word or phrase in double quotes forces an exact match. In Sphinx, double quotes have a different function, which only affects multiword (phrase) searches: wrapping a phrase in double quotes requires matching sentences to contain words in the specified continuous sequence. Simply placing a phrase in quotes does not suppress stemming of its individual words. To do that, you will need to place an equals sign before each word in the phrase for which you want to suppress stemming.

As an example, take the search *like thing*. This will find *like things*, *likely things*, and even *things like*. Adding quotes, as in *"like thing"*, will prevent a match against *things like* (where the words appear in the wrong order), but it will continue to match *like things*, *likely things*, and so on. By contrast, *"=like =thing"* will only match *like thing* (which does not occur in the Tatoeba corpus). Removing the double quotes, *=like =thing*, will match *What made you do a silly thing like that?* Removing one of the equals signs, as in *like =thing*, will find *Such a strange thing is not likely to happen.* 

You may be interested in other features, such as the following:

* A vertical bar (representing "or") finds examples where either of the words appears:
  *    *hate | detest* will match sentences with either *hate* or *detest* (or both). 

* If you want to combine an or-expression with other terms, you need to put the or-expression in parentheses: 
  *    *(red|blue) house* will match sentences in which the word "house" appears together with either "red" or "blue" (or both) 

* A dash (or exclamation point) before a word prevents matches with sentences where the word appears: *like -thing* (or *like !thing*) will match *I like ice cream* but not *I like that red thing*.

* Putting a caret (^) before a word will match only sentences that begin with that word: *^great* will match *Great people are not always wise.* but not *You are the great love of my life.* 

* Putting a dollar sign ($) after a word will match only sentences that end with that word: *life$* will match *Life resembles a novel more often than novels resemble life.* but not *Life never ends but earthly life does.*

* If you want to search for sentences that contain nothing other than the specified words, use double quotes, a caret, and a dollar sign in combination: *"^i love you$"* will find *I love you.* and *I love you!* but not *I love you more than you love me.* (However, it will find *I loved you.* To prevent this match, use *"^i =love you$"*)


See the [Sphinx documentation](http://sphinxsearch.com/docs/current.html#boolean-syntax) for other functionality. Note that keywords pertaining to specific fields in a document are not relevant to Tatoeba.

Note

The lines in green are the lines that have been added in the new version. The lines in red are those that have been removed.