Version at: 13/12/2019, 06:08 vs. version at: 13/12/2019, 06:11
11#How to Search for Text
22
33Return to [tatoeba.org](https://tatoeba.org/eng/sentences/advanced_search).
44
55
66## Important Notes
77
88The search engine on tatoeba.org (Manticore, previously Sphinx) doesn't work like other standard search engines.
99
1010You can't use ? or ! in your searches in the way you would normally expect to use them, so you need to search for sentences without using these punctuation marks.
1111
1212Also, if you are searching for sentences in a language (such as Japanese or Chinese) that does not put spaces between words, be sure to see the section [Languages without word boundaries](https://en.wiki.tatoeba.org/articles/show/text-search#languages-without-word-boundaries) below.
1313
1414## Tatoeba.org uses [Manticore Search](http://manticoresearch.com/)
1515
1616These instructions tell you how to use the search bar at the top of every Tatoeba page. Our search works much like a search engine such as Google, but has some important differences.
1717
1818* To find English sentences with "live", "lives", "living" or "lived", search for the word "live". (This will also find sentences with "Live", "Living", etc., since capitalization is ignored.)
1919
2020 * [live](http://tatoeba.org/eng/sentences/search?query=live+&from=eng&to=und)
2121
2222* To match a word exactly (ignoring capitalization), put an equals sign (=) before it.
2323
2424 * [=live](http://tatoeba.org/eng/sentences/search?query=%3Dlive+&from=eng&to=und)
2525
2626* Leave punctuation out of your search string. Most punctuation will be ignored, but a final exclamation mark (!) or question mark (?) will actually interfere with the search. These symbols have other purposes, as described later on this page.
2727 * The following yields no results:
2828
2929 * [how strange!](http://tatoeba.org/eng/sentences/search?query=how+strange!&from=eng&to=und)
3030
3131 * but this search will find *How strange!* among other results:
3232
3333 * [how strange](http://tatoeba.org/eng/sentences/search?query=how+strange&from=eng&to=und)
3434
3535* Put a $ after a word to find sentences ending with that word. The example finds English sentences ending with "Tom".
3636
3737 * [Tom$](http://tatoeba.org/eng/sentences/search?query=Tom%24&from=eng&to=und)
3838
3939* Most punctuation symbols cannot be found via a search. However, $ and _ are special. You can search for sentences containing either of these characters by putting a backslash before the symbol.
4040
4141 * [\$](https://tatoeba.org/eng/sentences/search?query=%5C%24&from=und&to=und)
4242 * [\\_](https://tatoeba.org/eng/sentences/search?query=%5C_&from=und&to=und)
43 * [\$(1|2|3|4|5|6|7|8|9)](https://tatoeba.org/eng/sentences/search?query=%5C%24%281%7C2%7C3%7C4%7C5%7C6%7C7%7C8%7C9%29&from=eng&to=und)
44 finds sentences with a $ followed by a number.
4345
4446* Put a ^ before a word to find sentences beginning with that word. The example finds English sentences beginning with "Tom".
4547
4648 * [^Tom](http://tatoeba.org/eng/sentences/search?query=%5ETom&from=eng&to=und)
4749
4850* This example finds English sentences beginning with "Tom" and ending with "Mary".
4951
5052 * [^Tom Mary$](http://tatoeba.org/eng/sentences/search?query=%5ETom+Mary%24&from=eng&to=und)
5153
5254* This example finds English sentences beginning with either "Tom" or "He".
5355
5456 * [(^Tom|^He)](http://tatoeba.org/eng/sentences/search?query=%28%5ETom%7C%5EHe%29&from=eng&to=und)
5557
5658* This example finds English sentences including any of the following words: fasting, fasted, or fasts. Using the equals sign means you'll get exact matches, thus you will avoid the adjective forms: fast, faster and fastest.
5759
5860 * [(=fasting|=fasted|=fasts)](https://tatoeba.org/eng/sentences/search?query=%28%3Dfasting%7C%3Dfasted%7C%3Dfasts%29&from=eng&to=und)
5961
6062* To search for a phrase, put quotes (") around it. Put an equals sign in front of each word that you want to be matched exactly. Or put an equals sign directly before the quotes to match every word in the quotes.
6163 * If you want to see phrases like "live in Boston", "living in Boston", or "lives in Boston", use the following search:
6264
6365 * ["live in boston"](http://tatoeba.org/eng/sentences/search?query=%22live+in+boston%22&from=eng&to=und)
6466
6567 * The following searches will only find sentences with the exact phrase "live in Boston".
6668
6769 * ["=live =in =boston"](http://tatoeba.org/eng/sentences/search?query=%22%3Dlive+%3Din+%3Dboston%22&from=eng&to=und)
6870 * [="live in boston"](http://tatoeba.org/eng/sentences/search?query=%3D%22live+in+boston%22&from=eng&to=und)
6971
7072 * This search will only find sentences consisting of the exact words "I live in Boston", without any additional words.
7173
7274 * ["^I =live =in =Boston$"](http://tatoeba.org/eng/sentences/search?query=%22%5EI+%3Dlive+%3Din+%3DBoston%24%22&from=eng&to=und)
7375
7476* This example finds English sentences that have "Tom", but don't begin with "Tom."
7577
7678 * [-^Tom Tom](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom&from=eng&to=und)
7779
7880* This example finds English sentences that have "Tom", but don't begin or end with "Tom."
7981
8082 * [-^Tom Tom -Tom$](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom+-Tom%24&from=eng&to=und)
8183
8284* The question mark (?) as part of a word is a one-letter wildcard.
8385
8486 * The following will find sentences with either "whenever" and "wherever."
8587
8688 * [whe?ever](https://tatoeba.org/eng/sentences/search?query=whe%3Fever&from=und&to=und)
8789
8890 * The following will find sentences with with 6-letter words that have 2 letters, and then "eve" and then one more letter, such as "clever" "eleven", "peeves", "uneven", ...
8991
9092 * [??eve?](https://tatoeba.org/eng/sentences/search?query=%3F%3Feve%3F&from=eng&to=und)
9193
9294* This example finds English sentences that have "Tom", then 2 words, then "Mary", then 1 word, and then "John."
9395
9496 * ["Tom * * Mary * John"](https://tatoeba.org/eng/sentences/search?query=%22Tom+*+*+Mary+*+John%22&from=eng&to=und)
9597
9698* This example finds English sentences that start with "Tom", then 3 words, then ends with "Mary".
9799
98100 * ["^Tom * * * Mary$"](https://tatoeba.org/eng/sentences/search?query=%22%5ETom+*+*+*+Mary%24%22&from=und&to=und)
99101
100102* This example finds English sentences that have words beginning with "red", including the word "red". (3 letters or more are required.)
101103
102104 * [red*](https://tatoeba.org/eng/sentences/search?query=red*&from=eng&to=und)
103105
104106* This example finds English sentences that have words ending with "red", including the word "red".
105107
106108 * [*red](https://tatoeba.org/eng/sentences/search?query=*red&from=eng&to=und)
107109
108110* This example finds English sentences that have words containing the word "red", including the word "red".
109111
110112 * [\*red\*](https://tatoeba.org/eng/sentences/search?query=*red*&from=eng&to=und)
111113
112114* This example finds English sentences that have the word "French", but don't have the word "Tom"
113115
114116 * [French -Tom](https://tatoeba.org/eng/sentences/search?query=French+-Tom&from=eng&to=und)
115117
116118* This example will find sentences with "cheek" (in any form: cheeks, etc.) that don't include any of the words preceded by a minus sign (-).
117119
118120 * [cheek -tear -slap -burn -red -hollow](http://tatoeba.org/eng/sentences/search?query=cheek+-tear+-slap+-burn+-red+-hollow&from=eng&to=und)
119121
120122* This example finds sentences in which the word "cat" comes before the word "dog."
121123
122124 * [cat << dog](https://tatoeba.org/eng/sentences/search?query=cat+%3C%3C+dog&from=eng&to=und)
123125
124126* This example finds sentences that contain at least two of the words "cat", "dog", and "fish" (a "quorum search").
125127
126128 * ["cat dog fish"/2](https://tatoeba.org/eng/sentences/search?query=%22cat+dog+fish%22/2&from=eng&to=und)
127129
128130
129131### How to limit sentences to "I can" without getting "I can't".
130132
131133* This shows just sentences beginning with "I can't."
132134
133135 * ["^I =can't"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%27t%22)
134136
135137* However, this search shows both the "I can" and "I can't" sentences.
136138
137139 * ["^I =can"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%22)
138140
139141* To just get "I can" sentences, without the "I can't" sentences, use this search. (Note that the quotes are necessary.)
140142
141143 * ["^I =can" -"can't"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%22+-"can%27t")
142144
143145
144146### How do I search for "of" followed by words ending in "ing" without any intervening words?
145147
146148 * [of NEAR/1 *ing -"*ing of"](https://tatoeba.org/eng/sentences/search?query=of+NEAR%2F1+*ing+-%22*ing+of%22&from=eng&to=none&user=&orphans=no&unapproved=no&has_audio=&tags=&list=&native=&trans_filter=limit&trans_to=und&trans_link=&trans_user=&trans_orphan=&trans_unapproved=&trans_has_audio=&sort=words&sort_reverse=) Use the advanced search, sorting by "fewest words."
147149
148150 * Notes
149151 * The above link works correctly, but there is a bug in the wiki that makes it so the asterisks are not properly displayed in the above link, so you need to put an asterisk before both of the "ing" strings if you are going to input this or a similar search query yourself in the future.
150152 * the -"ing of" part is necessary to avoid getting the -ing word before "of."
151153 * the "sort by fewest words" option is necessary to avoid search results that favor sentences that contain multiple occurrences of *ing,
152154
153155## Using the "Advanced Search" to Find Sentences to Translate
154156
155157You can find several different ways to do this on the following page.
156158
157159[Create a Dashboard of Customized Links for Tatoeba.org](http://study.aitech.ac.jp/tatoeba/translate/links.php)
158160
159161This page has a number of pre-set searches that you can use.
160162If you like this page, bookmark it for future use.
161163
162164
163165## Languages without word boundaries
164166
165167For languages that don't use space characters to separate words, like Japanese, Chinese etc. the search engine interprets each character as a single word. For instance, searching for 逆に will return the same results as 逆 に, which actually matches sentences that only *include* these characters, but not necessarily in that particular order, or not contiguously. So you want to surround keywords with quotes: ["逆に"](http://tatoeba.org/jpn/sentences/search?query=%22%E9%80%86%E3%81%AB%22&from=jpn).
166168
167169
168170## More details
169171
170172The search ignores capitalization and punctuation (unless the punctuation happens to match one of the special characters described elsewhere on the page).
171173
172174In some languages, including English, the search engine **stems** the search words by default. This means that it removes certain trailing sequences from both search words and indexed words. Thus a search for *live* will also find *lived* and *living*.
173175
174176The languages in which the search engine stems words are: Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian (Bokmål), Portuguese, Romanian, Russian, Spanish, Swedish, and Turkish.
175177
176178If you want to find an exact match for a word, you must precede it with an equals sign, as in *=live*. This may come as a surprise to users who are accustomed to Google Search, where wrapping a word or phrase in double quotes forces an exact match. In Manticore, double quotes have a different function, which only affects multiword (phrase) searches: wrapping a phrase in double quotes requires matching sentences to contain words in the specified continuous sequence. Simply placing a phrase in quotes does not suppress stemming of its individual words. To do that, you will need to place an equals sign before each word in the phrase for which you want to suppress stemming, or directly before the first quotation mark to suppress stemming for each word. If you want to put both an equals sign and a caret before the same word, the equals sign should precede the caret. For instance, to find sentences that begin with the exact word *Noise*, search for *=^noise*, not *^=noise*.
177179
178180As an example, take the search *like thing*. This will find *like things*, *likely things*, and even *things like*. Adding quotes, as in *"like thing"*, will prevent a match against *things like* (where the words appear in the wrong order), but it will continue to match *like things*, *likely things*, and so on. By contrast, *"=like =thing"* will only match *like thing* (which does not occur in the Tatoeba corpus). Removing the double quotes, *=like =thing*, will match *What made you do a silly thing like that?* Removing one of the equals signs, as in *like =thing*, will find *Such a strange thing is not likely to happen.*
179181
180182Note that a star (*) can be placed at the beginning and/or end of a string representing a word, but it if is placed in the middle, the search will always fail. Also, a string beginning and/or ending with a star must be at least three characters long.
181183
182184
183185## Other search operators
184186
185187* A vertical bar (representing "or") finds examples where either of the words appears:
186188 * *hate | detest* will match sentences with either *hate* or *detest* (or both).
187189
188190* If you want to combine an or-expression with other terms, you need to put the or-expression in parentheses:
189191 * *(red|blue) house* will match sentences in which the word "house" appears together with either "red" or "blue" (or both)
190192
191193* A dash (or exclamation point) before a word prevents matches with sentences where the word appears: *like -thing* (or *like !thing*) will match *I like ice cream* but not *I like that red thing*.
192194
193195* Putting a caret (^) before a word will match only sentences that begin with that word: *^great* will match *Great people are not always wise.* but not *You are the great love of my life.*
194196
195197* Putting a dollar sign ($) after a word will match only sentences that end with that word: *life$* will match *This is the best day of my life.* but not *Life means nothing without friends.*
196198
197199* If you want to search for sentences that contain nothing other than the specified words, use double quotes, a caret, and a dollar sign in combination: *"^i love you$"* will find *I love you.* and *I love you!* but not *I love you more than you love me.* (However, it will find *I loved you.* To prevent this match, use *"^i =love you$"*.)
198200
199201* The strict order operator (<<) between two words will find sentences where the first word occurs before the second but not where the second word comes before the first. Thus _dog << cat_ will find examples where _dog_ precedes _cat_, but not vice versa.
200202
201203* The proximity operator(~_N_, where _N_ is a positive number) following a phrase will limit the number of words that can separate the specified words to fewer than _N_. Thus _"you are *ble"~1_ will find *You are irresistible.* but not *You are partially responsible.*
202204
203205See the [Manticore documentation](https://docs.manticoresearch.com/latest/html/) for other functionality. Note that the documentation mentions keywords pertaining to specific fields in a document, but these are not relevant to Tatoeba.
204206
205207
diff view generated by jsdifflib

Version at: 13/12/2019, 06:08

#How to Search for Text

Return to [tatoeba.org](https://tatoeba.org/eng/sentences/advanced_search).


## Important Notes

The search engine on tatoeba.org (Manticore, previously Sphinx) doesn't work like other standard search engines.

You can't use ? or ! in your searches in the way you would normally expect to use them, so you need to search for sentences without using these punctuation marks.

Also, if you are searching for sentences in a language (such as Japanese or Chinese) that does not put spaces between words, be sure to see the section [Languages without word boundaries](https://en.wiki.tatoeba.org/articles/show/text-search#languages-without-word-boundaries) below.

## Tatoeba.org uses [Manticore Search](http://manticoresearch.com/) 

These instructions tell you how to use the search bar at the top of every Tatoeba page. Our search works much like a search engine such as Google, but has some important differences. 

* To find English sentences with "live", "lives", "living" or "lived", search for the word "live". (This will also find sentences with "Live", "Living", etc., since capitalization is ignored.)

  * [live](http://tatoeba.org/eng/sentences/search?query=live+&from=eng&to=und)

* To match a word exactly (ignoring capitalization), put an equals sign (=) before it. 

  * [=live](http://tatoeba.org/eng/sentences/search?query=%3Dlive+&from=eng&to=und)

* Leave punctuation out of your search string. Most punctuation will be ignored, but a final exclamation mark (!) or question mark (?) will actually interfere with the search. These symbols have other purposes, as described later on this page.
  * The following yields no results:

      * [how strange!](http://tatoeba.org/eng/sentences/search?query=how+strange!&from=eng&to=und)

  * but this search will find *How strange!* among other results:

      * [how strange](http://tatoeba.org/eng/sentences/search?query=how+strange&from=eng&to=und)

* Put a $ after a word to find sentences ending with that word. The example finds English sentences ending with "Tom".

  * [Tom$](http://tatoeba.org/eng/sentences/search?query=Tom%24&from=eng&to=und)

* Most punctuation symbols cannot be found via a search. However, $ and _ are special. You can search for sentences containing either of these characters by putting a backslash before the symbol.

  * [\$](https://tatoeba.org/eng/sentences/search?query=%5C%24&from=und&to=und)
  * [\\_](https://tatoeba.org/eng/sentences/search?query=%5C_&from=und&to=und)

* Put a ^ before a word to find sentences beginning with that word. The example finds English sentences beginning with "Tom".

  * [^Tom](http://tatoeba.org/eng/sentences/search?query=%5ETom&from=eng&to=und)

* This example finds English sentences beginning with "Tom" and ending with "Mary".

  * [^Tom Mary$](http://tatoeba.org/eng/sentences/search?query=%5ETom+Mary%24&from=eng&to=und)

* This example finds English sentences beginning with either "Tom" or "He".

  * [(^Tom|^He)](http://tatoeba.org/eng/sentences/search?query=%28%5ETom%7C%5EHe%29&from=eng&to=und)

* This example finds English sentences including any of the following words: fasting, fasted, or fasts.  Using the equals sign means you'll get exact matches, thus you will avoid the adjective forms: fast, faster and fastest.

  * [(=fasting|=fasted|=fasts)](https://tatoeba.org/eng/sentences/search?query=%28%3Dfasting%7C%3Dfasted%7C%3Dfasts%29&from=eng&to=und)

* To search for a phrase, put quotes (") around it. Put an equals sign in front of each word that you want to be matched exactly. Or put an equals sign directly before the quotes to match every word in the quotes.
  * If you want to see phrases like "live in Boston", "living in Boston", or "lives in Boston", use the following search:

      * ["live in boston"](http://tatoeba.org/eng/sentences/search?query=%22live+in+boston%22&from=eng&to=und)

  * The following searches will only find sentences with the exact phrase "live in Boston".

      * ["=live =in =boston"](http://tatoeba.org/eng/sentences/search?query=%22%3Dlive+%3Din+%3Dboston%22&from=eng&to=und)
      * [="live in boston"](http://tatoeba.org/eng/sentences/search?query=%3D%22live+in+boston%22&from=eng&to=und)

  * This search will only find sentences consisting of the exact words "I live in Boston", without any additional words.

      * ["^I =live =in =Boston$"](http://tatoeba.org/eng/sentences/search?query=%22%5EI+%3Dlive+%3Din+%3DBoston%24%22&from=eng&to=und)

* This example finds English sentences that have "Tom", but don't begin with "Tom."

  * [-^Tom Tom](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom&from=eng&to=und)

* This example finds English sentences that have "Tom", but don't begin or end with "Tom."

  * [-^Tom Tom -Tom$](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom+-Tom%24&from=eng&to=und)

* The question mark (?) as part of a word is a one-letter wildcard.

    * The following will find sentences with either "whenever" and "wherever."

        * [whe?ever](https://tatoeba.org/eng/sentences/search?query=whe%3Fever&from=und&to=und)

    * The following will find sentences with with 6-letter words that have 2 letters, and then "eve" and then one more letter,  such as "clever" "eleven", "peeves", "uneven", ...

        * [??eve?](https://tatoeba.org/eng/sentences/search?query=%3F%3Feve%3F&from=eng&to=und)

* This example finds English sentences that have "Tom", then 2 words, then "Mary", then 1 word, and then "John."

  * ["Tom * * Mary * John"](https://tatoeba.org/eng/sentences/search?query=%22Tom+*+*+Mary+*+John%22&from=eng&to=und)

* This example finds English sentences that start with "Tom", then 3 words, then ends with "Mary".

  * ["^Tom * * * Mary$"](https://tatoeba.org/eng/sentences/search?query=%22%5ETom+*+*+*+Mary%24%22&from=und&to=und)

* This example finds English sentences that have words beginning with "red", including the word "red".  (3 letters or more are required.)

  * [red*](https://tatoeba.org/eng/sentences/search?query=red*&from=eng&to=und)

* This example finds English sentences that have words ending with "red", including the word "red".

  * [*red](https://tatoeba.org/eng/sentences/search?query=*red&from=eng&to=und)

* This example finds English sentences that have words containing the word "red", including the word "red".

  * [\*red\*](https://tatoeba.org/eng/sentences/search?query=*red*&from=eng&to=und)

* This example finds English sentences that have the word "French", but don't have the word "Tom"

  * [French -Tom](https://tatoeba.org/eng/sentences/search?query=French+-Tom&from=eng&to=und)

* This example will find sentences with "cheek" (in any form: cheeks, etc.) that don't include any of the words preceded by a minus sign (-).

  * [cheek -tear -slap -burn -red -hollow](http://tatoeba.org/eng/sentences/search?query=cheek+-tear+-slap+-burn+-red+-hollow&from=eng&to=und)

* This example finds sentences in which the word "cat" comes before the word "dog."

  * [cat << dog](https://tatoeba.org/eng/sentences/search?query=cat+%3C%3C+dog&from=eng&to=und)

* This example finds sentences that contain at least two of the words "cat", "dog", and "fish" (a "quorum search").

  * ["cat dog fish"/2](https://tatoeba.org/eng/sentences/search?query=%22cat+dog+fish%22/2&from=eng&to=und)


### How to limit sentences to "I can" without getting "I can't".

* This shows just sentences beginning with "I can't."

  * ["^I =can't"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%27t%22)

* However, this search shows both the "I can" and "I can't" sentences.

  * ["^I =can"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%22)

* To just get "I can" sentences, without the "I can't" sentences, use this search. (Note that the quotes are necessary.)

  * ["^I =can" -"can't"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%22+-"can%27t")


### How do I search for "of" followed by words ending in "ing" without any intervening words?

 * [of NEAR/1 *ing -"*ing of"](https://tatoeba.org/eng/sentences/search?query=of+NEAR%2F1+*ing+-%22*ing+of%22&from=eng&to=none&user=&orphans=no&unapproved=no&has_audio=&tags=&list=&native=&trans_filter=limit&trans_to=und&trans_link=&trans_user=&trans_orphan=&trans_unapproved=&trans_has_audio=&sort=words&sort_reverse=)  Use the advanced search, sorting by "fewest words." 

 * Notes
   * The above link works correctly, but there is a bug in the wiki that makes it so the asterisks are not properly displayed in the above link, so you need to put an asterisk before both of the "ing" strings if you are going to input this or a similar search query yourself in the future.
   * the -"ing of" part is necessary to avoid getting the -ing word before "of."
   * the "sort by fewest words" option is necessary to avoid search results that favor sentences that contain multiple occurrences of *ing,

## Using the "Advanced Search" to Find Sentences to Translate

You can find several different ways to do this on the following page.

[Create a Dashboard of Customized Links for Tatoeba.org](http://study.aitech.ac.jp/tatoeba/translate/links.php)

This page has a number of pre-set searches that you can use.
If you like this page, bookmark it for future use.


## Languages without word boundaries

For languages that don't use space characters to separate words, like Japanese, Chinese etc. the search engine interprets each character as a single word. For instance, searching for 逆に will return the same results as 逆 に, which actually matches sentences that only *include* these characters, but not necessarily in that particular order, or not contiguously. So you want to surround keywords with quotes: ["逆に"](http://tatoeba.org/jpn/sentences/search?query=%22%E9%80%86%E3%81%AB%22&from=jpn).


## More details

The search ignores capitalization and punctuation (unless the punctuation happens to match one of the special characters described elsewhere on the page). 

In some languages, including English, the search engine **stems** the search words by default. This means that it removes certain trailing sequences from both search words and indexed words. Thus a search for *live* will also find *lived* and *living*.

The languages in which the search engine stems words are: Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian (Bokmål), Portuguese, Romanian, Russian, Spanish, Swedish, and Turkish.

If you want to find an exact match for a word, you must precede it with an equals sign, as in *=live*. This may come as a surprise to users who are accustomed to Google Search, where wrapping a word or phrase in double quotes forces an exact match. In Manticore, double quotes have a different function, which only affects multiword (phrase) searches: wrapping a phrase in double quotes requires matching sentences to contain words in the specified continuous sequence. Simply placing a phrase in quotes does not suppress stemming of its individual words. To do that, you will need to place an equals sign before each word in the phrase for which you want to suppress stemming, or directly before the first quotation mark to suppress stemming for each word. If you want to put both an equals sign and a caret before the same word, the equals sign should precede the caret. For instance, to find sentences that begin with the exact word *Noise*, search for *=^noise*, not *^=noise*.

As an example, take the search *like thing*. This will find *like things*, *likely things*, and even *things like*. Adding quotes, as in *"like thing"*, will prevent a match against *things like* (where the words appear in the wrong order), but it will continue to match *like things*, *likely things*, and so on. By contrast, *"=like =thing"* will only match *like thing* (which does not occur in the Tatoeba corpus). Removing the double quotes, *=like =thing*, will match *What made you do a silly thing like that?* Removing one of the equals signs, as in *like =thing*, will find *Such a strange thing is not likely to happen.* 

Note that a star (*) can be placed at the beginning and/or end of a string representing a word, but it if is placed in the middle, the search will always fail. Also, a string beginning and/or ending with a star must be at least three characters long.


## Other search operators

* A vertical bar (representing "or") finds examples where either of the words appears:
  *    *hate | detest* will match sentences with either *hate* or *detest* (or both). 

* If you want to combine an or-expression with other terms, you need to put the or-expression in parentheses: 
  *    *(red|blue) house* will match sentences in which the word "house" appears together with either "red" or "blue" (or both) 

* A dash (or exclamation point) before a word prevents matches with sentences where the word appears: *like -thing* (or *like !thing*) will match *I like ice cream* but not *I like that red thing*.

* Putting a caret (^) before a word will match only sentences that begin with that word: *^great* will match *Great people are not always wise.* but not *You are the great love of my life.*

* Putting a dollar sign ($) after a word will match only sentences that end with that word: *life$* will match *This is the best day of my life.* but not *Life means nothing without friends.*

* If you want to search for sentences that contain nothing other than the specified words, use double quotes, a caret, and a dollar sign in combination: *"^i love you$"* will find *I love you.* and *I love you!* but not *I love you more than you love me.* (However, it will find *I loved you.* To prevent this match, use *"^i =love you$"*.)

* The strict order operator (<<) between two words will find sentences where the first word occurs before the second but not where the second word comes before the first. Thus _dog << cat_ will find examples where _dog_ precedes _cat_, but not vice versa.

* The proximity operator(~_N_, where _N_ is a positive number) following a phrase will limit the number of words that can separate the specified words to fewer than _N_. Thus _"you are *ble"~1_ will find *You are irresistible.* but not *You are partially responsible.*
 
See the [Manticore documentation](https://docs.manticoresearch.com/latest/html/) for other functionality. Note that the documentation mentions keywords pertaining to specific fields in a document, but these are not relevant to Tatoeba.

version at: 13/12/2019, 06:11

#How to Search for Text

Return to [tatoeba.org](https://tatoeba.org/eng/sentences/advanced_search).


## Important Notes

The search engine on tatoeba.org (Manticore, previously Sphinx) doesn't work like other standard search engines.

You can't use ? or ! in your searches in the way you would normally expect to use them, so you need to search for sentences without using these punctuation marks.

Also, if you are searching for sentences in a language (such as Japanese or Chinese) that does not put spaces between words, be sure to see the section [Languages without word boundaries](https://en.wiki.tatoeba.org/articles/show/text-search#languages-without-word-boundaries) below.

## Tatoeba.org uses [Manticore Search](http://manticoresearch.com/) 

These instructions tell you how to use the search bar at the top of every Tatoeba page. Our search works much like a search engine such as Google, but has some important differences. 

* To find English sentences with "live", "lives", "living" or "lived", search for the word "live". (This will also find sentences with "Live", "Living", etc., since capitalization is ignored.)

  * [live](http://tatoeba.org/eng/sentences/search?query=live+&from=eng&to=und)

* To match a word exactly (ignoring capitalization), put an equals sign (=) before it. 

  * [=live](http://tatoeba.org/eng/sentences/search?query=%3Dlive+&from=eng&to=und)

* Leave punctuation out of your search string. Most punctuation will be ignored, but a final exclamation mark (!) or question mark (?) will actually interfere with the search. These symbols have other purposes, as described later on this page.
  * The following yields no results:

      * [how strange!](http://tatoeba.org/eng/sentences/search?query=how+strange!&from=eng&to=und)

  * but this search will find *How strange!* among other results:

      * [how strange](http://tatoeba.org/eng/sentences/search?query=how+strange&from=eng&to=und)

* Put a $ after a word to find sentences ending with that word. The example finds English sentences ending with "Tom".

  * [Tom$](http://tatoeba.org/eng/sentences/search?query=Tom%24&from=eng&to=und)

* Most punctuation symbols cannot be found via a search. However, $ and _ are special. You can search for sentences containing either of these characters by putting a backslash before the symbol.

  * [\$](https://tatoeba.org/eng/sentences/search?query=%5C%24&from=und&to=und)
  * [\\_](https://tatoeba.org/eng/sentences/search?query=%5C_&from=und&to=und)
  * [\$(1|2|3|4|5|6|7|8|9)](https://tatoeba.org/eng/sentences/search?query=%5C%24%281%7C2%7C3%7C4%7C5%7C6%7C7%7C8%7C9%29&from=eng&to=und)
 finds sentences with a $ followed by a number.

* Put a ^ before a word to find sentences beginning with that word. The example finds English sentences beginning with "Tom".

  * [^Tom](http://tatoeba.org/eng/sentences/search?query=%5ETom&from=eng&to=und)

* This example finds English sentences beginning with "Tom" and ending with "Mary".

  * [^Tom Mary$](http://tatoeba.org/eng/sentences/search?query=%5ETom+Mary%24&from=eng&to=und)

* This example finds English sentences beginning with either "Tom" or "He".

  * [(^Tom|^He)](http://tatoeba.org/eng/sentences/search?query=%28%5ETom%7C%5EHe%29&from=eng&to=und)

* This example finds English sentences including any of the following words: fasting, fasted, or fasts.  Using the equals sign means you'll get exact matches, thus you will avoid the adjective forms: fast, faster and fastest.

  * [(=fasting|=fasted|=fasts)](https://tatoeba.org/eng/sentences/search?query=%28%3Dfasting%7C%3Dfasted%7C%3Dfasts%29&from=eng&to=und)

* To search for a phrase, put quotes (") around it. Put an equals sign in front of each word that you want to be matched exactly. Or put an equals sign directly before the quotes to match every word in the quotes.
  * If you want to see phrases like "live in Boston", "living in Boston", or "lives in Boston", use the following search:

      * ["live in boston"](http://tatoeba.org/eng/sentences/search?query=%22live+in+boston%22&from=eng&to=und)

  * The following searches will only find sentences with the exact phrase "live in Boston".

      * ["=live =in =boston"](http://tatoeba.org/eng/sentences/search?query=%22%3Dlive+%3Din+%3Dboston%22&from=eng&to=und)
      * [="live in boston"](http://tatoeba.org/eng/sentences/search?query=%3D%22live+in+boston%22&from=eng&to=und)

  * This search will only find sentences consisting of the exact words "I live in Boston", without any additional words.

      * ["^I =live =in =Boston$"](http://tatoeba.org/eng/sentences/search?query=%22%5EI+%3Dlive+%3Din+%3DBoston%24%22&from=eng&to=und)

* This example finds English sentences that have "Tom", but don't begin with "Tom."

  * [-^Tom Tom](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom&from=eng&to=und)

* This example finds English sentences that have "Tom", but don't begin or end with "Tom."

  * [-^Tom Tom -Tom$](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom+-Tom%24&from=eng&to=und)

* The question mark (?) as part of a word is a one-letter wildcard.

    * The following will find sentences with either "whenever" and "wherever."

        * [whe?ever](https://tatoeba.org/eng/sentences/search?query=whe%3Fever&from=und&to=und)

    * The following will find sentences with with 6-letter words that have 2 letters, and then "eve" and then one more letter,  such as "clever" "eleven", "peeves", "uneven", ...

        * [??eve?](https://tatoeba.org/eng/sentences/search?query=%3F%3Feve%3F&from=eng&to=und)

* This example finds English sentences that have "Tom", then 2 words, then "Mary", then 1 word, and then "John."

  * ["Tom * * Mary * John"](https://tatoeba.org/eng/sentences/search?query=%22Tom+*+*+Mary+*+John%22&from=eng&to=und)

* This example finds English sentences that start with "Tom", then 3 words, then ends with "Mary".

  * ["^Tom * * * Mary$"](https://tatoeba.org/eng/sentences/search?query=%22%5ETom+*+*+*+Mary%24%22&from=und&to=und)

* This example finds English sentences that have words beginning with "red", including the word "red".  (3 letters or more are required.)

  * [red*](https://tatoeba.org/eng/sentences/search?query=red*&from=eng&to=und)

* This example finds English sentences that have words ending with "red", including the word "red".

  * [*red](https://tatoeba.org/eng/sentences/search?query=*red&from=eng&to=und)

* This example finds English sentences that have words containing the word "red", including the word "red".

  * [\*red\*](https://tatoeba.org/eng/sentences/search?query=*red*&from=eng&to=und)

* This example finds English sentences that have the word "French", but don't have the word "Tom"

  * [French -Tom](https://tatoeba.org/eng/sentences/search?query=French+-Tom&from=eng&to=und)

* This example will find sentences with "cheek" (in any form: cheeks, etc.) that don't include any of the words preceded by a minus sign (-).

  * [cheek -tear -slap -burn -red -hollow](http://tatoeba.org/eng/sentences/search?query=cheek+-tear+-slap+-burn+-red+-hollow&from=eng&to=und)

* This example finds sentences in which the word "cat" comes before the word "dog."

  * [cat << dog](https://tatoeba.org/eng/sentences/search?query=cat+%3C%3C+dog&from=eng&to=und)

* This example finds sentences that contain at least two of the words "cat", "dog", and "fish" (a "quorum search").

  * ["cat dog fish"/2](https://tatoeba.org/eng/sentences/search?query=%22cat+dog+fish%22/2&from=eng&to=und)


### How to limit sentences to "I can" without getting "I can't".

* This shows just sentences beginning with "I can't."

  * ["^I =can't"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%27t%22)

* However, this search shows both the "I can" and "I can't" sentences.

  * ["^I =can"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%22)

* To just get "I can" sentences, without the "I can't" sentences, use this search. (Note that the quotes are necessary.)

  * ["^I =can" -"can't"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%22+-"can%27t")


### How do I search for "of" followed by words ending in "ing" without any intervening words?

 * [of NEAR/1 *ing -"*ing of"](https://tatoeba.org/eng/sentences/search?query=of+NEAR%2F1+*ing+-%22*ing+of%22&from=eng&to=none&user=&orphans=no&unapproved=no&has_audio=&tags=&list=&native=&trans_filter=limit&trans_to=und&trans_link=&trans_user=&trans_orphan=&trans_unapproved=&trans_has_audio=&sort=words&sort_reverse=)  Use the advanced search, sorting by "fewest words." 

 * Notes
   * The above link works correctly, but there is a bug in the wiki that makes it so the asterisks are not properly displayed in the above link, so you need to put an asterisk before both of the "ing" strings if you are going to input this or a similar search query yourself in the future.
   * the -"ing of" part is necessary to avoid getting the -ing word before "of."
   * the "sort by fewest words" option is necessary to avoid search results that favor sentences that contain multiple occurrences of *ing,

## Using the "Advanced Search" to Find Sentences to Translate

You can find several different ways to do this on the following page.

[Create a Dashboard of Customized Links for Tatoeba.org](http://study.aitech.ac.jp/tatoeba/translate/links.php)

This page has a number of pre-set searches that you can use.
If you like this page, bookmark it for future use.


## Languages without word boundaries

For languages that don't use space characters to separate words, like Japanese, Chinese etc. the search engine interprets each character as a single word. For instance, searching for 逆に will return the same results as 逆 に, which actually matches sentences that only *include* these characters, but not necessarily in that particular order, or not contiguously. So you want to surround keywords with quotes: ["逆に"](http://tatoeba.org/jpn/sentences/search?query=%22%E9%80%86%E3%81%AB%22&from=jpn).


## More details

The search ignores capitalization and punctuation (unless the punctuation happens to match one of the special characters described elsewhere on the page). 

In some languages, including English, the search engine **stems** the search words by default. This means that it removes certain trailing sequences from both search words and indexed words. Thus a search for *live* will also find *lived* and *living*.

The languages in which the search engine stems words are: Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian (Bokmål), Portuguese, Romanian, Russian, Spanish, Swedish, and Turkish.

If you want to find an exact match for a word, you must precede it with an equals sign, as in *=live*. This may come as a surprise to users who are accustomed to Google Search, where wrapping a word or phrase in double quotes forces an exact match. In Manticore, double quotes have a different function, which only affects multiword (phrase) searches: wrapping a phrase in double quotes requires matching sentences to contain words in the specified continuous sequence. Simply placing a phrase in quotes does not suppress stemming of its individual words. To do that, you will need to place an equals sign before each word in the phrase for which you want to suppress stemming, or directly before the first quotation mark to suppress stemming for each word. If you want to put both an equals sign and a caret before the same word, the equals sign should precede the caret. For instance, to find sentences that begin with the exact word *Noise*, search for *=^noise*, not *^=noise*.

As an example, take the search *like thing*. This will find *like things*, *likely things*, and even *things like*. Adding quotes, as in *"like thing"*, will prevent a match against *things like* (where the words appear in the wrong order), but it will continue to match *like things*, *likely things*, and so on. By contrast, *"=like =thing"* will only match *like thing* (which does not occur in the Tatoeba corpus). Removing the double quotes, *=like =thing*, will match *What made you do a silly thing like that?* Removing one of the equals signs, as in *like =thing*, will find *Such a strange thing is not likely to happen.* 

Note that a star (*) can be placed at the beginning and/or end of a string representing a word, but it if is placed in the middle, the search will always fail. Also, a string beginning and/or ending with a star must be at least three characters long.


## Other search operators

* A vertical bar (representing "or") finds examples where either of the words appears:
  *    *hate | detest* will match sentences with either *hate* or *detest* (or both). 

* If you want to combine an or-expression with other terms, you need to put the or-expression in parentheses: 
  *    *(red|blue) house* will match sentences in which the word "house" appears together with either "red" or "blue" (or both) 

* A dash (or exclamation point) before a word prevents matches with sentences where the word appears: *like -thing* (or *like !thing*) will match *I like ice cream* but not *I like that red thing*.

* Putting a caret (^) before a word will match only sentences that begin with that word: *^great* will match *Great people are not always wise.* but not *You are the great love of my life.*

* Putting a dollar sign ($) after a word will match only sentences that end with that word: *life$* will match *This is the best day of my life.* but not *Life means nothing without friends.*

* If you want to search for sentences that contain nothing other than the specified words, use double quotes, a caret, and a dollar sign in combination: *"^i love you$"* will find *I love you.* and *I love you!* but not *I love you more than you love me.* (However, it will find *I loved you.* To prevent this match, use *"^i =love you$"*.)

* The strict order operator (<<) between two words will find sentences where the first word occurs before the second but not where the second word comes before the first. Thus _dog << cat_ will find examples where _dog_ precedes _cat_, but not vice versa.

* The proximity operator(~_N_, where _N_ is a positive number) following a phrase will limit the number of words that can separate the specified words to fewer than _N_. Thus _"you are *ble"~1_ will find *You are irresistible.* but not *You are partially responsible.*
 
See the [Manticore documentation](https://docs.manticoresearch.com/latest/html/) for other functionality. Note that the documentation mentions keywords pertaining to specific fields in a document, but these are not relevant to Tatoeba.

Note

The lines in green are the lines that have been added in the new version. The lines in red are those that have been removed.