Version at: 16/12/2019, 10:56 vs. version at: 21/12/2019, 18:24
11#How to Search for Text
22
33Return to [Advanced Search at tatoeba.org](https://tatoeba.org/eng/sentences/advanced_search).
44
55## Introduction
66
77Tatoeba provides two ways to search for sentences:
88
99* the regular search bar at the top of every page
1010* [advanced search](https://en.wiki.tatoeba.org/articles/show/advanced-search#), which you can reach from the **Advanced search** link above the regular search bar
1111
1212### Regular search
1313
1414For regular search, there are three fields:
1515
1616* the main field, which selects the word or words that you're looking for
1717* the **From** field, which selects the language you're looking for matches in
1818* the **To** field, which limits the search to sentences that have been directly or indirectly translated into the language you choose
1919
2020#### Main search field
2121
2222If you leave the main search field empty, it will find all sentences that match the **From** and **To** values that you've chosen. Otherwise, it will search for sentences containing the word or words that you type in.
2323
2424The search engine that Tatoeba uses ([Manticore](https://manticoresearch.com/)) is a little different from other search engines that you may have used, such as Google's. Please note the following:
2525
2626(1) Punctuation marks like _?_ and _!_ have special purposes in our search engine (Manticore, previously Sphinx). If you don't want to use those special functions, you should leave them out.
2727
2828(2) In Turkish and many European languages (Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian [Bokmål], Portuguese, Romanian, Russian, Spanish, and Swedish), a search for a word such as _live_ will also find similar words, such as _lived_ and _living_. If you want to indicate that a word should be matched exactly, you must put an equals sign before it: _=live_
2929
3030(3) If you are searching for sentences in a language (such as Japanese or Chinese) that does not put spaces between words, be sure to see the section [Languages without word boundaries](https://en.wiki.tatoeba.org/articles/show/text-search#languages-without-word-boundaries) below.
3131
3232(4) You can use quotation marks to group words into phrases. For instance, _met him_ will find matches where the words _met_ and _him_ will occur anywhere in the sentence, but _met him_ will only find matches where the words occur in that order.
3333
3434(5) For more information, read the section [Examples in English](https://en.wiki.tatoeba.org/articles/show/text-search#Examples in English).
3535#### To
3636
3737The "To" field can be set to "Any language", in which case the search will find words in any language. Otherwise, the search will only find words in the language you choose.
3838
3939#### From
4040
4141The "From" field can be set to "Any language", in which case it will be ignored. Otherwise, the search will only find sentences that are linked to sentences in the language you choose. They can either be directly linked, in which case they will be shown in black, or indirectly linked, in which case they will be shown in gray. Two sentences are indirectly linked when there is a chain of translations between them but no one has put a link between those two sentences themselves. This means you cannot be sure that the sentences are translations of each other.
4242
4343## Examples in English
4444
4545* To find English sentences with "live", "lives", "living" or "lived", search for the word "live". (This will also find sentences with "Live", "Living", etc., since capitalization is ignored.)
4646
4747 * [live](http://tatoeba.org/eng/sentences/search?query=live+&from=eng&to=und)
4848
4949* To match a word exactly (ignoring capitalization), put an equals sign (=) before it.
5050
5151 * [=live](http://tatoeba.org/eng/sentences/search?query=%3Dlive+&from=eng&to=und)
5252
5353* Leave punctuation out of your search string. Most punctuation will be ignored, but a final exclamation mark (!) or question mark (?) will actually interfere with the search. (See [Sentences with punctuation marks](https://en.wiki.tatoeba.org/articles/show/text-search#Sentences with punctuation marks) for an example.) These symbols have other purposes, as described later on this page.
5454
5555* Put a $ after a word to find sentences ending with that word. The example finds English sentences ending with "Tom".
5656
5757 * [Tom$](http://tatoeba.org/eng/sentences/search?query=Tom%24&from=eng&to=und)
5858
5959* Most punctuation symbols cannot be found via a search. However, $ and _ are special. You can search for sentences containing either of these characters by putting a backslash before the symbol.
6060
6161 * [\$](https://tatoeba.org/eng/sentences/search?query=%5C%24&from=und&to=und)
6262 * [\\_](https://tatoeba.org/eng/sentences/search?query=%5C_&from=und&to=und)
6363 * [\$(1|2|3|4|5|6|7|8|9)](https://tatoeba.org/eng/sentences/search?query=%5C%24%281%7C2%7C3%7C4%7C5%7C6%7C7%7C8%7C9%29&from=eng&to=und)
6464 finds sentences with a $ followed by a number.
6565
6666* Put a ^ before a word to find sentences beginning with that word. The example finds English sentences beginning with "Tom".
6767
6868 * [^Tom](http://tatoeba.org/eng/sentences/search?query=%5ETom&from=eng&to=und)
6969
7070* This example finds English sentences beginning with "Tom" and ending with "Mary".
7171
7272 * [^Tom Mary$](http://tatoeba.org/eng/sentences/search?query=%5ETom+Mary%24&from=eng&to=und)
7373
7474* This example finds English sentences beginning with either "Tom" or "He".
7575
7676 * [(^Tom|^He)](http://tatoeba.org/eng/sentences/search?query=%28%5ETom%7C%5EHe%29&from=eng&to=und)
7777
7878* This example finds English sentences including any of the following words: fasting, fasted, or fasts. Using the equals sign means you'll get exact matches, thus you will avoid the adjective forms: fast, faster and fastest.
7979
8080 * [(=fasting|=fasted|=fasts)](https://tatoeba.org/eng/sentences/search?query=%28%3Dfasting%7C%3Dfasted%7C%3Dfasts%29&from=eng&to=und)
8181
8282* To search for a phrase, put quotes (") around it. Put an equals sign in front of each word that you want to be matched exactly. Or put an equals sign directly before the quotes to match every word in the quotes.
8383 * If you want to see phrases like "live in Boston", "living in Boston", or "lives in Boston", use the following search:
8484
8585 * ["live in boston"](http://tatoeba.org/eng/sentences/search?query=%22live+in+boston%22&from=eng&to=und)
8686
8787 * The following searches will only find sentences with the exact phrase "live in Boston".
8888
8989 * ["=live =in =boston"](http://tatoeba.org/eng/sentences/search?query=%22%3Dlive+%3Din+%3Dboston%22&from=eng&to=und)
9090 * [="live in boston"](http://tatoeba.org/eng/sentences/search?query=%3D%22live+in+boston%22&from=eng&to=und)
9191
9292 * This search will only find sentences consisting of the exact words "I live in Boston", without any additional words.
9393
9494 * ["^I =live =in =Boston$"](http://tatoeba.org/eng/sentences/search?query=%22%5EI+%3Dlive+%3Din+%3DBoston%24%22&from=eng&to=und)
9595
9696* This example finds English sentences that have "Tom", but don't begin with "Tom."
9797
9898 * [-^Tom Tom](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom&from=eng&to=und)
9999
100100* This example finds English sentences that have "Tom", but don't begin or end with "Tom."
101101
102102 * [-^Tom Tom -Tom$](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom+-Tom%24&from=eng&to=und)
103103
104104* The question mark (?) as part of a word is a one-letter wildcard.
105105
106106 * The following will find sentences with either "whenever" and "wherever."
107107
108108 * [whe?ever](https://tatoeba.org/eng/sentences/search?query=whe%3Fever&from=und&to=und)
109109
110110 * The following will find sentences with with 6-letter words that have 2 letters, and then "eve" and then one more letter, such as "clever" "eleven", "peeves", "uneven", ...
111111
112112 * [??eve?](https://tatoeba.org/eng/sentences/search?query=%3F%3Feve%3F&from=eng&to=und)
113113
114114* This example finds English sentences that have "Tom", then 2 words, then "Mary", then 1 word, and then "John."
115115
116116 * ["Tom * * Mary * John"](https://tatoeba.org/eng/sentences/search?query=%22Tom+*+*+Mary+*+John%22&from=eng&to=und)
117117
118118* This example finds English sentences that start with "Tom", then have 3 words, then end with "Mary".
119119
120120 * ["^Tom * * * Mary$"](https://tatoeba.org/eng/sentences/search?query=%22%5ETom+*+*+*+Mary%24%22&from=und&to=und)
121121
122122* This example finds English sentences that have words beginning with "red", including the word "red". (3 letters or more are required.)
123123
124124 * [red*](https://tatoeba.org/eng/sentences/search?query=red*&from=eng&to=und)
125125
126126* This example finds English sentences that have words ending with "red", including the word "red".
127127
128128 * [*red](https://tatoeba.org/eng/sentences/search?query=*red&from=eng&to=und)
129129
130130* This example finds English sentences that have words containing the word "red", including the word "red".
131131
132132 * [\*red\*](https://tatoeba.org/eng/sentences/search?query=*red*&from=eng&to=und)
133133
134134* This example finds English sentences that have the word "French", but don't have the word "Tom".
135135
136136 * [French -Tom](https://tatoeba.org/eng/sentences/search?query=French+-Tom&from=eng&to=und)
137137
138138* This example will find sentences with "cheek" (in any form: cheeks, etc.) that don't include any of the words preceded by a minus sign (-).
139139
140140 * [cheek -tear -slap -burn -red -hollow](http://tatoeba.org/eng/sentences/search?query=cheek+-tear+-slap+-burn+-red+-hollow&from=eng&to=und)
141141
142142* This example finds sentences in which the word "cat" comes before the word "dog."
143143
144144 * [cat << dog](https://tatoeba.org/eng/sentences/search?query=cat+%3C%3C+dog&from=eng&to=und)
145145
146146* This example finds sentences that contain at least two of the words "cat", "dog", and "fish" (a "quorum search").
147147
148148 * ["cat dog fish"/2](https://tatoeba.org/eng/sentences/search?query=%22cat+dog+fish%22/2&from=eng&to=und)
149149
150150
151151### How to limit sentences to "I can" without getting "I can't".
152152
153153* This shows just sentences beginning with "I can't."
154154
155155 * ["^I =can't"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%27t%22)
156156
157157* However, this search shows both the "I can" and "I can't" sentences.
158158
159159 * ["^I =can"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%22)
160160
161161* To just get "I can" sentences, without the "I can't" sentences, use this search. (Note that the quotes are necessary.)
162162
163163 * ["^I =can" -"can't"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%22+-"can%27t")
164164
165165### How to find English sentences without "the", "a" or "an"
166166
167167* -the -a -an
168168
169 * This will get an error message, since you must have a "positive" search query, not only "negative" ones.
170
171* One solution is to search for words with vowel sounds, after putting a minus before each word that you do not want in the results.
172
173 * [-the -a -an a\*|\*a|\*a\*|\*e|e\*|\*e\*|i\*|\*i|\*i\*|o\*|\*o|\*o\*|u\*|\*u|\*u\*|y\*|\*y|\*y\*](https://tatoeba.org/eng/sentences/search?query=+-the+-a+-an+a*%7C*a%7C*a*%7C*e%7Ce*%7C*e*%7Ci*%7C*i%7C*i*%7Co*%7C*o%7C*o*%7Cu*%7C*u%7C*u*%7Cy*%7C*y%7C*y*&from=eng)
169 * This will get an error message, since you must specify at least one word that you want to include, not only words that you want to exclude. If you are determined to get as many results as possible, you can search for words that start with any letter of the alphabet, after putting a minus before each word that you do not want. However, this query will take a long time.
170
171 * [-the -a -an a\*|\*a|\*a\*|\*e|e\*|\*e\*|i\*|\*i|\*i\*|o\*|\*o|\*o\*|u\*|\*u|\*u\*|y\*|\*y|\*y\*](https://tatoeba.org/eng/sentences/search?query=-the+-a+-an+a*%7Cb*%7Cc*%7Cd*%7Ce*%7Cf*%7Cg*%7Ch*%7Ci*%7Cj*%7Ck*%7Cl*%7Cm*%7Cn*%7Co*%7Cp*%7Cq*%7Cr*%7Cs*%7Ct*%7Cu*%7Cv*%7Cw*%7Cx*%7Cy*%7Cz*&from=eng)
174172
175173
176174### How to find sentences with "of" followed by words ending in "ing" without any intervening words
177175
178176 * [of NEAR/1 \*ing -"\*ing of"](https://tatoeba.org/eng/sentences/search?query=of+NEAR%2F1+*ing+-%22*ing+of%22&from=eng&to=none&user=&orphans=no&unapproved=no&has_audio=&tags=&list=&native=&trans_filter=limit&trans_to=und&trans_link=&trans_user=&trans_orphan=&trans_unapproved=&trans_has_audio=&sort=words&sort_reverse=)
179177
180178 * Notes
181179 * The -"ing of" part is necessary to avoid getting results where the -ing word comes before "of."
182180 * The search results will favor sentences that contain multiple occurrences of *ing. If you don't want this, change the search order.
183181
184182### Sentences with punctuation marks
185183 * The following yields no results:
186184
187185 * [how strange!](http://tatoeba.org/eng/sentences/search?query=how+strange!&from=eng&to=und)
188186
189187 * but this search will find *How strange!* among other results:
190188
191189 * [how strange](http://tatoeba.org/eng/sentences/search?query=how+strange&from=eng&to=und)
192190
193191
194192## Languages without word boundaries
195193
196194For languages that don't use space characters to separate words, like Japanese, Chinese etc. the search engine interprets each character as a single word. For instance, searching for 逆に will return the same results as 逆 に, which actually matches sentences that only *include* these characters, but not necessarily in that particular order, or not contiguously. So you should surround keywords with quotes, as in this example:
197195
198196["逆に"](http://tatoeba.org/jpn/sentences/search?query=%22%E9%80%86%E3%81%AB%22&from=jpn).
199197
200198
201199## More details
202200
203201The search ignores capitalization and punctuation (unless the punctuation happens to match one of the special characters described elsewhere on the page).
204202
205203In some languages, including English, the search engine **stems** the search words by default. This means that it removes certain trailing sequences from both search words and indexed words. Thus a search for *live* will also find *lived* and *living*.
206204
207205The languages in which the search engine stems words are: Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian (Bokmål), Portuguese, Romanian, Russian, Spanish, Swedish, and Turkish.
208206
209207If you want to find an exact match for a word, you must precede it with an equals sign, as in *=live*. This may come as a surprise to users who are accustomed to Google Search, where wrapping a word or phrase in double quotes forces an exact match. In Manticore, double quotes have a different function, which only affects multiword (phrase) searches: wrapping a phrase in double quotes requires matching sentences to contain words in the specified continuous sequence. Simply placing a phrase in quotes does not suppress stemming of its individual words. To do that, you will need to place an equals sign before each word in the phrase for which you want to suppress stemming, or directly before the first quotation mark to suppress stemming for each word. If you want to put both an equals sign and a caret before the same word, the equals sign should precede the caret. For instance, to find sentences that begin with the exact word *Noise*, search for *=^noise*, not *^=noise*.
210208
211209As an example, take the search *like thing*. This will find *like things*, *likely things*, and even *things like*. Adding quotes, as in *"like thing"*, will prevent a match against *things like* (where the words appear in the wrong order), but it will continue to match *like things*, *likely things*, and so on. By contrast, *"=like =thing"* will only match *like thing* (which does not occur in the Tatoeba corpus). Removing the double quotes, *=like =thing*, will match *What made you do a silly thing like that?* Removing one of the equals signs, as in *like =thing*, will find *Such a strange thing is not likely to happen.*
212210
213211Note that a star (*) can be placed at the beginning and/or end of a string representing a word, but it if is placed in the middle, the search will always fail. Also, a string beginning and/or ending with a star must be at least three characters long.
214212
215213
216214## Other search operators
217215
218216* A vertical bar (representing "or") finds examples where either of the words appears:
219217 * *hate | detest* will match sentences with either *hate* or *detest* (or both).
220218
221219* If you want to combine an or-expression with other terms, you need to put the or-expression in parentheses:
222220 * *(red|blue) house* will match sentences in which the word "house" appears together with either "red" or "blue" (or both)
223221
224222* A dash (or exclamation point) before a word prevents matches with sentences where the word appears: *like -thing* (or *like !thing*) will match *I like ice cream* but not *I like that red thing*.
225223
226224* Putting a caret (^) before a word will match only sentences that begin with that word: *^great* will match *Great people are not always wise.* but not *You are the great love of my life.*
227225
228226* Putting a dollar sign ($) after a word will match only sentences that end with that word: *life$* will match *This is the best day of my life.* but not *Life means nothing without friends.*
229227
230228* If you want to search for sentences that contain nothing other than the specified words, use double quotes, a caret, and a dollar sign in combination: *"^i love you$"* will find *I love you.* and *I love you!* but not *I love you more than you love me.* (However, it will find *I loved you.* To prevent this match, use *"^i =love you$"*.)
231229
232230* The strict order operator (<<) between two words will find sentences where the first word occurs before the second but not where the second word comes before the first. Thus _dog << cat_ will find examples where _dog_ precedes _cat_, but not vice versa.
233231
234232* The proximity operator(~_N_, where _N_ is a positive number) following a phrase will limit the number of words that can separate the specified words to fewer than _N_. Thus _"you are *ble"~1_ will find *You are irresistible.* but not *You are partially responsible.*
235233
236234See the [Manticore documentation](https://docs.manticoresearch.com/latest/html/) for other functionality. Note that the documentation mentions keywords pertaining to specific fields in a document, but these are not relevant to Tatoeba.
237235
238236
diff view generated by jsdifflib

Version at: 16/12/2019, 10:56

#How to Search for Text

Return to [Advanced Search at tatoeba.org](https://tatoeba.org/eng/sentences/advanced_search).

## Introduction

Tatoeba provides two ways to search for sentences: 

* the regular search bar at the top of every page
* [advanced search](https://en.wiki.tatoeba.org/articles/show/advanced-search#), which you can reach from the **Advanced search** link above the regular search bar 

### Regular search

For regular search, there are three fields:

* the main field, which selects the word or words that you're looking for
* the **From** field, which selects the language you're looking for matches in
* the **To** field, which limits the search to sentences that have been directly or indirectly translated into the language you choose

#### Main search field

If you leave the main search field empty, it will find all sentences that match the **From** and **To** values that you've chosen. Otherwise, it will search for sentences containing the word or words that you type in. 

The search engine that Tatoeba uses ([Manticore](https://manticoresearch.com/)) is a little different from other search engines that you may have used, such as Google's. Please note the following:
 
(1) Punctuation marks like _?_ and _!_ have special purposes in our search engine (Manticore, previously Sphinx). If you don't want to use those special functions, you should leave them out.

(2) In Turkish and many European languages (Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian [Bokmål], Portuguese, Romanian, Russian, Spanish, and Swedish), a search for a word such as _live_ will also find similar words, such as _lived_ and _living_. If you want to indicate that a word should be matched exactly, you must put an equals sign before it: _=live_

(3) If you are searching for sentences in a language (such as Japanese or Chinese) that does not put spaces between words, be sure to see the section [Languages without word boundaries](https://en.wiki.tatoeba.org/articles/show/text-search#languages-without-word-boundaries) below.

(4) You can use quotation marks to group words into phrases. For instance, _met him_ will find matches where the words _met_ and _him_ will occur anywhere in the sentence, but _met him_ will only find matches where the words occur in that order.

(5) For more information, read the section [Examples in English](https://en.wiki.tatoeba.org/articles/show/text-search#Examples in English). 
#### To

The "To" field can be set to "Any language", in which case the search will find words in any language. Otherwise, the search will only find words in the language you choose.

#### From

The "From" field can be set to "Any language", in which case it will be ignored. Otherwise, the search will only find sentences that are linked to sentences in the language you choose. They can either be directly linked, in which case they will be shown in black, or indirectly linked, in which case they will be shown in gray. Two sentences are indirectly linked when there is a chain of translations between them but no one has put a link between those two sentences themselves. This means you cannot be sure that the sentences are translations of each other.

## Examples in English

* To find English sentences with "live", "lives", "living" or "lived", search for the word "live". (This will also find sentences with "Live", "Living", etc., since capitalization is ignored.)

  * [live](http://tatoeba.org/eng/sentences/search?query=live+&from=eng&to=und)

* To match a word exactly (ignoring capitalization), put an equals sign (=) before it. 

  * [=live](http://tatoeba.org/eng/sentences/search?query=%3Dlive+&from=eng&to=und)

* Leave punctuation out of your search string. Most punctuation will be ignored, but a final exclamation mark (!) or question mark (?) will actually interfere with the search. (See [Sentences with punctuation marks](https://en.wiki.tatoeba.org/articles/show/text-search#Sentences with punctuation marks) for an example.) These symbols have other purposes, as described later on this page.

* Put a $ after a word to find sentences ending with that word. The example finds English sentences ending with "Tom".

  * [Tom$](http://tatoeba.org/eng/sentences/search?query=Tom%24&from=eng&to=und)

* Most punctuation symbols cannot be found via a search. However, $ and _ are special. You can search for sentences containing either of these characters by putting a backslash before the symbol.

  * [\$](https://tatoeba.org/eng/sentences/search?query=%5C%24&from=und&to=und)
  * [\\_](https://tatoeba.org/eng/sentences/search?query=%5C_&from=und&to=und)
  * [\$(1|2|3|4|5|6|7|8|9)](https://tatoeba.org/eng/sentences/search?query=%5C%24%281%7C2%7C3%7C4%7C5%7C6%7C7%7C8%7C9%29&from=eng&to=und)
 finds sentences with a $ followed by a number.

* Put a ^ before a word to find sentences beginning with that word. The example finds English sentences beginning with "Tom".

  * [^Tom](http://tatoeba.org/eng/sentences/search?query=%5ETom&from=eng&to=und)

* This example finds English sentences beginning with "Tom" and ending with "Mary".

  * [^Tom Mary$](http://tatoeba.org/eng/sentences/search?query=%5ETom+Mary%24&from=eng&to=und)

* This example finds English sentences beginning with either "Tom" or "He".

  * [(^Tom|^He)](http://tatoeba.org/eng/sentences/search?query=%28%5ETom%7C%5EHe%29&from=eng&to=und)

* This example finds English sentences including any of the following words: fasting, fasted, or fasts.  Using the equals sign means you'll get exact matches, thus you will avoid the adjective forms: fast, faster and fastest.

  * [(=fasting|=fasted|=fasts)](https://tatoeba.org/eng/sentences/search?query=%28%3Dfasting%7C%3Dfasted%7C%3Dfasts%29&from=eng&to=und)

* To search for a phrase, put quotes (") around it. Put an equals sign in front of each word that you want to be matched exactly. Or put an equals sign directly before the quotes to match every word in the quotes.
  * If you want to see phrases like "live in Boston", "living in Boston", or "lives in Boston", use the following search:

      * ["live in boston"](http://tatoeba.org/eng/sentences/search?query=%22live+in+boston%22&from=eng&to=und)

  * The following searches will only find sentences with the exact phrase "live in Boston".

      * ["=live =in =boston"](http://tatoeba.org/eng/sentences/search?query=%22%3Dlive+%3Din+%3Dboston%22&from=eng&to=und)
      * [="live in boston"](http://tatoeba.org/eng/sentences/search?query=%3D%22live+in+boston%22&from=eng&to=und)

  * This search will only find sentences consisting of the exact words "I live in Boston", without any additional words.

      * ["^I =live =in =Boston$"](http://tatoeba.org/eng/sentences/search?query=%22%5EI+%3Dlive+%3Din+%3DBoston%24%22&from=eng&to=und)

* This example finds English sentences that have "Tom", but don't begin with "Tom."

  * [-^Tom Tom](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom&from=eng&to=und)

* This example finds English sentences that have "Tom", but don't begin or end with "Tom."

  * [-^Tom Tom -Tom$](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom+-Tom%24&from=eng&to=und)

* The question mark (?) as part of a word is a one-letter wildcard.

    * The following will find sentences with either "whenever" and "wherever."

        * [whe?ever](https://tatoeba.org/eng/sentences/search?query=whe%3Fever&from=und&to=und)

    * The following will find sentences with with 6-letter words that have 2 letters, and then "eve" and then one more letter,  such as "clever" "eleven", "peeves", "uneven", ...

        * [??eve?](https://tatoeba.org/eng/sentences/search?query=%3F%3Feve%3F&from=eng&to=und)

* This example finds English sentences that have "Tom", then 2 words, then "Mary", then 1 word, and then "John."

  * ["Tom * * Mary * John"](https://tatoeba.org/eng/sentences/search?query=%22Tom+*+*+Mary+*+John%22&from=eng&to=und)

* This example finds English sentences that start with "Tom", then have 3 words, then end with "Mary".

  * ["^Tom * * * Mary$"](https://tatoeba.org/eng/sentences/search?query=%22%5ETom+*+*+*+Mary%24%22&from=und&to=und)

* This example finds English sentences that have words beginning with "red", including the word "red".  (3 letters or more are required.)

  * [red*](https://tatoeba.org/eng/sentences/search?query=red*&from=eng&to=und)

* This example finds English sentences that have words ending with "red", including the word "red".

  * [*red](https://tatoeba.org/eng/sentences/search?query=*red&from=eng&to=und)

* This example finds English sentences that have words containing the word "red", including the word "red".

  * [\*red\*](https://tatoeba.org/eng/sentences/search?query=*red*&from=eng&to=und)

* This example finds English sentences that have the word "French", but don't have the word "Tom".

  * [French -Tom](https://tatoeba.org/eng/sentences/search?query=French+-Tom&from=eng&to=und)

* This example will find sentences with "cheek" (in any form: cheeks, etc.) that don't include any of the words preceded by a minus sign (-).

  * [cheek -tear -slap -burn -red -hollow](http://tatoeba.org/eng/sentences/search?query=cheek+-tear+-slap+-burn+-red+-hollow&from=eng&to=und)

* This example finds sentences in which the word "cat" comes before the word "dog."

  * [cat << dog](https://tatoeba.org/eng/sentences/search?query=cat+%3C%3C+dog&from=eng&to=und)

* This example finds sentences that contain at least two of the words "cat", "dog", and "fish" (a "quorum search").

  * ["cat dog fish"/2](https://tatoeba.org/eng/sentences/search?query=%22cat+dog+fish%22/2&from=eng&to=und)


### How to limit sentences to "I can" without getting "I can't".

* This shows just sentences beginning with "I can't."

  * ["^I =can't"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%27t%22)

* However, this search shows both the "I can" and "I can't" sentences.

  * ["^I =can"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%22)

* To just get "I can" sentences, without the "I can't" sentences, use this search. (Note that the quotes are necessary.)

  * ["^I =can" -"can't"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%22+-"can%27t")

### How to find English sentences without "the", "a" or "an"

* -the -a -an

  * This will get an error message, since you must have a "positive" search query, not only "negative" ones.

* One solution is to search for words with vowel sounds, after putting a minus before each word that you do not want in the results.

  * [-the -a -an a\*|\*a|\*a\*|\*e|e\*|\*e\*|i\*|\*i|\*i\*|o\*|\*o|\*o\*|u\*|\*u|\*u\*|y\*|\*y|\*y\*](https://tatoeba.org/eng/sentences/search?query=+-the+-a+-an+a*%7C*a%7C*a*%7C*e%7Ce*%7C*e*%7Ci*%7C*i%7C*i*%7Co*%7C*o%7C*o*%7Cu*%7C*u%7C*u*%7Cy*%7C*y%7C*y*&from=eng)


### How to find sentences with "of" followed by words ending in "ing" without any intervening words

 * [of NEAR/1 \*ing -"\*ing of"](https://tatoeba.org/eng/sentences/search?query=of+NEAR%2F1+*ing+-%22*ing+of%22&from=eng&to=none&user=&orphans=no&unapproved=no&has_audio=&tags=&list=&native=&trans_filter=limit&trans_to=und&trans_link=&trans_user=&trans_orphan=&trans_unapproved=&trans_has_audio=&sort=words&sort_reverse=)

 * Notes
   * The -"ing of" part is necessary to avoid getting results where the -ing word comes before "of."
   * The search results will favor sentences that contain multiple occurrences of *ing. If you don't want this, change the search order.

### Sentences with punctuation marks
  * The following yields no results:

      * [how strange!](http://tatoeba.org/eng/sentences/search?query=how+strange!&from=eng&to=und)

  * but this search will find *How strange!* among other results:

      * [how strange](http://tatoeba.org/eng/sentences/search?query=how+strange&from=eng&to=und)


## Languages without word boundaries

For languages that don't use space characters to separate words, like Japanese, Chinese etc. the search engine interprets each character as a single word. For instance, searching for 逆に will return the same results as 逆 に, which actually matches sentences that only *include* these characters, but not necessarily in that particular order, or not contiguously. So you should surround keywords with quotes, as in this example: 

["逆に"](http://tatoeba.org/jpn/sentences/search?query=%22%E9%80%86%E3%81%AB%22&from=jpn).


## More details

The search ignores capitalization and punctuation (unless the punctuation happens to match one of the special characters described elsewhere on the page). 

In some languages, including English, the search engine **stems** the search words by default. This means that it removes certain trailing sequences from both search words and indexed words. Thus a search for *live* will also find *lived* and *living*.

The languages in which the search engine stems words are: Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian (Bokmål), Portuguese, Romanian, Russian, Spanish, Swedish, and Turkish.

If you want to find an exact match for a word, you must precede it with an equals sign, as in *=live*. This may come as a surprise to users who are accustomed to Google Search, where wrapping a word or phrase in double quotes forces an exact match. In Manticore, double quotes have a different function, which only affects multiword (phrase) searches: wrapping a phrase in double quotes requires matching sentences to contain words in the specified continuous sequence. Simply placing a phrase in quotes does not suppress stemming of its individual words. To do that, you will need to place an equals sign before each word in the phrase for which you want to suppress stemming, or directly before the first quotation mark to suppress stemming for each word. If you want to put both an equals sign and a caret before the same word, the equals sign should precede the caret. For instance, to find sentences that begin with the exact word *Noise*, search for *=^noise*, not *^=noise*.

As an example, take the search *like thing*. This will find *like things*, *likely things*, and even *things like*. Adding quotes, as in *"like thing"*, will prevent a match against *things like* (where the words appear in the wrong order), but it will continue to match *like things*, *likely things*, and so on. By contrast, *"=like =thing"* will only match *like thing* (which does not occur in the Tatoeba corpus). Removing the double quotes, *=like =thing*, will match *What made you do a silly thing like that?* Removing one of the equals signs, as in *like =thing*, will find *Such a strange thing is not likely to happen.* 

Note that a star (*) can be placed at the beginning and/or end of a string representing a word, but it if is placed in the middle, the search will always fail. Also, a string beginning and/or ending with a star must be at least three characters long.


## Other search operators

* A vertical bar (representing "or") finds examples where either of the words appears:
  *    *hate | detest* will match sentences with either *hate* or *detest* (or both). 

* If you want to combine an or-expression with other terms, you need to put the or-expression in parentheses: 
  *    *(red|blue) house* will match sentences in which the word "house" appears together with either "red" or "blue" (or both) 

* A dash (or exclamation point) before a word prevents matches with sentences where the word appears: *like -thing* (or *like !thing*) will match *I like ice cream* but not *I like that red thing*.

* Putting a caret (^) before a word will match only sentences that begin with that word: *^great* will match *Great people are not always wise.* but not *You are the great love of my life.*

* Putting a dollar sign ($) after a word will match only sentences that end with that word: *life$* will match *This is the best day of my life.* but not *Life means nothing without friends.*

* If you want to search for sentences that contain nothing other than the specified words, use double quotes, a caret, and a dollar sign in combination: *"^i love you$"* will find *I love you.* and *I love you!* but not *I love you more than you love me.* (However, it will find *I loved you.* To prevent this match, use *"^i =love you$"*.)

* The strict order operator (<<) between two words will find sentences where the first word occurs before the second but not where the second word comes before the first. Thus _dog << cat_ will find examples where _dog_ precedes _cat_, but not vice versa.

* The proximity operator(~_N_, where _N_ is a positive number) following a phrase will limit the number of words that can separate the specified words to fewer than _N_. Thus _"you are *ble"~1_ will find *You are irresistible.* but not *You are partially responsible.*
 
See the [Manticore documentation](https://docs.manticoresearch.com/latest/html/) for other functionality. Note that the documentation mentions keywords pertaining to specific fields in a document, but these are not relevant to Tatoeba.

version at: 21/12/2019, 18:24

#How to Search for Text

Return to [Advanced Search at tatoeba.org](https://tatoeba.org/eng/sentences/advanced_search).

## Introduction

Tatoeba provides two ways to search for sentences: 

* the regular search bar at the top of every page
* [advanced search](https://en.wiki.tatoeba.org/articles/show/advanced-search#), which you can reach from the **Advanced search** link above the regular search bar 

### Regular search

For regular search, there are three fields:

* the main field, which selects the word or words that you're looking for
* the **From** field, which selects the language you're looking for matches in
* the **To** field, which limits the search to sentences that have been directly or indirectly translated into the language you choose

#### Main search field

If you leave the main search field empty, it will find all sentences that match the **From** and **To** values that you've chosen. Otherwise, it will search for sentences containing the word or words that you type in. 

The search engine that Tatoeba uses ([Manticore](https://manticoresearch.com/)) is a little different from other search engines that you may have used, such as Google's. Please note the following:
 
(1) Punctuation marks like _?_ and _!_ have special purposes in our search engine (Manticore, previously Sphinx). If you don't want to use those special functions, you should leave them out.

(2) In Turkish and many European languages (Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian [Bokmål], Portuguese, Romanian, Russian, Spanish, and Swedish), a search for a word such as _live_ will also find similar words, such as _lived_ and _living_. If you want to indicate that a word should be matched exactly, you must put an equals sign before it: _=live_

(3) If you are searching for sentences in a language (such as Japanese or Chinese) that does not put spaces between words, be sure to see the section [Languages without word boundaries](https://en.wiki.tatoeba.org/articles/show/text-search#languages-without-word-boundaries) below.

(4) You can use quotation marks to group words into phrases. For instance, _met him_ will find matches where the words _met_ and _him_ will occur anywhere in the sentence, but _met him_ will only find matches where the words occur in that order.

(5) For more information, read the section [Examples in English](https://en.wiki.tatoeba.org/articles/show/text-search#Examples in English). 
#### To

The "To" field can be set to "Any language", in which case the search will find words in any language. Otherwise, the search will only find words in the language you choose.

#### From

The "From" field can be set to "Any language", in which case it will be ignored. Otherwise, the search will only find sentences that are linked to sentences in the language you choose. They can either be directly linked, in which case they will be shown in black, or indirectly linked, in which case they will be shown in gray. Two sentences are indirectly linked when there is a chain of translations between them but no one has put a link between those two sentences themselves. This means you cannot be sure that the sentences are translations of each other.

## Examples in English

* To find English sentences with "live", "lives", "living" or "lived", search for the word "live". (This will also find sentences with "Live", "Living", etc., since capitalization is ignored.)

  * [live](http://tatoeba.org/eng/sentences/search?query=live+&from=eng&to=und)

* To match a word exactly (ignoring capitalization), put an equals sign (=) before it. 

  * [=live](http://tatoeba.org/eng/sentences/search?query=%3Dlive+&from=eng&to=und)

* Leave punctuation out of your search string. Most punctuation will be ignored, but a final exclamation mark (!) or question mark (?) will actually interfere with the search. (See [Sentences with punctuation marks](https://en.wiki.tatoeba.org/articles/show/text-search#Sentences with punctuation marks) for an example.) These symbols have other purposes, as described later on this page.

* Put a $ after a word to find sentences ending with that word. The example finds English sentences ending with "Tom".

  * [Tom$](http://tatoeba.org/eng/sentences/search?query=Tom%24&from=eng&to=und)

* Most punctuation symbols cannot be found via a search. However, $ and _ are special. You can search for sentences containing either of these characters by putting a backslash before the symbol.

  * [\$](https://tatoeba.org/eng/sentences/search?query=%5C%24&from=und&to=und)
  * [\\_](https://tatoeba.org/eng/sentences/search?query=%5C_&from=und&to=und)
  * [\$(1|2|3|4|5|6|7|8|9)](https://tatoeba.org/eng/sentences/search?query=%5C%24%281%7C2%7C3%7C4%7C5%7C6%7C7%7C8%7C9%29&from=eng&to=und)
 finds sentences with a $ followed by a number.

* Put a ^ before a word to find sentences beginning with that word. The example finds English sentences beginning with "Tom".

  * [^Tom](http://tatoeba.org/eng/sentences/search?query=%5ETom&from=eng&to=und)

* This example finds English sentences beginning with "Tom" and ending with "Mary".

  * [^Tom Mary$](http://tatoeba.org/eng/sentences/search?query=%5ETom+Mary%24&from=eng&to=und)

* This example finds English sentences beginning with either "Tom" or "He".

  * [(^Tom|^He)](http://tatoeba.org/eng/sentences/search?query=%28%5ETom%7C%5EHe%29&from=eng&to=und)

* This example finds English sentences including any of the following words: fasting, fasted, or fasts.  Using the equals sign means you'll get exact matches, thus you will avoid the adjective forms: fast, faster and fastest.

  * [(=fasting|=fasted|=fasts)](https://tatoeba.org/eng/sentences/search?query=%28%3Dfasting%7C%3Dfasted%7C%3Dfasts%29&from=eng&to=und)

* To search for a phrase, put quotes (") around it. Put an equals sign in front of each word that you want to be matched exactly. Or put an equals sign directly before the quotes to match every word in the quotes.
  * If you want to see phrases like "live in Boston", "living in Boston", or "lives in Boston", use the following search:

      * ["live in boston"](http://tatoeba.org/eng/sentences/search?query=%22live+in+boston%22&from=eng&to=und)

  * The following searches will only find sentences with the exact phrase "live in Boston".

      * ["=live =in =boston"](http://tatoeba.org/eng/sentences/search?query=%22%3Dlive+%3Din+%3Dboston%22&from=eng&to=und)
      * [="live in boston"](http://tatoeba.org/eng/sentences/search?query=%3D%22live+in+boston%22&from=eng&to=und)

  * This search will only find sentences consisting of the exact words "I live in Boston", without any additional words.

      * ["^I =live =in =Boston$"](http://tatoeba.org/eng/sentences/search?query=%22%5EI+%3Dlive+%3Din+%3DBoston%24%22&from=eng&to=und)

* This example finds English sentences that have "Tom", but don't begin with "Tom."

  * [-^Tom Tom](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom&from=eng&to=und)

* This example finds English sentences that have "Tom", but don't begin or end with "Tom."

  * [-^Tom Tom -Tom$](https://tatoeba.org/eng/sentences/search?query=-%5ETom+Tom+-Tom%24&from=eng&to=und)

* The question mark (?) as part of a word is a one-letter wildcard.

    * The following will find sentences with either "whenever" and "wherever."

        * [whe?ever](https://tatoeba.org/eng/sentences/search?query=whe%3Fever&from=und&to=und)

    * The following will find sentences with with 6-letter words that have 2 letters, and then "eve" and then one more letter,  such as "clever" "eleven", "peeves", "uneven", ...

        * [??eve?](https://tatoeba.org/eng/sentences/search?query=%3F%3Feve%3F&from=eng&to=und)

* This example finds English sentences that have "Tom", then 2 words, then "Mary", then 1 word, and then "John."

  * ["Tom * * Mary * John"](https://tatoeba.org/eng/sentences/search?query=%22Tom+*+*+Mary+*+John%22&from=eng&to=und)

* This example finds English sentences that start with "Tom", then have 3 words, then end with "Mary".

  * ["^Tom * * * Mary$"](https://tatoeba.org/eng/sentences/search?query=%22%5ETom+*+*+*+Mary%24%22&from=und&to=und)

* This example finds English sentences that have words beginning with "red", including the word "red".  (3 letters or more are required.)

  * [red*](https://tatoeba.org/eng/sentences/search?query=red*&from=eng&to=und)

* This example finds English sentences that have words ending with "red", including the word "red".

  * [*red](https://tatoeba.org/eng/sentences/search?query=*red&from=eng&to=und)

* This example finds English sentences that have words containing the word "red", including the word "red".

  * [\*red\*](https://tatoeba.org/eng/sentences/search?query=*red*&from=eng&to=und)

* This example finds English sentences that have the word "French", but don't have the word "Tom".

  * [French -Tom](https://tatoeba.org/eng/sentences/search?query=French+-Tom&from=eng&to=und)

* This example will find sentences with "cheek" (in any form: cheeks, etc.) that don't include any of the words preceded by a minus sign (-).

  * [cheek -tear -slap -burn -red -hollow](http://tatoeba.org/eng/sentences/search?query=cheek+-tear+-slap+-burn+-red+-hollow&from=eng&to=und)

* This example finds sentences in which the word "cat" comes before the word "dog."

  * [cat << dog](https://tatoeba.org/eng/sentences/search?query=cat+%3C%3C+dog&from=eng&to=und)

* This example finds sentences that contain at least two of the words "cat", "dog", and "fish" (a "quorum search").

  * ["cat dog fish"/2](https://tatoeba.org/eng/sentences/search?query=%22cat+dog+fish%22/2&from=eng&to=und)


### How to limit sentences to "I can" without getting "I can't".

* This shows just sentences beginning with "I can't."

  * ["^I =can't"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%27t%22)

* However, this search shows both the "I can" and "I can't" sentences.

  * ["^I =can"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%22)

* To just get "I can" sentences, without the "I can't" sentences, use this search. (Note that the quotes are necessary.)

  * ["^I =can" -"can't"](https://tatoeba.org/eng/sentences/search?from=eng&to=und&has_audio=yes&sort=created&query=%22%5EI+%3Dcan%22+-"can%27t")

### How to find English sentences without "the", "a" or "an"

* -the -a -an

  * This will get an error message, since you must specify at least one word that you want to include, not only words that you want to exclude. If you are determined to get as many results as possible, you can search for words that start with any letter of the alphabet, after putting a minus before each word that you do not want. However, this query will take a long time.

  * [-the -a -an a\*|\*a|\*a\*|\*e|e\*|\*e\*|i\*|\*i|\*i\*|o\*|\*o|\*o\*|u\*|\*u|\*u\*|y\*|\*y|\*y\*](https://tatoeba.org/eng/sentences/search?query=-the+-a+-an+a*%7Cb*%7Cc*%7Cd*%7Ce*%7Cf*%7Cg*%7Ch*%7Ci*%7Cj*%7Ck*%7Cl*%7Cm*%7Cn*%7Co*%7Cp*%7Cq*%7Cr*%7Cs*%7Ct*%7Cu*%7Cv*%7Cw*%7Cx*%7Cy*%7Cz*&from=eng)


### How to find sentences with "of" followed by words ending in "ing" without any intervening words

 * [of NEAR/1 \*ing -"\*ing of"](https://tatoeba.org/eng/sentences/search?query=of+NEAR%2F1+*ing+-%22*ing+of%22&from=eng&to=none&user=&orphans=no&unapproved=no&has_audio=&tags=&list=&native=&trans_filter=limit&trans_to=und&trans_link=&trans_user=&trans_orphan=&trans_unapproved=&trans_has_audio=&sort=words&sort_reverse=)

 * Notes
   * The -"ing of" part is necessary to avoid getting results where the -ing word comes before "of."
   * The search results will favor sentences that contain multiple occurrences of *ing. If you don't want this, change the search order.

### Sentences with punctuation marks
  * The following yields no results:

      * [how strange!](http://tatoeba.org/eng/sentences/search?query=how+strange!&from=eng&to=und)

  * but this search will find *How strange!* among other results:

      * [how strange](http://tatoeba.org/eng/sentences/search?query=how+strange&from=eng&to=und)


## Languages without word boundaries

For languages that don't use space characters to separate words, like Japanese, Chinese etc. the search engine interprets each character as a single word. For instance, searching for 逆に will return the same results as 逆 に, which actually matches sentences that only *include* these characters, but not necessarily in that particular order, or not contiguously. So you should surround keywords with quotes, as in this example: 

["逆に"](http://tatoeba.org/jpn/sentences/search?query=%22%E9%80%86%E3%81%AB%22&from=jpn).


## More details

The search ignores capitalization and punctuation (unless the punctuation happens to match one of the special characters described elsewhere on the page). 

In some languages, including English, the search engine **stems** the search words by default. This means that it removes certain trailing sequences from both search words and indexed words. Thus a search for *live* will also find *lived* and *living*.

The languages in which the search engine stems words are: Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian (Bokmål), Portuguese, Romanian, Russian, Spanish, Swedish, and Turkish.

If you want to find an exact match for a word, you must precede it with an equals sign, as in *=live*. This may come as a surprise to users who are accustomed to Google Search, where wrapping a word or phrase in double quotes forces an exact match. In Manticore, double quotes have a different function, which only affects multiword (phrase) searches: wrapping a phrase in double quotes requires matching sentences to contain words in the specified continuous sequence. Simply placing a phrase in quotes does not suppress stemming of its individual words. To do that, you will need to place an equals sign before each word in the phrase for which you want to suppress stemming, or directly before the first quotation mark to suppress stemming for each word. If you want to put both an equals sign and a caret before the same word, the equals sign should precede the caret. For instance, to find sentences that begin with the exact word *Noise*, search for *=^noise*, not *^=noise*.

As an example, take the search *like thing*. This will find *like things*, *likely things*, and even *things like*. Adding quotes, as in *"like thing"*, will prevent a match against *things like* (where the words appear in the wrong order), but it will continue to match *like things*, *likely things*, and so on. By contrast, *"=like =thing"* will only match *like thing* (which does not occur in the Tatoeba corpus). Removing the double quotes, *=like =thing*, will match *What made you do a silly thing like that?* Removing one of the equals signs, as in *like =thing*, will find *Such a strange thing is not likely to happen.* 

Note that a star (*) can be placed at the beginning and/or end of a string representing a word, but it if is placed in the middle, the search will always fail. Also, a string beginning and/or ending with a star must be at least three characters long.


## Other search operators

* A vertical bar (representing "or") finds examples where either of the words appears:
  *    *hate | detest* will match sentences with either *hate* or *detest* (or both). 

* If you want to combine an or-expression with other terms, you need to put the or-expression in parentheses: 
  *    *(red|blue) house* will match sentences in which the word "house" appears together with either "red" or "blue" (or both) 

* A dash (or exclamation point) before a word prevents matches with sentences where the word appears: *like -thing* (or *like !thing*) will match *I like ice cream* but not *I like that red thing*.

* Putting a caret (^) before a word will match only sentences that begin with that word: *^great* will match *Great people are not always wise.* but not *You are the great love of my life.*

* Putting a dollar sign ($) after a word will match only sentences that end with that word: *life$* will match *This is the best day of my life.* but not *Life means nothing without friends.*

* If you want to search for sentences that contain nothing other than the specified words, use double quotes, a caret, and a dollar sign in combination: *"^i love you$"* will find *I love you.* and *I love you!* but not *I love you more than you love me.* (However, it will find *I loved you.* To prevent this match, use *"^i =love you$"*.)

* The strict order operator (<<) between two words will find sentences where the first word occurs before the second but not where the second word comes before the first. Thus _dog << cat_ will find examples where _dog_ precedes _cat_, but not vice versa.

* The proximity operator(~_N_, where _N_ is a positive number) following a phrase will limit the number of words that can separate the specified words to fewer than _N_. Thus _"you are *ble"~1_ will find *You are irresistible.* but not *You are partially responsible.*
 
See the [Manticore documentation](https://docs.manticoresearch.com/latest/html/) for other functionality. Note that the documentation mentions keywords pertaining to specific fields in a document, but these are not relevant to Tatoeba.

Note

The lines in green are the lines that have been added in the new version. The lines in red are those that have been removed.