The match
query is the go-to query--the first query that you should
reach for whenever you need to query any field.((("match query")))((("full text search", "match query"))) It is a high-level full-text
query, meaning that it knows how to deal with both full-text fields and exact-value fields.
That said, the main use case for the match
query is for full-text search. So
let's take a look at how full-text search works with a simple example.
[[match-test-data]] ==== Index Some Data
First, we'll create a new index and index some((("full text search", "match query", "indexing data"))) documents using the
<
DELETE /my_index <1>
PUT /my_index { "settings": { "number_of_shards": 1 }} <2>
POST /my_index/my_type/_bulk { "index": { "_id": 1 }} { "title": "The quick brown fox" } { "index": { "_id": 2 }} { "title": "The quick brown fox jumps over the lazy dog" } { "index": { "_id": 3 }} { "title": "The quick brown fox jumps over the quick dog" } { "index": { "_id": 4 }}
// SENSE: 100_Full_Text_Search/05_Match_query.json
<1> Delete the index in case it already exists.
<2> Later, in <
==== A Single-Word Query
Our first example explains what((("full text search", "match query", "single word query")))((("match query", "single word query"))) happens when we use the match
query to
search within a full-text field for a single word:
GET /my_index/my_type/_search { "query": { "match": { "title": "QUICK!" } }
// SENSE: 100_Full_Text_Search/05_Match_query.json
Elasticsearch executes the preceding match
query((("analysis", "in single term match query"))) as follows:
Check the field type.
+
The title
field is a full-text (analyzed
) string
field, which means that
the query string should be analyzed too.
Analyze the query string.
+
The query string QUICK!
is passed through the standard analyzer, which
results in the single term quick
. Because we have a just a single term,
the match
query can be executed as a single low-level term
query.
Find matching docs.
+
The term
query looks up quick
in the inverted index and retrieves the
list of documents that contain that term--in this case, documents 1, 2, and
3.
Score each doc.
+
The term
query calculates the relevance _score
for each matching document,
by combining the((("relevance scores", "calculating for single term match query results"))) term frequency (how often quick
appears in the title
field of each document), with the inverse document frequency (how often
quick
appears in the title
field in all documents in the index), and the
length of each field (shorter fields are considered more relevant).
See <
This process gives us the following (abbreviated) results:
"hits": [ { "_id": "1", "_score": 0.5, <1> "_source": { "title": "The quick brown fox" } }, { "_id": "3", "_score": 0.44194174, <2> "_source": { "title": "The quick brown fox jumps over the quick dog" } }, { "_id": "2", "_score": 0.3125, <2> "_source": { "title": "The quick brown fox jumps over the lazy dog" } }
<1> Document 1 is most relevant because its title
field is short, which means
that quick
represents a large portion of its content.
<2> Document 3 is more relevant than document 2 because quick
appears twice.