Manipulating Relevance with Query Structureedit
The Elasticsearch query DSL is immensely flexible. You can move individual query clauses up and down the query hierarchy to make a clause more or less important. For instance, imagine the following query:
quick OR brown OR red OR fox
We could write this as a bool
query with
all terms at the same level:
GET /_search { "query": { "bool": { "should": [ { "term": { "text": "quick" }}, { "term": { "text": "brown" }}, { "term": { "text": "red" }}, { "term": { "text": "fox" }} ] } } }
But this query might score a document that contains quick
, red
, and
brown
the same as another document that contains quick
, red
, and fox
.
Red and brown are synonyms and we probably only need one of them to match.
Perhaps we really want to express the query as follows:
quick OR (brown OR red) OR fox
According to standard Boolean logic, this is exactly the same as the original
query, but as we have already seen in Combining Queries, a bool
query does not concern itself only with whether a document matches, but also with how
well it matches.
A better way to write this query is as follows:
GET /_search { "query": { "bool": { "should": [ { "term": { "text": "quick" }}, { "term": { "text": "fox" }}, { "bool": { "should": [ { "term": { "text": "brown" }}, { "term": { "text": "red" }} ] } } ] } } }
Now, red
and brown
compete with each other at their own level, and quick
,
fox
, and red OR brown
are the top-level competitive terms.
We have already discussed how the match
,
multi_match
, term
,
bool
, and dis_max
queries can be used
to manipulate scoring. In the rest of this chapter, we present
three other scoring-related queries: the boosting
query, the
constant_score
query, and the function_score
query.