What are Elasticsearch's nearly real-time queries? How can you write them in order to take the most of this tool? These are the questions I’ll answer in this article, so we can explore together more about the Elasticsearch possibilities.
We already know that Elasticsearch is a powerful non-relational database since it executes its queries nearly in real-time - if you don’t know anything about Elasticsearch, here’s an introduction to this great tool!
But what are these queries? And how can we write them to manage to Get and Post operations so we can enjoy the best of ES? That’s the question I’ll answer in this article, so we can explore together more about the Elasticsearch possibilities.
To exemplify what I’ll be explaining, I’ll use Postman to query an existing index with information and show the response, so you can understand what happened.
So, shall we explore Elasticsearsh queries a bit further?
Query DSL - understanding what it is
When we talk about Query DSL, first of all, it’s important to define that Elasticsearch invokes most of its APIs to provide JSON-based queries which are used in different ways to do operations like searching and persisting data.
All queries return a property called score for each object that represents how much that result meets the provided filter.
Within this context, there are two types of clauses, those are:
- Leaf Query Clauses − These clauses search for specific values in specific fields of an index, using functions such as match, term, or range to isolate and retrieve data based on the specified value;
- Compound Query Clauses - These clauses are a combination of Leaf Query Clauses and they are used to combine the results of other queries to look for new results.
Types of query
Match All Query
This is the most basic query, which purpose is to retrieve all the data from an index and, in this case, all the scores will be 1.0, since this query doesn’t depend on any condition.
It’s also worth noting that the Match All query can be simplified by using the URL http://localhost:used_port/users/_search?pretty=true.
This is the usage of the Match All Query:
And the result is returned in JSON format, just like the way ES works:
As we can see in this response, the score of each record is 1.0, which means there’s nothing to compare with each other!
Unlike the Match All Query, the Match Query is represented by one or more conditions that the query must meet. This query is characterized by matching a text or phrase with the values of one or more fields.
This is an example of the usage of a match query in the index that was queried above:
In the response, we can notice that all the records with the field “category” having the value equal to “Books” were returned, each one with a specific score that represents how much the field value is alike to the searched term. No matter how much the term matches the value, the scores will always be different due to the ES analysis process.
Multi Match Query
This query is used to match a condition (returning a word or a phrase) in one or more fields. In other words, the Multi Match Query looks for a term in how many fields you want to specify.
Notice that this query searches for the term “soccer” in both “tags” and “name” fields. Now let’s look at the response:
It’s interesting to look at the score of each record and how it changes due to the similarity of the terms and the fields. In this case, the records that contain “soccer” in both queried fields have a higher score than the ones that match only one condition.
Term Level Queries
This query has the same functionality as the Match Query, but the difference is that Term Level Queries are better recommended to deal with structured data like numbers, dates, and enums.
In the example, we can see that the field in which we’re looking for a specific value is a numerical type, and it needs to be totally equal to the provided value.
The response will return the single record matching the field “price” value equal to 29.9 with the score 1.0, because this is a numerical type with the value matching exactly what we are looking for. See it below:
This query is used to filter data between a range of values, and for this we can use four operators:
- gte − greater than or equals to
This operator is used to filter data whose filtered values are greater than or equals to a specified value;
- gt − greater-than
This operator is used to filter data whose filtered values are just greater than to a specified value;
- lte − less-than or equals to
This one filters by values that are less than or equal to the specified value.
- lt − less-than
Unlike the lte, the lt operator filters only by values that are less than the value we’re searching for.
Now let’s do an example of each operator:
In this case, we’re searching for all the products whose price is greater than or equal to a specific value, and then we can notice that all the returned data respects this condition:
In this example, it’s possible to observe that, now that we’re restricting to search only for the products whose price is greater than a specific value, the query returns only three records instead of four (one less record than gte):
This will return only one record that we have in ES and which price is less than or equal to 29.9:
In this example, it’s really interesting to observe that when we restrict the value to be only less than the value we are filtering, no record is returned, meaning that the minimum price of all records in this index is 29.9:
As said at the beginning of this article, a compound query is represented by a union of Leaf Queries by using boolean operators to achieve the comparisons and return values that attend to all the imposed conditions.
In the example below, we’re searching for records whose “category” field matches “Books” OR “Sports”. This is possible due to the term “should”, which supports the number of conditions that we need. We can also notice that we are adding an AND condition that is represented by the “must” operator, which restricts the “price” to be greater than 50 and less than 100.
It’s amazing to combine queries!
An introduction to Query DSL: creating queries in Elasticsearch - Final Thoughts
Working with Elasticsearch, besides being scalable, it’s also flexible due to the possibility to query information inside an index that will attend to your needs - that can be a simple search for all records, or it can be a complex and full of conditions query.
I recommend that you create an index and train more than the shown examples, so you can enjoy the maximum of Elasticsearch!