ElasticSearch: post_filter or filter?

Let's say I have the same situation explained here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-post-filter.html

Before I came across this article, I used a filter instead of post_filter for such a scenario, and it produced output in the same way as post_filter.

My question is: are they the same thing? If not, which one is recommended and more effective to use and why?

+6
source share
3 answers

As for search queries , they are the same, i.e. hits you get will be correctly filtered according to your filter in the filtered request or in the filter in post_filter .

However, with regard to aggregation , the end result will be different. The difference between both parameters depends on which document sets the units.

If your filter is in the filtered request, then your aggregations will be calculated by the set of documents selected by the request (s) and the filter (s) in your filtered request, that is, in the same set of documents that you will receive a response.

If your filter is in post_filter , your aggregations will be calculated based on the set of documents selected by your various query (s). After aggregations have been calculated in this set of documents, the latter will be filtered by filters (s) in your post_filter before returning the corresponding documents.

Summarizing,

  • a filtered affects search and aggregation results
  • whereas post_filter only affects search results, not aggregation
+12
source

In my tests, I could find that the filter behaves exactly like post_filter. Both of them affect only the hits section.

+3
source

Another important difference between filter and post_filter that was not mentioned in any of the answers: performance .

TL DR

Do not use post_filter unless you really need it for aggregation.

From the final guide :

ATTENTION: performance considerations

Use post_filter only if you need differential filtering of search results and aggregates. Sometimes people will use post_filter for regular searches.

Do not do this! The nature of post_filter means that it starts after a request, so any increase in filtering performance (such as caching) is completely lost.

post_filter should only be used in conjunction with aggregations and only when you need differential filtering.

0
source

All Articles