What is the difference between source filtering and field parameter in elasticsearch getting an API?

I got confused between source filtering (i.e. using the _source_include parameter) and the fields of the GET API in elasticsearch. How do they differ in terms of performance? When should they be used?

+5
source share
3 answers

Update : re: fields

Please note that this is 1.x documentation if you have just arrived here from the future.

For backward compatibility, if the fields parameter specifies fields that are not stored (the display of the store is set to false), it will load the _source and extract it from it. This functionality has been replaced with the source filtering parameter.

- https://www.elastic.co/guide/en/elasticsearch/reference/1.7/search-request-fields.html#search-request-fields


AFAICT:

_source tells elasticsearch whether to include the source of consistent documents in the response. A “source” is the data in a document that has been inserted.

fields tells elasticsearch to include the source, but only includes certain fields.

Permformance: If you have low bandwidth for the Elasticsearch server, this may be negligible.

+2
source

I had the same doubt, here I found what might be the answer.

fields limits the fields whose contents are parsed and returned

_source_filtering restricts fields

Another way to see that fields are used to optimize data transfer and CPU utilization, while _source_filtering only optimizes data transfer.

Source filtering allows us to control which parts of the original JSON document are returned for each hit [...] It should be borne in mind that this only saves us on bandwidth costs between the nodes involved in the search, as well as the client, not the processor or drive, as it was when using the fields.

Moreover:

One of the functions that are not usually known is the ability to select metadata fields. Of particular note is its ability to select a _ttl field that actually returns the number of milliseconds before the document expires, rather than the document’s original life. Very handy feature.

0
source

The fields parameter applies only to stored fields. From the documentation 2.3:

In addition to indexing field values, you can also choose to keep the original field value for later retrieval. Users with Lucene background use saved fields to select the fields that they would like to be able to return to their search results. In fact, the _source field is a saved field. In Elasticsearch, specify an individual document. The fields that you want to save are usually false optimizations. The whole document is already saved as the _source field. It is almost always better to just extract the fields you need using the _source parameter.

See source filetring to restrict the fields returned from _source

0
source

Source: https://habr.com/ru/post/1214455/


All Articles