How to join Elasticsearch - or at the Lucene level

What is the best way to accomplish the SQL join equivalent in Elasticsearch?

I have an SQL setup with two large tables: Faces and Elements. A person can own many objects. The lines of Person and Item may change (i.e., be updated). I have to run search queries that are filtered in parts of both the person and the element.

In Elasticsearch, it looks like you can make Person a nested Item document, then use has_child .

But: if you then update Person, I think you will need to update every element that they have (which can be a lot).

It is right? Is there a good way to resolve this request in Elasticsearch?

+7
join nosql elasticsearch lucene bigdata
source share
2 answers

As already mentioned, the path is parent / child. The fact is that attached documents are extremely effective, but to update them you need to resend the entire structure (parent + attached documents). Although the internal implementation of the attached documents consists of separate lucene documents, these attached documents are not visible and inaccessible directly. In fact, when using nested documents, you need to use the appropriate queries to access them (nested query, nested filter, nested facet, etc.).

On the other hand, parent / child allows you to have separate documents that reference each other, which can be updated independently. It has a cost in terms of performance and memory usage, but it is more flexible than attached documents.

As mentioned in this article , although the fact that elasticsearch helps you manage relationships does not mean you should use these functions. In many complex applications, it is simply better to have some user logic at the application level that handles relationships. In the aspect, there are also restrictions with the parent / child: for example, you cannot return both the parent and the child at the same time, unlike subdocuments that do not allow you to return only the corresponding child elements (for now).

+13
source share

Take a look at my answer for: In Elasticsearch, can multiple top-level documents share a single subdocument?

This discusses the use of _parent matching as a way to avoid a problem requiring updating each item when updating Person.

+2
source share

All Articles