You can perform a general connection manually - run two queries, get all the results (instead of the top N), sort them by the connection key and cross two ordered lists. But it will be shaking violently on your heap ( if lists even fit into it).
Optimization is possible, but under very specific conditions.
That is, you are doing self-training and using only (random access) Filters to filter, no Queries . Then you can manually iterate over the terms in your two connection fields (in parallel), cross the docId lists for each term, filter them - and here is your connection.
There is an approach that uses the popular use case of simple parent-child relationships with a relatively small number of children for each document - https://issues.apache.org/jira/browse/LUCENE-2454
Unlike the flattening method mentioned in @ntziolis, this approach handles cases like: there are several resumes, each of which has several work_experience children, and try to find someone who worked for NNN in the year YYY. If you just flatten, you will get a resume for people who worked at NNN any year and worked somewhere in the YYY year.
An alternative to handling the simple cases of the parent child is to smooth your document, but to ensure that the values ββfor different children are separated by a large posIncrement gap, then use the SpanNear query to prevent multiple subqueries from matching between the children. This was a several year LinkedIn presentation. but I could not find her.
Earwin
source share