I am working on a project in which I will have a lot of data, and it will be searchable in several forms, which are very effectively expressed as SQL queries, but they also need to be searched using natural language processing.
My plan is to create an index using Lucene for this search form.
My question is, if I do this and do a search, Lucene will return the identifier of the relevant documents to the index, then I have to look for these entities from the relational database.
There are two ways to do this (which I can still think of):
- N number of requests (Awful)
- Pass the entire identifier to the stored procedure immediately (possibly as a comma-delimited parameter). This has the disadvantage of limiting the maximum parameter size and slow UDF performance in order to split a row into a temporary table.
I am almost tempted to reflect everything in the lucenes index, so that I can periodically generate the index from the backup storage, but you only need to access it for the interface.
Tips?
source
share