I'm just wondering if we can achieve some RDBMS capabilities in lucene.
Example: 1) I have 10,000 project documents (pdf files) that need to be indexed with their content to make them searchable. 2) Each document is associated with ONE PROJECT. A project may contain details such as project name, number, start date, end date, location, type, etc.
I need to search the contents of pdf files for this keyword, but when displaying the results, I want to display the project metadata, as indicated in paragraph (2).
My idea is to associate a field called projectId with every PDF file when indexing. As soon as we get this, we will start the project metadata search again.
This way we could avoid data duplication. In addition, if we want to update the project metadata, we will finish the update in only one place. Otherwise, if we save this metadata with all pdf doument indices, we will complete updating all documents, which I am not looking for.
please inform.
source share