I can think of the following approaches.
Approach 1
Just as you mentioned: find out and add a tag of a part of speech to the actual term during indexing. Do the same when prompted.
I would like to discuss the disadvantages associated with them.
Minuses:
1) Future requirements may require you to receive results regardless of the part of speech. An index containing modified terms will not work.
2) You might want to execute BooleanQuery as "term: noun or adjective." You must write your own query expander.
Approach 2
Try using the Payload feature for Lucene.
Here is a quick guide to Lucene Payloads .
Steps to eliminate your use case.
1) Store the speech part tag as a payload.
2) Have custom affinity classes for each speech part tag.
3). Based on the request, assign the appropriate CustomSimilarity value to IndexSearcher. For example, assign NounBoostingSimilarity to request a noun.
4) Increase or decrease document rating based on payload. An example is given in the above tutorial.
5) Write a custom picker to filter documents with ratings that do not match the speed boost logic above.
The advantages of this approach are that the index remains compatible for any other regular search.
Minuses:
1) Maintenance overhead: you need to maintain multiple IndexSearchers for each affinity. 2) Somewhat difficult decision.
To be honest, I am not satisfied with my own decision, but just wanted to tell you that there is another way. It all depends on your scenario, whether the project is an academic one-time project or a commercial one, etc.