I am experimenting with an idea using SQL Server Full Text Indexing. This seems to be perfect for the task, but what my client wants is a very similar to Google similar summary of results in which the results display an excerpt of text around their search query.
If I'm looking for a "home" ...
My house is a very, very, very beautiful house
... thank you for visiting our house today ... you do not like this house ... hey, why are you setting fire to my house ? ...
It is not too difficult if their search query is an exact match for the search. You can simply do the tedious parsing of the text to create the extract.
But what happens to flexion and leakage? If I am looking for a βwalkβ, the request may fall into βwalkingβ, βwalked upβ, etc. I would need to know exactly what word inside the search result it came up with, so I would know what to base my extraction on.
This area seems mature for some kind of commercial product, or maybe there is an elegant way to do this that I am not considering?
(And yes, we know about GSA and the Google Mini. There are a few subtle reasons why they might not work in this case, so we try to use SQL FTI first.)
source share