Promotion / Fuzzy Search Using LINQ

I am trying to implement a DB search that I have inherited. The requirement states that the user should be able to search for an object by name. Unfortunately, an object may have several names associated with it. For example:

ID Name 1 John and Jane Doe 2 Foo McFoo 3 Boo McBoo 

It is easy enough to implement a search when there is one name in each record:

 var objects = from x in db.Foo where x.Name.Contains("Foo McFoo") select x; 

However, when multiple names exist, this approach does not work.

Question: Can I write a search method that will return a record (John and Jane Doe) when someone uses the search terms John Doe or Jane Doe ?

+7
source share
3 answers

This can hurt performance, but what about this quick:

 string[] filters = "John Doe".Split(new[] {' '}); var objects = from x in db.Foo where filters.All(f => x.Name.Contains(f)) select x; 

It seems to return what you expect. Now you will set it to behave when you have the record "John Doe", as well as "John and Jane Doe."

Does this work for you?

+3
source

You can create your own extension method called "ContainsFuzzy":

 public static bool ContainsFuzzy(this string target, string text){ // do the cheap stuff first if ( target == text ) return true; if ( target.Contains( text ) ) return true; // if the above don't return true, then do the more expensive stuff // such as splitting up the string or using a regex } 

Then your LINQ will be at least easier to read:

 var objects = from x in db.Foo where x.Name.ContainsFuzzy("Foo McFoo") select x; 

The obvious downside is that every call to ContainsFuzzy means re-creating your split list, etc., so there is some overhead. You can create a class called FuzzySearch, which at least will give you increased efficiency:

 class FuzzySearch{ private string _searchTerm; private string[] _searchTerms; private Regex _searchPattern; public FuzzySearch( string searchTerm ){ _searchTerm = searchTerm; _searchTerms = searchTerm.Split( new Char[] { ' ' } ); _searchPattern = new Regex( "(?i)(?=.*" + String.Join(")(?=.*", _searchTerms) + ")"); } public bool IsMatch( string value ){ // do the cheap stuff first if ( _searchTerm == value ) return true; if ( value.Contains( _searchTerm ) ) return true; // if the above don't return true, then do the more expensive stuff if ( _searchPattern.IsMatch( value ) ) return true; // etc. } } 

Your LINQ:

 FuzzySearch _fuzz = new FuzzySearch( "Foo McFoo" ); var objects = from x in db.Foo where _fuzz.IsMatch( x.Name ) select x; 
+7
source

You need to either output the names in the First / LastName columns, or in another table, possibly if there are several aliases.

But I really think you should look at it somehow like Lucene if you need something โ€œforgivingโ€ or โ€œfuzzyโ€

Question : Can I write a search method that will return to record (John and Jane Doe) when someone uses the search terms John Dow or Jane Doe?

To be very specific to your question, you can convert "John Doe" to LIKE '%John%Doe' or "Jane Doe" to LIKE '%Jane%Doe' , and this will extract this entry. However, I could see problems with names such as "Johnathan Poppadoe".

0
source

All Articles