How to build a search engine in C #

I am trying to create a web application in ASP.NET MVC and need to create a rather complicated search function. When a user enters a search query, I want to search for various data sources, which include documents, tables in the database, URLs of web pages and some APIs such as facebook. We will be very grateful for any advice, guidance and tips.

+7
c # search asp.net-mvc
source share
3 answers

Your question assumes that you probably do not plan to implement the entire function from scratch, so here are some links that may come in handy.

  • One (simplest) option is to use a third-party search engine (for example, Google Custom Search , but perhaps Bing has a similar API). This allows you to search (only) your page using Google and display the results individually. The limitation is that it only searches for data displayed on some (linked) pages.

  • A more complex approach is to use some .NET library that implements indexing for you (based on the data you give them). A popular library is, for example, Lucene.Net . In this case, you give it the data that you want to search explicitly (relevant content from web pages, database contents, etc.), so you have more control over what is being done (but this is a bit more work).

+14
source share

Building actual structures and search index algorithms is not a trivial feat. This is why people use Lucene, Sphinx, Solr, etc. Using google.com, as recommended in the comments, will not give you any control and poor match compared to what you get from one of these free search engines, if configured correctly b.

I recommend taking a look at Solr , it gives you the power of Lucene, but it's a lot easier to use, plus it adds a few handy features like caching, bordering, cutting, etc.

SolrNet is the Solr client for .Net, it has an example ASP.NET MVC application that you can use to see how it works and as the foundation of your project.

Disclaimer: I am the author of SolrNet.

+4
source share

I wrote a custom search engine for my MVC 4 site. It parses View directories and reads all .cshtml files, matching the provided conditions with a regular expression. Here is the basic code:

List<string> results = new List<string>(); DirectoryInfo di = new DirectoryInfo(System.Configuration.ConfigurationManager.AppSettings["PathToSearchableViews"]); //get all view directories except the shared foreach (DirectoryInfo d in di.GetDirectories().Where(d=>d.Name != "Shared")) { //get all the .cshtml files foreach (FileInfo fi in d.GetFiles().Where(e=>e.Extension == ".cshtml")) { //check if cshtml file and exclude partial pages if (fi.Name.Substring(0,1) != "_") { MatchCollection matches; bool foundMatch = false; int matchCount = 0; using (StreamReader sr = new StreamReader(fi.FullName)) { string file = sr.ReadToEnd(); foreach (string word in terms) { Regex exp = new Regex("(?i)" + word.Trim() + "(?-i)"); matches = exp.Matches(file); if (matches.Count > 0) { foundMatch = true; matchCount = matches.Count; } } //check match count and create links // // } } } } return results; 
+2
source share

All Articles