How to implement the search function on the website?

I want to implement a search function for a website (suppose it looks like SO). I do not want to use Google to search for such things.

My question is:

How to implement this?

There are two methods that I know of:

  • Search all databases in the application when the user gives his request.
  • Index all the data that I have and store it somewhere else and request from there (for example, what Google does).

Can someone tell me where to go? What are the pros and cons?

Better, are there any better ways to do this?

+54
search
Aug 29 '08 at 10:08
source share
7 answers

Use lucene,
http://lucene.apache.org/java/docs/

Apache Lucene is a high-performance, full-featured text search library written entirely in Java. This technology is suitable for almost any application that requires full-text search, especially cross-platform.

It is available in java and .net. It is also available in php as a zend framework module.

Lucene does what you need (indexing the items found), you need to track the lucene index, but this is much better than searching the database in terms of performance. BTW, SO search powered by lucene .: D

+33
Aug 29 '08 at 10:09
source share

It depends on how full your website is and how much you want to make yourself.

If you use aa small website with no additional features for adding custom searches, let google do the work (maybe add a sitemap ) and use custom Google search .

If you start a medium site using the sql engine, use the search functions of your sql engine.

If you run some heavier software stack , such as J2EE or .Net, use Lucene , an excellent, powerful search engine or its .Net clone of lucene.Net

If you want to abstract your search from your application and be able to query it in a neutral language using the XML / HTTP and JSON API, take a look at solr . Solr runs lucene in the background, but adds a nice web interface to it.

+30
Aug 29 '08 at 17:42
source share

Perhaps you should take a look at xapian and omega . This is essentially a toolkit on which you can create search functions.

+4
Aug 29 '08 at 10:11
source share

The best way to get close to this will depend on how you create your pages.

If they often consist of many different records (as I believe in the pages), the indexing approach is likely to give better results if you do not work on effectively restoring pages on the database side.

The disadvantage you have with the indexing approach is the turnaround time. There may be workarounds (e.g., Google Sitemap), but they are also difficult to make the right choice.

If you are on the way to the database, keep in mind that modern search engines work much better if they have data for links to process, so finding a system that can understand the relationship between the โ€œpagesโ€ in the database will have a positive Effect.

+1
Aug 29 '08 at 10:16
source share

If you are on a Microsoft platform, you can use the indexing service. This is very convenient with IIS websites.

It has all the basic functions, such as full-text search, ranking, exclusion and inclusion of certain types of files, and you can add your own meta-information using meta tags on html pages.

Make google and you will find tons!

+1
Aug 29 '08 at 17:30
source share

This is somewhat orthogonal to your question, but I highly recommend the idea of โ€‹โ€‹finding RESTful. That is, to perform a search that has never been performed, the website sends a request / search /. To restart the search, the GET website / search / {some id}

There are some good documents regarding this, for example here .

(However, I like indexing where possible, although it is an optimization and therefore may be premature.)

0
Aug 29 '08 at 14:59
source share

If the application uses the Java EE stack, and you use Hibernate , you can use the Compass Framework to support the search index of your database. The Compass Framework uses Lucene under the hood.

The only catch is that you cannot replicate your search index. Therefore, you need to use a clustered database to store index tables or use the new mesh-based index storage engines that were added in Compass Framework 2.x.

-one
Aug 29 '08 at 17:23
source share



All Articles