How to create a small search engine

I am going to create a small search engine in the application (something like a search bar for addresses in a google map). The requirement is quite simple. An element consists of many keywords, a user type in a key word, it gives the corresponding result, after which the user enters another keyword, he continues to filter the result.

The first thing that comes to my mind is to use mysql, create a keyword table to store each key-wrods and, like in the element table, and when the user enters a keyword, he looks at each entry in the keyword table to give results . I think right? Can you guys give me some clues? I am completely new to mysql (just learn it in a high school lesson). Is there an open source platform for this?

+4
source share
3 answers

Note. If you don’t need to keep your keyword frequency, go with the Marmik Bhatt LIKE offer.

If you have a large amount of data and you want to search by keywords (that is, you are not going to search for phrases or use concepts such as “nearby”), you can simply create a table of keywords:

 CREATE TABLE address ( id INT(10) PRIMARY KEY, /* ... */ ); CREATE TABLE keyword ( word VARCHAR(255), address_id INT(10), frequency INT(10), PRIMARY KEY(word, article_id) ); 

Then you look at the text that you are “indexing” and counts every word found there.

If you want to make some keywords:

 SELECT address.*, SUM(frequency) frequency_sum FROM address INNER JOIN keyword ON keyword.address_id = address.id WHERE keyword.word IN ('keyword1', 'keyword2', /*...*/) GROUP BY address.id; 

Here I made a frequency sum, which can be a dirty way to compare the usefulness of a result when many are given.

What to think about:

  • Do you want to insert all keywords in the database or only those with a frequency higher than a certain value? If you insert the whole table, it can become huge, if you insert only higher frequencies, then you will not find the only article that mentions a specific word, but does it only once.
  • Do you want to insert all available keywords for a specific article or just the "top" ones? In this case, the danger is that frequent words that add nothing to the meaning will begin to crowd out others. Consider the word "However," it can be much more in your article than "mysql", buy this last one, which defines the article, not the first one.
  • Do you want to exclude words shorter than a certain character length?
  • Do you want to exclude well-known "meaningless" words?
+2
source

For a search engine, I use "LIKE" to search for parameters ... The query will look like ...

 SELECT * FROM tbl_keywords INNER JOIN tbl_addresses ON tbl_addresses.id = tbl_keyword.address_id WHERE tbl_keywords.keywords LIKE "% $keyword %"; 

$ keyword is a variable repeated from a GET or POST request from a search string.

You can also use JSON output for your search result, so using jquery you can provide a fast result for the search result.

Full Text Search

You can also use full-text search to search for places and related keywords see the link ... SQL Full Search Tutorial

+1
source

One thing that you can implement is that you can break down a user's keyword based on spaces, and you will get the most relevant results.

For example, user types Create a search engine

then explode it based on space.

Then query DB for each word.

A REGEXP may be more effective, but you will need to test it, for example,

 SELECT * from fiberbox where field REGEXP 'Create|search|engine'; 

Use jQuery Autocomplete to do searches using automatic searches like Google.

0
source

All Articles