Your search strategy, as you have noticed, is slow. He is slow because
LIKE '%something%'
must scan the table to find matches. The signs of% LIKE in finding LIKE are a great way to disrupt performance.
I do not know how many columns are in your path table. If there are many columns, you can do two quick things to improve performance:
- get rid of
SELECT * and list the names of the desired columns in your result set. - create a composite index consisting of a
filename column followed by other columns you need to get.
(This will not help if there are only a few columns in the table.)
You cannot use the direct from the FULLTEXT software FULLTEXT to search for this material because it is intended for text in the language.
If I had to quickly make this work for production, I would do the following:
First create a new table called "searchterm" containing
filename_id INT the id number of a row in your path table searchterm VARCHAR(20) a fragment of a filename.
Secondly, write a program that reads the values โโof filename_id and filename , and inserts a bunch of different lines for each of them into searchterm . For the item you specified, the values โโshould be:
LG_MARGINCALL_HD2CH_127879834_EN.mov (original) LG MARGINCALL HD2CH 127879834 EN mov (split on punctuation) HD 2 CH (split on embedded numerics) MARGIN CALL (split on an app-specific list of words)
So, you will have many entries in your searchterm table, all with the same filename_id value and many different small pieces of text.
Finally, you can do this when searching.
SELECT path.id, path.filename, path.whatever, COUNT(DISTINCT searchterms.term) AS termcount FROM path JOIN searchterm ON path.filenanme_id = search.filename_id WHERE searchterm.term IN ('margin','call','hd','en', 'mov') GROUP BY path.id, path.filename, path.whatever ORDER BY path.filename, COUNT(DISTINCT searchterms.term) DESC
This little query finds all the relevant fragments in what you are looking for. It returns multiple file names and presents them in an order that matches most conditions.
What I am suggesting is that you create your own full-sized search engine like sorta- sorta. If you really have several million multimedia files, this is definitely worth your effort.