If you have 1M files for grep through, you (best of all, I know) go through each of them with a regex.
For all purposes and goals, you will do the same on table rows if you query them in bulk using the LIKE operator or regular expression.
My own experience with grep is that I rarely looked for something that didn't contain at least one full word, so you can use the database to reduce the set you're looking for.
MySQL has built-in full-text search functions, but I would recommend against, because they mean that you are not using InnoDB.
You can read about them from Postgres here:
http://www.postgresql.org/docs/current/static/textsearch.html
After creating an index in the tsvector column, you can do your grep in two steps: one to immediately find rows that might vaguely qualify, and then the other according to your true criteria:
select * from docs where tsvcol @@ :tsquery and (regexp at will);
This will be significantly faster than grep can do.
Denis de bernardy
source share