I have seen several database caching mechanisms, all of which are pretty dumb (i.e. keep this query cached for X minutes ) and require that you manually delete the entire cache repository after the INSERT / UPDATE / DELETE request has been executed.
About 2 or 3 years ago, I developed an alternative database caching system for the project I was working on, the idea was mainly to use regular expressions to find tables participating in a particular SQL query:
$query_patterns = array ( 'INSERT' => '/INTO\s+(\w+)\s+/i', 'SELECT' => '/FROM\s+((?:[\w]|,\s*)+)(?:\s+(?:[LEFT|RIGHT|OUTER|INNER|NATURAL|CROSS]\s*)*JOIN\s+((?:[\w]|,\s*)+)\s*)*/i', 'UPDATE' => '/UPDATE\s+(\w+)\s+SET/i', 'DELETE' => '/FROM\s+((?:[\w]|,\s*)+)/i', 'REPLACE' => '/INTO\s+(\w+)\s+/i', 'TRUNCATE' => '/TRUNCATE\s+(\w+)/i', 'LOAD' => '/INTO\s+TABLE\s+(\w+)/i', );
I know that these regular expressions probably have some flaws (my regular expression skills were pretty green) and obviously don't match the nested queries, but since I never use them, this is not a problem for me.
In any case, after searching for the involved tables, I sorted them alphabetically and created a new folder in the cache repository with the following naming convention:
+table_a+table_b+table_c+table_...+
In the case of a SELECT query, I would extract the results from the database, serialize() them and save them in the corresponding cache folder, therefore, for example, the results of the following query:
SELECT `table_a`.`title`, `table_b`.`description` FROM `table_a`, `table_b` WHERE `table_a`.`id` <= 10 ORDER BY `table_a`.`id` ASC;
Will be saved to:
/cache/+table_a+table_b+/079138e64d88039ab9cb2eab3b6bdb7b.md5
MD5 is the request itself. After a subsequent SELECT query, the results will be trivial to retrieve.
In the case of any other type of write request ( INSERT , REPLACE , UPDATE , DELETE , etc.) I would glob() all folders with +matched_table(s)+ in their name everything would delete the entire contents of the file. Thus, there is no need to delete the entire cache, only the cache used by the corresponding and related tables.
The system worked very well, and the performance difference was visible - although the project had many read requests than write requests. Since then, I started using transactions, FK CASCADE UPDATES / DELETES and I never had time to improve the system to make it work with these functions.
I have used MySQL Query Cache in the past, but I have to say that performance is not even compared.
I wonder: am I the only one who sees beauty in this system? Are there any bottlenecks that I donโt know about? Why do popular frameworks such as CodeIgniter and Kohana (I donโt know Zend Framework ) have such rudimentary database caching systems?
More importantly, do you see this as a function worth pursuing? If so, is there anything I could do / use to do it even faster (my main problems are disk I / O and (de) serializing query results)?
I appreciate the whole entry, thanks.