[EXTENSION] EDbModifiedDependency

Hi everyone,

I would have posted this in Extensions but I don’t have the right to do it, so here we go. This extension is a cache dependency used to avoid doing queries for data that haven’t changed. You’ve probably seen the idea in action in wiki articles or elsewhere. The trick is of course to use the MAX() function to determine when a table was last modified. This extension supports specifying multiple tables as the dependency, and instead of using the common UNION SELECT it uses the GREATEST() function, which is considerably faster than the former and scales much better to multiple tables:


+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+

| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |

+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+

|  1 | PRIMARY     | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | No tables used               |

|  4 | SUBQUERY    | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |

|  3 | SUBQUERY    | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |

|  2 | SUBQUERY    | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |

+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+

4 rows in set (0.00 sec)

Naturally, for this to work your tables must have a column indicating when it was last modified, and you should be certain that the column timestamp is always updated when a row is edited.

The source code is available on Bitbucket. It seems I don’t have the privilege of posting links so you’ll have to search for negge/edbmodified to get to the repository. As it only consists of one file I’ll paste the code here as well. Usage examples can be found in the class’s comment block.

How to use: but the file somewhere, import it and use it like you would use any other cache dependency.


/**

 * @author Sam Stenvall <sam.stenvall@arcada.fi>

 * @copyright Copyright &copy; Sam Stenvall 2012

 * @license can't post links so I can't link to the copyright notice...

 * 

 * Cache dependency based on the last modified row in a set of tables. It uses 

 * the built-on GREATEST() function when querying multiple tables instead of 

 * the widely used (and much slower) UNION SELECT way.

 * 

 * Usage examples:

 * 

 * // Single table

 * $dep = new EDbModifiedDependency('product');

 * 

 * // Single table with non-default column name

 * $dep = new EDbModifiedDependency('category', 'last_modified');

 * 

 * // Multiple tables

 * $dep = new EDbModifiedDependency(array('product', 'category'));

 * 

 * // Multiple tables with non-default column names

 * $dep = new EDbModifiedDependency(array(

 *     'product', 

 *     array(

 * 	       'table'=>'category',

 *         'modifiedColumn'=>'last_modified',

 *     ),

 * ));

 * 

 */

class EDbModifiedDependency extends CDbCacheDependency

{


	/**

	 * Class constructor.

	 * @param mixed $tables either the name of the table as a string, or an 

	 * array containing table names. When passing an array, the value(s) can 

	 * also be an array with the keys "table" (the table name) and 

	 * "modifiedColumn" (the column name for that table).

	 * @param string $modifiedColumn the column containing the last modified 

	 * timestamp

	 */

	public function __construct($tables, $modifiedColumn = 'modified')

	{

		// Use a simple SELECT MAX for single table or SELECT GREATEST(...) for 

		// multiple.

		if (!is_array($tables))

			$sql = 'SELECT MAX(`'.$modifiedColumn.'`) FROM `'.$tables.'`';

		else

		{

			$greatestParams = array();

			foreach ($tables as $table)

			{

				// Determine column and table name

				$col = is_array($table) ? $table['modifiedColumn'] : $modifiedColumn;

				$tbl = is_array($table) ? $table['table'] : $table;


				$greatestParams[] = '(SELECT MAX(`'.$col.'`) FROM `'.$tbl.'`)';

			}


			$sql = 'SELECT GREATEST('.implode(', ', $greatestParams).')';

		}


		parent::__construct($sql);

	}


}

Hopefully someone will find this useful! I know I did.