This seems like a common problem.
I’m into my fifth Yii project, and this has come up two or three times by now, usually when generating complicated reports, exporting data, performing batch processes such as mass e-mail, etc.
I’ve seen this problem mentioned many times on the forum, and it’s something that everyone is going to encounter again and again, for sure…
There are many reasons why using DAO rather than AR could be extremely impractical - in some cases impossible.
For one, your views (particularly partials) are probably written so they expect model objects. You may be relying on other model-specific features as well, such as behaviors, calculated attributes, etc.
Getting to your data probably is really the least of your concerns - by the time you go around the models and hardcode all your queries get to your data, you may realize that you have to duplicate business logic that was embedded in your models.
The bottom line is, the functionality provided by your models is usually very important, and usually, the rest of your application is designed to expect model objects.
For another, you may already be well into your project - having written hundreds or thousands of lines of code, before you realize that AR can’t cope. You may even have a running project that runs fine for a while, and as you accumulate more data, your application slowly outgrows the memory limitations imposed by AR.
What’s really required here, is a way to iterate through large numbers of records without constructing them all in advance - that’s not how a relational database works, so AR is imposing an artificial limitation.
I would propose one of two approaches:
-
An alternative query method that gives you an iterator, rather than an array of objects - you can then (in many cases) simply pass the iterator to your view, where the objects will be loaded, rendered, and destroyed, one at a time, rather than loaded and constructed in advance.
-
Have the existing methods return iterators, rather than actual arrays. Iterators can largely emulate arrays - for example, you can count() them without loading the result set. Random access would be hard to implement, but would probably very rarely be used (can’t think of an example).
Either approach would make Yii considerably less memory intensive, overall - eliminating a lot of unnecessary memory fragmentation and increasing server performance.
Either approach puts a limitation on the ‘index’ feature of AR, where you can specify which column you want to use as the index for the returned array - you can’t do that, because you don’t know what the indexes are going to be.
In the light of this, if the second approach was chosen, maybe it should be implemented so that you have to specifically ask for progressive loading - e.g. an iterator - rather than having it be the default.
So let’s say you have this:
$users = User::model()->findAll(); // preloads and gives you objects, as usual
With delayed loading, it might look like this:
$users = User::model()->delayed()->findAll(); // gives you an iterator instead
When you foreach() through your $users with this iterator, PHP can dispose of each object as it falls out of scope, so the memory overhead is for 1 record only.
The delayed() method would need to be supported by both CActiveRecord and CActiveFinder.
…
On another note, a related issue is the inability to cache model instances in-memory. This is sort of the "opposite" problem of the one described above.
Let’s say you’re going to load and display a page of 50 user records - each of them have a related country record, but there are only, say, five countries in my database.
Every time I access the country relation of user record, it’s going to query over and over again for the same records.
An in-memory caching feature would keep the country records somewhere, statically, so that they only load once each - if you findByPk() repeatedly, it’s going to return the same object.
Again, this feature would need to be something you turn on explicitly, so for example:
$users = User::model()->with(array(
'country'=>array('cache'=>true)
)->findAll();
This should cause the country records to lazy-load by default, since you can’t (or shouldn’t) load them eagerly, generating duplicates, if you expect them to stay in memory. It also would not work with delayed(), since those objects are expected to fall out of scope and be destroyed progressively.
…
In a sense, AR is currently optimized for an “average” number of records - not too many, and not too few. Unfortunately, you’ll find that there are many cases where “average” is not what you’re dealing with, and AR seems to fall short in many of these cases…