How to join Active Record to itself?

I have a table called PlayerHistory which has a history of player's rank for each day. Let's assume that it has only three columns: PlayerID, Date, Rank. I need to get a list of top N players for a particular date and their rank change from the previous date. I use following SQL statement for this purpose:

SELECT p1.PlayerID, p1.Date, p1.Rank, p2.Rank FROM


(SELECT PlayerID, Date, Rank FROM PlayerHistory WHERE Date='2009-01-29' ORDER BY Rank DESC LIMIT 50) p1


LEFT JOIN


(SELECT PlayerID, Date, Rank FROM PlayerHistory WHERE Date='2009-01-28') p2


ON p1.PlayerID=p2.PlayerID

If I create a PlayerHistory model which has these three columns, how can I perform this query using Active Record methods? Is it possible to avoid raw SQL in this case at all? Total number of SQL queries for this task shouldn't be more than two - and single query is better.

Please don't suggest to store previous date rank in the same row - the table is huge, I am already trying to squeeze every byte out of it.

Use CActiveRecord::findAllBySql(). For such kind of queries, it would be better to keep things simple rather than trying to figure out complex AR solutions.

Ok. Could you please also suggest what fields to add to the model? As you see, in this case Rank is returned twice. I can return "(p1.Rank-p2.Rank) AS Delta" instead but I don't want to put that field into the model because it's used only in this query and there are many other different queries for the same model.

You need different column aliases for these two ranks. Assuming one is 'Rank' and the other is 'Rank2', you can declare a public member variable named 'Rank2' in your model class, and that's it.

Hello again, I have another question about AR behavior. Specifically, why does it send two requests to return results from two joined tables and why does it use PKs from first table to get results from second. Here is my AR query:

$this->players = Player::model()->with(array('history'=>array('order'=>'??.Rank DESC')))->together()->findAll(array('limit'=>2));

The intent here is to get two players with the highest rank (from second table) and return their names (from first table). AR issues two SQL queries: one to obtain primary keys from Player table and second to join Player and PlayerHistory tables using those keys:

SELECT `Player`.`ServerID` AS t0_c0, `Player`.`ID` AS t0_c1, `Player`.`Name` AS t0_c2 FROM `Player`  LIMIT 2
SELECT `Player`.`ServerID` AS t0_c0, `Player`.`ID` AS t0_c1, t1.`ServerID` AS t1_c0, t1.`PlayerID` AS t1_c1, t1.`Date` AS t1_c2, t1.`Rank` AS t1_c4 FROM `Player` LEFT OUTER JOIN `PlayerHistory` t1 ON t1.`ServerID`=`Player`.`ServerID` AND t1.`PlayerID`=`Player`.`ID` WHERE (`Player`.`ServerID`, `Player`.`ID`) IN ((1, 14037), (1, 47587)) ORDER BY t1.Rank DESC

Obviously this produces incorrect result. The condition depends on joined query result but AR uses only first table to get PKs.

Such kind of query is what AR is not good at. When there is "limit" on the primary table and there is a HAS_MANY relation, AR has to query the primary table first to get correct limit result. And then it can join the other table to get HAS_MANY objects. I would recommend you use raw SQL in this case.

But I have specified together() method for the query. Why doesn't it just produce a single query which will automatically become correct?

Together() will still use two queries if the primary table has LIMIT and there is HAS_MANY relation. Try to write a plain SQL to solve this problem and you will see why.

Sure, I wrote a plain sql and it works, but this case (two joined tables with limit) seems simple enough to improve AR to support it. What do you think?

Could you show me your SQL?

It's very simple:

SELECT * FROM Player p INNER JOIN PlayerHistory ph ON p.ServerID=ph.ServerID AND p.ID=ph.VillageID WHERE p.ServerID=1 AND ph.Date='2009-01-11' ORDER BY ph.Rank LIMIT 10

If a player has more than 10 history entries, will your query still give correct results? You want to return the top 2 players, right?

Quote

If a player has more than 10 history entries, will your query still give correct results? You want to return the top 2 players, right?

No, the last query I showed will just return top 10 players for 01/11/2009. I sort joined tables, where each row in history table has only one record for each player per date, so by limiting with "date='2009-01-11'" we essentially get one-to-one mapping.

Your assumption is that you know a player has only one history with the specified condition, but AR doesn't know that.

A workaround to your problem is that you define a HAS_ONE relation and use that to bring back the needed data.

I just checked in an enhancement. You can use together(true) to enforce only one SQL is used, regardless of LIMIT. Please let me know if this works for you. Thanks.

Thanks, Qiang, I'll try it when I get a chance. In the meantime I have a question about your suggestion to use findAllBySql() method. Where should I call this method? From the controller? Then what about the idea of not having any model code (SQL query) in the controller classes?

Creating a method in your model class that calls the findAllBySql() seems more appropriate.