I am working through an issue with the yii\db\ActiveRelationTrait::via() function and it’s materialization in yii\db\ActiveQuery::prepare(). I have a many to many join which looks like this:
vehicles >-< vehicle_customers >-< customers
with approx 6500 vehicles, 100 customers and 10,000 relations mapped in vehicle_customers
In my Customer model I have the following relations defined:
/**
* @return ActiveQuery
*/
public function getVehicleCustomers()
{
return $this->hasMany(VehicleCustomer::class, ['customer_id' => 'id']);
}
/**
* @return ActiveQuery
*/
public function getVehicles()
{
return $this->hasMany(Vehicle::class, ['id' => 'vehicle_id'])
->via('vehicleCustomers');
}
This all functions as expected however (as this query is called very frequently) I am dealing with performance issues (IO and CPU). What I have discovered is that yii\db\ActiveQuery::prepare() executes a query to pull and populate all junction table relations regardless of subsequent joins
For example if I execute: $customer->getVehicles()->one();
It first executes SELECT * FROM vehicle_customers
WHERE customer_id
=:cid (returning 100s or 1000s of rows) and then populates the subsequent query with a large in clause e.g.
SELECT * FROM vehicles
WHERE (vehicles
.deleted
IS NULL) AND (id
IN (72395, 72406, 72446, 72465, 72488, 72489, 72497, 72500, 72519, 72522, 72531, 72537, 72538, 72559, 72587, 72593, 72610, 72615, 72627, 72645, 72654, 72658, 72673, 72721, 72731, 72769, 72772, 72782, 72807, 72830, 72835, 72857, 72890, 72893, 72894, 72896, 72901, 72915, 72928, 72936, 72954, 72980, 72989, 72995, 72996, 73007, 73018, 73021, 73026, 73039, 73045, 73055, 73072, 73083, 73094, 73096, 73108, 73109, 73124, 73138, 73140, 73177, 73196, 73206, 73254, 73270, 88691, 90948, 91183, 91186, 91187, 91188, 91190, 91191, 91193, 91195, 91196, 91202, 91203, 91228, 91229, 91255, 91256, 91282, 91284, 91286, 91287, 91288, 91290, 91297, 91341, 91342, 91385, 91386, 91399, 91424, 91426, 91465, 91466, 91481, 91493, 91646, 91647, 91648, 91656, 91665, 91786, 91787, 91788, 91789, 91817, 91964, 91985, 93329, 93793, 93794, 93795, 93796, 93797, 93800, 94026, 94032, 94132, 94144, 94145, 94150, 94157, 94726, 94790, 94892, 94894, 94907, 94908, 94918, 94922) );
It seems like a more efficient way to address this would be with an inner join:
SELECT * FROM vehicles
WHERE (vehicles
.deleted
IS NULL) AND (id
IN (SELECT vehicle_id FROM vehicle_customers
WHERE customer_id
=:cid) );
Which I have done manually (but then I loose all the benefits of the via() relation). Anybody else dealt with this? If I took a swing at refactoring would it be useful to the community? Thanks!