Large query using via() relation

I am working through an issue with the yii\db\ActiveRelationTrait::via() function and it’s materialization in yii\db\ActiveQuery::prepare(). I have a many to many join which looks like this:

vehicles >-< vehicle_customers >-< customers

with approx 6500 vehicles, 100 customers and 10,000 relations mapped in vehicle_customers

In my Customer model I have the following relations defined:

/**
 * @return ActiveQuery
 */
public function getVehicleCustomers()
{
    return $this->hasMany(VehicleCustomer::class, ['customer_id' => 'id']);
}

/**
 * @return ActiveQuery
 */
public function getVehicles()
{
    return $this->hasMany(Vehicle::class, ['id' => 'vehicle_id'])
        ->via('vehicleCustomers');
}

This all functions as expected however (as this query is called very frequently) I am dealing with performance issues (IO and CPU). What I have discovered is that yii\db\ActiveQuery::prepare() executes a query to pull and populate all junction table relations regardless of subsequent joins

For example if I execute: $customer->getVehicles()->one();

It first executes SELECT * FROM vehicle_customers WHERE customer_id=:cid (returning 100s or 1000s of rows) and then populates the subsequent query with a large in clause e.g.

SELECT * FROM vehicles WHERE (vehicles.deleted IS NULL) AND (id IN (72395, 72406, 72446, 72465, 72488, 72489, 72497, 72500, 72519, 72522, 72531, 72537, 72538, 72559, 72587, 72593, 72610, 72615, 72627, 72645, 72654, 72658, 72673, 72721, 72731, 72769, 72772, 72782, 72807, 72830, 72835, 72857, 72890, 72893, 72894, 72896, 72901, 72915, 72928, 72936, 72954, 72980, 72989, 72995, 72996, 73007, 73018, 73021, 73026, 73039, 73045, 73055, 73072, 73083, 73094, 73096, 73108, 73109, 73124, 73138, 73140, 73177, 73196, 73206, 73254, 73270, 88691, 90948, 91183, 91186, 91187, 91188, 91190, 91191, 91193, 91195, 91196, 91202, 91203, 91228, 91229, 91255, 91256, 91282, 91284, 91286, 91287, 91288, 91290, 91297, 91341, 91342, 91385, 91386, 91399, 91424, 91426, 91465, 91466, 91481, 91493, 91646, 91647, 91648, 91656, 91665, 91786, 91787, 91788, 91789, 91817, 91964, 91985, 93329, 93793, 93794, 93795, 93796, 93797, 93800, 94026, 94032, 94132, 94144, 94145, 94150, 94157, 94726, 94790, 94892, 94894, 94907, 94908, 94918, 94922) );

It seems like a more efficient way to address this would be with an inner join:

SELECT * FROM vehicles WHERE (vehicles.deleted IS NULL) AND (id IN (SELECT vehicle_id FROM vehicle_customers WHERE customer_id=:cid) );

Which I have done manually (but then I loose all the benefits of the via() relation). Anybody else dealt with this? If I took a swing at refactoring would it be useful to the community? Thanks!

did you compare performance on these two variants?

The current way it is done in Yii is faster in a lot of cases so we would need some criteria to decide in which cases we want to make a join and in which cases we want to keep current behavior with IN condition.

1 Like

Sorry it has taken me so long to respond but I did want to have a benchmark to test with before replying. I have created a repo that executes the case I am dealing with. It’s located at: https://github.com/darrylkuhn/yii-join-example. The readme should guide you through how to set it up and run the tests. I created two tests as yii comands. One loads a related model using via() and the other using a join. The results look like this on my machine:

./yii load-test/use-via
Memory Usage in bytes: 4519648
Total execution time: 1.5180041790009

./yii load-test/use-join
Memory Usage in bytes: 4423224
Total execution time: 0.10490989685059

As you can see there is a small difference in memory usage and a significant difference in processing time. I did not evaluate db IO or CPU usage mostly because I don’t know a good easy way to do it but I am fairly certain both are adversely affected. Hope this helps clarify the issue. I’m happy to brainstorm solutions…

1 Like

Hi, @darrylkuhn

The test result is interesting, but it measures the performance of “Getting a single main model with its multiple related models in lazy loading mode”. But we also have to consider the performance in a different scenario – Getting multiple main models with their multiple related models in eager loading mode.

I’ve made a PR to your test project, trying to add some tests.

  1. using viaTable instead of via
  2. Eager loading using via
  3. Eager loading using viaTable
  4. Lazy loading using viaJoin
  5. Lazy loading using via
  6. Lazy loading using viaTable

Only the first one is fetching a single customer, but others are fetching multiple customers.

The result is very interesting. TL/DR:

  1. viaJoin is the fastest.
  2. viaTable is considerably faster than via
  3. Lazy loading is faster than Eager loading in this test.
C:\data\yii-join-example>yii load-test/use-via
Vehicles found: 144
Memory Usage in bytes: 5086240
Total execution time: 0.4930739402771

C:\data\yii-join-example>yii load-test/use-join
Vehicles found: 144
Memory Usage in bytes: 4981872
Total execution time: 0.079849004745483

C:\data\yii-join-example>yii load-test/use-via-table
Vehicles found: 144
Memory Usage in bytes: 5037928
Total execution time: 0.18838500976562

C:\data\yii-join-example>yii load-test/with ... Eager loading using via
Customers found: 10
Customer Name: Customer0, Active Vehicles: 6
Customer Name: Customer1, Active Vehicles: 144
Customer Name: Customer2, Active Vehicles: 28
Customer Name: Customer3, Active Vehicles: 0
Customer Name: Customer4, Active Vehicles: 0
Customer Name: Customer5, Active Vehicles: 0
Customer Name: Customer6, Active Vehicles: 12
Customer Name: Customer7, Active Vehicles: 12
Customer Name: Customer8, Active Vehicles: 16
Customer Name: Customer9, Active Vehicles: 13
Memory Usage in bytes: 9981088
Total execution time: 0.95620203018188

C:\data\yii-join-example>yii load-test/with-direct ... Eager loading using viaTable
Customers found: 10
Customer Name: Customer0, Active Vehicles: 6
Customer Name: Customer1, Active Vehicles: 144
Customer Name: Customer2, Active Vehicles: 28
Customer Name: Customer3, Active Vehicles: 0
Customer Name: Customer4, Active Vehicles: 0
Customer Name: Customer5, Active Vehicles: 0
Customer Name: Customer6, Active Vehicles: 12
Customer Name: Customer7, Active Vehicles: 12
Customer Name: Customer8, Active Vehicles: 16
Customer Name: Customer9, Active Vehicles: 13
Memory Usage in bytes: 5184912
Total execution time: 0.35573506355286

C:\data\yii-join-example>yii load-test/lazy-via
Customers found: 10
Customer Name: Customer0, Active Vehicles: 6
Customer Name: Customer1, Active Vehicles: 144
Customer Name: Customer2, Active Vehicles: 28
Customer Name: Customer3, Active Vehicles: 0
Customer Name: Customer4, Active Vehicles: 0
Customer Name: Customer5, Active Vehicles: 0
Customer Name: Customer6, Active Vehicles: 12
Customer Name: Customer7, Active Vehicles: 12
Customer Name: Customer8, Active Vehicles: 16
Customer Name: Customer9, Active Vehicles: 13
Memory Usage in bytes: 10024152
Total execution time: 0.81106114387512

C:\data\yii-join-example>yii load-test/lazy-join
Customers found: 10
Customer Name: Customer0, Active Vehicles: 6
Customer Name: Customer1, Active Vehicles: 144
Customer Name: Customer2, Active Vehicles: 28
Customer Name: Customer3, Active Vehicles: 0
Customer Name: Customer4, Active Vehicles: 0
Customer Name: Customer5, Active Vehicles: 0
Customer Name: Customer6, Active Vehicles: 12
Customer Name: Customer7, Active Vehicles: 12
Customer Name: Customer8, Active Vehicles: 16
Customer Name: Customer9, Active Vehicles: 13
Memory Usage in bytes: 5115456
Total execution time: 0.10398507118225

C:\data\yii-join-example>yii load-test/lazy-via-table
Customers found: 10
Customer Name: Customer0, Active Vehicles: 6
Customer Name: Customer1, Active Vehicles: 144
Customer Name: Customer2, Active Vehicles: 28
Customer Name: Customer3, Active Vehicles: 0
Customer Name: Customer4, Active Vehicles: 0
Customer Name: Customer5, Active Vehicles: 0
Customer Name: Customer6, Active Vehicles: 12
Customer Name: Customer7, Active Vehicles: 12
Customer Name: Customer8, Active Vehicles: 16
Customer Name: Customer9, Active Vehicles: 13
Memory Usage in bytes: 5218904
Total execution time: 0.28959488868713
1 Like