Multiple databases strategy

GSTAR1 · August 7, 2021, 4:07pm

Hi all,

I wanted some advice on the best strategy for implementing multiple databases in my application. Currently I’m developing a system whereby each client’s data will be held in their own database. There will be one CORE database which stores client table and users table (1 to many).

Should I:

A) Define a separate database connection inside web.php for each client? E.g:

'components' => [
    'db' => require __DIR__ . '/db.php',
    'client1_db' => require __DIR__ . '/client1_db.php',
    'client2_db' => require __DIR__ . '/client1_db.php',
    ...
],

B) Or, create the client DB connection on the fly? For this option, I guess I would have to store database connection credentials inside the core database?

What are the advantages/disadvantages of each approach? Is there any other solution?

In addition, I also want to have a set of database migrations applied to each new client database. Currently, you can specify the database connection to use by specifying the --db= parameter on the yii migrate command, however if I went with option B I guess I would not be able to do this?

snathan · August 8, 2021, 12:52pm

As you may already know, the standard approach to this is to store all the data in a single database and separate the client data rows by a column you can filter - in most cases this will be a client user id column. Then all your queries will have to add filter to only show that client or client’s group data (if more than one client user can log in).

Your approach is valid when there is a good requirement where you want to (1) avoid the above, or (2) you want the application to be secured in a way that a logged in user can only get access to his database. I can only assume you have a handful of clients as it may test the limits of the database when you create a large number of databases - although that may be fine too - may have to test it out as to what happens if you create a large number of databases.

Now coming to your strategy, besides the two you mentioned in your question, I do want to propose a 3rd strategy. You may also want to consider using a different table prefix for each customer. In that case, you keep some of the common tables with a common set of prefix and you have the client related tables with client specific prefix also stored in the same database - the client table prefix may simple be CLI followed by client-id and underscore (e.g. cli00001_) as the prefix. This will create a large number of tables within the same database - which can also be an issue depending on the number of clients and I suggest you test it out. After the client login you will have to set the appropriate table prefix dynamically.

As far as migration goes, it may just be easier to loop for each client connection or table prefix setting. You can generate the config files dynamically from the database if you are using the approach to store db connections in the database. The same for table prefix approach as well.

Requirements supersedes the decision of the best approach, but still a lot of work to be going “off track”. When going off-track there will be things you will hit a roadblock on - tinkering with (or inheriting) core classes, hitting bugs that may never get resolved as no one else is encountering them, etc.

I would love to know what you end up doing and implementing and what level of effort it took, issues encountered, etc.

PS: Edit: After I hit save I thought of a 4th option. It got me thinking that perhaps the requirement you have in mind is for performance and you want to separate client data out by each database for that reason. The table prefix option is somewhat the same as creating a database partition scheme for the tables. In this case you have the same database, tables and prefix however each client data resides in a separate database partition that you can dynamically configure based on client login. This would be the best approach in my mind as you don’t have to worry about number of databases, tables or migrating each set of databases/tables. I have not used partition in MySQL but have done the same in Oracle where it gives you a number of options such as creating local partition indexes, global indexes, etc. that you will have to consider.

GSTAR1 · August 8, 2021, 4:49pm

Hi @snathan - thanks for your reply! Very insightful!

I initially went with the single database approach - and I had a client_id column in all my database tables.

However this also meant I had to include a WHERE client_id = $client_id clause in all of my SQL statements. That isn’t necessarily a bad thing - as long as you remember to add that clause in everywhere. I just felt as a multi-tenant application it makes sense to use separate databases.

Your suggestion of using a table prefix for each client is interesting. You’re right - this would create a large number of tables in the same database, so for that reason I probably wouldn’t go with this approach.

The problem I’m trying to solve isn’t strictly related to performance. It’s more so having a good separation of each client’s data. I also found having to include the WHERE client_id = $client_id clause everywhere was getting a bit tedious.