I have a table which stores user credits on my website. It is possible for 2 events which update a users credit balance to happen at the same time. I would like to lock the row to prevent one update cancelling out the other. Is this possible?
database updates are atomic you can also use transactions on bunch of changes (transactions are also atomic), so on this level everything will be ok. The main problem is with UI in web environment - between select that gets data to show and subsequent update query there is significant amount of time. There is also possibility that the user that get the data disconnected. also the lock must be on row-level or you will end up with whole table locked when some user started update process and disconnected (closed browser)
So there are two main choices:
[*] pessimistic locking - when you put some locking information in every row (i.e. columns locked_by_user int and locked_at timestamp). Other users can’t enter update process while there is lock for other user. You have to deal with user disconnects (i.e. clear locks older than XXX on cron job)
[*] optimistic locking - when you store some information about version of each row (version int, or updated_at timestamt, etc) and allow UPDATe query only if that version did not change dusring update process. If so - another user changed something in meantime and you cannot aplly your changes. You have to reload current data and repeat your changes. This approach is disconnects save, but causes errors like "another user changed this data in meantime. Please try again."
if the possibility is high and users are reliable (low disconnection probability) maybe pessimistic lock will be better, because users will be informed that the record is being modified by someone else.
Thanks for the reply. I am thinking of using pessimistic locking as it is absolutely essential that only one update can happen at a time. The updates will all be decrements of a decimal field by a fixed amount. This is how I plan to implement it, using a boolean locking field:
Query the row with a condition that the locking field is FALSE. If this fails there is a lock and the code will sleep for a few seconds then try again.
If the row is returned, execute a query to set the locking field value to TRUE, with a condition that the locking field is currently FALSE. If this fails the row has been locked since the first query, so sleep then start again at step 1.
Perform the update.
Set the locking field value to FALSE.
Does this seem like the way to go? I guess as a precaution I could add a timestamp field as well and run a cron job to remove old locks in case something goes wrong.
this looks like you want to do everything during single HTTP request or inside CLI application. If so - the only locking you should and must use are db transactions. without transaction you do not have even guarantee that this lock will work well.
also this looks like you do not have to lock anything - just run: UPDATE table SET amount = amount + XXXX WHERE id = Y; and this query will also be atomic, which means - there cannot be parallel updates that will cause data inconsistency…
Yes but there is a catch . While each transaction will happen within a single HTTP request, the source of these requests are from random events on the internet. There is no predicting them and multiple could come through at the same time. However if I understand correctly, even in this scenario the “UPDATE table SET amount = amount + XXXX WHERE id = Y;” command should suffice even if 2 requests come through at exactly the same time?