my application is supposed to process list of links, and - if some of them are not found in the database - add them. What makes the Link unique is it’s (link, user_id) pair.
The above code is automatically called by background service (daemon written in PHP).
The problem is that after few hours of running, some entries are duplicated.
Yeah, thats right, they have the very same user_id and link attributes. Second one was added 65 sec after the first one… Any ideas how is that even possible?
Yeah, I will later put an UNIQUE index on this pair, however first I’d like to know what is wrong in above code.
there are at least two possible explanations: you are using db engine which does not support transactions (like MyISAM MySQL tables), or there are several processes executing this code same time. transactions do not prevent such situation when two simultaneous processes try to do this procedure same time - both will check if link exists, both will find out that it is not, and then both will try to add it. When there is not unique contraint - you will end up with duplicated entries…
transactions only guarantee that set of queries will be executed completely, or none of them will take any effect in case of any failure.
Thank you for the answer, however this doesn’t seem to solve the problem.
All the tables are InnoDB, and the process runs as a daemon, running the "update" command as soon as previous "update" finished. Therefore its not possible to have it running simultaneously …
I thought "serializable" isolation level might prevent this misbehaviour.
I think it can only work when data already exist, select "for update" locks such data and prevents other transaction from update/read. but when there is no data - there is nothing to lock. You could lock entire table to the end of such transaction and this could solve your problem, but "unique" constraint would be better (only one transaction ended with success, others will fail returning contraint violation).