Thread: Online & update races
Hello there, I am thinking how to solve another typical problem of online systems with combination of thin client... Imagine simple case, 2 users are going to edit 'same' datas. Both see on the 'screen' the same, after they started edit them. First one changes datas and submit changes (sucessfully). Database (set tables - inserts/updates/deleting) was changed. At this point, datas which second user is watching are not valid anymore. They are outdated and he should refresh or 'merge' changes. If he will 'submit' his datas, he can delete change of first user. These time races can occurs accross various part of complex system, so, there is no way how to 'lock' function - e.g. second user will see "wait, someone edits right now". What is the best way to solve this general problem - on DB layer as possible ? (using PlSql, triggers, ...) - anyone have experimence, or is there any project, which allready did something like this ? Database server is PgSql, as you probably expect :) I am not too much into PgSql internals but I belive, there is way how to solve these problems better than application layer... Locking records/tables is not possible at all. It is yeasy to add some counter and check version in system itselfs, but something more general (e.g. UPDATE SQL statement will return error, if record 'timestamp/changeid/...' will be mismatched, whatever) is much nicer solution. It makes 'system' itselfs simpler, which is good. Another solution is more 'organisation work flow'. I belive, combination of system (DB) and workflow is the best solution. All these time-related changes are done within one transaction. I will also appreciate any links to web resources, talking about this problem. I didn't find anything usefull around. Thank you again, Best regards, Lada 'Ray' Lostak Unreal64 Develop group http://www.orcave.com http://www.unreal64.net -------------------------------------------------------------------------- In the 1960s you needed the power of two C64s to get a rocket to the moon. Now you need a machine which is a vast number of times more powerful just to run the most popular GUI.
Is a dump and reload required when going from beta1 to beta3? Chris
I used to do it this way : Suppose you have a web form to edit data from a table... you add a field in your table which contains a version identifier for that data, then you UPDATE ... WHERE ... AND version_id = the old version id. The version_id is passed around in a session variable or in hidden form fields. The hidden form fields are better because they will prevent breakage if the same user edits the same data in different windows, or refreshes his browser window at the wrong time. Then, if the UPDATE has a rowcount of 1, you know it's OK, but if it has a rowcount of 0, you know something happens, and can check if the record still exists and its version id was modified, or if the record was deleted. A version id can be a counter, a sequence... it can also be a MD5 of the row contents for instance, its sole purpose being to detect change. Using a sequence might be the easiest. This way works but still looks like band-aid ; moreover, if you do a complex operation which modifies several tables, you have to take care of modification order, and the problem becomes more complex. It would be nice to have a framework for that kind of thing which is common in web apps. One of postgresql's good points is that it does not lock things, thanks to MVCC, unlike MySQL which locks the table on every write. This model is in the same spirit than MVCC, because it will not prevent reads to records which are being updated. However, a recurrent problem in web applications is that there is no "logout", logout can only be implemented with certainty using timeouts, so you can't use locking, because you really don't know when the locks will be released. If you use locking, some information will get locked waiting for a timeout if a user closes his browser without explicitely logging out ; besides you'd have to have a cron to log users out as a disconnected user, by definition makes no action to signal the fact that h's gone away. You could implement this by adding a version_id serial field to the relevant tables, and then an ON UPDATE trigger which would check that the version_id of the updater is the same than the version_id in the updated row, or else raise an exception. You can also have a special value to bypass checks, to be able to update in all cases, and not get stuck if you have a problem. The trigger would then increment the version_id before updating. What do you think ? > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
Hi, > Suppose you have a web form to edit data from a table... you add a field > in your table which contains a version identifier for that data, then you > UPDATE ... WHERE ... AND version_id = the old version id. The version_id > is passed around in a session variable or in hidden form fields. The > hidden form fields are better because they will prevent breakage if the > same user edits the same data in different windows, or refreshes his > browser window at the wrong time. > Then, if the UPDATE has a rowcount of 1, you know it's OK, but if it has > a rowcount of 0, you know something happens, and can check if the record > still exists and its version id was modified, or if the record was deleted. > A version id can be a counter, a sequence... it can also be a MD5 of the > row contents for instance, its sole purpose being to detect change. Using > a sequence might be the easiest. > This way works but still looks like band-aid ; moreover, if you do a > complex operation which modifies several tables, you have to take care of > modification order, and the problem becomes more complex. > It would be nice to have a framework for that kind of thing which is > common in web apps. > One of postgresql's good points is that it does not lock things, thanks > to MVCC, unlike MySQL which locks the table on every write. This model is > in the same spirit than MVCC, because it will not prevent reads to records > which are being updated. > However, a recurrent problem in web applications is that there is no > "logout", logout can only be implemented with certainty using timeouts, so > you can't use locking, because you really don't know when the locks will > be released. If you use locking, some information will get locked waiting > for a timeout if a user closes his browser without explicitely logging out > ; besides you'd have to have a cron to log users out as a disconnected > user, by definition makes no action to signal the fact that h's gone away. > You could implement this by adding a version_id serial field to the > relevant tables, and then an ON UPDATE trigger which would check that the > version_id of the updater is the same than the version_id in the updated > row, or else raise an exception. You can also have a special value to > bypass checks, to be able to update in all cases, and not get stuck if you > have a problem. The trigger would then increment the version_id before > updating. Thank you for reply. I think this is basically the 'only' way how to solve the problem. Have some 'row changes count'. This scheme is easy implementable for smaller systems. But as you wrote above, if you prepare 'user datas' for various tables, joins, whatever, it is very hard to take care of this 'modified serial'. I would like to move this way to 'database' itselfs somehow. I was thinking abotu something like this... But I don't know PgSql internals, so, I don't know if it is possible... I suppose DB engine have to 'hold' some kind of row version' (chnages count, timestamp, combination, whatever) - because it should be needed while transactions. If DB engine can 'collect' within one transaction these ID's which were used while selecting datas, I can keep then in 'POST' data and verify (while transaction) if they are still valid. I have no clue if this is possible with PlSQL or another server-side scripting language or some improvement of PqSql is needed. And if improvement is needed, if it is possible. And if it is possible, if some PG developer will want to do that - and if they will like this kind of improvement. I am open to pay for solving this problem, because the system we are doing will be used commerically. And I think, this is general problem, which every bigger system have to solve. Sooner or later. I fully agree with you, that locking is not a way. But is there any other 'more automatized' way than take care of row versions by ourself ? But before contacting developers, I would like to ask other about opinions... Have a nice day, Best regards, Lada 'Ray' Lostak Unreal64 Develop group http://www.orcave.com http://www.unreal64.net -------------------------------------------------------------------------- In the 1960s you needed the power of two C64s to get a rocket to the moon. Now you need a machine which is a vast number of times more powerful just to run the most popular GUI.
Lada 'Ray' Lostak wrote: > I will also appreciate any links to web resources, talking about this > problem. I didn't find anything usefull around. I'm working with developing a fairly big warehouse management system, and there we see this problem every day. We've settled (many years ago) for the following: loop begin select * from my_table where status = :status (read into record structure) update my_table set a=1, b=2 where and status = :status and another_status = :status2 and lates_updater =:record.latest_updater and latest_date = :record.latest_date and latest_time = :record.latest_time commit exit loop exception when transaction_conflict | no_such_row => rollback delay small time end loop; transaction_conflict | no_such_row should be interpreted as no row matched. The flaw this design has, is that you can't separate 'real transaction conflicts' with situation where other conditions failed, as no row had another_status = status2 However, we find the design good enough to keep using it. It has the advantage that no table is ever looked, we don't use select for update at all. -- /Björn ------------------------------------------------------------------- http://lundin.homelinux.net Registered Linux User No. 267342 <http://counter.li.org>