Thread: Online & update races

Online & update races

From

"Lada 'Ray' Lostak"

Date:

18 October 2004, 16:25:59

Hello there,

    I am thinking how to solve another typical problem of online systems
with combination of thin client... Imagine simple case, 2 users are going to
edit 'same' datas. Both see on the 'screen' the same, after they started
edit them. First one changes datas and submit changes (sucessfully).
Database (set tables - inserts/updates/deleting) was changed. At this point,
datas which second user is watching are not valid anymore. They are outdated
and he should refresh or 'merge' changes. If he will 'submit' his datas, he
can delete change of first user. These time races can occurs accross various
part of complex system, so, there is no way how to 'lock' function - e.g.
second user will see "wait, someone edits right now".

    What is the best way to solve this general problem - on DB layer as
possible ? (using PlSql, triggers, ...) - anyone have experimence, or is
there any project, which allready did something like this ? Database server
is PgSql, as you probably expect :) I am not too much into PgSql internals
but I belive, there is way how to solve these problems better than
application layer...

    Locking records/tables is not possible at all. It is yeasy to add some
counter and check version in system itselfs, but something more general
(e.g. UPDATE SQL statement will return error, if record
'timestamp/changeid/...' will be mismatched, whatever) is much nicer
solution. It makes 'system' itselfs simpler, which is good.

    Another solution is more 'organisation work flow'. I belive, combination
of system (DB) and workflow is the best solution.

    All these time-related changes are done within one transaction.

    I will also appreciate any links to web resources, talking about this
problem. I didn't find anything usefull around.

Thank you again,

Best regards,
Lada 'Ray' Lostak
Unreal64 Develop group
http://www.orcave.com
http://www.unreal64.net


--------------------------------------------------------------------------
In the 1960s you needed the power of two C64s to get a rocket
to the moon. Now you need a machine which is a vast number
of times more powerful just to run the most popular GUI.

Dump requried for beta1 >> beta3?

From

"Chris Ochs"

Date:

18 October 2004, 17:31:15

Is a dump and reload required when going from beta1 to beta3?

Chris

Re: Online & update races

From

Pierre-Frédéric Caillaud

Date:

18 October 2004, 17:44:43

    I used to do it this way :

    Suppose you have a web form to edit data from a table... you add a field
in your table which contains a version identifier for that data, then you
UPDATE ... WHERE ... AND version_id = the old version id. The version_id
is passed around in a session variable or in hidden form fields. The
hidden form fields are better because they will prevent breakage if the
same user edits the same data in different windows, or refreshes his
browser window at the wrong time.
    Then, if the UPDATE has a rowcount of 1, you know it's OK, but if it has
a rowcount of 0, you know something happens, and can check if the record
still exists and its version id was modified, or if the record was deleted.
    A version id can be a counter, a sequence... it can also be a MD5 of the
row contents for instance, its sole purpose being to detect change. Using
a sequence might be the easiest.
    This way works but still looks like band-aid ; moreover, if you do a
complex operation which modifies several tables, you have to take care of
modification order, and the problem becomes more complex.
    It would be nice to have a framework for that kind of thing which is
common in web apps.
    One of postgresql's good points is that it does not lock things, thanks
to MVCC, unlike MySQL which locks the table on every write. This model is
in the same spirit than MVCC, because it will not prevent reads to records
which are being updated.
    However, a recurrent problem in web applications is that there is no
"logout", logout can only be implemented with certainty using timeouts, so
you can't use locking, because you really don't know when the locks will
be released. If you use locking, some information will get locked waiting
for a timeout if a user closes his browser without explicitely logging out
; besides you'd have to have a cron to log users out as a disconnected
user, by definition makes no action to signal the fact that h's gone away.
    You could implement this by adding a version_id serial field to the
relevant tables, and then an ON UPDATE trigger which would check that the
version_id of the updater is the same than the version_id in the updated
row, or else raise an exception. You can also have a special value to
bypass checks, to be able to update in all cases, and not get stuck if you
have a problem. The trigger would then increment the version_id before
updating.

    What do you think ?

>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

Re: Online & update races

From

"Lada 'Ray' Lostak"

Date:

19 October 2004, 12:50:04

Hi,

> Suppose you have a web form to edit data from a table... you add a field
> in your table which contains a version identifier for that data, then you
> UPDATE ... WHERE ... AND version_id = the old version id. The version_id
> is passed around in a session variable or in hidden form fields. The
> hidden form fields are better because they will prevent breakage if the
> same user edits the same data in different windows, or refreshes his
> browser window at the wrong time.
> Then, if the UPDATE has a rowcount of 1, you know it's OK, but if it has
> a rowcount of 0, you know something happens, and can check if the record
> still exists and its version id was modified, or if the record was
deleted.
> A version id can be a counter, a sequence... it can also be a MD5 of the
> row contents for instance, its sole purpose being to detect change. Using
> a sequence might be the easiest.
> This way works but still looks like band-aid ; moreover, if you do a
> complex operation which modifies several tables, you have to take care of
> modification order, and the problem becomes more complex.
> It would be nice to have a framework for that kind of thing which is
> common in web apps.
> One of postgresql's good points is that it does not lock things, thanks
> to MVCC, unlike MySQL which locks the table on every write. This model is
> in the same spirit than MVCC, because it will not prevent reads to records
> which are being updated.
> However, a recurrent problem in web applications is that there is no
> "logout", logout can only be implemented with certainty using timeouts, so
> you can't use locking, because you really don't know when the locks will
> be released. If you use locking, some information will get locked waiting
> for a timeout if a user closes his browser without explicitely logging out
> ; besides you'd have to have a cron to log users out as a disconnected
> user, by definition makes no action to signal the fact that h's gone away.
> You could implement this by adding a version_id serial field to the
> relevant tables, and then an ON UPDATE trigger which would check that the
> version_id of the updater is the same than the version_id in the updated
> row, or else raise an exception. You can also have a special value to
> bypass checks, to be able to update in all cases, and not get stuck if you
> have a problem. The trigger would then increment the version_id before
> updating.

Thank you for reply. I think this is basically the 'only' way how to solve
the problem. Have some 'row changes count'. This scheme is easy
implementable for smaller systems. But as you wrote above, if you prepare
'user datas' for various tables, joins, whatever, it is very hard to take
care of this 'modified serial'. I would like to move this way to 'database'
itselfs somehow.

I was thinking abotu something like this... But I don't know PgSql
internals, so, I don't know if it is possible...

I suppose DB engine have to 'hold' some kind of row version' (chnages count,
timestamp, combination, whatever) - because it should be  needed while
transactions. If DB engine can 'collect' within one transaction these ID's
which were used while selecting datas, I can keep then in 'POST' data and
verify (while transaction) if they are still valid. I have no clue if this
is possible with PlSQL or another server-side scripting language or some
improvement of PqSql is needed. And if improvement is needed, if it is
possible. And if it is possible, if some PG developer will want to do that -
and if they will like this kind of improvement. I am open to pay for solving
this problem, because the system we are doing will be used commerically. And
I think, this is general problem, which every bigger system have to solve.
Sooner or later.

I fully agree with you, that locking is not a way. But is there any other
'more automatized' way than take care of row versions by ourself ?

But before contacting developers, I would like to ask other about
opinions...

Have a nice day,
Best regards,
Lada 'Ray' Lostak
Unreal64 Develop group
http://www.orcave.com
http://www.unreal64.net


--------------------------------------------------------------------------
In the 1960s you needed the power of two C64s to get a rocket
to the moon. Now you need a machine which is a vast number
of times more powerful just to run the most popular GUI.

Re: Online & update races

From

Björn Lundin

Date:

19 October 2004, 22:29:03

Lada 'Ray' Lostak wrote:
>     I will also appreciate any links to web resources, talking about this
> problem. I didn't find anything usefull around.

I'm working with developing a fairly big warehouse management system, and
there we see this problem every day. We've settled (many years ago) for the
following:

loop
  begin
    select * from my_table where status = :status
    (read into record structure)

    update my_table set a=1, b=2 where  and status = :status and
another_status = :status2 and lates_updater =:record.latest_updater and
latest_date = :record.latest_date and latest_time = :record.latest_time

  commit
  exit loop
  exception
    when transaction_conflict | no_such_row =>
      rollback
      delay small time
end loop;

transaction_conflict | no_such_row should be interpreted as no row matched.
The flaw this design has, is that you can't separate 'real transaction
conflicts' with situation where other conditions failed, as no row had
another_status = status2

However, we find the design good enough to keep using it. It has the
advantage that no table is ever looked, we don't use select for update at
all.

--
/Björn
-------------------------------------------------------------------
http://lundin.homelinux.net
Registered Linux User No. 267342 <http://counter.li.org>