Thread: serialization errors

serialization errors

From
Ryan VanderBijl
Date:
Hello,

Let us suppose I have a table like this:
    create sequence seq_foo;
    create table foo (
        bar  int primary key default nextval('seq_foo'),
        name varchar(32),
        unique(name)
    );

I need to have multiple clients update this table concurrently.

Below are two "scripts", one for client A, one for client B:

Time:  Client A
1.     BEGIN;
2.     SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
3.     SELECT * FROM foo;  -- each get their own db 'copy'/version
4.     INSERT INTO foo(name) VALUES('test');
5.     [no-op]
6.     COMMIT;
7.     [no-op]

Time:  Client B
1.     BEGIN;
2.     SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
3.     SELECT * FROM foo;  -- each get their own db 'copy'/version
4.     [no-op]
5.     INSERT INTO foo(name) VALUES('test');
6.     [paused waitting for insert to complete]
7.     ERROR:  Cannot insert a duplicate key into unique index foo_name_key

The documentation about concurrency control / serializable isolation level
indicates that if there is a concurrent conflicting update, then I would
receive the following error:
    ERROR:  Can't serialize access due to concurrent update
( The documentation is found here:
    http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/mvcc.html )

However, as the scripts above demonstrate, one can receive other errors.

So, assuming I mis-understand the documentation, and that mutliple errors can
legally occur, how can I detect if an error received is because of a concurrent
update?

There are many reasons that the INSERT statement could fail (e.g. syntax,
database connection dropped, conflicting concurrent update, unique constraint).

In serialiable mode, I am supposed to put the database update code inside of
a loop.  There are two break conditions for this loop:
   the update code succeeded  or
   a database error not caused by a concurrent update occurred

When the error is 'Can't serialize' it is trivial to know to retry the loop.

If I receive the legal error "duplicate key" error, how am I supposed to
detect if that error is due to a concurrent update, or because of some
other error elsewhere?

If I receive a different error, how can I tell if I should retry or
pass the error onto the user?

Thanks for any help!

Ryan
--
Ryan VanderBijl                              rvbijl@vanderbijlfamily.com

Re: serialization errors

From
Stephan Szabo
Date:
On Tue, 28 Jan 2003, Ryan VanderBijl wrote:

> Let us suppose I have a table like this:
>     create sequence seq_foo;
>     create table foo (
>         bar  int primary key default nextval('seq_foo'),
>         name varchar(32),
>         unique(name)
>     );
>
> I need to have multiple clients update this table concurrently.
>
> Below are two "scripts", one for client A, one for client B:
>
> Time:  Client A
> 1.     BEGIN;
> 2.     SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
> 3.     SELECT * FROM foo;  -- each get their own db 'copy'/version
> 4.     INSERT INTO foo(name) VALUES('test');
> 5.     [no-op]
> 6.     COMMIT;
> 7.     [no-op]
>
> Time:  Client B
> 1.     BEGIN;
> 2.     SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
> 3.     SELECT * FROM foo;  -- each get their own db 'copy'/version
> 4.     [no-op]
> 5.     INSERT INTO foo(name) VALUES('test');
> 6.     [paused waitting for insert to complete]
> 7.     ERROR:  Cannot insert a duplicate key into unique index foo_name_key
>
> The documentation about concurrency control / serializable isolation level
> indicates that if there is a concurrent conflicting update, then I would
> receive the following error:
>     ERROR:  Can't serialize access due to concurrent update
> ( The documentation is found here:
>     http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/mvcc.html )
>
> However, as the scripts above demonstrate, one can receive other errors.
>
> So, assuming I mis-understand the documentation, and that mutliple errors can
> legally occur, how can I detect if an error received is because of a concurrent
> update?

Note that the above isn't an update.  It's not a case of finding a target
row that's been concurrently modified and committed (as described by the
paragraph in that document starting with "UPDATE, DELETE, and SELECT FOR
UPDATE"), it's a case of violating a unique constraint.  If the constraint
wasn't there, there wouldn't be an error in doing the above.



Re: serialization errors

From
Tom Lane
Date:
Ryan VanderBijl <rvbijl-pgsql@vanderbijlfamily.com> writes:
> There are many reasons that the INSERT statement could fail (e.g. syntax,
> database connection dropped, conflicting concurrent update, unique constraint).

An INSERT statement can *never* fail because of a concurrent conflicting
update, because it isn't an update.  Any INSERT is, logically speaking,
creating a unique new entity (row) in the database; there is no conflict
against other insertions, unless by way of violation of a unique constraint.

> If I receive the legal error "duplicate key" error, how am I supposed to
> detect if that error is due to a concurrent update, or because of some
> other error elsewhere?

What difference does it make if the other guy got there ten microseconds
or ten years earlier?  He inserted before you did.  Whether it's
"concurrent" or not shouldn't matter that I can see.  Perhaps more to
the point, there is no reason to expect that a duplicate-key failure
will succeed if you retry the same insertion.

            regards, tom lane

Re: serialization errors

From
Ryan VanderBijl
Date:
Hello,

On Tue, Jan 28, 2003 at 12:47:20PM -0500, Tom Lane wrote:
> > If I receive the legal error "duplicate key" error, how am I supposed to
> > detect if that error is due to a concurrent update, or because of some
> > other error elsewhere?
>
> What difference does it make if the other guy got there ten microseconds
> or ten years earlier?  He inserted before you did.  Whether it's
> "concurrent" or not shouldn't matter that I can see.  Perhaps more to
> the point, there is no reason to expect that a duplicate-key failure
> will succeed if you retry the same insertion.

My application checks to make sure that the operation it is about to
perform wouldn't violates the constraints.  (This is done right after
the transaction is started.)

If I get a "duplicate record" error, I want to know if it's because
of recently committed data, or because something else messed up.

Let me give a more concrete example, closer to what I'm doing:
   create table tree_node (
       node_id    int primary key default nextval('seq_tree_node'),
       parent_id  int references tree_node(node_id),
       node_order int not null,
       unique(parent_id, node_order)
   );

For adding a new node, I basically do this:
   insert into tree_node(parent_id, node_order)
   values(1, (select max(node_order)+1 from tree_node where parent_id = 1) );

Now, if I have to clients which do this simultaneously, then one will get a
duplicate record error, and I know to simply re-run the query.

However, with more complicated functionality (e.g. changing the tree
structure / order), I need to know why I received the error:
    Did my version of the data get corrupted such that i'm running
        completely invalid queries? (end result: force-quit)
        (this also covers any bugs I may have introduced).
    Did someone else update the database at the same time?
       (end result: apply changes and if changes are compatible, then
       re-run with new data, or tell user operation was aborted because
       of what someone else did).

At the very least I would suggest adding a note to the manual, section 9.2.2
(serialization level isoloation).  The note would mention that "Serialization"
errors are not generated by inserts that violate unique constraints.

Thanks!

Ryan

--
Ryan VanderBijl                              rvbijl@vanderbijlfamily.com

Re: serialization errors

From
Tom Lane
Date:
Ryan VanderBijl <rvbijl@vanderbijlfamily.com> writes:
> For adding a new node, I basically do this:
>    insert into tree_node(parent_id, node_order)
>    values(1, (select max(node_order)+1 from tree_node where parent_id = 1) );

That "select max()+1" makes me itch ;-) ... that's as perfect a recipe
for concurrency problems as I can imagine.

At first glance it seems that all this is doing is assigning sequential
node_order values to the children of any particular parent.  Is it
really necessary that those node_order values be consecutive?  If they
only need to be strictly increasing, you could use a sequence to
generate them.  (Actually, I think you could dispense with node_order
entirely, and use the node_id field instead.)

In any case, I'd suggest some careful thought about what your data
structure really is, and how you could avoid creating serialization
bottlenecks like this one.

            regards, tom lane

Re: serialization errors

From
Ryan VanderBijl
Date:
On Wed, Jan 29, 2003 at 12:59:10AM -0500, Tom Lane wrote:
> Ryan VanderBijl <rvbijl@vanderbijlfamily.com> writes:
> > For adding a new node, I basically do this:
> >    insert into tree_node(parent_id, node_order)
> >    values(1, (select max(node_order)+1 from tree_node where parent_id = 1) );
>
> That "select max()+1" makes me itch ;-) ... that's as perfect a recipe
> for concurrency problems as I can imagine.
>
> At first glance it seems that all this is doing is assigning sequential
> node_order values to the children of any particular parent.  Is it
> really necessary that those node_order values be consecutive?  If they
> only need to be strictly increasing, you could use a sequence to
> generate them.  (Actually, I think you could dispense with node_order
> entirely, and use the node_id field instead.)
>
> In any case, I'd suggest some careful thought about what your data
> structure really is, and how you could avoid creating serialization
> bottlenecks like this one.

I'm open to suggestions of a better way to store tree structures in a
database ...

My user's really do need to be able to manage the order of the children.
For example, if the tree looks like:
Box 1
   - document A
   - document B
   - document C
Box 2
   - thing Z
   - thing Y
   - thing X

I need to be able drag 'thing Y' and place it between document B and
document C. (And, at the same time, I need to let separate people add
items under both boxes.) (And within several seconds, all my other
clients need to see the same change!)

Thus I need to keep track of the node_order field.  I've thought
about using non-adjacent numbers, (e.g. 10, 20), and then when dropping
inbetween fill in the gaps.  However, i'd have to worry about when
there are no available numbers in between, and it just seems that if
I always keep them adjacent, life will be easier later on.


However, back to serialization issue ... i'm going to take one (er,
actually there are two arguments) more stab at convincing you a unique
constraint error should be flagged as a serial error (a "serializable
error: unique constraint violated"?)

1.  According to the docs, a serializable transaction: is "as if
transactions had been executed one after another, serially, rather
than concurrently."

Thus any problem because of concurrently commited data is an error
trying to serialize the transactions.


2. This change shouldn't [love that word] affect anyone. There will
already be a loop for retrying the transaction [assumption].
If the change is made, existing programs will receive a "serializable
error" instead of "unique constraint violated" error, then retry the
transaction, and *then* get the unique constraint violated error.

What's the beneffit of changing: it makes dealing with serializable
transactions more simple to think about, and more simple to work with.
At the very least, it removes one more special case that programmers
need to deal with when doing serializable transactions.


Also, if I hadn't mentioned it before, at the very least, the web
documentation on serializable transactions should be updated to
mention that unique constraints violations from concurrent updates
are NOT generated as serializable errors. [note: I can't recall
if this applies to both inserts and updates or just to inserts]


Thanks for your time!

Ryan

ps. I've had some problems getting this message out. Sorry if I
    have sent duplicates.
--
Ryan VanderBijl                              rvbijl@vanderbijlfamily.com


Re: serialization errors

From
Alan Gutierrez
Date:
On Tuesday 28 January 2003 23:59, Tom Lane wrote:
> Ryan VanderBijl <rvbijl@vanderbijlfamily.com> writes:
> > For adding a new node, I basically do this:
> >    insert into tree_node(parent_id, node_order)
> >    values(1, (select max(node_order)+1 from tree_node where parent_id =
> > 1) );

> That "select max()+1" makes me itch ;-) ... that's as perfect a recipe
> for concurrency problems as I can imagine.

For concurrent transactions, select max (node_order) + 1 will return the
same value for all concurrent transactions. The first transaction to
insert the value wins. Any other concurrent transaction will abort with
a duplicate key error.

Do this instead:

insert into tree_node (parent_id, node_order)
values (1, 0);

update tree_node
   set node_order =
        (select max (node_order) + 1
           from tree_node as tn1
          where tn1.parent_id = parent_id);

> At first glance it seems that all this is doing is assigning sequential
> node_order values to the children of any particular parent.  Is it
> really necessary that those node_order values be consecutive?  If they
> only need to be strictly increasing, you could use a sequence to
> generate them.  (Actually, I think you could dispense with node_order
> entirely, and use the node_id field instead.)
>
> In any case, I'd suggest some careful thought about what your data
> structure really is, and how you could avoid creating serialization
> bottlenecks like this one.

In my application, I use select max to generate a sequential value for
concatenated key such as:

create table order_item (
    firm_id integer not null references (firm),
    patient_id integer not null,
    prescription_id integer not null,
  ... -- script data
    primary key (firm_id, patient_id, prescription_id),
    foreign key (firm_id, patient_id) references (patient)
);


Creating a prescription id by select max + 1 in this case does not cause
a bottleneck, since it will only block other transactions that wish to
insert a prescirption for this particular patient. Not common in my
application.

If you are going to be inserting into trees frequently, you are more
likely to have a bottleneck, espcecially if different processes want to
insert into the same tree.

Alan Gutierrez - ajglist@izzy.net
http://khtml-win32.sourceforge.net/ - KHTML on Windows


Re: serialization errors

From
Alan Gutierrez
Date:
On Wednesday 29 January 2003 22:39, Ryan VanderBijl wrote:
> On Wed, Jan 29, 2003 at 12:59:10AM -0500, Tom Lane wrote:
> > Ryan VanderBijl <rvbijl@vanderbijlfamily.com> writes:
> > > For adding a new node, I basically do this:
> > >    insert into tree_node(parent_id, node_order)
> > >    values(1, (select max(node_order)+1 from tree_node where parent_id =
> > > 1) );
> >
> > That "select max()+1" makes me itch ;-) ... that's as perfect a recipe
> > for concurrency problems as I can imagine.
> >
> > At first glance it seems that all this is doing is assigning sequential
> > node_order values to the children of any particular parent.  Is it
> > really necessary that those node_order values be consecutive?  If they
> > only need to be strictly increasing, you could use a sequence to
> > generate them.  (Actually, I think you could dispense with node_order
> > entirely, and use the node_id field instead.)
> >
> > In any case, I'd suggest some careful thought about what your data
> > structure really is, and how you could avoid creating serialization
> > bottlenecks like this one.

> I'm open to suggestions of a better way to store tree structures in a
> database ...

Not a better way, per se, but a different way:
http://searchdatabase.techtarget.com/tip/1,289483,sid13_gci537290,00.html

If you only have two levels, the the adjacency list model will work very
well. It works well when a maximum depth can be specified.

The order field might be better expressed as a non-primary key column.

> However, back to serialization issue ... i'm going to take one [snip]
> more stab at convincing you a uniqueconstraint error should be flagged
> as a serial error (a "serializable error: unique constraint
> violated"?)

No it shouldn't. You have attempted to insert duplicate primary keys.
When you use select max with concurrent transactions, both transactions
will receive the same value for select max. The second one to insert
will be inserting a duplicate primary key.

PostgreSQL should say; "unique constraint violated", which it does.

Alan Gutierrez - ajglist@izzy.net
http://khtml-win32.sourceforge.net/ - KHTML on Windows


Re: serialization errors

From
Ryan VanderBijl
Date:
> > I'm open to suggestions of a better way to store tree structures in a
> > database ...
>
> Not a better way, per se, but a different way:
> http://searchdatabase.techtarget.com/tip/1,289483,sid13_gci537290,00.html
>
> If you only have two levels, the the adjacency list model will work very
> well. It works well when a maximum depth can be specified.
>
> The order field might be better expressed as a non-primary key column.

That article looks interesting ... I'll have to take a look at it in more
detail.

BTW, the node_id is the primary key, and the node_order is a unique
constraint.

Also, BTW, the inset command looks more like:

INSERT INTO tree_node(parent_id, node_name, node_order)
VALUES(
    1,
    "document",
    (SELECT COALESCE(MAX(node_order),0)+1 FROM tree_node WHERE parent_id = 1)
);


>
> > However, back to serialization issue ... i'm going to take one [snip]
> > more stab at convincing you a uniqueconstraint error should be flagged
> > as a serial error (a "serializable error: unique constraint
> > violated"?)
>
> No it shouldn't. You have attempted to insert duplicate primary keys.
> When you use select max with concurrent transactions, both transactions
> will receive the same value for select max. The second one to insert
> will be inserting a duplicate primary key.
>
> PostgreSQL should say; "unique constraint violated", which it does.

I guess I'm starting to sound like a broken record here, but I'm struggling
to understand why it should say unique constraint violated instead of serial.

   BEGIN;
   i = SELECT MAX(node_order)+1 FROM tree_node WHERE parent_id = 1;
   INSERT INTO tree_node(parent_id, node_name, node_order)
   VALUES(1, "name", i);

Why does it make more sense to get a unique constraint violated at this point?
As far as that transaction is concerned, this is completely and perfectly
correct.  When I use a serializable transaction, I would expect all queries
to act internally consistant.

Now, if because of a concurrently commited transaction, this would
violate a unique constraint, and the database can't serialize the transaction,
then well, i should get a serializing error.  Then I try again, and two things
happen:
  1. I'm not smart enough to requery the max id, in which case on try two
     I get the unique contraint violated error.
  2. I requery the max id, and I get a non-unique constraint violated error.


The other thing beneffit of having this return a "serializable error", that
I neglected to mention last time, was that then the user doesn't strictly
have to put in a re-try limiter in the loop.  Currently, if I receive a unique
constraint violated error, there are two reasons it may have happened:
   a) someone else committed an entry
   b) the query I'm attempting is wrong

So, the way it is currently, I have to special case the unique constraint
violated.  If I receive that error consistantly, I don't know if it is
simply because of extremely high activity, or if I have a bug and am
executing a stupid query.

Anyways, now that I know that I can get a unique constraint violated error
in addition to serializable error, i've added the special case code, and
am (un?)happily retrying my queries, with a retry limit.

Thanks!

Ryan
--
Ryan VanderBijl                              rvbijl@vanderbijlfamily.com

Re: serialization errors

From
Greg Copeland
Date:
On Thu, 2003-01-30 at 13:00, Ryan VanderBijl wrote:
> I guess I'm starting to sound like a broken record here, but I'm struggling
> to understand why it should say unique constraint violated instead of serial.
>

Because, the "select max(node_order)+1" will select the identical value
in multiple sessions.  Done concurrently, it results in unique
constraint violation on your insert, even if the inserts are serialized.

--------------------
session 1; parallel:
    select max(node_order)+1 = 5

session 2; parallel:
    select max(node_order)+1 = 5

--------------------
session 1, session 2; serialized:
    insert into x ( order ) values ( 5 )
    insert into x ( order ) values ( 5 )

As you can see from the fast and crude example, serializing the inserts
are not going to protect you from attempting to insert the same value
into your table twice.

That help clarify it?  Then again, perhaps I misunderstood what you're
trying to do.... ;)


Regards,


--
Greg Copeland <greg@copelandconsulting.net>
Copeland Computer Consulting


Re: serialization errors

From
Stephan Szabo
Date:
On 30 Jan 2003, Greg Copeland wrote:

> On Thu, 2003-01-30 at 13:00, Ryan VanderBijl wrote:
> > I guess I'm starting to sound like a broken record here, but I'm struggling
> > to understand why it should say unique constraint violated instead of serial.
>
> Because, the "select max(node_order)+1" will select the identical value
> in multiple sessions.  Done concurrently, it results in unique
> constraint violation on your insert, even if the inserts are serialized.

I think his argument is that since the two transactions (as a whole)
should be serialized, he shouldn't get the same max(node_order) in both
since in either order of serialization of the two transactions you can't
get 5 from both selects (one should return 6).

The problem with this is that it's probably pretty unimplementable, since
it would mean forcing a serialization error in any case that a
modification of a table would have changed a past select in a
serializable transaction which changed a table such that it would have
changed a past select for this serializable transaction.

So:
T1: select * from a where a=3;
T2: select * from b where b=4;
T1: insert into b(b) values (4);
T2: insert into a(a) values (3);
would be a serialization error, however:
T1: select * from a where a=3 and b>4;
T2: select * from b where b=4;
T1: insert into b(b) values (4);
T2: insert into a(a,b) values (3,2);
is not.


Re: serialization errors

From
Stephan Szabo
Date:
On Fri, 31 Jan 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > The problem with this is that it's probably pretty unimplementable,
>
> Yeah.  This shows the difference between what we do and true
> serialization.  An academic would tell you that true serialization
> requires predicate locking --- that is, as soon as transaction T1 has
> done a "SELECT ... WHERE foo", then concurrent transaction T2 must wait
> for T1 if it tries to insert *or remove* a row that would satisfy foo.

Right, or you need to do some kind of thing where instead of locking, you
keep track of what predicate violations have occurred and prevent cycles
by killing a transaction in the cycle with a serialization error
(basically the optimistic locking version of the above I'd guess).  The
enforcement would be a nightmare in any case if you wanted perfect results
(no false positives), especially given functions and custom aggregates
where you might have to run the results again after data modifications and
see if they've changed since all you get is a black box.


Re: serialization errors

From
Stephan Szabo
Date:
On 31 Jan 2003, Greg Copeland wrote:

> On Fri, 2003-01-31 at 00:40, Stephan Szabo wrote:
> > On 30 Jan 2003, Greg Copeland wrote:
> >
> > > On Thu, 2003-01-30 at 13:00, Ryan VanderBijl wrote:
> > > > I guess I'm starting to sound like a broken record here, but I'm struggling
> > > > to understand why it should say unique constraint violated instead of serial.
> > >
> > > Because, the "select max(node_order)+1" will select the identical value
> > > in multiple sessions.  Done concurrently, it results in unique
> > > constraint violation on your insert, even if the inserts are serialized.
> >
> > I think his argument is that since the two transactions (as a whole)
> > should be serialized, he shouldn't get the same max(node_order) in both
> > since in either order of serialization of the two transactions you can't
> > get 5 from both selects (one should return 6).
>
>
> Thank you for the followup.  If you take a look at the section I quoted,
> you'll note that you're not addressing the specific question even though
> you are addressing the greater question.  ;)
>
> He asked why he was getting a constraint violation rather than a serial
> violation.  I simply attempted to help illustrate why it makes sense
> that it is a unique key constraint violation as he's attempting to
> insert the same value twice.  As such, the fact that he's attempting to
> do so within a pair of serialized transactions doesn't change the fact
> that he is still attempting to insert a duplicate value.

It's basically becoming an academic point, the question in my mind was
whether or not the system is allowed to get to the point of diagnosing a
unique constraint violation and to not diagnose the serializability
violation.  Using the serializability as run these two transactions in
order definition and a requirement to raise an exception if this is
impossible, the serializability violation occurs first (since constraints
are checked after the row is inserted and the serializability violation
happens at that moment).  So at the least, both violations would have to
be diagnosed.

However, the wording in SQL99 is interesting in that it uses may, and I'm
not sure what level of requirement that places upon the implementation.
I see in 4.32 "The execution of a <rollback statement> may be initiated
implicitly by an SQL-implementation when it detects the inability to
guarantee the serializability of two or more concurrent SQL-transactions.
When this error occurs, an exception condition is raised: transaction
rollback -- serialization failure."  That seems to imply that it's allowed
for an implementation to not raise an error, which seems to make the
guarantee of serialized execution meaningless.



Re: serialization errors

From
Greg Copeland
Date:
On Fri, 2003-01-31 at 00:40, Stephan Szabo wrote:
> On 30 Jan 2003, Greg Copeland wrote:
>
> > On Thu, 2003-01-30 at 13:00, Ryan VanderBijl wrote:
> > > I guess I'm starting to sound like a broken record here, but I'm struggling
> > > to understand why it should say unique constraint violated instead of serial.
> >
> > Because, the "select max(node_order)+1" will select the identical value
> > in multiple sessions.  Done concurrently, it results in unique
> > constraint violation on your insert, even if the inserts are serialized.
>
> I think his argument is that since the two transactions (as a whole)
> should be serialized, he shouldn't get the same max(node_order) in both
> since in either order of serialization of the two transactions you can't
> get 5 from both selects (one should return 6).


Thank you for the followup.  If you take a look at the section I quoted,
you'll note that you're not addressing the specific question even though
you are addressing the greater question.  ;)

He asked why he was getting a constraint violation rather than a serial
violation.  I simply attempted to help illustrate why it makes sense
that it is a unique key constraint violation as he's attempting to
insert the same value twice.  As such, the fact that he's attempting to
do so within a pair of serialized transactions doesn't change the fact
that he is still attempting to insert a duplicate value.


Regards,

--
Greg Copeland <greg@copelandconsulting.net>
Copeland Computer Consulting


Re: serialization errors

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> I think his argument is that since the two transactions (as a whole)
> should be serialized, he shouldn't get the same max(node_order) in both
> since in either order of serialization of the two transactions you can't
> get 5 from both selects (one should return 6).

Right, that's a fair point.

> The problem with this is that it's probably pretty unimplementable,

Yeah.  This shows the difference between what we do and true
serialization.  An academic would tell you that true serialization
requires predicate locking --- that is, as soon as transaction T1 has
done a "SELECT ... WHERE foo", then concurrent transaction T2 must wait
for T1 if it tries to insert *or remove* a row that would satisfy foo.

Unfortunately this cure is much worse than the disease.  Aside from the
sheer overhead of enforcing it, it leads to deadlocks.  For example,

    T1                T2

    SELECT max(node_order)+1...
                    SELECT max(node_order)+1...
    INSERT ...

at this point T1's insert blocks, because it would violate the predicate
of T2's already-performed SELECT.

                    INSERT...

And now T2 is blocked by T1 --- deadlock.  So now you know why predicate
locking is an academic solution and not used by real databases :-(

But probably the shortest answer to Ryan is that what the database sees
is a unique-index violation, so that's what it reports.  Deducing that
this condition can only have arisen because of concurrent behavior would
take an impractical amount of knowledge, both of what your own
transaction is doing and of what other transactions are doing.

            regards, tom lane

Re: serialization errors

From
Ryan VanderBijl
Date:
Hello again,

Unfortunately, because English is a language, and the
first one I learned (learnt?), I'm not able to communicate
very well in it.  I don't think my question / suggestion
has come across very clear.  So, I wrote a summary of what
I've been trying to ask / suggest. It can be found at:

http://www.vanderbijlfamily.com/~rvbijl/transactions.html

Thanks again for your time!

Ryan

Re: serialization errors

From
Stephan Szabo
Date:
On Mon, 3 Feb 2003, Ryan VanderBijl wrote:

> Hello again,
>
> Unfortunately, because English is a language, and the
> first one I learned (learnt?), I'm not able to communicate
> very well in it.  I don't think my question / suggestion
> has come across very clear.  So, I wrote a summary of what
> I've been trying to ask / suggest. It can be found at:
>
> http://www.vanderbijlfamily.com/~rvbijl/transactions.html

The given a unique constraint or serializability error condition may fix
your specific case but doesn't fix serializability in general.  I'm not
certain that changing the error returned in this case buys us that much
given that equivalent serializability errors are not caught when there
isn't a unique index.