Thread: Re: Revisited: Transactions, insert unique.

Re: Revisited: Transactions, insert unique.

From

davidb@vectormath.com

Date:

24 April 2000, 11:09:43

Hi Lincoln,

I'm not sure I'm understanding your question, but it seems like this is
something that
ought to be handled programmatically.  That is, query the table to see if
the row exists,
then decide what you are going to do (insert or update) based on the results
of your
query.

Am I completely missing the point?

David Boerwinkle

-----Original Message-----
From: Lincoln Yeoh <lylyeoh@mecomb.com>
To: pgsql-general@postgresql.org <pgsql-general@postgresql.org>
Date: Monday, April 24, 2000 1:13 AM
Subject: [GENERAL] Revisited: Transactions, insert unique.


>Hi,
>
>Previously I wanted to ensure that I am inserting something unique into a
>table, the answer was to create a unique index on the relevant columns.
>
>But what if I don't want to get an error which would force a rollback? Say
>I want to insert something if it doesn't already exist, but update it if it
>does.
>
>Do I have to lock the whole table?
>
>Would it be a good idea to be able to request a lock on an arbitrary string
>like in MySQL? Then I could perhaps do something like
>
>LOCK HANDLE('max255charstring',TimeoutInSeconds)
>e.g.
>LOCK HANDLE('mytable,field1=x,field2=y',10)
>
>Then I could control access to a row that may not even exist, or do other
>snazzy transaction stuff.
>
>Cheerio,
>Link.
>

Re: Revisited: Transactions, insert unique.

From

Ed Loehr

Date:

24 April 2000, 11:55:59

davidb@vectormath.com wrote:
>
> Hi Lincoln,
>
> I'm not sure I'm understanding your question, but it seems like this is
> something that
> ought to be handled programmatically.  That is, query the table to see if
> the row exists,
> then decide what you are going to do (insert or update) based on the results
> of your
> query.

Good point.  And you can combine the check David suggests with the insert
statement, e.g.,

    INSERT INTO mytable (id, ...)
        SELECT 7, ...
        FROM mytable
        WHERE NOT EXISTS (SELECT * FROM mytable WHERE id = 7)

And then check the return result for number of rows inserted.  '0' means an
update is needed.  I don't remember if there is cleaner more efficient
manner for doing that, but probably so...

Regards,
Ed Loehr

> David Boerwinkle
>
> -----Original Message-----
> From: Lincoln Yeoh <lylyeoh@mecomb.com>
> To: pgsql-general@postgresql.org <pgsql-general@postgresql.org>
> Date: Monday, April 24, 2000 1:13 AM
> Subject: [GENERAL] Revisited: Transactions, insert unique.
>
> >Hi,
> >
> >Previously I wanted to ensure that I am inserting something unique into a
> >table, the answer was to create a unique index on the relevant columns.
> >
> >But what if I don't want to get an error which would force a rollback? Say
> >I want to insert something if it doesn't already exist, but update it if it
> >does.
> >
> >Do I have to lock the whole table?
> >
> >Would it be a good idea to be able to request a lock on an arbitrary string
> >like in MySQL? Then I could perhaps do something like
> >
> >LOCK HANDLE('max255charstring',TimeoutInSeconds)
> >e.g.
> >LOCK HANDLE('mytable,field1=x,field2=y',10)
> >
> >Then I could control access to a row that may not even exist, or do other
> >snazzy transaction stuff.
> >
> >Cheerio,
> >Link.
> >

Re: Revisited: Transactions, insert unique.

From

Joachim Achtzehnter

Date:

24 April 2000, 13:21:49

Today, in a message to pgsql-general, David Boerwinkle wrote:
>
> it seems like this is something that ought to be handled
> programmatically.  That is, query the table to see if the row exists,
> then decide what you are going to do (insert or update) based on the
> results of your query.

It certainly 'can' be handled the way you describe, but to say that it
'ought' to be handled this way is going too far. It is common practice in
database programming to simply try the most likely case and fall back to
alternatives when an error is encountered. For example, if one expects 99%
of inserts to be unique one may simply try the insert and when this fails
because of a duplicate key error one can update instead. This is slightly
more efficient than doing the extra query in 100% of cases.

In any case, if I'm not mistaken the SQL standard permits an automatic
rollback only for deadlock errors and equivalent types of errors where the
rollback may be required to resolve a lockup situation.

Joachim

--
private:  joachim@kraut.bc.ca    (http://www.kraut.bc.ca)
work:     joachim@mercury.bc.ca  (http://www.mercury.bc.ca)

Re: Revisited: Transactions, insert unique.

From

Ed Loehr

Date:

24 April 2000, 13:46:46

Joachim Achtzehnter wrote:
>
> Today, in a message to pgsql-general, David Boerwinkle wrote:
> >
> > it seems like this is something that ought to be handled
> > programmatically.  That is, query the table to see if the row exists,
> > then decide what you are going to do (insert or update) based on the
> > results of your query.
>
> It certainly 'can' be handled the way you describe, but to say that it
> 'ought' to be handled this way is going too far. It is common practice in
> database programming to simply try the most likely case and fall back to
> alternatives when an error is encountered. For example, if one expects 99%
> of inserts to be unique one may simply try the insert and when this fails
> because of a duplicate key error one can update instead. This is slightly
> more efficient than doing the extra query in 100% of cases.

More efficient, yes.  However, given the lack of "statement-only" aborts
and the state of pg error codes (all strings, no numeric codes), the
programmatic/query check appears not only compulsory but simpler code-wise.

Regards,
Ed Loehr

Re: Revisited: Transactions, insert unique.

From

Joachim Achtzehnter

Date:

24 April 2000, 14:02:23

Today, in a message to Joachim Achtzehnter, Ed Loehr wrote:
>
> More efficient, yes.  However, given the lack of "statement-only"
> aborts and the state of pg error codes (all strings, no numeric
> codes), the programmatic/query check appears not only compulsory but
> simpler code-wise.

Given current limitations of the postgresql implementation, you are
certainly right: there is no other choice. I was merely countering the
suggestion that there was something wrong with the approach of simply
trying the insert first. It is a perfectly valid approach when used with
an SQL92 compliant database. We just have to live without it until
postgresql improves on this point. This is certainly not a show stopper
for most of us unless perhaps when somebody has to port a ton of code from
another database :-(

Joachim

--
private:  joachim@kraut.bc.ca    (http://www.kraut.bc.ca)
work:     joachim@mercury.bc.ca  (http://www.mercury.bc.ca)

Re: Revisited: Transactions, insert unique.

From

"Ross J. Reedstrom"

Date:

24 April 2000, 15:16:47

On Mon, Apr 24, 2000 at 11:01:57AM -0700, Joachim Achtzehnter wrote:

> [...] It is a perfectly valid approach when used with an SQL92 compliant
> database. We just have to live without it until postgresql improves
> on this point. This is certainly not a show stopper for most of us
> unless perhaps when somebody has to port a ton of code from another
> database :-(

I'm going to jump in here, because this is a particular axe I grind:

I've bent my brain around the SQL92 standards docs, and there's _no_
requirement for this type of behavior on error. Yes, it's a useful thing
to have, and yes, all the bigname commercial RDBMS's work that way, but that
doesn't mean postgres isn't SQL92 compliant on that point (it misses on
other points, though). So, go ahead and complain, I agree it's a pain for
those porting code. But don't say it's a standards issue, until you can
point to chapter and verse to defend your position.

Ross (touchy about SQL92 standards compliance, for some reason)
--
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005

Re: Revisited: Transactions, insert unique.

From

Ed Loehr

Date:

24 April 2000, 16:03:56

"Ross J. Reedstrom" wrote:
>
> On Mon, Apr 24, 2000 at 11:01:57AM -0700, Joachim Achtzehnter wrote:
>
> > [...] It is a perfectly valid approach when used with an SQL92 compliant
> > database. We just have to live without it until postgresql improves
> > on this point. This is certainly not a show stopper for most of us
> > unless perhaps when somebody has to port a ton of code from another
> > database :-(
>
> I'm going to jump in here, because this is a particular axe I grind:
>
> I've bent my brain around the SQL92 standards docs, and there's _no_
> requirement for this type of behavior on error. Yes, it's a useful thing
> to have, and yes, all the bigname commercial RDBMS's work that way, but that
> doesn't mean postgres isn't SQL92 compliant on that point (it misses on
> other points, though). So, go ahead and complain, I agree it's a pain for
> those porting code. But don't say it's a standards issue, until you can
> point to chapter and verse to defend your position.

Fair enough.  My "non-std" comment an unfortunate choice of words.  I was
alluding to pgsql "doing it differently" from everyone else (i.e., in a
"non-std manner"), not to a violation of an "official" SQL standard.  My
recollection of the recent SQL standards discussion on this point was that
at least one standards document (which one?) was ambiguous on this point.

Regards,
Ed Loehr

Re: Revisited: Transactions, insert unique.

From

Joachim Achtzehnter

Date:

24 April 2000, 16:11:50

Today, in a message to pgsql-general, Ross J. Reedstrom wrote:
>
> I've bent my brain around the SQL92 standards docs, and there's _no_
> requirement for this type of behavior on error.

Don't have access to the actual standard text, perhaps somebody who has
can confirm whether the following quotes from an earlier draft (identified
by the code X3H2-92-154/DBL CBR-002) are also in the final text.

In section 4.10.1 (Checking of constraints) I find this:

  When a constraint is checked other than at the end of an SQL-
  transaction, if it is not satisfied, then an exception condition
  is raised and the SQL-statement that caused the constraint to be
  checked has no effect other than entering the exception
  information into the diagnostics area.

An automatic rollback of the whole transaction in response to a violated
primary key constraint is hardly consistent with the "no effect"
requirement expressed here.

The following passages from section 4.28 (SQL-transactions) also very
strongly imply that an automatic rollback should not occur except in
circumstances where there is no choice (serialization failure and
unrecoverable errors):

  The execution of a <rollback statement> may be initiated implicitly
  by an implementation when it detects the inability to guarantee the
  serializability of two or more concurrent SQL-transactions. When
  this error occurs, an exception condition is raised: transaction
  rollback-serialization failure.

  The execution of a <rollback statement> may be initiated implicitly
  by an implementation when it detects unrecoverable errors. When
  such an error occurs, an exception condition is raised: transaction
  rollback with an implementation-defined subclass code.

  The execution of an SQL-statement within an SQL-transaction has
  no effect on SQL-data or schemas other than the effect stated in
  the General Rules for that SQL-statement, in the General Rules
  for Subclause 11.8, "<referential constraint definition>", and
  in the General Rules for Subclause 12.3, "<procedure>".

Perhaps, you can make the argument that an automatic rollback in all error
situations is compliant by claiming that all errors are unrecoverable. In
my view this is definitely against the spirit of the standard. As you said
yourself, all big-name databases behave according to my interpretation,
hence it is understandable that the authors of the standard didn't see a
need to spell this out more explicitly.

Joachim

--
joachim@kraut.bc.ca (http://www.kraut.bc.ca)
joachim@mercury.bc.ca (http://www.mercury.bc.ca)

Re: Revisited: Transactions, insert unique.

From

Ed Loehr

Date:

24 April 2000, 16:41:48

Joachim Achtzehnter wrote:
>
> Today, in a message to pgsql-general, Ross J. Reedstrom wrote:
> >
> > I've bent my brain around the SQL92 standards docs, and there's _no_
> > requirement for this type of behavior on error.
>
> Don't have access to the actual standard text, perhaps somebody who has
> can confirm whether the following quotes from an earlier draft (identified
> by the code X3H2-92-154/DBL CBR-002) are also in the final text.
>
> In section 4.10.1 (Checking of constraints) I find this:
>
>   When a constraint is checked other than at the end of an SQL-
>   transaction, if it is not satisfied, then an exception condition
>   is raised and the SQL-statement that caused the constraint to be
>   checked has no effect other than entering the exception
>   information into the diagnostics area.
>
> An automatic rollback of the whole transaction in response to a violated
> primary key constraint is hardly consistent with the "no effect"
> requirement expressed here.
>
> The following passages from section 4.28 (SQL-transactions) also very
> strongly imply that an automatic rollback should not occur except in
> circumstances where there is no choice (serialization failure and
> unrecoverable errors):
>
>   The execution of a <rollback statement> may be initiated implicitly
>   by an implementation when it detects the inability to guarantee the
>   serializability of two or more concurrent SQL-transactions. When
>   this error occurs, an exception condition is raised: transaction
>   rollback-serialization failure.
>
>   The execution of a <rollback statement> may be initiated implicitly
>   by an implementation when it detects unrecoverable errors. When
>   such an error occurs, an exception condition is raised: transaction
>   rollback with an implementation-defined subclass code.
>
>   The execution of an SQL-statement within an SQL-transaction has
>   no effect on SQL-data or schemas other than the effect stated in
>   the General Rules for that SQL-statement, in the General Rules
>   for Subclause 11.8, "<referential constraint definition>", and
>   in the General Rules for Subclause 12.3, "<procedure>".
>
> Perhaps, you can make the argument that an automatic rollback in all error
> situations is compliant by claiming that all errors are unrecoverable. In
> my view this is definitely against the spirit of the standard. As you said
> yourself, all big-name databases behave according to my interpretation,
> hence it is understandable that the authors of the standard didn't see a
> need to spell this out more explicitly.

I found that pretty informative.  I dug up the previous conversation on
this, in which Tom Lane cited section 3.3.4.1 (of what std?).  Its emphasis
on *statements* as opposed to *transactions* suggests to me that aborting
the transaction is beyond the scope of what they had in mind, though I
admittedly don't fully understand the jargon here...

    The phrase "an exception condition is raised:", followed by the
    name of a condition, is used in General Rules and elsewhere to
    indicate that the execution of a statement is unsuccessful, ap-
    plication of General Rules, other than those of Subclause 12.3,
    "<procedure>", and Subclause 20.1, "<direct SQL statement>", may
    be terminated, diagnostic information is to be made available,
    and execution of the statement is to have no effect on SQL-data or
    schemas. The effect on <target specification>s and SQL descriptor
    areas of an SQL-statement that terminates with an exception condi-
    tion, unless explicitly defined by this International Standard, is
    implementation-dependent.

Regards,
Ed Loehr

Re: Revisited: Transactions, insert unique.

From

"Ross J. Reedstrom"

Date:

24 April 2000, 16:53:48

On Mon, Apr 24, 2000 at 01:10:55PM -0700, Joachim Achtzehnter wrote:
>
> Perhaps, you can make the argument that an automatic rollback in all error
> situations is compliant by claiming that all errors are unrecoverable. In
> my view this is definitely against the spirit of the standard. As you said
> yourself, all big-name databases behave according to my interpretation,
> hence it is understandable that the authors of the standard didn't see a
> need to spell this out more explicitly.
>
Joachim -
I see you haven't done much Language Lawyering, have you? There is no
such thing as the 'spirit' of the standard, only the written document.
;-) This is exactly my argument, with regard to errors and the standard:
_which_ errors are considered unrecoverable is not spelled out in the
standard, therefore, it is implementation defined. The fact the the
definition chosen by postgresql is inconvenient for users of the database
is, I agree, unfortunate, but it doesn't stand in the way of us claiming
compliance, which is the name of the game for these sort of standards.

Note that postgres is careful not to _automatically_ rollback: the
standard (as you quoted) indicated only certain conditions that allow for
an implicit rollback of that sort. Postgres just won't let you do anything
else in the current transaction. Yes, it's splitting hairs, but if you dig
into any of the 'bigname' DBs, you'll find similar split ends.  Often, the
end is able to be split, i.e. the language in the standard is ambigious,
_because_ the commercial DB had representitives on the committee, making
sure the standard didn't get too far from their exisiting implementation.

I might even argue that the original definition was a good, conservative
choice, for the early days of postgres as a research database: you
_know_ people have been messing with the server code, and if something
throws an error, bailing out is the safest course. Now that the core
developers have done an amazing job at cleaning up and stabilizing
the code, a more relaxed attitude toward certain classes of errors is
desirable. There's been a fair amount of discussion about cleaning up
(defining!) the error codes returned, as well, so a complete overhaul
may be in the works. That'd clearly be the time to fix this up. I beleive
it's already on the TODO list.

Ross
--
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005

Re: Revisited: Transactions, insert unique.

From

"Ross J. Reedstrom"

Date:

24 April 2000, 17:15:51

To All -
O.K., I give! I'm surprised at the number of people willing to dig into
the standards docs, and try to come up with justifications. Note the fact
that whole paragraphs must be quoted in order to get at the general tone
of the standard supports my argument that the behavior on error is _not_
spelled out in the standard: the exact case we're talking about is almost
conspicious by it's absence, given the number of other cases covered,
and the depth of the coverage. The rest of the standard is written in
such away that I keep thinking that all there Exception Conditions must
be defined somewhere, with appropriate actions, but, to the best of my
knowledge, they're not.

Makes me think there must have been a big commercial DB without good
error recovery with representitives on the committee ;-)

Suffice to say, the _right_ thing to do is make the users happy:
now that postgres is more robust, there's little reason not to allow
capture or 'recoverable' errors, and allow the DB developer to decide
when to rollback.

Ross

--
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005

Re: Revisited: Transactions, insert unique.

From

Joachim Achtzehnter

Date:

24 April 2000, 17:42:51

Today, in a message to pgsql-general, Ross J. Reedstrom wrote:
>
> I see you haven't done much Language Lawyering, have you?

No, and I have no intention of getting into it now :-)

> There is no such thing as the 'spirit' of the standard, only the
> written document. ;-)

The spirit of the standard comes into play when people who are not
Language Lawyers try to decide how something should work that is not
spelled out explicitly, but where the standard text contains suggestive
statements that imply that the authors assumed something without spelling
it out, because they thought everybody reading the standard would agree on
this as a matter of course. Of course, as soon as somebody comes along who
has some motivation to make a contrary assumption, perhaps to claim
compliance, the fact that the assumption was not spelled out leads to the
kinds of arguments we are having.

> This is exactly my argument, with regard to errors and the standard:
> _which_ errors are considered unrecoverable is not spelled out in the
> standard, therefore, it is implementation defined. The fact the the
> definition chosen by postgresql is inconvenient for users of the database
> is, I agree, unfortunate, but it doesn't stand in the way of us claiming
> compliance, which is the name of the game for these sort of standards.

This is precisely NOT the game I'm playing! I don't care whether something
is technically 100% compliant or not. I do care a lot about improving a
free software database management system that is in the same league as the
big-name databases.

The reason I entered this discussion was not to discuss whether postgresql
is or is not 100% compliant with SQL92. Supporting statement level aborts
is a feature that should be supported at some point, and this talk about
the current practice somehow being compliant with the letter of the
standard doesn't help.

> Note that postgres is careful not to _automatically_ rollback: the
> standard (as you quoted) indicated only certain conditions that allow for
> an implicit rollback of that sort.

The standard is very explicit about some types of errors, namely
constraint violations, where it says that this must have no effect except
an entry in the diagnostics area. It is precisely these errors where one
would like to be able to continue the transaction.

Joachim

--
private:  joachim@kraut.bc.ca    (http://www.kraut.bc.ca)
work:     joachim@mercury.bc.ca  (http://www.mercury.bc.ca)

Re: Revisited: Transactions, insert unique.

From

"Ross J. Reedstrom"

Date:

24 April 2000, 18:53:31

On Mon, Apr 24, 2000 at 02:41:46PM -0700, Joachim Achtzehnter wrote:
>
> The spirit of the standard comes into play when people who are not
> Language Lawyers try to decide how something should work that is not
> spelled out explicitly, but where the standard text contains suggestive
> statements that imply that the authors assumed something without spelling
> it out, because they thought everybody reading the standard would agree on
> this as a matter of course. Of course, as soon as somebody comes along who
> has some motivation to make a contrary assumption, perhaps to claim
> compliance, the fact that the assumption was not spelled out leads to the
> kinds of arguments we are having.

While I agree with you in theory, that while many more casual standards
documents need to be read this way, the SQL standards are highly
engineered, passing through multiple national and international committee
bodies. Heck, any document that goes to the trouble to define the BNF
for <simple Latin letter>, let alone <digit> (section 5.1), clearly
aspires to being complete in and of itself. If that is so, omissions
are as significant as inclusions. As to other motives, the complete hash
that these same bodies have made of the SQL3 spec. leads me to believe
that every possible contrary assumption is already present.

> > is, I agree, unfortunate, but it doesn't stand in the way of us claiming
> > compliance, which is the name of the game for these sort of standards.
>
> This is precisely NOT the game I'm playing! I don't care whether something
> is technically 100% compliant or not. I do care a lot about improving a
> free software database management system that is in the same league as the
> big-name databases.
>
> The reason I entered this discussion was not to discuss whether PostgreSQLql
> is or is not 100% compliant with SQL92. Supporting statement level aborts
> is a feature that should be supported at some point, and this talk about
> the current practice somehow being compliant with the letter of the
> standard doesn't help.

But it doesn't hurt (much). This is why we're having this discussion
on GENERAL, and not HACKERS: the developers have already agreed that
the error system needs an overhaul, mostly to provide the interface
writers with consistent error numbers, rather than the current text
strings. Inclusion of the ability to ignore some errors will happen.

I would not have started this branch of the discussion if the original
complaint had not ventured from 'other DBMSs' to 'SQL92 compliant DBMSs'
I was _very_ specific that the _only_ thing I disagree with in this
is being careful to not provide the enemy with ammunition, as it were,
and over interpret the standard to PostgreSQL's detriment. This is why
_not_ having this discussion can hurt. In order to aid acceptance of
PostgreSQL into many enviroments, being able to play the 'technically
SQL92 compliant' card, without having to cross your fingers behind your
back, is very important.  Heck, I'd be wrestling with Oracle right now,
and had a lot less grant money to put into the hardware for my server,
if I hadn't been able to play the 'mostly SQL92 compliant, and climbing'
card.

>
> The standard is very explicit about some types of errors, namely
> constraint violations, where it says that this must have no effect except
> an entry in the diagnostics area. It is precisely these errors where one
> would like to be able to continue the transaction.
>

And this interpretation will guide the developers in _extending_
the standard in a consistent way. I know, because the developers that
implemented the constraints for 7.0 used this (and the SQL3 spec) as
guides. How's that?

Ross
P.S. I think we're in (quasi) violent agreement, don't you?
--
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005

RE: Revisited: Transactions, insert unique.

From

"Hiroshi Inoue"

Date:

24 April 2000, 20:11:52

> -----Original Message-----
> From: pgsql-general-owner@hub.org [mailto:pgsql-general-owner@hub.org]On
> Behalf Of Ross J. Reedstrom
>
> And this interpretation will guide the developers in _extending_
> the standard in a consistent way. I know, because the developers that
> implemented the constraints for 7.0 used this (and the SQL3 spec) as
> guides. How's that?
>

I don't know what is standard.
However as far as I see,few people prefer entire rollback on abort.
The problem is that PostgreSQL lacks a per statement rollback
functionality and unfortunately it isn't easy to implement.
Vadim has already planned the implementation. AFAIK one of the
purpose of WAL is to implement savepoint functionality. Savepoint
functionality would enable per statement rollback functionality easily.

The following is an extract of Vadim's posting about 10 months ago.

Re: [HACKERS] RE: [GENERAL] Transaction logging
Well, I'm thinking about WAL last two weeks. Hiroshi pointed me
problems in my approach to savepoints (when a tuple was marked
for update and updated after it) and solution would require
new tid field in header and both t_cmin/t_cmax => bigger header.
I don't like it and so I switched my mind -:).

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

Re: Revisited: Transactions, insert unique.

From

Lincoln Yeoh

Date:

24 April 2000, 20:55:51

Hi David,

That can work if there's only one session using the database. But it's not
100% if there are multiple users. There's a small chance that a row may not
exist during the select, but exist by the time of the insert. If I'm wrong
please correct me - then I'll go optimize some code :).

By having the unorthodox locking mechanism suggested I can ensure at the
application level that no one else is going to insert stuff before my
select, update/insert, without having to lock the whole table.

So it will be
LOCK arbitrary
select
if exist update
else insert
UNLOCK arbitrary

Which would be faster- doing the lock arbitrary method, or doing an insert
with unique indexes and recovering if necessary (assuming postgresql does
what other databases do)? I suspect unique indexes could slow inserts and
updates down significantly.

If we don't want to do all that, how about we have a select for insert (and
update), which locks things? But I figured that it would be problematic to
implement in a number of scenarios tho.

Cheerio,

Link.

At 09:56 AM 24-04-2000 -0500, davidb@vectormath.com wrote:
>Hi Lincoln,
>
>I'm not sure I'm understanding your question, but it seems like this is
>something that
>ought to be handled programmatically.  That is, query the table to see if
>the row exists,
>then decide what you are going to do (insert or update) based on the results
>of your
>query.
>
>Am I completely missing the point?
>
>David Boerwinkle

Re: Revisited: Transactions, insert unique.

From

Mike Mascari

Date:

24 April 2000, 22:19:53

Lincoln Yeoh wrote:
>
> Hi David,
>
> That can work if there's only one session using the database. But it's not
> 100% if there are multiple users. There's a small chance that a row may not
> exist during the select, but exist by the time of the insert. If I'm wrong
> please correct me - then I'll go optimize some code :).
>
> By having the unorthodox locking mechanism suggested I can ensure at the
> application level that no one else is going to insert stuff before my
> select, update/insert, without having to lock the whole table.
>
> So it will be
> LOCK arbitrary
> select
> if exist update
> else insert
> UNLOCK arbitrary
>
> Which would be faster- doing the lock arbitrary method, or doing an insert
> with unique indexes and recovering if necessary (assuming postgresql does
> what other databases do)? I suspect unique indexes could slow inserts and
> updates down significantly.
>
> If we don't want to do all that, how about we have a select for insert (and
> update), which locks things? But I figured that it would be problematic to
> implement in a number of scenarios tho.

PostgreSQL implements SELECT...FOR UPDATE to allow for the
sequence you'be described:

Session 1:

BEGIN;
SELECT webuser FROM webusers WHERE webuser = 'webuser1';

Session 2:

BEGIN;
UPDATE webusers SET webuser = 'webuser2' WHERE webuser =
'webuser1';*

*At this point Session 2 blocks, waiting for Session 1 to
complete. This prevents the race condition you've described and
only locks those rows which were selected by Session 1's SELECT.
With MVCC, table-level locking is largely a thing of the past.
The MVCC documentation describes this in detail:

http://www.postgresql.org/docs/postgres/mvcc4496.htm

Hope that helps,

Mike Mascari

Can't find destroydb command in 7.0

From

"Michael S. Kelly"

Date:

25 April 2000, 00:15:59

Maybe it's late, maybe it'll all be clear to me in the morning, but I seem
to have lost the destroydb command after installing PostgreSQL 7.0 RC1.

I'm still a newbie to PostgreSQL, but I believe destroydb should be executed
at the command line, meaning there should be a destroydb command somewhere
on my HD.  I swear it isn't there.

Appreciate any help.

-=michael=-

*****************************************************
*  Michael S. Kelly
*  4800 SW Griffith Dr., Ste. 202
*  Beaverton, OR  97005 USA
*  voice: (503)644-6106 x122  fax: (503)643-8425
*  <michaelk@axian.com>
*  http://www.axian.com/
*****************************************************
*    Axian:  Software Consulting and Training
*****************************************************

Re: Revisited: Transactions, insert unique.

From

Joachim Achtzehnter

Date:

25 April 2000, 00:24:57

Today, in a message to pgsql-general, Ross J. Reedstrom wrote:
>
> the developers have already agreed that the error system needs an
> overhaul, ... Inclusion of the ability to ignore some errors will
> happen.

This is good to hear. I suspect, though, that to support statement-level
aborts more is needed than an overhaul of the error system. Statement
level aborts are essentially a form of nested transaction, i.e. the
ability to rollback a precisely defined subset of the top-level
transaction.

> I was _very_ specific that the _only_ thing I disagree with in this
> is being careful to not provide the enemy with ammunition, as it were,
> and over interpret the standard to PostgreSQL's detriment.

The fact that some vendors of proprietary systems think in these terms is
no reason for the postgresql community to do the same. Unlike them we
don't HAVE to sell the system at all cost to be successful.

> This is why _not_ having this discussion can hurt.

Don't agree.

> being able to play the 'technically SQL92 compliant' card, without
> having to cross your fingers behind your back, is very important.

But you are essentially doing that, crossing your fingers I mean, by
relying on a shaky interpretation solely to be able to claim compliance,
while you know full well that the alternative interpretation is the better
one.

> Heck, I'd be wrestling with Oracle right now,

Well, since you mention Oracle: A few years ago I had a similar argument
with them about their peculiar interpretation of what SERIALIZABLE in
SQL92 means. The early versions of Oracle's multi-version concurrancy
mechanism didn't provide truely serializable transactions, yet they
claimed it did by resorting to rather twisted interpretations of the
standard.

I didn't think this was acceptable then, and I certainly don't condone
similar window dressing today. Postgresql doesn't need this kind of PR.

> P.S. I think we're in (quasi) violent agreement, don't you?

Well, we do seem to agree on some important points, like the fact that
statement-level aborts are important to have. We seem to be talking past
each other in other respects.

Let me get back to my contention that statement-level aborts, or atomicity
on the level of SQL statements, are considered an obvious requirement in
the database community. Because everybody assumes this to be the case, it
is hard to find it explicitly mentioned. Let me quote from Gray and
Reuter's "Transaction Processing", considered the bible on the subject by
some:

  Even current database systems ... do not rely on nested transactions for
  their own implementation. This is quite surprising, because nesting the
  scope of commitment and backout is commonplace even in today's SQL
  systems (although the user cannot influence it). This is particularly
  true for SQL update statements that are executed as the subtransactions.
  Think of this transaction as the root transaction and the SQL statements
  as the subtransactions. Obviously, an SQL update statement commits its
  modifications to the top-level transaction. If it fails (assume an
  INSERT being rejected because a unique value condition is violated), it
  is implicitly undone and so appears to be atomic even in the case of a
  failure. In other words, update SQL statements have all the properties
  of subtransactions, but since the implementation techniques are
  typically ad hoc rather than guided by the idea of transaction nesting,
  the general mechanism is not available for the application programmer.

Joachim

--
private:  joachim@kraut.bc.ca    (http://www.kraut.bc.ca)
work:     joachim@mercury.bc.ca  (http://www.mercury.bc.ca)

Re: Can't find destroydb command in 7.0

From

Stephan Richter

Date:

25 April 2000, 00:42:54

>I'm still a newbie to PostgreSQL, but I believe destroydb should be executed
>at the command line, meaning there should be a destroydb command somewhere
>on my HD.  I swear it isn't there.
>
>Appreciate any help.

I do not want to insult you, but did you try:

 > find / -name destroydb

Regards,
Stephan
--
Stephan Richter - (901) 573-3308 - srichter@cbu.edu
CBU - Physics & Chemistry; Framework Web - Web Design & Development
PGP Key: 735E C61E 5C64 F430 4F9C 798E DCA2 07E3 E42B 5391

Re: Can't find destroydb command in 7.0

From

Stephan Richter

Date:

25 April 2000, 00:46:58

At 09:16 PM 4/24/00 -0700, Michael S. Kelly wrote:
>Maybe it's late, maybe it'll all be clear to me in the morning, but I seem
>to have lost the destroydb command after installing PostgreSQL 7.0 RC1.

I just checked. It is called dropdb now, to be consistent with other commands.

Regards,
Stephan
--
Stephan Richter - (901) 573-3308 - srichter@cbu.edu
CBU - Physics & Chemistry; Framework Web - Web Design & Development
PGP Key: 735E C61E 5C64 F430 4F9C 798E DCA2 07E3 E42B 5391

Re: Revisited: Transactions, insert unique.

From

Lincoln Yeoh

Date:

25 April 2000, 03:58:04

At 10:17 PM 24-04-2000 -0400, Mike Mascari wrote:
>Lincoln Yeoh wrote:
>PostgreSQL implements SELECT...FOR UPDATE to allow for the
>sequence you'be described:
>
>Session 1:
>
>BEGIN;
>SELECT webuser FROM webusers WHERE webuser = 'webuser1';
>
>Session 2:
>
>BEGIN;
>UPDATE webusers SET webuser = 'webuser2' WHERE webuser =
>'webuser1';*

The 6.5.3 running on my site does not block for the case I described which
has an INSERT after the select.

begin;
select from mytable where field=value for update;
if rows=0 {
    insert into mytable (field,field2) values (value,value2);
} else {
    update mytable set field=value, field2=value2;
}
commit;

Tested on 6.5.3.
Did a select for update in two concurrent transactions, they did not block,
and both inserts went through (if there was a unique, one of the
transactions would have had to be rolled back and redone from scratch).

If the behaviour is different for 7.0, I'm interested!

Cheerio,

Link.

Re: Revisited: Transactions, insert unique.

From

"Ross J. Reedstrom"

Date:

25 April 2000, 12:26:02

On Mon, Apr 24, 2000 at 09:23:48PM -0700, Joachim Achtzehnter wrote:
>
> This is good to hear. I suspect, though, that to support statement-level
> aborts more is needed than an overhaul of the error system. Statement
> level aborts are essentially a form of nested transaction, i.e. the
> ability to rollback a precisely defined subset of the top-level
> transaction.

Hmm, sounds like your moving the goal posts: now you want checkpoints,
no? The problem under discussion is allowing the tranaction to procede
when an error occurs that both the programmer and the DBMS knows leaves
the system in a consistent state: a simple SQL parse error, for example,
or an attempt to violate a 'unique' constraint on an INSERT. Peter
Eisentraut already offered a very high level, potential dangerous one
line patch, to deal with the 'table not found' case. Digging into the
code a little, I think this will also handle the SQL parse error case
correctly, but may have a problem with constaint case: I've recompiling
code right now to test it. ;-) (Later) Hmm, not good: the bad tuple ends
up going live, not immediately, but after the next select. I think I
better find that old message and follow up with this info, so it's in the
mailing list archives, at least.

>
> > being able to play the 'technically SQL92 compliant' card, without
> > having to cross your fingers behind your back, is very important.
>
> But you are essentially doing that, crossing your fingers I mean, by
> relying on a shaky interpretation solely to be able to claim compliance,
> while you know full well that the alternative interpretation is the better
> one.

Right, but often one is in a situation where technical requirements are
being enforced by beaurocracies for non-technical reasons: if I _know_
that the product is good enough for the task at hand, and there exists no
mechanism to get an exception to the requirement, 'technical compliance'
serves my purposes. Basically, if management is going to play games, I'm
going to outplay them, if I have to.

>
> > Heck, I'd be wrestling with Oracle right now,
>
> Well, since you mention Oracle: A few years ago I had a similar argument
> with them about their peculiar interpretation of what SERIALIZABLE in
> SQL92 means. The early versions of Oracle's multi-version concurrancy
> mechanism didn't provide truely serializable transactions, yet they
> claimed it did by resorting to rather twisted interpretations of the
> standard.
>
> I didn't think this was acceptable then, and I certainly don't condone
> similar window dressing today. Postgresql doesn't need this kind of PR.
>

The difference between the Oracle case, and the Postgresql case is
that the corporation tried to give you that twisting of the standard
as the _only_ answer: What I'm saying is, well, technically, this _is_
compliant, but yes, it's not the right way to do it, and we're working
on it, and by the way, how do you want it to work, and here's the code,
if we can't get to it fast enough, maybe you can?

I'm just asking you not to make Postgresql jump a higher _political_
bar than its competitors. Hmm, I find it odd to be on the 'practical,
business' side of one of these arguments: I'm usually the one pushing
the 'theoretical, academic' side.

One thing I should mention: PostgreSQL's overall transaction semantics
_are not_ SQL92 standard compliant, in that the client must explicitly
start the first transaction. the 'autocommit' mode is a non-SQL92
extension, which is also commonly available in other systems. Having it
be the default is non-standard, however.

>
> Let me get back to my contention that statement-level aborts, or atomicity
> on the level of SQL statements, are considered an obvious requirement in
> the database community. Because everybody assumes this to be the case, it
> is hard to find it explicitly mentioned. Let me quote from Gray and
> Reuter's "Transaction Processing", considered the bible on the subject by
> some:
>
>   Even current database systems ... do not rely on nested transactions for
>   their own implementation. This is quite surprising, because nesting the
>   scope of commitment and backout is commonplace even in today's SQL
>   systems (although the user cannot influence it). This is particularly
>   true for SQL update statements that are executed as the subtransactions.
>   Think of this transaction as the root transaction and the SQL statements
>   as the subtransactions. Obviously, an SQL update statement commits its
>   modifications to the top-level transaction. If it fails (assume an
>   INSERT being rejected because a unique value condition is violated), it
>   is implicitly undone and so appears to be atomic even in the case of a
>   failure. In other words, update SQL statements have all the properties
>   of subtransactions, but since the implementation techniques are
>   typically ad hoc rather than guided by the idea of transaction nesting,
>   the general mechanism is not available for the application programmer.
>

Doesn't this quote mearly bemoan the fact that nesting of transactions is
_not_ a general requirement of "even today's SQL systems"? I can't follow
the example, which seems to move back and forth between an UPDATE and an
INSERT statement. Are they suggesting that the UPDATE could be implemented
as a nested transaction containing a DELETE and an INSERT, and if the
INSERT fails, the DELETE must rollback?  More likely, I suppose, is that
the context is of an overall transaction, and by 'SQL update statements'
they mean any SQL statement that alters the data in the tables. With that
interpretation, I agree it supports your argument that "even today's SQL
systems" (I can hear their academic disdain in that, can't you?) support
an "ad hoc" form of transaction nesting, which postgresql does not.

Hmm, on further reflection, I'm going to come over to your side of this
one: while I reserve the right to twist the standards for those who play
standards compliance games, I will agree that much of the verbage in the
standard make the _most_ sense when statement level aborts are assumed
to be present.

Ross
P.S. Thanks for having the patience to bring me around on this one.

--
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005

Re: Revisited: Transactions, insert unique.

From

Paul Condon

Date:

25 April 2000, 13:16:03

I have not researched this issue thoroughly, or even superficially, but
I understand that PostgreSQL has something called multi-version
concurrency control (MVCC). This is, I'm told, a GOOD THING, and I
believe it. But the documentation and tutorials on MVCC are far too
terse for me to feel comfortable with my understanding. If both
transactions (yours and your competitor's) are serializable, then you
cannot see his insertion, and his transaction will be aborted (I think?
maybe?). This is not standard SQL, but it has the look and feel of a
better way.

I wish there were more information available on MVCC, and how to use it
effectively.

PostgreSQL General Digest (mime)  -   Volume 1 : Issue 97

Today's Topics:
  RE: Revisited: Transactions, insert unique.
                                         ["Hiroshi Inoue"
<Inoue@tpf.co.jp>]

 Subject:
         RE: Revisited: Transactions, insert unique.
   Date:
         Tue, 25 Apr 2000 09:18:52 +0900
   From:
         "Hiroshi Inoue" <Inoue@tpf.co.jp>
     To:
         "Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu>
     CC:
         <pgsql-general@postgresql.org>

> -----Original Message-----
> From: pgsql-general-owner@hub.org
[mailto:pgsql-general-owner@hub.org]On
> Behalf Of Ross J. Reedstrom
>
> And this interpretation will guide the developers in _extending_
> the standard in a consistent way. I know, because the developers that
> implemented the constraints for 7.0 used this (and the SQL3 spec) as
> guides. How's that?
>

I don't know what is standard.
However as far as I see,few people prefer entire rollback on abort.
The problem is that PostgreSQL lacks a per statement rollback
functionality and unfortunately it isn't easy to implement.
Vadim has already planned the implementation. AFAIK one of the
purpose of WAL is to implement savepoint functionality. Savepoint
functionality would enable per statement rollback functionality easily.

The following is an extract of Vadim's posting about 10 months ago.

Re: [HACKERS] RE: [GENERAL] Transaction logging
Well, I'm thinking about WAL last two weeks. Hiroshi pointed me
problems in my approach to savepoints (when a tuple was marked
for update and updated after it) and solution would require
new tid field in header and both t_cmin/t_cmax => bigger header.
I don't like it and so I switched my mind -:).

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

 Subject:
         Re: Revisited: Transactions, insert unique.
   Date:
         Tue, 25 Apr 2000 08:52:49 +0800
   From:
         Lincoln Yeoh <lylyeoh@mecomb.com>
     To:
         davidb@vectormath.com, <pgsql-general@postgresql.org>

Hi David,

That can work if there's only one session using the database. But it's
not
100% if there are multiple users. There's a small chance that a row may
not
exist during the select, but exist by the time of the insert. If I'm
wrong
please correct me - then I'll go optimize some code :).

By having the unorthodox locking mechanism suggested I can ensure at the

application level that no one else is going to insert stuff before my
select, update/insert, without having to lock the whole table.

So it will be
LOCK arbitrary
select
if exist update
else insert
UNLOCK arbitrary

Which would be faster- doing the lock arbitrary method, or doing an
insert
with unique indexes and recovering if necessary (assuming postgresql
does
what other databases do)? I suspect unique indexes could slow inserts
and
updates down significantly.

If we don't want to do all that, how about we have a select for insert
(and
update), which locks things? But I figured that it would be problematic
to
implement in a number of scenarios tho.

Cheerio,

Link.

At 09:56 AM 24-04-2000 -0500, davidb@vectormath.com wrote:
>Hi Lincoln,
>
>I'm not sure I'm understanding your question, but it seems like this is

>something that
>ought to be handled programmatically.  That is, query the table to see
if
>the row exists,
>then decide what you are going to do (insert or update) based on the
results
>of your
>query.
>
>Am I completely missing the point?
>
>David Boerwinkle

RE: Can't find destroydb command in 7.0

From

"Michael S. Kelly"

Date:

25 April 2000, 16:56:04

Thanks Stephan, dropdb worked just fine.  Also, I was not at all offended by
your previous suggestion.  I'm still pretty new to Linux.

This issue surfaces an opportunity for me to help the cause.  As I go
through the process of setting up PostgreSQL 7.0 and developing a new
database in it, I could provide some QA on the documentation.  Two items
have already surfaced.  The first, of course, is that destroydb is now
dropdb in 7.0.  The second is in the INSTALL document.  In the section that
describes how to make PostgreSQL start at boot-time there is the following
paragraph:

    In RedHat Linux add a file /etc/rc.d/init.d/postgres.init which is
    based on the example in contrib/linux/.  Then make a softlink to
    this file from /etc/rc.d/rc5.d/S98postgres.init.

There's a much simpler way to accomplish this.  Simply execute the
following:

    contrib/linux/postgres.init.sh install

Who would I pass these suggestions on to?

-=michael=-

-----Original Message-----
From: Stephan Richter [mailto:srichter@cbu.edu]
Sent: Monday, April 24, 2000 9:53 PM
To: Michael S. Kelly
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Can't find destroydb command in 7.0

At 09:16 PM 4/24/00 -0700, Michael S. Kelly wrote:
>Maybe it's late, maybe it'll all be clear to me in the morning, but I seem
>to have lost the destroydb command after installing PostgreSQL 7.0 RC1.

I just checked. It is called dropdb now, to be consistent with other
commands.

Regards,
Stephan
--
Stephan Richter - (901) 573-3308 - srichter@cbu.edu
CBU - Physics & Chemistry; Framework Web - Web Design & Development
PGP Key: 735E C61E 5C64 F430 4F9C 798E DCA2 07E3 E42B 5391

Re: Revisited: Transactions, insert unique.

From

Lincoln Yeoh

Date:

25 April 2000, 22:27:09

At 10:58 AM 24-04-2000 -0500, Ed Loehr wrote:
>Good point.  And you can combine the check David suggests with the insert
>statement, e.g.,
>
>    INSERT INTO mytable (id, ...)
>        SELECT 7, ...
>        FROM mytable
>        WHERE NOT EXISTS (SELECT * FROM mytable WHERE id = 7)
>
>And then check the return result for number of rows inserted.  '0' means an
>update is needed.  I don't remember if there is cleaner more efficient
>manner for doing that, but probably so...

At first it looked like it might work, but it doesn't ensure uniqueness
when I checked. And after that I realised the situation was even more
difficult.

e.g.
create table test (a int, b text);
Transaction A
begin;
insert into test (a,b) select 4,'four' from test
where not exists (select * from test where a=4);

Transaction B
begin;
insert into test (a,b) select 4,'four' from test
where not exists (select * from test where a=4);

Then you do a commit on both, and you end up with two rows.

The reason this happens is because inserted stuff in one transaction is not
visible to other transactions until a commit. So even David's suggestion
won't work, nor will a "select for update", since with the current
implementation there is nothing to lock on till the commit.

This is correct behaviour, but that means we have to try something else to
ensure unique rows. If we resort to using UNIQUE then we'll have to redo
whole transactions till the error point. If we use the proposed arbitrary
lock, it's a lot more work for the developers plus it shifts so much more
responsibility to the application compared to stuff like select for update.
I kind of like the arbitrary lock thingy, but I'm not in a "large"
environment.

For large environments maybe there should be a select for insert? Or the
behaviour for select for update should change?

That is to mean that the statement blocks if there's any other prior select
for insert with a potentially interfering WHERE clause (ouch!), even if
there are NO relevant ROWS to return (ouch!), . It does not even begin to
fetch rows till after the other transactions are completed.

This doesn't look easy to do. But it's one of the ways to maintain the
Postgresql "no error transaction" behaviour.

To maintain this behaviour there need to be a few more ways to check for
stuff (and lock the state) before attempting to change or create stuff.
Stuff like create table and so on.

Aside: would an arbitrary lock mechanism be better for performance than
using a UNIQUE on a table? In most cases you're inserting to an indexed
table, so having a UNIQUE isn't going to cost much more right?

Cheerio,
Link.

Re: Revisited: Transactions, insert unique.

From

Joachim Achtzehnter

Date:

26 April 2000, 00:15:14

Today, in a message to pgsql-general, Ross J. Reedstrom wrote:
>
> On Mon, Apr 24, 2000 at 09:23:48PM -0700, Joachim Achtzehnter wrote:
> >
> > I suspect, though, that to support statement-level aborts
> > more is needed than an overhaul of the error system. Statement
> > level aborts are essentially a form of nested transaction,...
>
> Hmm, sounds like your moving the goal posts: now you want checkpoints,
> no?

I didn't say that. Was merely pointing out that supporting statement level
aborts may be more difficult than it seems. This is not just an error
reporting issue. What it amounts to is a special case of nested
transactions, i.e. where every statement is an implicit nested
transaction.

Keep in mind that a statement can be very complex, and it can update,
directly or indirectly via triggers, many rows in many tables. To
precisely undo the effect of such a statement may not be trivial, in
general.

[quote from Gray and Reuter omitted]

> Doesn't this quote mearly bemoan the fact that nesting of transactions
> is _not_ a general requirement of "even today's SQL systems"? I can't
> follow the example,

The quote essentially said this: Most existing SQL systems don't support
user-controlled nested transactions. The authors find this surprising
because these same systems DO support a special form of nested
transactions, namely statement-level abort. What they are complaining
about is the fact that vendors choose ad hoc approaches to implement the
special case, instead of implementing the general case. The general
mechanism could then be used to achieve statement-level aborts as well as
supporting user-controlled nested transactions.

Joachim

--
private:  joachim@kraut.bc.ca    (http://www.kraut.bc.ca)
work:     joachim@mercury.bc.ca  (http://www.mercury.bc.ca)

Re: Revisited: Transactions, insert unique.

From

Joachim Achtzehnter

Date:

26 April 2000, 00:20:11

In a message to Ed Loehr and pgsql-general, Lincoln Yeoh wrote:
>
> Transaction A
> begin;
> insert into test (a,b) select 4,'four' from test
> where not exists (select * from test where a=4);
>
> Transaction B
> begin;
> insert into test (a,b) select 4,'four' from test
> where not exists (select * from test where a=4);
>
> Then you do a commit on both, and you end up with two rows.

This is dissapointing indeed! What this means is that Postgresql
transactions are, in fact, not truely serializable. The concurrent
execution of these two transactions is not equivalent to some serial
execution.

Have to read up on the documentation, which I thought was claiming
serializability.

Joachim

--
private:  joachim@kraut.bc.ca    (http://www.kraut.bc.ca)
work:     joachim@mercury.bc.ca  (http://www.mercury.bc.ca)

RE: Revisited: Transactions, insert unique.

From

"Hiroshi Inoue"

Date:

26 April 2000, 02:48:13

> -----Original Message-----
> From: pgsql-general-owner@hub.org [mailto:pgsql-general-owner@hub.org]On
> Behalf Of Joachim Achtzehnter
>
> In a message to Ed Loehr and pgsql-general, Lincoln Yeoh wrote:
> >
> > Transaction A
> > begin;
> > insert into test (a,b) select 4,'four' from test
> > where not exists (select * from test where a=4);
> >
> > Transaction B
> > begin;
> > insert into test (a,b) select 4,'four' from test
> > where not exists (select * from test where a=4);
> >
> > Then you do a commit on both, and you end up with two rows.
>
> This is dissapointing indeed! What this means is that Postgresql
> transactions are, in fact, not truely serializable. The concurrent
> execution of these two transactions is not equivalent to some serial
> execution.
>

Strictly speaking SERIALIZABLE isolation level of PosgreSQL isn't
SERIALIZABLE.  It's same as Oracle.
AFAIK,there's no way to block (logically) non-existent row(key)
except unique constraint.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

P.S. Note that the default isolation level of PostgreSQL is READ
COMMITTED.

RE: Revisited: Transactions, insert unique.

From

Joachim Achtzehnter

Date:

26 April 2000, 03:19:12

In a message to pgsql-general, Hiroshi Inoue wrote:
>
> Strictly speaking SERIALIZABLE isolation level of PosgreSQL isn't
> SERIALIZABLE.

Thanks for clarifying this.

> It's same as Oracle.

So, even the latest versions of Oracle still have this problem?

> AFAIK,there's no way to block (logically) non-existent row(key) except
> unique constraint.

A couple of comments:

There is, of course, a way to achieve this with a concurrancy mechanism
that is based on locking, rather than a multi-version system. Locking
systems in serializable mode would use shared locks for reads and hold the
locks until the end of the transaction. The trick with the non-existent
rows is that the locks must be placed on the access path rather than just
individual rows.

For example, if the select query is executed using a hash index, it would
place a shared lock on the hash bucket where the non-existing row would
have been. If the second transaction does its select before the first one
upgrades its lock to exclusive mode the two transactions will deadlock,
otherwise the second transaction's select blocks. Either way, the problem
is avoided.

Clearly, if a table has no index the lock would have to be placed on the
table because the access method is a full table scan.

The 1,000,000 dollar question is whether there is some way to achieve the
same with a multi version concurrancy system without giving up its
advantages?

My second comment is about the remark "except unique constraint": It is
true that a unique constraint would have stopped the incorrect second
insert in the example under discussion. Furthermore, a unique constraint
or primary key is definitely the way to go in such a case because
performance would be really bad with large tables. However, a unique
constraint does not prevent all serialization violations caused by
non-existent row effects.

> P.S. Note that the default isolation level of PostgreSQL is READ
> COMMITTED.

I see. This is another important point to keep in mind. Thanks for
pointing this out (another minor SQL92 non-compliance :-)

Joachim

--
private:  joachim@kraut.bc.ca    (http://www.kraut.bc.ca)
work:     joachim@mercury.bc.ca  (http://www.mercury.bc.ca)

RE: Revisited: Transactions, insert unique.

From

"Hiroshi Inoue"

Date:

26 April 2000, 05:30:31

> -----Original Message-----
> From: pgsql-general-owner@hub.org [mailto:pgsql-general-owner@hub.org]On
> Behalf Of Joachim Achtzehnter
>
> In a message to pgsql-general, Hiroshi Inoue wrote:
> >
> > Strictly speaking SERIALIZABLE isolation level of PosgreSQL isn't
> > SERIALIZABLE.
>
> Thanks for clarifying this.
>
> > It's same as Oracle.
>
> So, even the latest versions of Oracle still have this problem?
>

I don't know recent versions of Oracle.
However truly SERIALIZABLE isolation level seems very hard to
implement if read-only queries acquire no lock.

> > AFAIK,there's no way to block (logically) non-existent row(key) except
> > unique constraint.
>

Sorry,my above comment lacks the words *in current PostgreSQL*.

> A couple of comments:
>
> There is, of course, a way to achieve this with a concurrancy mechanism
> that is based on locking, rather than a multi-version system. Locking
> systems in serializable mode would use shared locks for reads and hold the
> locks until the end of the transaction. The trick with the non-existent
> rows is that the locks must be placed on the access path rather than just
> individual rows.
>
> For example, if the select query is executed using a hash index, it would
> place a shared lock on the hash bucket where the non-existing row would
> have been. If the second transaction does its select before the first one
> upgrades its lock to exclusive mode the two transactions will deadlock,
> otherwise the second transaction's select blocks. Either way, the problem
> is avoided.
>
> Clearly, if a table has no index the lock would have to be placed on the
> table because the access method is a full table scan.
>
> The 1,000,000 dollar question is whether there is some way to achieve the
> same with a multi version concurrancy system without giving up its
> advantages?
>

There might be some discussion about dirty read though I don't
remember well.  Unique constraint stuff has to read dirty data to
check uniqueness.  I don't know other code using *dirty read*
in current PostgreSQL. It's what I meant by "except unique
constraint".

> My second comment is about the remark "except unique constraint": It is
> true that a unique constraint would have stopped the incorrect second
> insert in the example under discussion. Furthermore, a unique constraint
> or primary key is definitely the way to go in such a case because
> performance would be really bad with large tables.

> However, a unique
> constraint does not prevent all serialization violations caused by
> non-existent row effects.
>

You are right. So Oracle 8.1.5's manual seems to refer FOR UPDATE,
Referential Integrity etc.  as to serializability.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

Re: Revisited: Transactions, insert unique.

From

Ed Loehr

Date:

26 April 2000, 13:14:10

Joachim Achtzehnter wrote:
>
> In a message to Ed Loehr and pgsql-general, Lincoln Yeoh wrote:
> >
> > Transaction A
> > begin;
> > insert into test (a,b) select 4,'four' from test
> > where not exists (select * from test where a=4);
> >
> > Transaction B
> > begin;
> > insert into test (a,b) select 4,'four' from test
> > where not exists (select * from test where a=4);
> >
> > Then you do a commit on both, and you end up with two rows.
>
> This is dissapointing indeed! What this means is that Postgresql
> transactions are, in fact, not truely serializable. The concurrent
> execution of these two transactions is not equivalent to some serial
> execution.
>
> Have to read up on the documentation, which I thought was claiming
> serializability.

Hmmm.  This surprised me, too.  The serialization doesn't work as I
expected it to.  Maybe someone can illuminate.  Here's what confused me...

The documentation says:

    Serializable provides the highest transaction isolation.
    When a transaction is on the serializable level, a query
    sees only data committed before the transaction began and
    never see either dirty data or concurrent transaction
    changes committed during transaction execution. So, this
    level emulates serial transaction execution, as if
    transactions would be executed one after another, serially,
    rather than concurrently.
    - http://www.postgresql.org/docs/postgres/mvcc4508.htm (4/25/2000)

But the following sequence seems to contradict this:

-- Transaction A
DROP TABLE foo;
CREATE TABLE foo (id INTEGER);
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
-- Transaction B
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
INSERT INTO foo (id) SELECT 1 WHERE NOT EXISTS (SELECT * FROM foo WHERE id
= 1);
COMMIT;
SELECT * FROM foo;
-- Transaction A
SELECT * FROM foo;
INSERT INTO foo (id) SELECT 1 WHERE NOT EXISTS (SELECT * FROM foo WHERE id
= 1);
SELECT * FROM foo;
COMMIT;

This sequence allows B's inserts, AFTER A began, to be seen by A (at least,
in pgsql 7.0beta3).

Anyone understand what's going on here?  Bug?

Regards,
Ed Loehr

Re: Revisited: Transactions, insert unique.

From

rmcm@compsoft.com.au

Date:

27 April 2000, 00:28:05

Doesn't

     SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

have to come within transaction - ie

     BEGIN;
     SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

In this order your test seems to behave correctly - only 1 row inserted.
--
Rex McMaster                       rex@mcmaster.wattle.id.au
                     http://www.compsoft.com.au/~rmcm/pgp-pk

Ed Loehr writes:
 >
 > But the following sequence seems to contradict this:
 >
 > -- Transaction A
 > DROP TABLE foo;
 > CREATE TABLE foo (id INTEGER);
 > SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
 > BEGIN;
 > -- Transaction B
 > SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
 > BEGIN;
 > INSERT INTO foo (id) SELECT 1 WHERE NOT EXISTS (SELECT * FROM foo WHERE id
 > = 1);
 > COMMIT;
 > SELECT * FROM foo;
 > -- Transaction A
 > SELECT * FROM foo;
 > INSERT INTO foo (id) SELECT 1 WHERE NOT EXISTS (SELECT * FROM foo WHERE id
 > = 1);
 > SELECT * FROM foo;
 > COMMIT;
 >
 > This sequence allows B's inserts, AFTER A began, to be seen by A (at least,
 > in pgsql 7.0beta3).
 >
 > Anyone understand what's going on here?  Bug?
 >
 > Regards,
 > Ed Loehr

Re: Revisited: Transactions, insert unique.

From

Ed Loehr

Date:

27 April 2000, 00:33:46

rmcm@compsoft.com.au wrote:
>
> Doesn't
>
>      SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
>
> have to come within transaction - ie
>
>      BEGIN;
>      SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
>
> In this order your test seems to behave correctly - only 1 row inserted.

Yes, my initial ordering was in error.  But even after reordering, the
point is not that only 1 row was inserted, but rather that Transaction A
was able to see the effects of transaction B when it clearly should not.

Regards,
Ed Loehr

Re: Revisited: Transactions, insert unique.

From

rmcm@compsoft.com.au

Date:

27 April 2000, 00:39:45

Sorry, mistake in my previous email -

        > - only 1 row inserted

this was before the second commit. After both commits, 2 rows are
visible. Neither transactions can see effects of the other till both
are commited.

A:     CREATE TABLE foo (id INTEGER);
       ===> CREATE
       BEGIN;
       ===> BEGIN
       SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
       ===> SET VARIABLE
       INSERT INTO foo (id) SELECT 1 WHERE NOT EXISTS (SELECT * FROM foo WHERE id = 1);
       ===> INSERT 959179 1
B:     BEGIN;
       ===> BEGIN
       SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
       ===> SET VARIABLE
       SELECT * FROM foo;
       ===> 0 rows
A:     SELECT * FROM foo;
       ===> 1 rows    <== the row inserted in A:
B:     INSERT INTO foo (id) SELECT 1 WHERE NOT EXISTS (SELECT * FROM foo WHERE id = 1);
       ===> INSERT 959155 1
       SELECT * FROM foo;
       ===> 1 rows    <== the row inserted in B:
A:     SELECT * FROM foo;
       ===> 1 rows
B:     COMMIT;
       ===> END
       SELECT * FROM foo;
       ===> 1 rows
A:     SELECT * FROM foo;
       ===> 1 rows
       COMMIT;
       ===> END
       SELECT * FROM foo;
       ===> 2 rows
B:     SELECT * FROM foo;
       ===> 2 rows


Ed Loehr writes:
 > rmcm@compsoft.com.au wrote:
 > >
 > > Doesn't
 > >
 > >      SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
 > >
 > > have to come within transaction - ie
 > >
 > >      BEGIN;
 > >      SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
 > >
 > > In this order your test seems to behave correctly - only 1 row inserted.
 >
 > Yes, my initial ordering was in error.  But even after reordering, the
 > point is not that only 1 row was inserted, but rather that Transaction A
 > was able to see the effects of transaction B when it clearly should not.
 >
 > Regards,
 > Ed Loehr

--
Rex McMaster                       rex@mcmaster.wattle.id.au
                     http://www.compsoft.com.au/~rmcm/pgp-pk

Re: Revisited: Transactions, insert unique.

From

Joachim Achtzehnter

Date:

27 April 2000, 00:48:00

Rex McMaster wrote in an email addressed to Ed Loehr and pgsql-general:
>
> After both commits, 2 rows are visible. Neither transactions can see
> effects of the other till both are commited.
>
> A:
>   BEGIN;
>   SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
>   INSERT INTO foo (id) SELECT 1 WHERE NOT EXISTS (SELECT * FROM foo WHERE id = 1);
> B:
>   BEGIN;
>   SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
>   INSERT INTO foo (id) SELECT 1 WHERE NOT EXISTS (SELECT * FROM foo WHERE id = 1);
>   COMMIT;
> A:
>   COMMIT;

While it is true that neither transaction sees the effect of the other,
the sequence of execution in postgresql depicted above is not
serializable. There is no serial execution of the two transactions,
neither A followed by B, nor B followed by A, that would produce the same
results as the interleaved execution of the two transactions.

The fact that NEITHER transaction sees ANY effect of the other is
precisely the problem! One SHOULD see the effects of the other. The
outcome should be equivalent to one occuring entirely before the other.

With a concurrancy control implementation based on locks one transaction
would see the effect of the other by being blocked at a suitable point, or
by being terminated with a deadlock exception.

Joachim

--
joachim@kraut.bc.ca   (http://www.kraut.bc.ca)
joachim@mercury.bc.ca (http://www.mercury.bc.ca)

RE: Revisited: Transactions, insert unique.

From

"Hiroshi Inoue"

Date:

27 April 2000, 00:48:48

> -----Original Message-----
> From: pgsql-general-owner@hub.org [mailto:pgsql-general-owner@hub.org]On
> Behalf Of Ed Loehr
>
> >
> > Have to read up on the documentation, which I thought was claiming
> > serializability.
>
> Hmmm.  This surprised me, too.  The serialization doesn't work as I
> expected it to.  Maybe someone can illuminate.  Here's what confused me...
>
> The documentation says:
>
>     Serializable provides the highest transaction isolation.
>     When a transaction is on the serializable level, a query
>     sees only data committed before the transaction began and
>     never see either dirty data or concurrent transaction
>     changes committed during transaction execution. So, this
>     level emulates serial transaction execution, as if
>     transactions would be executed one after another, serially,
>     rather than concurrently.
>     - http://www.postgresql.org/docs/postgres/mvcc4508.htm (4/25/2000)
>
> But the following sequence seems to contradict this:
>
> -- Transaction A
> DROP TABLE foo;
> CREATE TABLE foo (id INTEGER);
> SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
> BEGIN;
> -- Transaction B
> SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
> BEGIN;
> INSERT INTO foo (id) SELECT 1 WHERE NOT EXISTS (SELECT * FROM foo WHERE id
> = 1);
> COMMIT;
> SELECT * FROM foo;
> -- Transaction A
> SELECT * FROM foo;
> INSERT INTO foo (id) SELECT 1 WHERE NOT EXISTS (SELECT * FROM foo WHERE id
> = 1);
> SELECT * FROM foo;
> COMMIT;
>
> This sequence allows B's inserts, AFTER A began, to be seen by A
> (at least,
> in pgsql 7.0beta3).
>
> Anyone understand what's going on here?  Bug?
>

You should call SET TRANSACTION immediately after BEGIN.
Note that SET TRANSACTION .. is per transaction command.

PostgreSQL's SERIALIZABLE isolation level would allow both inserts.
READ COMMITED isolation level wouldn't allow A's inserts.

As I mentioned in another posting,PostgreSQL's SERIALIZABLE
isolation level isn't completely serializable and it's same as Oracle.
Probably Vadim refers to this incomplete serializability somewhere
in documentation.
It seems almost impossible to implement complete serializability
under MVCC without using table level locking. I love MVCC much
more than theoretically beautiful complete serializability.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

Re: Revisited: Transactions, insert unique.

From

Ed Loehr

Date:

27 April 2000, 01:16:07

Hiroshi Inoue wrote:
>
> You should call SET TRANSACTION immediately after BEGIN.
> Note that SET TRANSACTION .. is per transaction command.
>
> PostgreSQL's SERIALIZABLE isolation level would allow both inserts.
> READ COMMITED isolation level wouldn't allow A's inserts.

Even if I call SET after BEGIN, it is not consistent with docs or the
standard (see pghackers discussion under same subject), as the two scripts
below seem to demonstrate.

[Rex:  There is a subtle difference between your sequence and mine.  Insert
from B first, and don't do any select from A before you attempt the insert,
as below.]

> As I mentioned in another posting,PostgreSQL's SERIALIZABLE
> isolation level isn't completely serializable and it's same as Oracle.
> Probably Vadim refers to this incomplete serializability somewhere
> in documentation.
> It seems almost impossible to implement complete serializability
> under MVCC without using table level locking. I love MVCC much
> more than theoretically beautiful complete serializability.

Interesting, thanks.  I certainly don't have any big gripes about PG
concurrency, as it has worked flawlessly for me so far with READ COMMITTED
level.  All this has been strictly in the interest of clarifying a
discrepancy between the docs/std and 7.0.  I could see it mattering to
some, but not for my app.

Regards,
Ed Loehr
===

This sequence, AFAICT, appears to fail the SERIALIZABLE standard, allowing
A to see effects from B that permute the serializability...

--------------------- START -----------------------------
-- Within transaction A --------------------------
DROP TABLE foo;
CREATE TABLE foo (id INTEGER, msg VARCHAR);
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

-- Within transaction B --------------------------
BEGIN;
--SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
INSERT INTO foo (id, msg)
        SELECT 1, 'From B'
        WHERE NOT EXISTS (SELECT * FROM foo WHERE id = 1);
COMMIT;
SELECT * FROM foo;

-- Within transaction A --------------------------
SELECT * FROM foo;
INSERT INTO foo (id, msg)
        SELECT 1, 'From A'
        WHERE NOT EXISTS (SELECT * FROM foo WHERE id = 1);
SELECT * FROM foo;
COMMIT;
--------------------- END -----------------------------


...while this sequence, which I would've thought to be functionally
identical, is compliant...


--------------------- START -----------------------------
-- Within transaction A ----------------------------
DROP TABLE foo;
CREATE TABLE foo (id INTEGER, msg VARCHAR);
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

-- Within transaction B ----------------------------
BEGIN;
INSERT INTO foo (id, msg)
        SELECT 1, 'From B'
        WHERE NOT EXISTS (SELECT * FROM foo WHERE id = 1);

-- Within transaction A ----------------------------
SELECT * FROM foo;  -- The magic query.

-- Within transaction B ----------------------------
COMMIT;
SELECT * FROM foo;

-- Within transaction A ----------------------------
SELECT * FROM foo;
INSERT INTO foo (id, msg)
        SELECT 1, 'From A'
        WHERE NOT EXISTS (SELECT * FROM foo WHERE id = 1);
SELECT * FROM foo;
COMMIT;

Re: Revisited: Transactions, insert unique.

From

Lincoln Yeoh

Date:

27 April 2000, 01:17:08

At 11:59 AM 26-04-2000 -0500, Ed Loehr wrote:
>Joachim Achtzehnter wrote:
>>
>> In a message to Ed Loehr and pgsql-general, Lincoln Yeoh wrote:
>> > Then you do a commit on both, and you end up with two rows.
>>
>> This is dissapointing indeed! What this means is that Postgresql
>> transactions are, in fact, not truely serializable. The concurrent

Well, it's not that disappointing as it's not easily avoidable given that
transactions usually consist of multiple consecutive nonzerotime order
sensitive statements. :)

So the database has to figure out how much information to share amongst
partially completed transactions in order to achieve maximum
serialisation/atomicity.

There will also be transactions which to be serialised must actually occur
serially. That is a whole transaction has to go through before the other
one (or every other one), or at least a group of statements within the
transaction.

Stuff like select for update gives the database engine some hints, but in
the case I mentioned it is not enough (unless modified).

Basically the database needs to get a need-to-know summary of the
transaction in a single atomic statement, then it can better decide what to
block. The application is in the best position to give this summary.

So in the end for maximum serialisation it may be a good idea to have an
arbitrary lock mechanism to cover the troublesome cases.

e.g.
getlock('arbitrary string'[,timeout]);
getlock('all temporary product tables, created or about to be')
getlock('all rows potential or real on all tables where key1=X or key2=Y')
If another transaction tries to get the same lock, it will wait for timeout
then abort with an error.
(possible timeout values, -1= never, 0 or greater=time to wait in seconds).

Ok so I'm biased to how MySQL does it (it's simple and has a good chance of
working well). Yes it shifts a lot to the application. But if people have
to do things like do their multiple select for updates in the right order
(to prevent deadlocks), they might as well start using something like this
instead (or fix their architecture if possible ;) ).

And it's likely to be faster! Anyone else here like this arbitrary lock
thingy?

I'm very interested to know of other ways to achieve good serialisation,
especially database centric methods.

Cheerio,

Link.

p.s. Would anyone actually need timeouts of a day (86400) or greater?

RE: Revisited: Transactions, insert unique.

From

"Hiroshi Inoue"

Date:

27 April 2000, 01:18:01

> -----Original Message-----
> From: ed [mailto:ed]On Behalf Of Ed Loehr
>
> Hiroshi Inoue wrote:
> >
> > You should call SET TRANSACTION immediately after BEGIN.
> > Note that SET TRANSACTION .. is per transaction command.
> >
> > PostgreSQL's SERIALIZABLE isolation level would allow both inserts.
> > READ COMMITED isolation level wouldn't allow A's inserts.
>
> Even if I call SET after BEGIN, it is not consistent with docs or the
> standard (see pghackers discussion under same subject), as the two scripts
> below seem to demonstrate.
>

Hmm,the magic query ???
Snapshot may be created when the first query is issued though
I don't remember well.

>
> This sequence, AFAICT, appears to fail the SERIALIZABLE standard, allowing
> A to see effects from B that permute the serializability...
>

AFAIK,serializability doesn't imply permutability.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

Re: Revisited: Transactions, insert unique.

From

rmcm@compsoft.com.au

Date:

27 April 2000, 01:18:52

Ed,

I think the lack of SELECT may be the critical item here, with respect to
reconciling pg's behaviour with (some of) it's docs. The documentation
for SET indicates that the snapshot is taken from the first DML
statement (I cannot locate a similar reference in the MVCC dos). This appears
to be consistent with your observation below.

Rex

Ed Loehr writes:
 > Even if I call SET after BEGIN, it is not consistent with docs or the
 > standard (see pghackers discussion under same subject), as the two scripts
 > below seem to demonstrate.
 >
 > [Rex:  There is a subtle difference between your sequence and mine.  Insert
 > from B first, and don't do any select from A before you attempt the insert,
 > as below.]
 >

--
Rex McMaster                       rex@mcmaster.wattle.id.au
                     http://www.compsoft.com.au/~rmcm/pgp-pk

Re: Revisited: Transactions, insert unique.

From

Joachim Achtzehnter

Date:

27 April 2000, 03:55:17

In a message to and Hiroshi Inoue pgsql-general, Ed Loehr wrote:
>
> -- Within transaction A --------------------------
> BEGIN;
> SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
>
> -- Within transaction B --------------------------
> BEGIN;
> SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
> INSERT INTO foo (id, msg)
>         SELECT 1, 'From B'
>         WHERE NOT EXISTS (SELECT * FROM foo WHERE id = 1);
> COMMIT;
>
> -- Within transaction A --------------------------
> SELECT * FROM foo;
> ...

In this case, it is actually OK for A to see the committed results of B
because the overall outcome is then equivalent to B occuring entirely
before A.

In general, it is important to remember what SERIALIZABLE means: A
particular concurrent execution of several transactions must have an
observable outcome that is equivalent to running the same transactions one
after the other (serialized). It is NOT required that the outcome be
equivalent to the result that would be observed by running the
transactions in a particular order, such as in the order they were
actually started. The outcome is only required to be equivalent to some
(arbitrary) order.

A concurrancy mechanism supports the SERIALIZABLE isolation level if
it guarantees that every concurrent execution of transactions is
serializable.

Joachim

--
private:  joachim@kraut.bc.ca    (http://www.kraut.bc.ca)
work:     joachim@mercury.bc.ca  (http://www.mercury.bc.ca)

RE: Revisited: Transactions, insert unique.

From

Peter Eisentraut

Date:

27 April 2000, 04:17:20

On Thu, 27 Apr 2000, Hiroshi Inoue wrote:

> PostgreSQL's SERIALIZABLE isolation level would allow both inserts.
> READ COMMITED isolation level wouldn't allow A's inserts.
>
> As I mentioned in another posting,PostgreSQL's SERIALIZABLE
> isolation level isn't completely serializable and it's same as Oracle.
> Probably Vadim refers to this incomplete serializability somewhere
> in documentation.
> It seems almost impossible to implement complete serializability
> under MVCC without using table level locking. I love MVCC much
> more than theoretically beautiful complete serializability.

Given that Postgres correctly recognizes concurrent updates and aborts one
of the transactions, how difficult would it be to do the same for inserts?

--
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden

RE: Revisited: Transactions, insert unique.

From

"Hiroshi Inoue"

Date:

27 April 2000, 04:41:40

> -----Original Message-----
> From: Peter Eisentraut [mailto:e99re41@DoCS.UU.SE]
> Sent: Thursday, April 27, 2000 4:46 PM
> To: Hiroshi Inoue
>
> On Thu, 27 Apr 2000, Hiroshi Inoue wrote:
>
> > PostgreSQL's SERIALIZABLE isolation level would allow both inserts.
> > READ COMMITED isolation level wouldn't allow A's inserts.
> >
> > As I mentioned in another posting,PostgreSQL's SERIALIZABLE
> > isolation level isn't completely serializable and it's same as Oracle.
> > Probably Vadim refers to this incomplete serializability somewhere
> > in documentation.
> > It seems almost impossible to implement complete serializability
> > under MVCC without using table level locking. I love MVCC much
> > more than theoretically beautiful complete serializability.
>
> Given that Postgres correctly recognizes concurrent updates and aborts one
> of the transactions,

Is what you mean the following ?
When a transaction is about to update a row which has been updated by
other transactions under SERIALIZABLE isolation level,update is rejected
with message ERROR: Can't serialize access due to concurrent update.

> how difficult would it be to do the same for inserts?

Should INSERT/UPDATE/SELECT .. FOR UPDATE statements wait
until being inserted rows to be commit/aborted ?
This means INSERT operations block all update operations for the
same table.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

Re: Revisited: Transactions, insert unique.

From

Bruce Momjian

Date:

27 April 2000, 08:08:36

> Ok so I'm biased to how MySQL does it (it's simple and has a good chance of
> working well). Yes it shifts a lot to the application. But if people have
> to do things like do their multiple select for updates in the right order
> (to prevent deadlocks), they might as well start using something like this
> instead (or fix their architecture if possible ;) ).
>
> And it's likely to be faster! Anyone else here like this arbitrary lock
> thingy?
>
> I'm very interested to know of other ways to achieve good serialisation,
> especially database centric methods.
>
> Cheerio,
>
> Link.
>
> p.s. Would anyone actually need timeouts of a day (86400) or greater?

Are you asking for sub-second timeout values?  If so, we could add that.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Revisited: Transactions, insert unique.

From

Ed Loehr

Date:

27 April 2000, 12:59:53

rmcm@compsoft.com.au wrote:
>
> ...the snapshot is taken from the first DML statement...

That explains it.  I missed that in the docs, and was mislead by the
SERIALIZABLE doc sections.

Regards,
Ed Loehr

Re: Revisited: Transactions, insert unique.

From

Ed Loehr

Date:

27 April 2000, 13:17:01

Joachim Achtzehnter wrote:
>
> ...It is NOT required that the outcome be
> equivalent to the result that would be observed by running the
> transactions in a particular order, such as in the order they were
> actually started. The outcome is only required to be equivalent to some
> (arbitrary) order.

That was another fundamental misunderstanding of mine about SERIALIZABLE.
I appreciate the explanation, and have a deeper appreciation for Vadim's
MVCC work!!

Regards,
Ed Loehr

Re: Revisited: Transactions, insert unique.

From

Peter Eisentraut

Date:

28 April 2000, 06:07:47

On Thu, 27 Apr 2000, Bruce Momjian wrote:

> Are you asking for sub-second timeout values?  If so, we could add that.

I already have a millisecond resolution for deadlock timeouts implemented.
Is that okay?

--
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden

Re: Revisited: Transactions, insert unique.

From

Lincoln Yeoh

Date:

28 April 2000, 06:30:48

At 07:35 AM 27-04-2000 -0400, Bruce Momjian wrote:
>> p.s. Would anyone actually need timeouts of a day (86400) or greater?
>
>Are you asking for sub-second timeout values?  If so, we could add that.

No. Was wondering about the other end - timeouts of a day. That would
affect how many bytes to used timeout values.

Cheerio,
Link.

Re: Revisited: Transactions, insert unique.

From

Bruce Momjian

Date:

28 April 2000, 10:50:50

[Charset iso-8859-1 unsupported, filtering to ASCII...]
> On Thu, 27 Apr 2000, Bruce Momjian wrote:
>
> > Are you asking for sub-second timeout values?  If so, we could add that.
>
> I already have a millisecond resolution for deadlock timeouts implemented.
> Is that okay?

Sure, that should work.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026