Thread: SAVEPOINT SQL conformance

SAVEPOINT SQL conformance

From
"Michael Paesold"
Date:
Developer docs have this (in SAVEPOINT command reference):
"SQL requires a savepoint to be destroyed automatically when another
savepoint with the same name is established. In PostgreSQL, the old
savepoint is kept, though only the more recent one will be used when rolling
back or releasing. (Releasing the newer savepoint will cause the older one
to again become accessible to ROLLBACK TO SAVEPOINT and RELEASE SAVEPOINT.)"

I read through the code in transam/xact.c recently. Now thinking about it
again, I am wondering, if this non-standard behaviour is really good. I have
found at least one case against it:

Imagine a program that wants to insert some (hundret/thousand) rows into a
table. The program expects some rows to be duplicates, but does not know
which, so it will just try...

BEGIN;
SAVEPOINT a;
INSERT INTO ...
SAVEPOINT a;
INSERT INTO ...
SAVEPOINT a;
...
(encountering an error it would just ROLLBACK TO a;)

According to the standard this is exactly the same as:

BEGIN;
SAVEPOINT a;
INSERT INTO ...
RELEASE SAVEPOINT a;
SAVEPOINT a;
INSERT INTO ...

If the first example code is used (which I would use if I did not think
about postgresql's exception), the subxact state stack in xact.c will grow
and grow and grow... whereas in the case of compliance with the standard, it
will not.
(or if you use the second example).

I have found some discussion in the archives that could explain, why it's
reasonable that postgres does not conform to the standard (although it's
probably not).

Bruce Momjian wrote:
> And consider this case:
>
> BEGIN;
> ...
> SAVEPOINT x;
> SELECT func_call();
> SELECT func_call();
> COMMIT;
>
> Now if func_call has a savepoint, it is really nested because it can't
> know whether the savepoint X will be used to roll back, so its status is
> dependent on the status of X.  Now, if we used savepoints in func_call,
> what happens in the second function call when we define a savepoint with
> the same name?  I assume we overwrite the original, but using nested
> transaction syntax seems much clearer.

Weird things can happen if savepoints have the same name accidentially.
Nevertheless, this is true in any case, wether a savepoint is overwritten by
a savepoint with the same name or not -- the other part will not know of the
first savepoint -- which will cause problems eventually.

If nobody can give a really good reason for the current behaviour, I would
really suggest to change to standard compliance.

Best Regards,
Michael Paesold

P.S. I know that there is still the problem of shared memory growth because
of the transaction id locks, but lets focus on one problem at a time :-).



Re: SAVEPOINT SQL conformance

From
Tom Lane
Date:
"Michael Paesold" <mpaesold@gmx.at> writes:
> If the first example code is used (which I would use if I did not think
> about postgresql's exception), the subxact state stack in xact.c will grow
> and grow and grow... whereas in the case of compliance with the standard, it
> will not.

This is fairly irrelevant though, as the state stack entry is only a
small part of the resources consumed by an uncommitted subtransaction.
I don't really think it outweighs the argument you quoted about
accidental collisions of savepoint names causing problems.

On the other hand, we do have provisions in the code for savepoint
naming levels, and so maybe a better answer to the collision issue
is to support savepoint levels more completely.  (But that's not
standard either.)
        regards, tom lane


Re: SAVEPOINT SQL conformance

From
Oliver Jowett
Date:
Michael Paesold wrote:

> BEGIN;
> SAVEPOINT a;
> INSERT INTO ...
> SAVEPOINT a;
> INSERT INTO ...
> SAVEPOINT a;
> ...
> (encountering an error it would just ROLLBACK TO a;)
> 
> According to the standard this is exactly the same as:
> 
> BEGIN;
> SAVEPOINT a;
> INSERT INTO ...
> RELEASE SAVEPOINT a;
> SAVEPOINT a;
> INSERT INTO ...

While that's true in this particular case, you can't do that 
transformation in the general case. Consider:

BEGIN
SAVEPOINT a -- work
SAVEPOINT b -- work
SAVEPOINT a -- work
ROLLBACK TO b -- work

This is valid: the standard says that the second "SAVEPOINT a" destroys 
and recreates the savepoint "a", but doesn't say that it destroys 
intervening savepoints. In contrast, RELEASE SAVEPOINT explicitly says 
that it destroys the specified savepoint and all savepoints established 
since the specified savepoint.

If you converted the second "SAVEPOINT a" into "RELEASE SAVEPOINT a; 
SAVEPOINT a" then savepoint "b" would be incorrectly destroyed.

It'd work for the (common?) case where there are no intervening 
savepoints, though.

-O


Re: SAVEPOINT SQL conformance

From
"Michael Paesold"
Date:
Tom Lane wrote:

> This is fairly irrelevant though, as the state stack entry is only a
> small part of the resources consumed by an uncommitted subtransaction.
> I don't really think it outweighs the argument you quoted about
> accidental collisions of savepoint names causing problems.

Perhaps I am wrong, but I think the problem of name collision exists anyway,
at least to some extent.

The current behaviour will help in this case:

BEGIN;
...
SAVEPOINT a;
SELECT func();
...
COMMIT;

where func does:
SAVEPOINT a;
....
RELEASE <or> ROLLBACK TO a;

But it will not help, if func only does:
SAVEPOINT a;

on error ROLLBACK TO a; (but no release path)

Then, if an error occurs after the function call, an the programm executes
ROLLBACK TO a; it will rollback to a state that existed inside the
function... rather bad again.

And... in PL/pgSQL you will use EXCEPTION blocks rather than SAVEPOINT
directly... will there are still the other languages.

I just wanted to show that it is still not _that_ save to use colliding
savepoint names.

Regards,
Michael Paesold



Re: SAVEPOINT SQL conformance

From
"Michael Paesold"
Date:
Oliver Jowett wrote:
> BEGIN
> SAVEPOINT a
>   -- work
> SAVEPOINT b
>   -- work
> SAVEPOINT a
>   -- work
> ROLLBACK TO b
>   -- work
>
> This is valid: the standard says that the second "SAVEPOINT a" destroys
> and recreates the savepoint "a", but doesn't say that it destroys
> intervening savepoints. In contrast, RELEASE SAVEPOINT explicitly says
> that it destroys the specified savepoint and all savepoints established
> since the specified savepoint.
>
> If you converted the second "SAVEPOINT a" into "RELEASE SAVEPOINT a;
> SAVEPOINT a" then savepoint "b" would be incorrectly destroyed.

You are right, that proves my proposal to be incorrect, because an implicit
RELEASE SAVEPOINT a; has side effects that are definitively against the
standard or what you would expect.

Best Regards,
Michael Paesold