Thread: [HACKERS] Transaction control in procedures

[HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

31 October 2017, 22:38:05

Here is a patch that implements transaction control in PL/Python
procedures.  (This patch goes on top of "SQL procedures" patch v1.)

So you can do this:

CREATE PROCEDURE transaction_test1()
LANGUAGE plpythonu
AS $$
for i in range(0, 10):
    plpy.execute("INSERT INTO test1 (a) VALUES (%d)" % i)
    if i % 2 == 0:
         plpy.commit()
    else:
         plpy.rollback()
$$;

CALL transaction_test1();

I started with PL/Python because the internal structures there are more
manageable.  Obviously, people will want this for PL/pgSQL as well, and
I have that in the works.  It's not in a usable state, but I have found
that the work needed is essentially the same as in PL/Python for example.

I have discovered three groups of obstacles that needed addressing to
make this work.  At this point, the patch is more of a demo of what
these issues are, and if we come to satisfactory solutions for each of
them, things should fall into place more easily afterwards.

1) While calling CommitTransactionCommand() in the middle of a utility
command works just fine (several utility commands do this, of course),
calling AbortCurrentTransaction() in a similar way does not.  There are
a few pieces of code that think that a transaction abort will always
result in a return to the main control loop, and so they will just clean
away everything.  This is what the changes in portalmem.c are about.
Some comments there already hint about the issue.  No doubt this will
need further refinement.  I think it would be desirable in general to
separate the control flow concerns from the transaction management
concerns more cleanly.

2) SPI needs some work.  It thinks that it can clean everything away at
transaction end.  I have found that instead of TopTransactionContext one
can use PortalContext and get a more suitable life cycle for the memory.
 I have played with some variants to make this configurable (e.g.,
argument to SPI_connect()), but that didn't seem very useful.  There are
some comments indicating that there might not always be a PortalContext,
but the existing tests don't seem to mind.  (There was a thread recently
about making a fake PortalContext for autovacuum, so maybe the current
idea is that we make sure there always is a PortalContext.)  Maybe we
need another context like StatementContext or ProcedureContext.

There also needs to be a way via SPI to end transactions and allowing
*some* cleanup to happen but leaving the memory around.  I have done
that via additional SPI API functions like SPI_commit(), which are then
available to PL implementations.  I also tried making it possible
calling transaction statements directly via SPI_exec() or similar, but
that ended up a total disaster.  So from the API perspective, I like the
current implementation, but the details will no doubt need refinement.

3) The PL implementations themselves allocate memory in
transaction-bound contexts for convenience as well.  This is usually
easy to fix by switching to PortalContext as well.  As you see, the
PL/Python code part of the patch is actually very small.  Changes in
other PLs would be similar.

Two issues have not been addressed yet:

A) It would be desirable to be able to run commands such as VACUUM and
CREATE INDEX CONCURRENTLY in a procedure.  This needs a bit of thinking
and standards-lawyering about the semantics, like where exactly do
transactions begin and end in various combinations.  It will probably
also need a different entry point into SPI, because SPI_exec cannot
handle statements ending transactions.  But so far my assessment is that
this can be added in a mostly independent way later on.

B) There needs to be some kind of call stack for procedure and function
invocations, so that we can track when we are allowed to make
transaction controlling calls.  The key term in the SQL standard is
"non-atomic execution context", which seems specifically devised to
cover this scenario.  So for example, if you have CALL -> CALL -> CALL,
the third call can issue a transaction statement.  But if you have CALL
-> SELECT -> CALL, then the last call cannot, because the SELECT
introduces an atomic execution context.  I don't know if we have such a
thing yet or something that we could easily latch on to.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

v1-0001-Transaction-control-in-PL-Python-procedures.patch

Re: [HACKERS] Transaction control in procedures

From

Simon Riggs

Date:

09 November 2017, 02:48:38

On 31 October 2017 at 15:38, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> Here is a patch that implements transaction control in PL/Python
> procedures.  (This patch goes on top of "SQL procedures" patch v1.)

The patch is incredibly short for such a feature, which is probably a
good indication that it is feasible.

Amazing!

> So you can do this:
>
> CREATE PROCEDURE transaction_test1()
> LANGUAGE plpythonu
> AS $$
> for i in range(0, 10):
>     plpy.execute("INSERT INTO test1 (a) VALUES (%d)" % i)
>     if i % 2 == 0:
>          plpy.commit()
>     else:
>          plpy.rollback()
> $$;
>
> CALL transaction_test1();
>
> I started with PL/Python because the internal structures there are more
> manageable.

AFAICS we execute 10 INSERTs and each one runs in a new top-level xid.
The INSERTs succeed in all cases but we then abort odd-numbered ones
and commit even numbered ones.

What would happen if some of the INSERTs failed? Where would control
go to? (Maybe this is just "no change" in this particular proc)

What happens if the procedure is updated during execution? Presumably
it keeps executing the original version as seen in the initial
snapshot?

Does the xmin of this session advance after each transaction, or do we
hold the snapshot used for the procedure body open, causing us to hold
back xmin and prevent vacuuming from being effective?

What would happen if a procedure recursively called itself? And yet it
was updated half-way through? Would that throw an error (I think it
should).

> Obviously, people will want this for PL/pgSQL as well, and
> I have that in the works.  It's not in a usable state, but I have found
> that the work needed is essentially the same as in PL/Python for example.
>
> I have discovered three groups of obstacles that needed addressing to
> make this work.  At this point, the patch is more of a demo of what
> these issues are, and if we come to satisfactory solutions for each of
> them, things should fall into place more easily afterwards.
>
> 1) While calling CommitTransactionCommand() in the middle of a utility
> command works just fine (several utility commands do this, of course),
> calling AbortCurrentTransaction() in a similar way does not.  There are
> a few pieces of code that think that a transaction abort will always
> result in a return to the main control loop, and so they will just clean
> away everything.  This is what the changes in portalmem.c are about.
> Some comments there already hint about the issue.  No doubt this will
> need further refinement.  I think it would be desirable in general to
> separate the control flow concerns from the transaction management
> concerns more cleanly.

+1

> 2) SPI needs some work.  It thinks that it can clean everything away at
> transaction end.  I have found that instead of TopTransactionContext one
> can use PortalContext and get a more suitable life cycle for the memory.
>  I have played with some variants to make this configurable (e.g.,
> argument to SPI_connect()), but that didn't seem very useful.  There are
> some comments indicating that there might not always be a PortalContext,
> but the existing tests don't seem to mind.  (There was a thread recently
> about making a fake PortalContext for autovacuum, so maybe the current
> idea is that we make sure there always is a PortalContext.)  Maybe we
> need another context like StatementContext or ProcedureContext.
>
> There also needs to be a way via SPI to end transactions and allowing
> *some* cleanup to happen but leaving the memory around.  I have done
> that via additional SPI API functions like SPI_commit(), which are then
> available to PL implementations.  I also tried making it possible
> calling transaction statements directly via SPI_exec() or similar, but
> that ended up a total disaster.  So from the API perspective, I like the
> current implementation, but the details will no doubt need refinement.
>
> 3) The PL implementations themselves allocate memory in
> transaction-bound contexts for convenience as well.  This is usually
> easy to fix by switching to PortalContext as well.  As you see, the
> PL/Python code part of the patch is actually very small.  Changes in
> other PLs would be similar.

Is there some kind of interlock to prevent dropping the portal half way thru?

Will the SPI transaction control functions fail if called from a
normal function?

> Two issues have not been addressed yet:
>
> A) It would be desirable to be able to run commands such as VACUUM and
> CREATE INDEX CONCURRENTLY in a procedure.  This needs a bit of thinking
> and standards-lawyering about the semantics, like where exactly do
> transactions begin and end in various combinations.  It will probably
> also need a different entry point into SPI, because SPI_exec cannot
> handle statements ending transactions.  But so far my assessment is that
> this can be added in a mostly independent way later on.

Sounds like a separate commit, but perhaps it influences the design?

> B) There needs to be some kind of call stack for procedure and function
> invocations, so that we can track when we are allowed to make
> transaction controlling calls.  The key term in the SQL standard is
> "non-atomic execution context", which seems specifically devised to
> cover this scenario.  So for example, if you have CALL -> CALL -> CALL,
> the third call can issue a transaction statement.  But if you have CALL
> -> SELECT -> CALL, then the last call cannot, because the SELECT
> introduces an atomic execution context.  I don't know if we have such a
> thing yet or something that we could easily latch on to.

Yeh. The internal_xact flag is only set at EoXact, so its not really
set as described in the .h
It would certainly be useful to have some state that allows sanity
checking on weird state transitions.

What we would benefit from is a README that gives the theory of
operation, so everyone can read and agree.

Presumably we would need to contact authors of main PL languages to
get them to comment on the API requirements for their languages.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transaction control in procedures

From

Merlin Moncure

Date:

14 November 2017, 17:27:11

On Wed, Nov 8, 2017 at 5:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 31 October 2017 at 15:38, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> Here is a patch that implements transaction control in PL/Python
>> procedures.  (This patch goes on top of "SQL procedures" patch v1.)
>
> The patch is incredibly short for such a feature, which is probably a
> good indication that it is feasible.
>
> Amazing!

I have to agree with that.  I'm really excited about this...

Some questions:
*) Will it be possible to do operations like this in pl/pgsql?

BEGIN SELECT INTO r * FROM foo;
 START TRANSACTION;  -- perhaps we ought to have a special function
for this instead (BEGIN is reserved, etc). SET transaction_isololation TO serializable;
...
*) Will there be any negative consequences to a procedure running
with an unbounded run time?  For example, something like:

LOOP SELECT check_for_stuff_to_do();
 IF stuff_to_do THEN   do_stuff(); ELSE   PERFORM pg_sleep(1); END IF;
END LOOP;

*) Will pg_cancel_backend() cancel the currently executing statement
or the procedure? (I guess probably the procedure but I'm curious)

*) Will long running procedures be subject to statement timeout (and
does it apply to the entire procedure)?  Will they be able to control
statement_timeout from within the procedure itself?

*) Will pg_stat_activity show the invoking CALL or the currently
executing statement?  I see a strong argument for showing both of
these things. although I understand that's out of scope here.

If these questions (especially the first two) come down the correct
way, then it will mean that I can stop coding in other languages
(primarily bash) for a fairly large number of cases that I really
think belong in the database itself.  This would really simplify
coding, some things in bash are really awkward to get right such as a
mutex to guarantee single script invocation.  My only real dependency
on the operation system environment at that point would be cron to
step in to the backround daemon process (which would immediately set
an advisory lock).

I'm somewhat surprised that SPI is the point of attack for this
functionality, but if it works that's really the best case scenario
(the only downside I can see is that the various out of core pl/s have
to implement the interface individually).

merlin

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

14 November 2017, 21:09:26

On 11/14/17 09:27, Merlin Moncure wrote:
> *) Will it be possible to do operations like this in pl/pgsql?
> 
> BEGIN
>   SELECT INTO r * FROM foo;
> 
>   START TRANSACTION;  -- perhaps we ought to have a special function
> for this instead (BEGIN is reserved, etc).
>   SET transaction_isololation TO serializable;
> ...

Eventually, I don't see why not.  Currently, it's not complete.

One detail in your example is that when you enter the procedure, you are
already in a transaction, so you would have to run either COMMIT or
ROLLBACK before the START TRANSACTION.

Also, you can't run SET TRANSACTION ISOLATION through SPI, so one would
have to implement a separate code path for that, but that would just be
a bit of leg work.

>  *) Will there be any negative consequences to a procedure running
> with an unbounded run time?  For example, something like:
> 
> LOOP
>   SELECT check_for_stuff_to_do();
> 
>   IF stuff_to_do
>   THEN
>     do_stuff();
>   ELSE
>     PERFORM pg_sleep(1);
>   END IF;
> END LOOP;

That procedure doesn't do anything with transactions, so it's just like
a long-running function.  Otherwise, it'd just be like long-running
client code managing transactions.

> *) Will pg_cancel_backend() cancel the currently executing statement
> or the procedure? (I guess probably the procedure but I'm curious)

Same as the way it currently works.  It will raise an exception, which
will travel up the stack and eventually issue an error or be caught.  If
someone knows more specific concerns here I could look into it, but I
don't see any problem.

> *) Will long running procedures be subject to statement timeout (and
> does it apply to the entire procedure)?

See previous item.

> Will they be able to control
> statement_timeout from within the procedure itself?

The statement timeout alarm is set by the top-level execution loop, so
you can't change a statement timeout that is already in progress.  But
you could change the GUC and commit it for the next top-level statement.

> *) Will pg_stat_activity show the invoking CALL or the currently
> executing statement?  I see a strong argument for showing both of
> these things. although I understand that's out of scope here.

Not different from a function execution, i.e., top-level statement.

> If these questions (especially the first two) come down the correct
> way, then it will mean that I can stop coding in other languages
> (primarily bash) for a fairly large number of cases that I really
> think belong in the database itself.  This would really simplify
> coding, some things in bash are really awkward to get right such as a
> mutex to guarantee single script invocation.  My only real dependency
> on the operation system environment at that point would be cron to
> step in to the backround daemon process (which would immediately set
> an advisory lock).

Well, some kind of scheduler around procedures would be pretty cool at
some point.

> I'm somewhat surprised that SPI is the point of attack for this
> functionality, but if it works that's really the best case scenario
> (the only downside I can see is that the various out of core pl/s have
> to implement the interface individually).

So I tried different things here, and I'll list them here to explain how
I got there.

Option zero is to not use SPI at all and implement a whole new internal
command execution system.  But that would obviously be a large amount of
work, and it would look 85% like SPI, and as it turns out it's not
necessary.

The first thing I tried out what to run transaction control statements
through SPI.  That turned out to be very complicated and confusing and
fragile, mainly because of the dichotomy between the internal
subtransaction management that the PLs do and the explicit transaction
control statements on the other hand.  It was just a giant unworkable mess.

The next thing I tried was to shut down (SPI_finish) SPI before a
transaction boundary command and restart it (SPI_connect) it afterwards.That would work in principle, but it would
requirea fair amount of

work in each PL, because they implicitly rely on SPI (or perhaps are
tangled up with SPI) for memory management.

The setup I finally arrived at was to implement the transaction boundary
commands as SPI API calls and let them internally make sure that only
the appropriate stuff is cleared away at transaction boundaries.  This
turned out to be the easiest and cleanest.  I have since the last patch
implemented the transaction control capabilities in PL/pgSQL, PL/Perl,
and PL/Tcl, and it was entirely trivial once the details were worked out
as I had shown in PL/Python.  I will post an updated patch with this soon.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Merlin Moncure

Date:

15 November 2017, 00:33:24

On Tue, Nov 14, 2017 at 12:09 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 11/14/17 09:27, Merlin Moncure wrote:
>> *) Will it be possible to do operations like this in pl/pgsql?
>>
>> BEGIN
>>   SELECT INTO r * FROM foo;
>>
>>   START TRANSACTION;  -- perhaps we ought to have a special function
>> for this instead (BEGIN is reserved, etc).
>>   SET transaction_isololation TO serializable;
>> ...
>
> Eventually, I don't see why not.  Currently, it's not complete.
>
> One detail in your example is that when you enter the procedure, you are
> already in a transaction, so you would have to run either COMMIT or
> ROLLBACK before the START TRANSACTION.

Ok, that's good, but it seems a little wonky to me to have to issue
COMMIT first.  Shouldn't that be the default?  Meaning you would not
be *in* a transaction unless you specified to be in one.

> Also, you can't run SET TRANSACTION ISOLATION through SPI, so one would
> have to implement a separate code path for that, but that would just be
> a bit of leg work.

Roger -- I'm more interested in if your design generally supports this
being able to this (either now or in the future...).  I'm hammering on
this point for basically two reasons:

1) Trying to understand if the MVCC snapshot creation can be
meaningfully controlled (I think so, but I'll verify)).
2) This is an important case for databases that want to run in a mode
(typically serializeable) but lower the isolation for specific cases;
for example to loop on a special flag being set in a table.  It's
annoying to only be able to specify this on the client side; I tend to
like to abstract arcane database considerations into the database
whenever possible.

>>  *) Will there be any negative consequences to a procedure running
>> with an unbounded run time?  For example, something like:
>>
>> LOOP
>>   SELECT check_for_stuff_to_do();
>>
>>   IF stuff_to_do
>>   THEN
>>     do_stuff();
>>   ELSE
>>     PERFORM pg_sleep(1);
>>   END IF;
>> END LOOP;
>
> That procedure doesn't do anything with transactions, so it's just like
> a long-running function.  Otherwise, it'd just be like long-running
> client code managing transactions.

Can we zero in on this?  The question implied, 'can you do this
without being in a transaction'?  PERFORM do_stuff() is a implicit
transaction, so it ought to end when the function returns right?
Meaning, assuming I was not already in a transaction when hitting this
block, I would not be subject to an endless transaction duration?

>>> *) Will pg_cancel_backend() cancel the currently executing statement
>> or the procedure? (I guess probably the procedure but I'm curious)
>
> Same as the way it currently works.  It will raise an exception, which
> will travel up the stack and eventually issue an error or be caught.  If
> someone knows more specific concerns here I could look into it, but I
> don't see any problem.

Yeah, that works.

>> I'm somewhat surprised that SPI is the point of attack for this
>> functionality, but if it works that's really the best case scenario
>> (the only downside I can see is that the various out of core pl/s have
>> to implement the interface individually).
>
> So I tried different things here, and I'll list them here to explain how
> I got there.
>
> Option zero is to not use SPI at all and implement a whole new internal
> command execution system.  But that would obviously be a large amount of
> work, and it would look 85% like SPI, and as it turns out it's not
> necessary.
>
> The first thing I tried out what to run transaction control statements
> through SPI.  That turned out to be very complicated and confusing and
> fragile, mainly because of the dichotomy between the internal
> subtransaction management that the PLs do and the explicit transaction
> control statements on the other hand.  It was just a giant unworkable mess.
>
> The next thing I tried was to shut down (SPI_finish) SPI before a
> transaction boundary command and restart it (SPI_connect) it afterwards.
>  That would work in principle, but it would require a fair amount of
> work in each PL, because they implicitly rely on SPI (or perhaps are
> tangled up with SPI) for memory management.
>
> The setup I finally arrived at was to implement the transaction boundary
> commands as SPI API calls and let them internally make sure that only
> the appropriate stuff is cleared away at transaction boundaries.  This
> turned out to be the easiest and cleanest.  I have since the last patch
> implemented the transaction control capabilities in PL/pgSQL, PL/Perl,
> and PL/Tcl, and it was entirely trivial once the details were worked out
> as I had shown in PL/Python.  I will post an updated patch with this soon.

well, you've convinced me.  now that you've got pl/pgsql implemented
I'll fire it up and see if I can make qualitative assessments...

merlin

Re: Transaction control in procedures

From

legrand legrand

Date:

15 November 2017, 01:40:56

will that kind of statement (that is permitted with Oracle but gives errors
ora-1555 snapshot too old) be permitted ?

beginfor c in (select id from tab where cond='blabla')loop   update tab set x=1 where id=c.id;   commit;end loop;
end;






--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

15 November 2017, 02:27:59

On 11/14/17 16:33, Merlin Moncure wrote:
>> One detail in your example is that when you enter the procedure, you are
>> already in a transaction, so you would have to run either COMMIT or
>> ROLLBACK before the START TRANSACTION.
> 
> Ok, that's good, but it seems a little wonky to me to have to issue
> COMMIT first.  Shouldn't that be the default?  Meaning you would not
> be *in* a transaction unless you specified to be in one.

But that's not how this feature is defined in the SQL standard and AFAIK
other implementations.  When you enter the procedure call, you are in a
transaction.  For one thing, a procedure does not *have* to do
transaction control.  So if it's a plain old procedure like a function
that just runs a few statements, there needs to be a transaction.  We
can't know ahead of time whether the procedure will execute a
transaction control statement and then retroactively change when the
transaction should have started.  Something like an autonomous
transaction procedure might work more like that, but not this feature.

>> Also, you can't run SET TRANSACTION ISOLATION through SPI, so one would
>> have to implement a separate code path for that, but that would just be
>> a bit of leg work.
> 
> Roger -- I'm more interested in if your design generally supports this
> being able to this (either now or in the future...).

Nothing in this patch really changes anything about how transactions
themselves work.  So I don't see why any of this shouldn't work.  As I
fill in the gaps in the code, I'll make sure to come back around to
this, but for the time being I can't say anything more.

> Can we zero in on this?  The question implied, 'can you do this
> without being in a transaction'?  PERFORM do_stuff() is a implicit
> transaction, so it ought to end when the function returns right?
> Meaning, assuming I was not already in a transaction when hitting this
> block, I would not be subject to an endless transaction duration?

In the server, you are always in a transaction, so that's not how this
works.  I think this also ties into my first response above.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

15 November 2017, 02:38:38

On 10/31/17 15:38, Peter Eisentraut wrote:
> Here is a patch that implements transaction control in PL/Python
> procedures.  (This patch goes on top of "SQL procedures" patch v1.)

Here is an updated patch, now on top of "SQL procedures" v2.

Relative to the previous patch v1 I added transaction control to
PL/pgSQL, PL/Perl, and PL/Tcl with relative ease.  (There is a weird
crash in one PL/Perl test that is currently commented out.  And I can't
get a proper backtrace.  Maybe something to do with recursive Perl
interpreters?)

I crash-coursed myself in PL/Perl and PL/Tcl (and Tcl).  If anyone has
more of a feel for those languages and wants to comment on the proposed
interfaces and internals, please chime in.

I also added tracking so that transaction control commands can only be
made in the proper context, currently meaning only top-level procedure
calls, not functions or other procedure calls.  This should be extended
to also allow nested CALLs without anything in between, but I need more
time to code that.

I'll spend a bit more time on tidying up a few things, and a bunch of
documentation is missing, but I currently don't see any more major
issues here.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

v2-0001-Transaction-control-in-PL-procedures.patch

Re: [HACKERS] Transaction control in procedures

From

Merlin Moncure

Date:

15 November 2017, 16:38:15

On Tue, Nov 14, 2017 at 5:27 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 11/14/17 16:33, Merlin Moncure wrote:
>>> One detail in your example is that when you enter the procedure, you are
>>> already in a transaction, so you would have to run either COMMIT or
>>> ROLLBACK before the START TRANSACTION.
>>
>> Ok, that's good, but it seems a little wonky to me to have to issue
>> COMMIT first.  Shouldn't that be the default?  Meaning you would not
>> be *in* a transaction unless you specified to be in one.
>
> But that's not how this feature is defined in the SQL standard and AFAIK
> other implementations.  When you enter the procedure call, you are in a
> transaction.  For one thing, a procedure does not *have* to do
> transaction control.  So if it's a plain old procedure like a function
> that just runs a few statements, there needs to be a transaction.

Hm, OK.   Well, SQL Server (which is pretty far from the SQL standard)
works that way.  See here:
http://www.4guysfromrolla.com/webtech/080305-1.shtml.  DB2, which is
very close to the SQL standard, only supports COMMIT/ROLLBACK (not
begin/start etc)

https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.0/com.ibm.nz.sproc.doc/c_sproc_transaction_commits_and_rollbacks.html.
Either approach is ok I guess, and always being in a transaction
probably has some advantages. performance being an obvious one.  With
DB2, the COMMIT statement acts as kind of a flush, or a paired
'commit;begin;'.

>> Can we zero in on this?  The question implied, 'can you do this
>> without being in a transaction'?  PERFORM do_stuff() is a implicit
>> transaction, so it ought to end when the function returns right?
>> Meaning, assuming I was not already in a transaction when hitting this
>> block, I would not be subject to an endless transaction duration?
>
> In the server, you are always in a transaction, so that's not how this
> works.  I think this also ties into my first response above.

I'll try this out myself, but as long as we can have a *bounded*
transaction lifetime (basically the time to do stuff + 1 second) via
something like:
LOOP <do stuff> COMMIT; PERFORM pg_sleep(1);
END LOOP;

... I'm good. I'll try your patch out ASAP.  Thanks for answering all
my questions.

merlin

Re: [HACKERS] Transaction control in procedures

From

Pavel Stehule

Date:

15 November 2017, 16:44:45

2017-11-15 14:38 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:

On Tue, Nov 14, 2017 at 5:27 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 11/14/17 16:33, Merlin Moncure wrote:
>>> One detail in your example is that when you enter the procedure, you are
>>> already in a transaction, so you would have to run either COMMIT or
>>> ROLLBACK before the START TRANSACTION.
>>
>> Ok, that's good, but it seems a little wonky to me to have to issue
>> COMMIT first. Shouldn't that be the default? Meaning you would not
>> be *in* a transaction unless you specified to be in one.
>
> But that's not how this feature is defined in the SQL standard and AFAIK
> other implementations. When you enter the procedure call, you are in a
> transaction. For one thing, a procedure does not *have* to do
> transaction control. So if it's a plain old procedure like a function
> that just runs a few statements, there needs to be a transaction.

Hm, OK. Well, SQL Server (which is pretty far from the SQL standard)
works that way. See here:
http://www.4guysfromrolla.com/webtech/080305-1.shtml. DB2, which is
very close to the SQL standard, only supports COMMIT/ROLLBACK (not
begin/start etc)
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.0/com.ibm.nz.sproc.doc/c_sproc_transaction_commits_and_rollbacks.html.
Either approach is ok I guess, and always being in a transaction
probably has some advantages. performance being an obvious one. With
DB2, the COMMIT statement acts as kind of a flush, or a paired
'commit;begin;'.

same in Oracle PL/SQL

>> Can we zero in on this? The question implied, 'can you do this
>> without being in a transaction'? PERFORM do_stuff() is a implicit
>> transaction, so it ought to end when the function returns right?
>> Meaning, assuming I was not already in a transaction when hitting this
>> block, I would not be subject to an endless transaction duration?
>
> In the server, you are always in a transaction, so that's not how this
> works. I think this also ties into my first response above.

I'll try this out myself, but as long as we can have a *bounded*
transaction lifetime (basically the time to do stuff + 1 second) via
something like:
LOOP
<do stuff>
COMMIT;
PERFORM pg_sleep(1);
END LOOP;

... I'm good. I'll try your patch out ASAP. Thanks for answering all
my questions.

merlin

Re: [HACKERS] Transaction control in procedures

From

Merlin Moncure

Date:

15 November 2017, 17:54:43

On Wed, Nov 15, 2017 at 7:38 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Tue, Nov 14, 2017 at 5:27 PM, Peter Eisentraut
>>> Can we zero in on this?  The question implied, 'can you do this
>>> without being in a transaction'?  PERFORM do_stuff() is a implicit
>>> transaction, so it ought to end when the function returns right?
>>> Meaning, assuming I was not already in a transaction when hitting this
>>> block, I would not be subject to an endless transaction duration?
>>
>> In the server, you are always in a transaction, so that's not how this
>> works.  I think this also ties into my first response above.
>
> I'll try this out myself, but as long as we can have a *bounded*
> transaction lifetime (basically the time to do stuff + 1 second) via
> something like:
> LOOP
>   <do stuff>
>   COMMIT;
>   PERFORM pg_sleep(1);
> END LOOP;
>
> ... I'm good. I'll try your patch out ASAP.  Thanks for answering all
> my questions.


Trying this out (v2 both patches,  compiled clean, thank you!),
postgres=# CREATE OR REPLACE PROCEDURE foo() AS
$$
BEGIN LOOP   PERFORM 1;   COMMIT;   RAISE NOTICE '%', now();   PERFORM pg_sleep(1); END LOOP;
END;
$$ LANGUAGE PLPGSQL;
CREATE PROCEDURE
Time: 0.996 ms
postgres=# call foo();
NOTICE:  2017-11-15 08:52:08.936025-06
NOTICE:  2017-11-15 08:52:08.936025-06

... I noticed that:
*) now() did not advance with commit and,
*) xact_start via pg_stat_activity did not advance

Shouldn't both of those advance with the in-loop COMMIT?

merlin

Re: Transaction control in procedures

From

Peter Eisentraut

Date:

16 November 2017, 00:21:56

On 11/14/17 17:40, legrand legrand wrote:
> will that kind of statement (that is permitted with Oracle but gives errors
> ora-1555 snapshot too old) be permitted ?
> 
> begin
>  for c in (select id from tab where cond='blabla')
>  loop
>     update tab set x=1 where id=c.id;
>     commit;
>  end loop;
> end;

Hmm, that currently results in

ERROR:  cannot commit while a portal is pinned

You say this fails in Oracle too.  Are we supposed to make this work
somehow?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

16 November 2017, 00:36:54

On 11/8/17 18:48, Simon Riggs wrote:
> What would happen if some of the INSERTs failed? Where would control
> go to? (Maybe this is just "no change" in this particular proc)

An exception is raised and unless the exception is caught (depending on
the PL), control leaves the procedure.  What is already committed stays.

> What happens if the procedure is updated during execution? Presumably
> it keeps executing the original version as seen in the initial
> snapshot?

correct

> Does the xmin of this session advance after each transaction, or do we
> hold the snapshot used for the procedure body open, causing us to hold
> back xmin and prevent vacuuming from being effective?
> 
> What would happen if a procedure recursively called itself? And yet it
> was updated half-way through? Would that throw an error (I think it
> should).

I don't think anything special happens here.  The snapshot that is used
to read the procedure source code and other meta information is released
at a transaction boundary.

>> 3) The PL implementations themselves allocate memory in
>> transaction-bound contexts for convenience as well.  This is usually
>> easy to fix by switching to PortalContext as well.  As you see, the
>> PL/Python code part of the patch is actually very small.  Changes in
>> other PLs would be similar.
> 
> Is there some kind of interlock to prevent dropping the portal half way thru?

I should probably look this up, but I don't think this is fundamentally
different from how VACUUM and CREATE INDEX CONCURRENTLY run inside a
portal and issue multiple transactions in sequence.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

16 November 2017, 00:42:35

On 11/15/17 09:54, Merlin Moncure wrote:
> ... I noticed that:
> *) now() did not advance with commit and,
> *) xact_start via pg_stat_activity did not advance
> 
> Shouldn't both of those advance with the in-loop COMMIT?

I think you are correct.  I'll include that in the next patch version.
It shouldn't be difficult.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

RE: Transaction control in procedures

From

legrand legrand

Date:

16 November 2017, 15:04:59

We are just opening the "close cursors on/at commit" specification ;o)

- MS SQL server: cursor_close_on_commit

- Firebird: close_cursors_at_commit

- DB2: "with hold" syntax

- ...

I think it a plus to support keeping opened cursors at commit time,

but impacts have to be checked in details ...

Oracle Ora-1555 error comes in the extreme situation where rows used inside the cursor are modified, commited, before to be fetched.

Re: [HACKERS] Transaction control in procedures

From

Merlin Moncure

Date:

16 November 2017, 17:00:14

On Wed, Nov 15, 2017 at 3:42 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 11/15/17 09:54, Merlin Moncure wrote:
>> ... I noticed that:
>> *) now() did not advance with commit and,
>> *) xact_start via pg_stat_activity did not advance
>>
>> Shouldn't both of those advance with the in-loop COMMIT?
>
> I think you are correct.  I'll include that in the next patch version.
> It shouldn't be difficult.

Thanks.  A couple of more things.

*) This error message is incorrect now:
postgres=# CREATE OR REPLACE PROCEDURE foo() AS
$$
BEGIN LOOP   SAVEPOINT x; END LOOP;
END;
$$ LANGUAGE PLPGSQL;
CREATE PROCEDURE
Time: 0.912 ms
postgres=# call foo();
ERROR:  cannot begin/end transactions in PL/pgSQL
HINT:  Use a BEGIN block with an EXCEPTION clause instead.
CONTEXT:  PL/pgSQL function foo() line 5 at SQL statement

I guess there are a few places that assume pl/pgsql is always run from
a in-transaction function.

*) Exception handlers seem to override COMMITs.  The the following
procedure will not insert any rows.  I wonder if this is the correct
behavior.  I think there's a pretty good case to be made to raise an
error if a COMMIT is issued if you're in an exception block.

CREATE OR REPLACE PROCEDURE foo() AS
$$
BEGIN LOOP   INSERT INTO foo DEFAULT VALUES;   COMMIT;   RAISE EXCEPTION 'test'; END LOOP;
EXCEPTION WHEN OTHERS THEN RAISE NOTICE '%', SQLERRM;
END;
$$ LANGUAGE PLPGSQL;

*) The documentation could use some work.  Would you like some help?

merlin

Re: Transaction control in procedures

From

Peter Eisentraut

Date:

16 November 2017, 21:36:09

On 11/16/17 07:04, legrand legrand wrote:
> We are just opening the  "close cursors on/at commit" specification ;o)
> 
> - MS SQL server: cursor_close_on_commit
> - Firebird: close_cursors_at_commit
> - DB2: "with hold" syntax
> - ...
> 
> I think it a plus to support keeping opened cursors at commit time,
> but impacts have to be checked in details ...

I think the facilities to support this in PostgreSQL are already there.
We'd just have to tweak PL/pgSQL to make some of its internal portals
"held" and then clean them up manually at some later point.  So I think
this is a localized detail, not a fundamental problem.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Transaction control in procedures

From

Merlin Moncure

Date:

17 November 2017, 02:00:38

On Thu, Nov 16, 2017 at 12:36 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 11/16/17 07:04, legrand legrand wrote:
>> We are just opening the  "close cursors on/at commit" specification ;o)
>>
>> - MS SQL server: cursor_close_on_commit
>> - Firebird: close_cursors_at_commit
>> - DB2: "with hold" syntax
>> - ...
>>
>> I think it a plus to support keeping opened cursors at commit time,
>> but impacts have to be checked in details ...
>
> I think the facilities to support this in PostgreSQL are already there.
> We'd just have to tweak PL/pgSQL to make some of its internal portals
> "held" and then clean them up manually at some later point.  So I think
> this is a localized detail, not a fundamental problem.

Automatically persisting cursors (WITH HOLD) can have some very
surprising performance considerations, except when the current code
execution depends on that particular cursor, in which case the current
behavior of raising a (hopefully better worded-) error seems
appropriate.  Cursors based on temporary tables could be exempt from
having to be closed or checked on COMMIT.

plpgsql does not have the facility to create held cursors
FWICT...automatic promotion seems pretty dubious.  It could certainly
be added, and cursors so held could be exempt from being force
closed/errored as well. In lieu of that, having users materialize data
in to temp tables for such cases seems reasonable.

merlin

Re: [HACKERS] Transaction control in procedures

From

Simon Riggs

Date:

17 November 2017, 02:27:03

On 15 November 2017 at 16:36, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 11/8/17 18:48, Simon Riggs wrote:
>> What would happen if some of the INSERTs failed? Where would control
>> go to? (Maybe this is just "no change" in this particular proc)
>
> An exception is raised and unless the exception is caught (depending on
> the PL), control leaves the procedure.  What is already committed stays.
>
>> What happens if the procedure is updated during execution? Presumably
>> it keeps executing the original version as seen in the initial
>> snapshot?
>
> correct
>
>> Does the xmin of this session advance after each transaction, or do we
>> hold the snapshot used for the procedure body open, causing us to hold
>> back xmin and prevent vacuuming from being effective?
>>
>> What would happen if a procedure recursively called itself? And yet it
>> was updated half-way through? Would that throw an error (I think it
>> should).
>
> I don't think anything special happens here.  The snapshot that is used
> to read the procedure source code and other meta information is released
> at a transaction boundary.

I think we need to document that, or at least note in README

It's quite important for VACUUM.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Simon Riggs

Date:

17 November 2017, 02:35:17

On 14 November 2017 at 13:09, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

>> *) Will pg_cancel_backend() cancel the currently executing statement
>> or the procedure? (I guess probably the procedure but I'm curious)
>
> Same as the way it currently works.  It will raise an exception, which
> will travel up the stack and eventually issue an error or be caught.  If
> someone knows more specific concerns here I could look into it, but I
> don't see any problem.
>
>> *) Will long running procedures be subject to statement timeout (and
>> does it apply to the entire procedure)?
>
> See previous item.
>
>> Will they be able to control
>> statement_timeout from within the procedure itself?
>
> The statement timeout alarm is set by the top-level execution loop, so
> you can't change a statement timeout that is already in progress.  But
> you could change the GUC and commit it for the next top-level statement.
>
>> *) Will pg_stat_activity show the invoking CALL or the currently
>> executing statement?  I see a strong argument for showing both of
>> these things. although I understand that's out of scope here.
>
> Not different from a function execution, i.e., top-level statement.

Which is the "top-level statement"? The CALL or the currently
executing statement within the proc? I think you mean former.

For the first two answers above the answer was "currently executing
statement", yet the third answer seems to be the procedure. So that is
a slight discrepancy.

ISTM we would like

1) a way to cancel execution of a procedure
2) a way to set a timeout to cancel execution of a procedure

as well as

1) a way to cancel execution of a statement that is running within a procedure
2) a way to set a timeout to cancel execution of a statement in a procedure

Visibility of what a routine is currently executing is the role of a
debugger utility/API.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Merlin Moncure

Date:

17 November 2017, 02:46:14

On Thu, Nov 16, 2017 at 5:35 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 14 November 2017 at 13:09, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>
>>> *) Will pg_cancel_backend() cancel the currently executing statement
>>> or the procedure? (I guess probably the procedure but I'm curious)
>>
>> Same as the way it currently works.  It will raise an exception, which
>> will travel up the stack and eventually issue an error or be caught.  If
>> someone knows more specific concerns here I could look into it, but I
>> don't see any problem.
>>
>>> *) Will long running procedures be subject to statement timeout (and
>>> does it apply to the entire procedure)?
>>
>> See previous item.
>>
>>> Will they be able to control
>>> statement_timeout from within the procedure itself?
>>
>> The statement timeout alarm is set by the top-level execution loop, so
>> you can't change a statement timeout that is already in progress.  But
>> you could change the GUC and commit it for the next top-level statement.
>>
>>> *) Will pg_stat_activity show the invoking CALL or the currently
>>> executing statement?  I see a strong argument for showing both of
>>> these things. although I understand that's out of scope here.
>>
>> Not different from a function execution, i.e., top-level statement.
>
> Which is the "top-level statement"? The CALL or the currently
> executing statement within the proc? I think you mean former.
>
> For the first two answers above the answer was "currently executing
> statement", yet the third answer seems to be the procedure. So that is
> a slight discrepancy.
>
> ISTM we would like
>
> 1) a way to cancel execution of a procedure
> 2) a way to set a timeout to cancel execution of a procedure
>
> as well as
>
> 1) a way to cancel execution of a statement that is running within a procedure
> 2) a way to set a timeout to cancel execution of a statement in a procedure
>
> Visibility of what a routine is currently executing is the role of a
> debugger utility/API.

How could you cancel a statement but not the procedure itself?
Cancelling (either by timeout or administrative) type errors
untrappable by design for very good reasons and untrapped errors ought
to return the database all the way to 'ready for query'.

merlin

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

17 November 2017, 18:16:39

On 11/16/17 18:35, Simon Riggs wrote:
> For the first two answers above the answer was "currently executing
> statement", yet the third answer seems to be the procedure. So that is
> a slight discrepancy.

That's the way function execution, or really any nested execution,
currently works.

> ISTM we would like
> 
> 1) a way to cancel execution of a procedure
> 2) a way to set a timeout to cancel execution of a procedure
> 
> as well as
> 
> 1) a way to cancel execution of a statement that is running within a procedure
> 2) a way to set a timeout to cancel execution of a statement in a procedure
> 
> Visibility of what a routine is currently executing is the role of a
> debugger utility/API.

That would probably be nice, but it would be an entirely separate
undertaking.  In particular, getting insight into some kind of execution
stack would necessarily be language specific.  We do have some of that
for PL/pgSQL of course.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Simon Riggs

Date:

23 November 2017, 02:19:45

On 18 November 2017 at 02:16, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 11/16/17 18:35, Simon Riggs wrote:
>> For the first two answers above the answer was "currently executing
>> statement", yet the third answer seems to be the procedure. So that is
>> a slight discrepancy.
>
> That's the way function execution, or really any nested execution,
> currently works.

I'm impressed that these features are so clean and simple. I wasn't
expecting that. I have very few review comments.

I vote in favour of applying these patches at the end of this CF, end of 11/17.
* Procedures
* Transaction control in PL/pgSQL (only)

That will give us 3 months to discuss problems and find solutions,
then later we can commit PL/Python, PL/perl and PL/tcl once we know
where the dragons are hiding.

If we delay, we will end up with some weird gotcha that needs changing
in the next release.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

28 November 2017, 21:33:59

On 10/31/17 15:38, Peter Eisentraut wrote:
> 2) SPI needs some work.  It thinks that it can clean everything away at
> transaction end.  I have found that instead of TopTransactionContext one
> can use PortalContext and get a more suitable life cycle for the memory.
>  I have played with some variants to make this configurable (e.g.,
> argument to SPI_connect()), but that didn't seem very useful.  There are
> some comments indicating that there might not always be a PortalContext,
> but the existing tests don't seem to mind.  (There was a thread recently
> about making a fake PortalContext for autovacuum, so maybe the current
> idea is that we make sure there always is a PortalContext.)  Maybe we
> need another context like StatementContext or ProcedureContext.

This could use more specific discussion, as it is a critical point.

One general theme in this patch is to use PortalContext instead of
TopTransactionContext (or similar).  In SPI_connect(), we have
   /*    * Create memory contexts for this procedure    *    * XXX it would be better to use PortalContext as the
parent
context, but    * we may not be inside a portal (consider deferred-trigger execution).    * Perhaps
CurTransactionContextwould do?  For now it doesn't matter    * because we clean up explicitly in AtEOSubXact_SPI().
*/  _SPI_current->procCxt = AllocSetContextCreate(TopTransactionContext,
"SPI Proc",

and my patch changes that to PortalContext in defiance of that comment.

So either the comment is incorrect or we have insufficient test coverage
or something in between.

ISTM that in the normal case, at the time deferred triggers are
executed, we are in the portal that executes the COMMIT command, so that
should work.  There are some cases that call CommitTransactionCommand()
internally, but they don't run in cases when triggers are pending (e.g.,
VACUUM).  Although logical apply workers might be a problem, but they
clearly postdate that comment.

In any case, the precedent in autovacuum appears to be to make a fake
PortalContext if needed, so we could do that.  Can we think of other
cases where that might be necessary, so I can construct some test cases?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Michael Paquier

Date:

30 November 2017, 06:14:10

On Wed, Nov 29, 2017 at 3:33 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> [snip]

Moved to next CF as the discussion is still hot.
-- 
Michael

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

01 December 2017, 22:48:51

On 11/14/17 18:38, Peter Eisentraut wrote:
> On 10/31/17 15:38, Peter Eisentraut wrote:
>> Here is a patch that implements transaction control in PL/Python
>> procedures.  (This patch goes on top of "SQL procedures" patch v1.)
> 
> Here is an updated patch, now on top of "SQL procedures" v2.

Here is a new patch, now on top of master.  The main changes are that a
lot of documentation has been added.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

v3-0001-Transaction-control-in-PL-procedures.patch

Re: [HACKERS] Transaction control in procedures

From

Robert Haas

Date:

01 December 2017, 23:28:00

On Fri, Dec 1, 2017 at 2:48 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> Here is a new patch, now on top of master.  The main changes are that a
> lot of documentation has been added.

This feature doesn't have many tests.  I think it should have a lot
more of them.  It's tinkering with the transaction control machinery
of the system in a fairly fundamental way, and that could break
things.

I suggest, in particular, testing how it interactions with resources
such as cursors and prepared statements.  For example, what happens if
you commit or roll back inside a cursor-for loop (given that the
cursor is not holdable)?    There are several kinds of cursor loops in
PostgreSQL, plus there are cursors, prepared statements, and portals
that can be created using SQL commands or protocol messages.  I
suspect that there are quite a few tricky interactions there.

Other things to think about:

- COMMIT or ROLLBACK inside a PLpgsql block with an attached EXCEPTION
block, or when an SQL SAVEPOINT has been established previously.

- COMMIT or ROLLBACK inside a procedure with a SET clause attached,
and/or while SET LOCAL is in effect either at the inner or outer
level.

- COMMIT or ROLLBACK with open large objects.

- COMMIT inside a procedure fails because of a serialization failure,
deferred constraint, etc.

In some cases, there are not only questions of correctness (it
shouldn't crash/give wrong answers) but also definitional questions
(what exactly should happen in that case, anyway?).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

05 December 2017, 21:25:25

On 12/1/17 15:28, Robert Haas wrote:
> This feature doesn't have many tests.  I think it should have a lot
> more of them.  It's tinkering with the transaction control machinery
> of the system in a fairly fundamental way, and that could break
> things.

Thank you, these are great ideas.

> I suggest, in particular, testing how it interactions with resources
> such as cursors and prepared statements.  For example, what happens if
> you commit or roll back inside a cursor-for loop (given that the
> cursor is not holdable)?

This was discussed briefly earlier in the thread.  The mid-term fix is
to convert pinned cursors to holdable ones before a COMMIT in PL/pgSQL
and then clean them up separately later.  I have that mostly working,
but I'd like to hold it for a separate patch submission.  The short-term
fix is to prohibit COMMIT and ROLLBACK while a portal is pinned.

I think ROLLBACK in a cursor loop might not make sense, because the
cursor query itself could have side effects, so a rollback would have to
roll back the entire loop.  That might need more refined analysis before
it could be allowed.

> - COMMIT or ROLLBACK inside a PLpgsql block with an attached EXCEPTION
> block, or when an SQL SAVEPOINT has been established previously.

I think that needs to be prohibited because if you end transactions in
an exception-handled block, you can no longer actually roll back that
block when an exception occurs, which was the entire point.

> - COMMIT or ROLLBACK inside a procedure with a SET clause attached,

That also needs to be prohibited because of the way the GUC nesting
currently works.  It's probably possible to fix it, but it would be a
separate effort.

> and/or while SET LOCAL is in effect either at the inner or outer
> level.

That seems to work fine.

> - COMMIT or ROLLBACK with open large objects.

I haven't been able to reproduce any problems with that, but maybe I
haven't tried hard enough.

> - COMMIT inside a procedure fails because of a serialization failure,
> deferred constraint, etc.

That works fine.  The COMMIT fails and control exits the procedure using
the normal exception propagation.

I'll submit an updated patch with some fixes for the above and more
documentation.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Robert Haas

Date:

05 December 2017, 21:33:55

On Tue, Dec 5, 2017 at 1:25 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> I think ROLLBACK in a cursor loop might not make sense, because the
> cursor query itself could have side effects, so a rollback would have to
> roll back the entire loop.  That might need more refined analysis before
> it could be allowed.

COMMIT really has the same problem; if the cursor query has side
effects, you can't commit those side effects piecemeal as the loop
executed and have things behave sanely.

>> - COMMIT or ROLLBACK inside a procedure with a SET clause attached,
>
> That also needs to be prohibited because of the way the GUC nesting
> currently works.  It's probably possible to fix it, but it would be a
> separate effort.
>
>> and/or while SET LOCAL is in effect either at the inner or outer
>> level.
>
> That seems to work fine.

These two are related -- if you don't permit anything that makes
temporary changes to GUCs at all, like SET clauses attached to
functions, then SET LOCAL won't cause any problems.  The problem is if
you do a transaction operation when something set locally is in the
stack of values, but not at the top.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

06 December 2017, 17:41:57

On 12/5/17 13:33, Robert Haas wrote:
> On Tue, Dec 5, 2017 at 1:25 PM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> I think ROLLBACK in a cursor loop might not make sense, because the
>> cursor query itself could have side effects, so a rollback would have to
>> roll back the entire loop.  That might need more refined analysis before
>> it could be allowed.
> 
> COMMIT really has the same problem; if the cursor query has side
> effects, you can't commit those side effects piecemeal as the loop
> executed and have things behave sanely.

The first COMMIT inside the loop would commit the cursor query.  This
isn't all that different from what you'd get now if you coded this
manually using holdable cursors or just plain client code.  Clearly, you
can create a mess if the loop body interacts with the loop expression,
but that's already the case.

But if you coded something like this yourself now and ran a ROLLBACK
inside the loop, the holdable cursor would disappear (unless previously
committed), so you couldn't proceed with the loop.

The SQL standard for persistent stored modules explicitly prohibits
COMMIT and ROLLBACK in cursor loop bodies.  But I think people will
eventually want it.

>>> - COMMIT or ROLLBACK inside a procedure with a SET clause attached,
>>
>> That also needs to be prohibited because of the way the GUC nesting
>> currently works.  It's probably possible to fix it, but it would be a
>> separate effort.
>>
>>> and/or while SET LOCAL is in effect either at the inner or outer
>>> level.
>>
>> That seems to work fine.
> 
> These two are related -- if you don't permit anything that makes
> temporary changes to GUCs at all, like SET clauses attached to
> functions, then SET LOCAL won't cause any problems.  The problem is if
> you do a transaction operation when something set locally is in the
> stack of values, but not at the top.

Yes, that's exactly the problem.  So right now I'm just preventing the
problematic scenario.  So fix that, one would possibly have to replace
the stack by something not quite a stack.

New patch attached.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

v4-0001-Transaction-control-in-PL-procedures.patch

Re: [HACKERS] Transaction control in procedures

From

Merlin Moncure

Date:

07 December 2017, 01:34:39

On Wed, Dec 6, 2017 at 8:41 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 12/5/17 13:33, Robert Haas wrote:
>> On Tue, Dec 5, 2017 at 1:25 PM, Peter Eisentraut
>> <peter.eisentraut@2ndquadrant.com> wrote:
>>> I think ROLLBACK in a cursor loop might not make sense, because the
>>> cursor query itself could have side effects, so a rollback would have to
>>> roll back the entire loop.  That might need more refined analysis before
>>> it could be allowed.
>>
>> COMMIT really has the same problem; if the cursor query has side
>> effects, you can't commit those side effects piecemeal as the loop
>> executed and have things behave sanely.
>
> The first COMMIT inside the loop would commit the cursor query.  This
> isn't all that different from what you'd get now if you coded this
> manually using holdable cursors or just plain client code.  Clearly, you
> can create a mess if the loop body interacts with the loop expression,
> but that's already the case.
>
> But if you coded something like this yourself now and ran a ROLLBACK
> inside the loop, the holdable cursor would disappear (unless previously
> committed), so you couldn't proceed with the loop.
>
> The SQL standard for persistent stored modules explicitly prohibits
> COMMIT and ROLLBACK in cursor loop bodies.  But I think people will
> eventually want it.

The may want it, but silently promoting all cursors to held ones is
not the way to give it to them, unless we narrow it down the the
'for-loop derived cursor' only.  Even then we should consider syntax
decoration:

FOR x IN SELECT .... WITH HOLD
LOOP

END LOOP;

merlin

Re: [HACKERS] Transaction control in procedures

From

Andrew Dunstan

Date:

08 December 2017, 02:47:51

On 12/06/2017 09:41 AM, Peter Eisentraut wrote:
> On 12/5/17 13:33, Robert Haas wrote:
>> On Tue, Dec 5, 2017 at 1:25 PM, Peter Eisentraut
>> <peter.eisentraut@2ndquadrant.com> wrote:
>>> I think ROLLBACK in a cursor loop might not make sense, because the
>>> cursor query itself could have side effects, so a rollback would have to
>>> roll back the entire loop.  That might need more refined analysis before
>>> it could be allowed.
>> COMMIT really has the same problem; if the cursor query has side
>> effects, you can't commit those side effects piecemeal as the loop
>> executed and have things behave sanely.
> The first COMMIT inside the loop would commit the cursor query.  This
> isn't all that different from what you'd get now if you coded this
> manually using holdable cursors or just plain client code.  Clearly, you
> can create a mess if the loop body interacts with the loop expression,
> but that's already the case.
>
> But if you coded something like this yourself now and ran a ROLLBACK
> inside the loop, the holdable cursor would disappear (unless previously
> committed), so you couldn't proceed with the loop.
>
> The SQL standard for persistent stored modules explicitly prohibits
> COMMIT and ROLLBACK in cursor loop bodies.  But I think people will
> eventually want it.
>
>>>> - COMMIT or ROLLBACK inside a procedure with a SET clause attached,
>>> That also needs to be prohibited because of the way the GUC nesting
>>> currently works.  It's probably possible to fix it, but it would be a
>>> separate effort.
>>>
>>>> and/or while SET LOCAL is in effect either at the inner or outer
>>>> level.
>>> That seems to work fine.
>> These two are related -- if you don't permit anything that makes
>> temporary changes to GUCs at all, like SET clauses attached to
>> functions, then SET LOCAL won't cause any problems.  The problem is if
>> you do a transaction operation when something set locally is in the
>> stack of values, but not at the top.
> Yes, that's exactly the problem.  So right now I'm just preventing the
> problematic scenario.  So fix that, one would possibly have to replace
> the stack by something not quite a stack.
>
>
> New patch attached.
>

In general this looks good. However, I'm getting runtime errors in
plperl_elog.c on two different Linux platforms (Fedora 25, Ubuntu
16.04). There seems to be something funky going on. And we do need to
work out why the commented out plperl test isn't working.

Detailed comments:

Referring to anonymous blocks might be a bit mystifying for some
readers, unless we note that they are invoked via DO.

I think this sentence should probably be worded a bit differently:

    (And of course, BEGIN and END have different meanings in PL/pgSQL.)

Perhaps "Note that" instead of "And of course,".

Why aren't the SPI functions that error out or return 0 just void
instead of returning an int?

The docs say SPI_set_non_atomic() returns 0 on success, but the code
says otherwise.

This sentence in the comment in SPI_Commit() is slightly odd:

    This restriction is particular to PLs implemented on top of SPI.

Perhaps "required by" rather than "particular to" would make it read better.

The empty free_commit() and free_rollback() functions in pl_funcs.c look
slightly odd. I realize that the compiler should optimize the calls
away, but it seems an odd style.

One other thing I wondered about was what if a PL function (say plperl)
used SPI to open an explicit cursor and then looped over it? If there
were a commit or rollback inside that loop we wouldn't have the same
protections we have in plpgsql, ISTM. I haven't tried this yet, so I'm
just speculating about what might happen.

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

11 December 2017, 23:07:20

On 12/7/17 18:47, Andrew Dunstan wrote:
> Referring to anonymous blocks might be a bit mystifying for some
> readers, unless we note that they are invoked via DO.

Added parenthetical comments.

> I think this sentence should probably be worded a bit differently:
> 
>     (And of course, BEGIN and END have different meanings in PL/pgSQL.)
> 
> Perhaps "Note that" instead of "And of course,".

fixed

> Why aren't the SPI functions that error out or return 0 just void
> instead of returning an int?

Tried to align then with existing functions, but I agree it seems weird.
 Changed to return void.

> The docs say SPI_set_non_atomic() returns 0 on success, but the code
> says otherwise.

Fixed the documentation.

> This sentence in the comment in SPI_Commit() is slightly odd:
> 
>     This restriction is particular to PLs implemented on top of SPI.
> 
> Perhaps "required by" rather than "particular to" would make it read better.

fixed

> The empty free_commit() and free_rollback() functions in pl_funcs.c look
> slightly odd. I realize that the compiler should optimize the calls
> away, but it seems an odd style.

That's how the existing code for other statements looks as well.

> One other thing I wondered about was what if a PL function (say plperl)
> used SPI to open an explicit cursor and then looped over it? If there
> were a commit or rollback inside that loop we wouldn't have the same
> protections we have in plpgsql, ISTM. I haven't tried this yet, so I'm
> just speculating about what might happen.

Good point.  I added test cases similar to the plpgsql tests to the
other three languages, which not-so-amusingly gave three different
outcomes.  In PL/Perl in particular, the commit clears away the portal,
and the next call to spi_fetchrow() will then not find the cursor and
just return undefined.  So that is not so nice.  I'm thinking about
extending the portal pinning mechanism to the other languages as well,
which seems mildly useful independent of transaction management.  I will
propose a patch for that soon.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

20 December 2017, 18:08:26

Updated patch attached.

I have addressed the most recent review comments I believe.

The question about what happens to cursor loops in PL/Perl and PL/Python
would be addressed by the separate thread "portal pinning".  The test
cases in this patch are currently marked by FIXMEs.

I have changed the SPI API a bit.  I got rid of SPI_set_nonatomic() and
instead introduced SPI_connect_ext() that you can pass flags to.  The
advantage of that is that in the normal case we can continue to use the
existing memory contexts, so nothing changes for existing uses, which
seems desirable.  (This also appears to address some sporadic test
failures in PL/Perl.)

I have also cleaned up the changes in portalmem.c further, so the
changes are now even smaller.

The commit message in this patch contains more details about some of
these changes.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

v5-0001-Transaction-control-in-PL-procedures.patch

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

06 January 2018, 00:30:33

A merge conflict has arisen, so for simplicity, here is an updated patch.

On 12/20/17 10:08, Peter Eisentraut wrote:
> Updated patch attached.
> 
> I have addressed the most recent review comments I believe.
> 
> The question about what happens to cursor loops in PL/Perl and PL/Python
> would be addressed by the separate thread "portal pinning".  The test
> cases in this patch are currently marked by FIXMEs.
> 
> I have changed the SPI API a bit.  I got rid of SPI_set_nonatomic() and
> instead introduced SPI_connect_ext() that you can pass flags to.  The
> advantage of that is that in the normal case we can continue to use the
> existing memory contexts, so nothing changes for existing uses, which
> seems desirable.  (This also appears to address some sporadic test
> failures in PL/Perl.)
> 
> I have also cleaned up the changes in portalmem.c further, so the
> changes are now even smaller.
> 
> The commit message in this patch contains more details about some of
> these changes.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

v6-0001-Transaction-control-in-PL-procedures.patch

Re: [HACKERS] Transaction control in procedures

From

Andrew Dunstan

Date:

15 January 2018, 20:57:43


On 01/05/2018 04:30 PM, Peter Eisentraut wrote:
> A merge conflict has arisen, so for simplicity, here is an updated patch.
>
> On 12/20/17 10:08, Peter Eisentraut wrote:
>> Updated patch attached.
>>
>> I have addressed the most recent review comments I believe.
>>
>> The question about what happens to cursor loops in PL/Perl and PL/Python
>> would be addressed by the separate thread "portal pinning".  The test
>> cases in this patch are currently marked by FIXMEs.
>>
>> I have changed the SPI API a bit.  I got rid of SPI_set_nonatomic() and
>> instead introduced SPI_connect_ext() that you can pass flags to.  The
>> advantage of that is that in the normal case we can continue to use the
>> existing memory contexts, so nothing changes for existing uses, which
>> seems desirable.  (This also appears to address some sporadic test
>> failures in PL/Perl.)
>>
>> I have also cleaned up the changes in portalmem.c further, so the
>> changes are now even smaller.
>>
>> The commit message in this patch contains more details about some of
>> these changes.


Generally looks good.

This confused me slightly:

    +    Transactions cannot be ended inside loops through query results
    or inside
    +    blocks with exception handlers.

I suggest: "A transaction cannot be ended inside a loop over query
results, nor inside a block with exception handlers."

The patch has bitrotted slightly in src/backend/commands/portalcmds.c

The plperl expected file needs updating. Also, why does spi_commit() in
a loop result in an error message but not spi_rollback()?


cheers

andrew



-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

16 January 2018, 18:16:15

On 1/15/18 12:57, Andrew Dunstan wrote:
> This confused me slightly:
> 
>     +    Transactions cannot be ended inside loops through query results
>     or inside
>     +    blocks with exception handlers.
> 
> I suggest: "A transaction cannot be ended inside a loop over query
> results, nor inside a block with exception handlers."

fixed

> The patch has bitrotted slightly in src/backend/commands/portalcmds.c

merged

> The plperl expected file needs updating. Also, why does spi_commit() in
> a loop result in an error message but not spi_rollback()?

This is all changed now after the patch for portal pinning in PL/Perl
and PL/Python has been committed.  The attached patch behaves better.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

v7-0001-Transaction-control-in-PL-procedures.patch

Re: [HACKERS] Transaction control in procedures

From

Andrew Dunstan

Date:

16 January 2018, 23:24:47


On 01/16/2018 10:16 AM, Peter Eisentraut wrote:
> On 1/15/18 12:57, Andrew Dunstan wrote:
>> This confused me slightly:
>>
>>     +    Transactions cannot be ended inside loops through query results
>>     or inside
>>     +    blocks with exception handlers.
>>
>> I suggest: "A transaction cannot be ended inside a loop over query
>> results, nor inside a block with exception handlers."
> fixed
>
>> The patch has bitrotted slightly in src/backend/commands/portalcmds.c
> merged
>
>> The plperl expected file needs updating. Also, why does spi_commit() in
>> a loop result in an error message but not spi_rollback()?
> This is all changed now after the patch for portal pinning in PL/Perl
> and PL/Python has been committed.  The attached patch behaves better.
>


Looks good. Marking ready for committer.

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Simon Riggs

Date:

19 January 2018, 16:31:30

On 6 December 2017 at 22:34, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Dec 6, 2017 at 8:41 AM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> On 12/5/17 13:33, Robert Haas wrote:
>>> On Tue, Dec 5, 2017 at 1:25 PM, Peter Eisentraut
>>> <peter.eisentraut@2ndquadrant.com> wrote:
>>>> I think ROLLBACK in a cursor loop might not make sense, because the
>>>> cursor query itself could have side effects, so a rollback would have to
>>>> roll back the entire loop.  That might need more refined analysis before
>>>> it could be allowed.
>>>
>>> COMMIT really has the same problem; if the cursor query has side
>>> effects, you can't commit those side effects piecemeal as the loop
>>> executed and have things behave sanely.
>>
>> The first COMMIT inside the loop would commit the cursor query.  This
>> isn't all that different from what you'd get now if you coded this
>> manually using holdable cursors or just plain client code.  Clearly, you
>> can create a mess if the loop body interacts with the loop expression,
>> but that's already the case.
>>
>> But if you coded something like this yourself now and ran a ROLLBACK
>> inside the loop, the holdable cursor would disappear (unless previously
>> committed), so you couldn't proceed with the loop.
>>
>> The SQL standard for persistent stored modules explicitly prohibits
>> COMMIT and ROLLBACK in cursor loop bodies.  But I think people will
>> eventually want it.
>
> The may want it, but silently promoting all cursors to held ones is
> not the way to give it to them, unless we narrow it down the the
> 'for-loop derived cursor' only.

I don't think we should do that automatically for all cursors, but it
seems clear that we would want that iff the loop contains COMMIT or
ROLLBACK.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Simon Riggs

Date:

19 January 2018, 16:42:56

On 16 January 2018 at 20:24, Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:

> Looks good. Marking ready for committer.

Few questions/points for the docs.

Docs say: "A new transaction is started automatically after a
transaction is ended using these commands"
Presumably this would have exactly the same isolation level and other
transaction characteristics?
(Is it somehow possible to vary that. OK if not, no problem)

The error "cannot commit while a subtransaction is active"
is commented as intending to prevent COMMIT/ROLLBACK inside an EXCEPTION block.
That makes sense. It seems it will also prevent SAVEPOINTs, though
that seems not to be intended.
The two cases are dissimilar and it would be possible to block the
former but allow the latter.

It's not documented or tested that SET LOCAL would work or not work.
Does it work?

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Transaction control in procedures

From

Peter Eisentraut

Date:

22 January 2018, 16:54:14

On 1/16/18 15:24, Andrew Dunstan wrote:
> Looks good. Marking ready for committer.

committed

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services