Thread: [HACKERS] Transaction control in procedures
Here is a patch that implements transaction control in PL/Python procedures. (This patch goes on top of "SQL procedures" patch v1.) So you can do this: CREATE PROCEDURE transaction_test1() LANGUAGE plpythonu AS $$ for i in range(0, 10): plpy.execute("INSERT INTO test1 (a) VALUES (%d)" % i) if i % 2 == 0: plpy.commit() else: plpy.rollback() $$; CALL transaction_test1(); I started with PL/Python because the internal structures there are more manageable. Obviously, people will want this for PL/pgSQL as well, and I have that in the works. It's not in a usable state, but I have found that the work needed is essentially the same as in PL/Python for example. I have discovered three groups of obstacles that needed addressing to make this work. At this point, the patch is more of a demo of what these issues are, and if we come to satisfactory solutions for each of them, things should fall into place more easily afterwards. 1) While calling CommitTransactionCommand() in the middle of a utility command works just fine (several utility commands do this, of course), calling AbortCurrentTransaction() in a similar way does not. There are a few pieces of code that think that a transaction abort will always result in a return to the main control loop, and so they will just clean away everything. This is what the changes in portalmem.c are about. Some comments there already hint about the issue. No doubt this will need further refinement. I think it would be desirable in general to separate the control flow concerns from the transaction management concerns more cleanly. 2) SPI needs some work. It thinks that it can clean everything away at transaction end. I have found that instead of TopTransactionContext one can use PortalContext and get a more suitable life cycle for the memory. I have played with some variants to make this configurable (e.g., argument to SPI_connect()), but that didn't seem very useful. There are some comments indicating that there might not always be a PortalContext, but the existing tests don't seem to mind. (There was a thread recently about making a fake PortalContext for autovacuum, so maybe the current idea is that we make sure there always is a PortalContext.) Maybe we need another context like StatementContext or ProcedureContext. There also needs to be a way via SPI to end transactions and allowing *some* cleanup to happen but leaving the memory around. I have done that via additional SPI API functions like SPI_commit(), which are then available to PL implementations. I also tried making it possible calling transaction statements directly via SPI_exec() or similar, but that ended up a total disaster. So from the API perspective, I like the current implementation, but the details will no doubt need refinement. 3) The PL implementations themselves allocate memory in transaction-bound contexts for convenience as well. This is usually easy to fix by switching to PortalContext as well. As you see, the PL/Python code part of the patch is actually very small. Changes in other PLs would be similar. Two issues have not been addressed yet: A) It would be desirable to be able to run commands such as VACUUM and CREATE INDEX CONCURRENTLY in a procedure. This needs a bit of thinking and standards-lawyering about the semantics, like where exactly do transactions begin and end in various combinations. It will probably also need a different entry point into SPI, because SPI_exec cannot handle statements ending transactions. But so far my assessment is that this can be added in a mostly independent way later on. B) There needs to be some kind of call stack for procedure and function invocations, so that we can track when we are allowed to make transaction controlling calls. The key term in the SQL standard is "non-atomic execution context", which seems specifically devised to cover this scenario. So for example, if you have CALL -> CALL -> CALL, the third call can issue a transaction statement. But if you have CALL -> SELECT -> CALL, then the last call cannot, because the SELECT introduces an atomic execution context. I don't know if we have such a thing yet or something that we could easily latch on to. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 31 October 2017 at 15:38, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > Here is a patch that implements transaction control in PL/Python > procedures. (This patch goes on top of "SQL procedures" patch v1.) The patch is incredibly short for such a feature, which is probably a good indication that it is feasible. Amazing! > So you can do this: > > CREATE PROCEDURE transaction_test1() > LANGUAGE plpythonu > AS $$ > for i in range(0, 10): > plpy.execute("INSERT INTO test1 (a) VALUES (%d)" % i) > if i % 2 == 0: > plpy.commit() > else: > plpy.rollback() > $$; > > CALL transaction_test1(); > > I started with PL/Python because the internal structures there are more > manageable. AFAICS we execute 10 INSERTs and each one runs in a new top-level xid. The INSERTs succeed in all cases but we then abort odd-numbered ones and commit even numbered ones. What would happen if some of the INSERTs failed? Where would control go to? (Maybe this is just "no change" in this particular proc) What happens if the procedure is updated during execution? Presumably it keeps executing the original version as seen in the initial snapshot? Does the xmin of this session advance after each transaction, or do we hold the snapshot used for the procedure body open, causing us to hold back xmin and prevent vacuuming from being effective? What would happen if a procedure recursively called itself? And yet it was updated half-way through? Would that throw an error (I think it should). > Obviously, people will want this for PL/pgSQL as well, and > I have that in the works. It's not in a usable state, but I have found > that the work needed is essentially the same as in PL/Python for example. > > I have discovered three groups of obstacles that needed addressing to > make this work. At this point, the patch is more of a demo of what > these issues are, and if we come to satisfactory solutions for each of > them, things should fall into place more easily afterwards. > > 1) While calling CommitTransactionCommand() in the middle of a utility > command works just fine (several utility commands do this, of course), > calling AbortCurrentTransaction() in a similar way does not. There are > a few pieces of code that think that a transaction abort will always > result in a return to the main control loop, and so they will just clean > away everything. This is what the changes in portalmem.c are about. > Some comments there already hint about the issue. No doubt this will > need further refinement. I think it would be desirable in general to > separate the control flow concerns from the transaction management > concerns more cleanly. +1 > 2) SPI needs some work. It thinks that it can clean everything away at > transaction end. I have found that instead of TopTransactionContext one > can use PortalContext and get a more suitable life cycle for the memory. > I have played with some variants to make this configurable (e.g., > argument to SPI_connect()), but that didn't seem very useful. There are > some comments indicating that there might not always be a PortalContext, > but the existing tests don't seem to mind. (There was a thread recently > about making a fake PortalContext for autovacuum, so maybe the current > idea is that we make sure there always is a PortalContext.) Maybe we > need another context like StatementContext or ProcedureContext. > > There also needs to be a way via SPI to end transactions and allowing > *some* cleanup to happen but leaving the memory around. I have done > that via additional SPI API functions like SPI_commit(), which are then > available to PL implementations. I also tried making it possible > calling transaction statements directly via SPI_exec() or similar, but > that ended up a total disaster. So from the API perspective, I like the > current implementation, but the details will no doubt need refinement. > > 3) The PL implementations themselves allocate memory in > transaction-bound contexts for convenience as well. This is usually > easy to fix by switching to PortalContext as well. As you see, the > PL/Python code part of the patch is actually very small. Changes in > other PLs would be similar. Is there some kind of interlock to prevent dropping the portal half way thru? Will the SPI transaction control functions fail if called from a normal function? > Two issues have not been addressed yet: > > A) It would be desirable to be able to run commands such as VACUUM and > CREATE INDEX CONCURRENTLY in a procedure. This needs a bit of thinking > and standards-lawyering about the semantics, like where exactly do > transactions begin and end in various combinations. It will probably > also need a different entry point into SPI, because SPI_exec cannot > handle statements ending transactions. But so far my assessment is that > this can be added in a mostly independent way later on. Sounds like a separate commit, but perhaps it influences the design? > B) There needs to be some kind of call stack for procedure and function > invocations, so that we can track when we are allowed to make > transaction controlling calls. The key term in the SQL standard is > "non-atomic execution context", which seems specifically devised to > cover this scenario. So for example, if you have CALL -> CALL -> CALL, > the third call can issue a transaction statement. But if you have CALL > -> SELECT -> CALL, then the last call cannot, because the SELECT > introduces an atomic execution context. I don't know if we have such a > thing yet or something that we could easily latch on to. Yeh. The internal_xact flag is only set at EoXact, so its not really set as described in the .h It would certainly be useful to have some state that allows sanity checking on weird state transitions. What we would benefit from is a README that gives the theory of operation, so everyone can read and agree. Presumably we would need to contact authors of main PL languages to get them to comment on the API requirements for their languages. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 8, 2017 at 5:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 31 October 2017 at 15:38, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> Here is a patch that implements transaction control in PL/Python >> procedures. (This patch goes on top of "SQL procedures" patch v1.) > > The patch is incredibly short for such a feature, which is probably a > good indication that it is feasible. > > Amazing! I have to agree with that. I'm really excited about this... Some questions: *) Will it be possible to do operations like this in pl/pgsql? BEGIN SELECT INTO r * FROM foo; START TRANSACTION; -- perhaps we ought to have a special function for this instead (BEGIN is reserved, etc). SET transaction_isololation TO serializable; ... *) Will there be any negative consequences to a procedure running with an unbounded run time? For example, something like: LOOP SELECT check_for_stuff_to_do(); IF stuff_to_do THEN do_stuff(); ELSE PERFORM pg_sleep(1); END IF; END LOOP; *) Will pg_cancel_backend() cancel the currently executing statement or the procedure? (I guess probably the procedure but I'm curious) *) Will long running procedures be subject to statement timeout (and does it apply to the entire procedure)? Will they be able to control statement_timeout from within the procedure itself? *) Will pg_stat_activity show the invoking CALL or the currently executing statement? I see a strong argument for showing both of these things. although I understand that's out of scope here. If these questions (especially the first two) come down the correct way, then it will mean that I can stop coding in other languages (primarily bash) for a fairly large number of cases that I really think belong in the database itself. This would really simplify coding, some things in bash are really awkward to get right such as a mutex to guarantee single script invocation. My only real dependency on the operation system environment at that point would be cron to step in to the backround daemon process (which would immediately set an advisory lock). I'm somewhat surprised that SPI is the point of attack for this functionality, but if it works that's really the best case scenario (the only downside I can see is that the various out of core pl/s have to implement the interface individually). merlin
On 11/14/17 09:27, Merlin Moncure wrote: > *) Will it be possible to do operations like this in pl/pgsql? > > BEGIN > SELECT INTO r * FROM foo; > > START TRANSACTION; -- perhaps we ought to have a special function > for this instead (BEGIN is reserved, etc). > SET transaction_isololation TO serializable; > ... Eventually, I don't see why not. Currently, it's not complete. One detail in your example is that when you enter the procedure, you are already in a transaction, so you would have to run either COMMIT or ROLLBACK before the START TRANSACTION. Also, you can't run SET TRANSACTION ISOLATION through SPI, so one would have to implement a separate code path for that, but that would just be a bit of leg work. > *) Will there be any negative consequences to a procedure running > with an unbounded run time? For example, something like: > > LOOP > SELECT check_for_stuff_to_do(); > > IF stuff_to_do > THEN > do_stuff(); > ELSE > PERFORM pg_sleep(1); > END IF; > END LOOP; That procedure doesn't do anything with transactions, so it's just like a long-running function. Otherwise, it'd just be like long-running client code managing transactions. > *) Will pg_cancel_backend() cancel the currently executing statement > or the procedure? (I guess probably the procedure but I'm curious) Same as the way it currently works. It will raise an exception, which will travel up the stack and eventually issue an error or be caught. If someone knows more specific concerns here I could look into it, but I don't see any problem. > *) Will long running procedures be subject to statement timeout (and > does it apply to the entire procedure)? See previous item. > Will they be able to control > statement_timeout from within the procedure itself? The statement timeout alarm is set by the top-level execution loop, so you can't change a statement timeout that is already in progress. But you could change the GUC and commit it for the next top-level statement. > *) Will pg_stat_activity show the invoking CALL or the currently > executing statement? I see a strong argument for showing both of > these things. although I understand that's out of scope here. Not different from a function execution, i.e., top-level statement. > If these questions (especially the first two) come down the correct > way, then it will mean that I can stop coding in other languages > (primarily bash) for a fairly large number of cases that I really > think belong in the database itself. This would really simplify > coding, some things in bash are really awkward to get right such as a > mutex to guarantee single script invocation. My only real dependency > on the operation system environment at that point would be cron to > step in to the backround daemon process (which would immediately set > an advisory lock). Well, some kind of scheduler around procedures would be pretty cool at some point. > I'm somewhat surprised that SPI is the point of attack for this > functionality, but if it works that's really the best case scenario > (the only downside I can see is that the various out of core pl/s have > to implement the interface individually). So I tried different things here, and I'll list them here to explain how I got there. Option zero is to not use SPI at all and implement a whole new internal command execution system. But that would obviously be a large amount of work, and it would look 85% like SPI, and as it turns out it's not necessary. The first thing I tried out what to run transaction control statements through SPI. That turned out to be very complicated and confusing and fragile, mainly because of the dichotomy between the internal subtransaction management that the PLs do and the explicit transaction control statements on the other hand. It was just a giant unworkable mess. The next thing I tried was to shut down (SPI_finish) SPI before a transaction boundary command and restart it (SPI_connect) it afterwards.That would work in principle, but it would requirea fair amount of work in each PL, because they implicitly rely on SPI (or perhaps are tangled up with SPI) for memory management. The setup I finally arrived at was to implement the transaction boundary commands as SPI API calls and let them internally make sure that only the appropriate stuff is cleared away at transaction boundaries. This turned out to be the easiest and cleanest. I have since the last patch implemented the transaction control capabilities in PL/pgSQL, PL/Perl, and PL/Tcl, and it was entirely trivial once the details were worked out as I had shown in PL/Python. I will post an updated patch with this soon. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Nov 14, 2017 at 12:09 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 11/14/17 09:27, Merlin Moncure wrote: >> *) Will it be possible to do operations like this in pl/pgsql? >> >> BEGIN >> SELECT INTO r * FROM foo; >> >> START TRANSACTION; -- perhaps we ought to have a special function >> for this instead (BEGIN is reserved, etc). >> SET transaction_isololation TO serializable; >> ... > > Eventually, I don't see why not. Currently, it's not complete. > > One detail in your example is that when you enter the procedure, you are > already in a transaction, so you would have to run either COMMIT or > ROLLBACK before the START TRANSACTION. Ok, that's good, but it seems a little wonky to me to have to issue COMMIT first. Shouldn't that be the default? Meaning you would not be *in* a transaction unless you specified to be in one. > Also, you can't run SET TRANSACTION ISOLATION through SPI, so one would > have to implement a separate code path for that, but that would just be > a bit of leg work. Roger -- I'm more interested in if your design generally supports this being able to this (either now or in the future...). I'm hammering on this point for basically two reasons: 1) Trying to understand if the MVCC snapshot creation can be meaningfully controlled (I think so, but I'll verify)). 2) This is an important case for databases that want to run in a mode (typically serializeable) but lower the isolation for specific cases; for example to loop on a special flag being set in a table. It's annoying to only be able to specify this on the client side; I tend to like to abstract arcane database considerations into the database whenever possible. >> *) Will there be any negative consequences to a procedure running >> with an unbounded run time? For example, something like: >> >> LOOP >> SELECT check_for_stuff_to_do(); >> >> IF stuff_to_do >> THEN >> do_stuff(); >> ELSE >> PERFORM pg_sleep(1); >> END IF; >> END LOOP; > > That procedure doesn't do anything with transactions, so it's just like > a long-running function. Otherwise, it'd just be like long-running > client code managing transactions. Can we zero in on this? The question implied, 'can you do this without being in a transaction'? PERFORM do_stuff() is a implicit transaction, so it ought to end when the function returns right? Meaning, assuming I was not already in a transaction when hitting this block, I would not be subject to an endless transaction duration? >>> *) Will pg_cancel_backend() cancel the currently executing statement >> or the procedure? (I guess probably the procedure but I'm curious) > > Same as the way it currently works. It will raise an exception, which > will travel up the stack and eventually issue an error or be caught. If > someone knows more specific concerns here I could look into it, but I > don't see any problem. Yeah, that works. >> I'm somewhat surprised that SPI is the point of attack for this >> functionality, but if it works that's really the best case scenario >> (the only downside I can see is that the various out of core pl/s have >> to implement the interface individually). > > So I tried different things here, and I'll list them here to explain how > I got there. > > Option zero is to not use SPI at all and implement a whole new internal > command execution system. But that would obviously be a large amount of > work, and it would look 85% like SPI, and as it turns out it's not > necessary. > > The first thing I tried out what to run transaction control statements > through SPI. That turned out to be very complicated and confusing and > fragile, mainly because of the dichotomy between the internal > subtransaction management that the PLs do and the explicit transaction > control statements on the other hand. It was just a giant unworkable mess. > > The next thing I tried was to shut down (SPI_finish) SPI before a > transaction boundary command and restart it (SPI_connect) it afterwards. > That would work in principle, but it would require a fair amount of > work in each PL, because they implicitly rely on SPI (or perhaps are > tangled up with SPI) for memory management. > > The setup I finally arrived at was to implement the transaction boundary > commands as SPI API calls and let them internally make sure that only > the appropriate stuff is cleared away at transaction boundaries. This > turned out to be the easiest and cleanest. I have since the last patch > implemented the transaction control capabilities in PL/pgSQL, PL/Perl, > and PL/Tcl, and it was entirely trivial once the details were worked out > as I had shown in PL/Python. I will post an updated patch with this soon. well, you've convinced me. now that you've got pl/pgsql implemented I'll fire it up and see if I can make qualitative assessments... merlin
will that kind of statement (that is permitted with Oracle but gives errors ora-1555 snapshot too old) be permitted ? beginfor c in (select id from tab where cond='blabla')loop update tab set x=1 where id=c.id; commit;end loop; end; -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
On 11/14/17 16:33, Merlin Moncure wrote: >> One detail in your example is that when you enter the procedure, you are >> already in a transaction, so you would have to run either COMMIT or >> ROLLBACK before the START TRANSACTION. > > Ok, that's good, but it seems a little wonky to me to have to issue > COMMIT first. Shouldn't that be the default? Meaning you would not > be *in* a transaction unless you specified to be in one. But that's not how this feature is defined in the SQL standard and AFAIK other implementations. When you enter the procedure call, you are in a transaction. For one thing, a procedure does not *have* to do transaction control. So if it's a plain old procedure like a function that just runs a few statements, there needs to be a transaction. We can't know ahead of time whether the procedure will execute a transaction control statement and then retroactively change when the transaction should have started. Something like an autonomous transaction procedure might work more like that, but not this feature. >> Also, you can't run SET TRANSACTION ISOLATION through SPI, so one would >> have to implement a separate code path for that, but that would just be >> a bit of leg work. > > Roger -- I'm more interested in if your design generally supports this > being able to this (either now or in the future...). Nothing in this patch really changes anything about how transactions themselves work. So I don't see why any of this shouldn't work. As I fill in the gaps in the code, I'll make sure to come back around to this, but for the time being I can't say anything more. > Can we zero in on this? The question implied, 'can you do this > without being in a transaction'? PERFORM do_stuff() is a implicit > transaction, so it ought to end when the function returns right? > Meaning, assuming I was not already in a transaction when hitting this > block, I would not be subject to an endless transaction duration? In the server, you are always in a transaction, so that's not how this works. I think this also ties into my first response above. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/31/17 15:38, Peter Eisentraut wrote: > Here is a patch that implements transaction control in PL/Python > procedures. (This patch goes on top of "SQL procedures" patch v1.) Here is an updated patch, now on top of "SQL procedures" v2. Relative to the previous patch v1 I added transaction control to PL/pgSQL, PL/Perl, and PL/Tcl with relative ease. (There is a weird crash in one PL/Perl test that is currently commented out. And I can't get a proper backtrace. Maybe something to do with recursive Perl interpreters?) I crash-coursed myself in PL/Perl and PL/Tcl (and Tcl). If anyone has more of a feel for those languages and wants to comment on the proposed interfaces and internals, please chime in. I also added tracking so that transaction control commands can only be made in the proper context, currently meaning only top-level procedure calls, not functions or other procedure calls. This should be extended to also allow nested CALLs without anything in between, but I need more time to code that. I'll spend a bit more time on tidying up a few things, and a bunch of documentation is missing, but I currently don't see any more major issues here. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Tue, Nov 14, 2017 at 5:27 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 11/14/17 16:33, Merlin Moncure wrote: >>> One detail in your example is that when you enter the procedure, you are >>> already in a transaction, so you would have to run either COMMIT or >>> ROLLBACK before the START TRANSACTION. >> >> Ok, that's good, but it seems a little wonky to me to have to issue >> COMMIT first. Shouldn't that be the default? Meaning you would not >> be *in* a transaction unless you specified to be in one. > > But that's not how this feature is defined in the SQL standard and AFAIK > other implementations. When you enter the procedure call, you are in a > transaction. For one thing, a procedure does not *have* to do > transaction control. So if it's a plain old procedure like a function > that just runs a few statements, there needs to be a transaction. Hm, OK. Well, SQL Server (which is pretty far from the SQL standard) works that way. See here: http://www.4guysfromrolla.com/webtech/080305-1.shtml. DB2, which is very close to the SQL standard, only supports COMMIT/ROLLBACK (not begin/start etc) https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.0/com.ibm.nz.sproc.doc/c_sproc_transaction_commits_and_rollbacks.html. Either approach is ok I guess, and always being in a transaction probably has some advantages. performance being an obvious one. With DB2, the COMMIT statement acts as kind of a flush, or a paired 'commit;begin;'. >> Can we zero in on this? The question implied, 'can you do this >> without being in a transaction'? PERFORM do_stuff() is a implicit >> transaction, so it ought to end when the function returns right? >> Meaning, assuming I was not already in a transaction when hitting this >> block, I would not be subject to an endless transaction duration? > > In the server, you are always in a transaction, so that's not how this > works. I think this also ties into my first response above. I'll try this out myself, but as long as we can have a *bounded* transaction lifetime (basically the time to do stuff + 1 second) via something like: LOOP <do stuff> COMMIT; PERFORM pg_sleep(1); END LOOP; ... I'm good. I'll try your patch out ASAP. Thanks for answering all my questions. merlin
2017-11-15 14:38 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
On Tue, Nov 14, 2017 at 5:27 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 11/14/17 16:33, Merlin Moncure wrote:
>>> One detail in your example is that when you enter the procedure, you are
>>> already in a transaction, so you would have to run either COMMIT or
>>> ROLLBACK before the START TRANSACTION.
>>
>> Ok, that's good, but it seems a little wonky to me to have to issue
>> COMMIT first. Shouldn't that be the default? Meaning you would not
>> be *in* a transaction unless you specified to be in one.
>
> But that's not how this feature is defined in the SQL standard and AFAIK
> other implementations. When you enter the procedure call, you are in a
> transaction. For one thing, a procedure does not *have* to do
> transaction control. So if it's a plain old procedure like a function
> that just runs a few statements, there needs to be a transaction.
Hm, OK. Well, SQL Server (which is pretty far from the SQL standard)
works that way. See here:
http://www.4guysfromrolla.com/webtech/080305-1.shtml. DB2, which is
very close to the SQL standard, only supports COMMIT/ROLLBACK (not
begin/start etc)
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2. 0/com.ibm.nz.sproc.doc/c_ sproc_transaction_commits_and_ rollbacks.html.
Either approach is ok I guess, and always being in a transaction
probably has some advantages. performance being an obvious one. With
DB2, the COMMIT statement acts as kind of a flush, or a paired
'commit;begin;'.
same in Oracle PL/SQL
>> Can we zero in on this? The question implied, 'can you do this
>> without being in a transaction'? PERFORM do_stuff() is a implicit
>> transaction, so it ought to end when the function returns right?
>> Meaning, assuming I was not already in a transaction when hitting this
>> block, I would not be subject to an endless transaction duration?
>
> In the server, you are always in a transaction, so that's not how this
> works. I think this also ties into my first response above.
I'll try this out myself, but as long as we can have a *bounded*
transaction lifetime (basically the time to do stuff + 1 second) via
something like:
LOOP
<do stuff>
COMMIT;
PERFORM pg_sleep(1);
END LOOP;
... I'm good. I'll try your patch out ASAP. Thanks for answering all
my questions.
merlin
On Wed, Nov 15, 2017 at 7:38 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Tue, Nov 14, 2017 at 5:27 PM, Peter Eisentraut >>> Can we zero in on this? The question implied, 'can you do this >>> without being in a transaction'? PERFORM do_stuff() is a implicit >>> transaction, so it ought to end when the function returns right? >>> Meaning, assuming I was not already in a transaction when hitting this >>> block, I would not be subject to an endless transaction duration? >> >> In the server, you are always in a transaction, so that's not how this >> works. I think this also ties into my first response above. > > I'll try this out myself, but as long as we can have a *bounded* > transaction lifetime (basically the time to do stuff + 1 second) via > something like: > LOOP > <do stuff> > COMMIT; > PERFORM pg_sleep(1); > END LOOP; > > ... I'm good. I'll try your patch out ASAP. Thanks for answering all > my questions. Trying this out (v2 both patches, compiled clean, thank you!), postgres=# CREATE OR REPLACE PROCEDURE foo() AS $$ BEGIN LOOP PERFORM 1; COMMIT; RAISE NOTICE '%', now(); PERFORM pg_sleep(1); END LOOP; END; $$ LANGUAGE PLPGSQL; CREATE PROCEDURE Time: 0.996 ms postgres=# call foo(); NOTICE: 2017-11-15 08:52:08.936025-06 NOTICE: 2017-11-15 08:52:08.936025-06 ... I noticed that: *) now() did not advance with commit and, *) xact_start via pg_stat_activity did not advance Shouldn't both of those advance with the in-loop COMMIT? merlin
On 11/14/17 17:40, legrand legrand wrote: > will that kind of statement (that is permitted with Oracle but gives errors > ora-1555 snapshot too old) be permitted ? > > begin > for c in (select id from tab where cond='blabla') > loop > update tab set x=1 where id=c.id; > commit; > end loop; > end; Hmm, that currently results in ERROR: cannot commit while a portal is pinned You say this fails in Oracle too. Are we supposed to make this work somehow? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 11/8/17 18:48, Simon Riggs wrote: > What would happen if some of the INSERTs failed? Where would control > go to? (Maybe this is just "no change" in this particular proc) An exception is raised and unless the exception is caught (depending on the PL), control leaves the procedure. What is already committed stays. > What happens if the procedure is updated during execution? Presumably > it keeps executing the original version as seen in the initial > snapshot? correct > Does the xmin of this session advance after each transaction, or do we > hold the snapshot used for the procedure body open, causing us to hold > back xmin and prevent vacuuming from being effective? > > What would happen if a procedure recursively called itself? And yet it > was updated half-way through? Would that throw an error (I think it > should). I don't think anything special happens here. The snapshot that is used to read the procedure source code and other meta information is released at a transaction boundary. >> 3) The PL implementations themselves allocate memory in >> transaction-bound contexts for convenience as well. This is usually >> easy to fix by switching to PortalContext as well. As you see, the >> PL/Python code part of the patch is actually very small. Changes in >> other PLs would be similar. > > Is there some kind of interlock to prevent dropping the portal half way thru? I should probably look this up, but I don't think this is fundamentally different from how VACUUM and CREATE INDEX CONCURRENTLY run inside a portal and issue multiple transactions in sequence. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 11/15/17 09:54, Merlin Moncure wrote: > ... I noticed that: > *) now() did not advance with commit and, > *) xact_start via pg_stat_activity did not advance > > Shouldn't both of those advance with the in-loop COMMIT? I think you are correct. I'll include that in the next patch version. It shouldn't be difficult. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
We are just opening the "close cursors on/at commit" specification ;o)
- MS SQL server: cursor_close_on_commit
- Firebird: close_cursors_at_commit
- DB2: "with hold" syntax
- ...
I think it a plus to support keeping opened cursors at commit time,
but impacts have to be checked in details ...
Oracle Ora-1555 error comes in the extreme situation where rows used inside the cursor are modified, commited, before to be fetched.
On Wed, Nov 15, 2017 at 3:42 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 11/15/17 09:54, Merlin Moncure wrote: >> ... I noticed that: >> *) now() did not advance with commit and, >> *) xact_start via pg_stat_activity did not advance >> >> Shouldn't both of those advance with the in-loop COMMIT? > > I think you are correct. I'll include that in the next patch version. > It shouldn't be difficult. Thanks. A couple of more things. *) This error message is incorrect now: postgres=# CREATE OR REPLACE PROCEDURE foo() AS $$ BEGIN LOOP SAVEPOINT x; END LOOP; END; $$ LANGUAGE PLPGSQL; CREATE PROCEDURE Time: 0.912 ms postgres=# call foo(); ERROR: cannot begin/end transactions in PL/pgSQL HINT: Use a BEGIN block with an EXCEPTION clause instead. CONTEXT: PL/pgSQL function foo() line 5 at SQL statement I guess there are a few places that assume pl/pgsql is always run from a in-transaction function. *) Exception handlers seem to override COMMITs. The the following procedure will not insert any rows. I wonder if this is the correct behavior. I think there's a pretty good case to be made to raise an error if a COMMIT is issued if you're in an exception block. CREATE OR REPLACE PROCEDURE foo() AS $$ BEGIN LOOP INSERT INTO foo DEFAULT VALUES; COMMIT; RAISE EXCEPTION 'test'; END LOOP; EXCEPTION WHEN OTHERS THEN RAISE NOTICE '%', SQLERRM; END; $$ LANGUAGE PLPGSQL; *) The documentation could use some work. Would you like some help? merlin
On 11/16/17 07:04, legrand legrand wrote: > We are just opening the "close cursors on/at commit" specification ;o) > > - MS SQL server: cursor_close_on_commit > - Firebird: close_cursors_at_commit > - DB2: "with hold" syntax > - ... > > I think it a plus to support keeping opened cursors at commit time, > but impacts have to be checked in details ... I think the facilities to support this in PostgreSQL are already there. We'd just have to tweak PL/pgSQL to make some of its internal portals "held" and then clean them up manually at some later point. So I think this is a localized detail, not a fundamental problem. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Nov 16, 2017 at 12:36 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 11/16/17 07:04, legrand legrand wrote: >> We are just opening the "close cursors on/at commit" specification ;o) >> >> - MS SQL server: cursor_close_on_commit >> - Firebird: close_cursors_at_commit >> - DB2: "with hold" syntax >> - ... >> >> I think it a plus to support keeping opened cursors at commit time, >> but impacts have to be checked in details ... > > I think the facilities to support this in PostgreSQL are already there. > We'd just have to tweak PL/pgSQL to make some of its internal portals > "held" and then clean them up manually at some later point. So I think > this is a localized detail, not a fundamental problem. Automatically persisting cursors (WITH HOLD) can have some very surprising performance considerations, except when the current code execution depends on that particular cursor, in which case the current behavior of raising a (hopefully better worded-) error seems appropriate. Cursors based on temporary tables could be exempt from having to be closed or checked on COMMIT. plpgsql does not have the facility to create held cursors FWICT...automatic promotion seems pretty dubious. It could certainly be added, and cursors so held could be exempt from being force closed/errored as well. In lieu of that, having users materialize data in to temp tables for such cases seems reasonable. merlin
On 15 November 2017 at 16:36, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 11/8/17 18:48, Simon Riggs wrote: >> What would happen if some of the INSERTs failed? Where would control >> go to? (Maybe this is just "no change" in this particular proc) > > An exception is raised and unless the exception is caught (depending on > the PL), control leaves the procedure. What is already committed stays. > >> What happens if the procedure is updated during execution? Presumably >> it keeps executing the original version as seen in the initial >> snapshot? > > correct > >> Does the xmin of this session advance after each transaction, or do we >> hold the snapshot used for the procedure body open, causing us to hold >> back xmin and prevent vacuuming from being effective? >> >> What would happen if a procedure recursively called itself? And yet it >> was updated half-way through? Would that throw an error (I think it >> should). > > I don't think anything special happens here. The snapshot that is used > to read the procedure source code and other meta information is released > at a transaction boundary. I think we need to document that, or at least note in README It's quite important for VACUUM. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 14 November 2017 at 13:09, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: >> *) Will pg_cancel_backend() cancel the currently executing statement >> or the procedure? (I guess probably the procedure but I'm curious) > > Same as the way it currently works. It will raise an exception, which > will travel up the stack and eventually issue an error or be caught. If > someone knows more specific concerns here I could look into it, but I > don't see any problem. > >> *) Will long running procedures be subject to statement timeout (and >> does it apply to the entire procedure)? > > See previous item. > >> Will they be able to control >> statement_timeout from within the procedure itself? > > The statement timeout alarm is set by the top-level execution loop, so > you can't change a statement timeout that is already in progress. But > you could change the GUC and commit it for the next top-level statement. > >> *) Will pg_stat_activity show the invoking CALL or the currently >> executing statement? I see a strong argument for showing both of >> these things. although I understand that's out of scope here. > > Not different from a function execution, i.e., top-level statement. Which is the "top-level statement"? The CALL or the currently executing statement within the proc? I think you mean former. For the first two answers above the answer was "currently executing statement", yet the third answer seems to be the procedure. So that is a slight discrepancy. ISTM we would like 1) a way to cancel execution of a procedure 2) a way to set a timeout to cancel execution of a procedure as well as 1) a way to cancel execution of a statement that is running within a procedure 2) a way to set a timeout to cancel execution of a statement in a procedure Visibility of what a routine is currently executing is the role of a debugger utility/API. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Nov 16, 2017 at 5:35 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 14 November 2017 at 13:09, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: > >>> *) Will pg_cancel_backend() cancel the currently executing statement >>> or the procedure? (I guess probably the procedure but I'm curious) >> >> Same as the way it currently works. It will raise an exception, which >> will travel up the stack and eventually issue an error or be caught. If >> someone knows more specific concerns here I could look into it, but I >> don't see any problem. >> >>> *) Will long running procedures be subject to statement timeout (and >>> does it apply to the entire procedure)? >> >> See previous item. >> >>> Will they be able to control >>> statement_timeout from within the procedure itself? >> >> The statement timeout alarm is set by the top-level execution loop, so >> you can't change a statement timeout that is already in progress. But >> you could change the GUC and commit it for the next top-level statement. >> >>> *) Will pg_stat_activity show the invoking CALL or the currently >>> executing statement? I see a strong argument for showing both of >>> these things. although I understand that's out of scope here. >> >> Not different from a function execution, i.e., top-level statement. > > Which is the "top-level statement"? The CALL or the currently > executing statement within the proc? I think you mean former. > > For the first two answers above the answer was "currently executing > statement", yet the third answer seems to be the procedure. So that is > a slight discrepancy. > > ISTM we would like > > 1) a way to cancel execution of a procedure > 2) a way to set a timeout to cancel execution of a procedure > > as well as > > 1) a way to cancel execution of a statement that is running within a procedure > 2) a way to set a timeout to cancel execution of a statement in a procedure > > Visibility of what a routine is currently executing is the role of a > debugger utility/API. How could you cancel a statement but not the procedure itself? Cancelling (either by timeout or administrative) type errors untrappable by design for very good reasons and untrapped errors ought to return the database all the way to 'ready for query'. merlin
On 11/16/17 18:35, Simon Riggs wrote: > For the first two answers above the answer was "currently executing > statement", yet the third answer seems to be the procedure. So that is > a slight discrepancy. That's the way function execution, or really any nested execution, currently works. > ISTM we would like > > 1) a way to cancel execution of a procedure > 2) a way to set a timeout to cancel execution of a procedure > > as well as > > 1) a way to cancel execution of a statement that is running within a procedure > 2) a way to set a timeout to cancel execution of a statement in a procedure > > Visibility of what a routine is currently executing is the role of a > debugger utility/API. That would probably be nice, but it would be an entirely separate undertaking. In particular, getting insight into some kind of execution stack would necessarily be language specific. We do have some of that for PL/pgSQL of course. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 18 November 2017 at 02:16, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 11/16/17 18:35, Simon Riggs wrote: >> For the first two answers above the answer was "currently executing >> statement", yet the third answer seems to be the procedure. So that is >> a slight discrepancy. > > That's the way function execution, or really any nested execution, > currently works. I'm impressed that these features are so clean and simple. I wasn't expecting that. I have very few review comments. I vote in favour of applying these patches at the end of this CF, end of 11/17. * Procedures * Transaction control in PL/pgSQL (only) That will give us 3 months to discuss problems and find solutions, then later we can commit PL/Python, PL/perl and PL/tcl once we know where the dragons are hiding. If we delay, we will end up with some weird gotcha that needs changing in the next release. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/31/17 15:38, Peter Eisentraut wrote: > 2) SPI needs some work. It thinks that it can clean everything away at > transaction end. I have found that instead of TopTransactionContext one > can use PortalContext and get a more suitable life cycle for the memory. > I have played with some variants to make this configurable (e.g., > argument to SPI_connect()), but that didn't seem very useful. There are > some comments indicating that there might not always be a PortalContext, > but the existing tests don't seem to mind. (There was a thread recently > about making a fake PortalContext for autovacuum, so maybe the current > idea is that we make sure there always is a PortalContext.) Maybe we > need another context like StatementContext or ProcedureContext. This could use more specific discussion, as it is a critical point. One general theme in this patch is to use PortalContext instead of TopTransactionContext (or similar). In SPI_connect(), we have /* * Create memory contexts for this procedure * * XXX it would be better to use PortalContext as the parent context, but * we may not be inside a portal (consider deferred-trigger execution). * Perhaps CurTransactionContextwould do? For now it doesn't matter * because we clean up explicitly in AtEOSubXact_SPI(). */ _SPI_current->procCxt = AllocSetContextCreate(TopTransactionContext, "SPI Proc", and my patch changes that to PortalContext in defiance of that comment. So either the comment is incorrect or we have insufficient test coverage or something in between. ISTM that in the normal case, at the time deferred triggers are executed, we are in the portal that executes the COMMIT command, so that should work. There are some cases that call CommitTransactionCommand() internally, but they don't run in cases when triggers are pending (e.g., VACUUM). Although logical apply workers might be a problem, but they clearly postdate that comment. In any case, the precedent in autovacuum appears to be to make a fake PortalContext if needed, so we could do that. Can we think of other cases where that might be necessary, so I can construct some test cases? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Nov 29, 2017 at 3:33 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > [snip] Moved to next CF as the discussion is still hot. -- Michael
On 11/14/17 18:38, Peter Eisentraut wrote: > On 10/31/17 15:38, Peter Eisentraut wrote: >> Here is a patch that implements transaction control in PL/Python >> procedures. (This patch goes on top of "SQL procedures" patch v1.) > > Here is an updated patch, now on top of "SQL procedures" v2. Here is a new patch, now on top of master. The main changes are that a lot of documentation has been added. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Fri, Dec 1, 2017 at 2:48 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > Here is a new patch, now on top of master. The main changes are that a > lot of documentation has been added. This feature doesn't have many tests. I think it should have a lot more of them. It's tinkering with the transaction control machinery of the system in a fairly fundamental way, and that could break things. I suggest, in particular, testing how it interactions with resources such as cursors and prepared statements. For example, what happens if you commit or roll back inside a cursor-for loop (given that the cursor is not holdable)? There are several kinds of cursor loops in PostgreSQL, plus there are cursors, prepared statements, and portals that can be created using SQL commands or protocol messages. I suspect that there are quite a few tricky interactions there. Other things to think about: - COMMIT or ROLLBACK inside a PLpgsql block with an attached EXCEPTION block, or when an SQL SAVEPOINT has been established previously. - COMMIT or ROLLBACK inside a procedure with a SET clause attached, and/or while SET LOCAL is in effect either at the inner or outer level. - COMMIT or ROLLBACK with open large objects. - COMMIT inside a procedure fails because of a serialization failure, deferred constraint, etc. In some cases, there are not only questions of correctness (it shouldn't crash/give wrong answers) but also definitional questions (what exactly should happen in that case, anyway?). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12/1/17 15:28, Robert Haas wrote: > This feature doesn't have many tests. I think it should have a lot > more of them. It's tinkering with the transaction control machinery > of the system in a fairly fundamental way, and that could break > things. Thank you, these are great ideas. > I suggest, in particular, testing how it interactions with resources > such as cursors and prepared statements. For example, what happens if > you commit or roll back inside a cursor-for loop (given that the > cursor is not holdable)? This was discussed briefly earlier in the thread. The mid-term fix is to convert pinned cursors to holdable ones before a COMMIT in PL/pgSQL and then clean them up separately later. I have that mostly working, but I'd like to hold it for a separate patch submission. The short-term fix is to prohibit COMMIT and ROLLBACK while a portal is pinned. I think ROLLBACK in a cursor loop might not make sense, because the cursor query itself could have side effects, so a rollback would have to roll back the entire loop. That might need more refined analysis before it could be allowed. > - COMMIT or ROLLBACK inside a PLpgsql block with an attached EXCEPTION > block, or when an SQL SAVEPOINT has been established previously. I think that needs to be prohibited because if you end transactions in an exception-handled block, you can no longer actually roll back that block when an exception occurs, which was the entire point. > - COMMIT or ROLLBACK inside a procedure with a SET clause attached, That also needs to be prohibited because of the way the GUC nesting currently works. It's probably possible to fix it, but it would be a separate effort. > and/or while SET LOCAL is in effect either at the inner or outer > level. That seems to work fine. > - COMMIT or ROLLBACK with open large objects. I haven't been able to reproduce any problems with that, but maybe I haven't tried hard enough. > - COMMIT inside a procedure fails because of a serialization failure, > deferred constraint, etc. That works fine. The COMMIT fails and control exits the procedure using the normal exception propagation. I'll submit an updated patch with some fixes for the above and more documentation. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Dec 5, 2017 at 1:25 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > I think ROLLBACK in a cursor loop might not make sense, because the > cursor query itself could have side effects, so a rollback would have to > roll back the entire loop. That might need more refined analysis before > it could be allowed. COMMIT really has the same problem; if the cursor query has side effects, you can't commit those side effects piecemeal as the loop executed and have things behave sanely. >> - COMMIT or ROLLBACK inside a procedure with a SET clause attached, > > That also needs to be prohibited because of the way the GUC nesting > currently works. It's probably possible to fix it, but it would be a > separate effort. > >> and/or while SET LOCAL is in effect either at the inner or outer >> level. > > That seems to work fine. These two are related -- if you don't permit anything that makes temporary changes to GUCs at all, like SET clauses attached to functions, then SET LOCAL won't cause any problems. The problem is if you do a transaction operation when something set locally is in the stack of values, but not at the top. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12/5/17 13:33, Robert Haas wrote: > On Tue, Dec 5, 2017 at 1:25 PM, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> I think ROLLBACK in a cursor loop might not make sense, because the >> cursor query itself could have side effects, so a rollback would have to >> roll back the entire loop. That might need more refined analysis before >> it could be allowed. > > COMMIT really has the same problem; if the cursor query has side > effects, you can't commit those side effects piecemeal as the loop > executed and have things behave sanely. The first COMMIT inside the loop would commit the cursor query. This isn't all that different from what you'd get now if you coded this manually using holdable cursors or just plain client code. Clearly, you can create a mess if the loop body interacts with the loop expression, but that's already the case. But if you coded something like this yourself now and ran a ROLLBACK inside the loop, the holdable cursor would disappear (unless previously committed), so you couldn't proceed with the loop. The SQL standard for persistent stored modules explicitly prohibits COMMIT and ROLLBACK in cursor loop bodies. But I think people will eventually want it. >>> - COMMIT or ROLLBACK inside a procedure with a SET clause attached, >> >> That also needs to be prohibited because of the way the GUC nesting >> currently works. It's probably possible to fix it, but it would be a >> separate effort. >> >>> and/or while SET LOCAL is in effect either at the inner or outer >>> level. >> >> That seems to work fine. > > These two are related -- if you don't permit anything that makes > temporary changes to GUCs at all, like SET clauses attached to > functions, then SET LOCAL won't cause any problems. The problem is if > you do a transaction operation when something set locally is in the > stack of values, but not at the top. Yes, that's exactly the problem. So right now I'm just preventing the problematic scenario. So fix that, one would possibly have to replace the stack by something not quite a stack. New patch attached. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Wed, Dec 6, 2017 at 8:41 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 12/5/17 13:33, Robert Haas wrote: >> On Tue, Dec 5, 2017 at 1:25 PM, Peter Eisentraut >> <peter.eisentraut@2ndquadrant.com> wrote: >>> I think ROLLBACK in a cursor loop might not make sense, because the >>> cursor query itself could have side effects, so a rollback would have to >>> roll back the entire loop. That might need more refined analysis before >>> it could be allowed. >> >> COMMIT really has the same problem; if the cursor query has side >> effects, you can't commit those side effects piecemeal as the loop >> executed and have things behave sanely. > > The first COMMIT inside the loop would commit the cursor query. This > isn't all that different from what you'd get now if you coded this > manually using holdable cursors or just plain client code. Clearly, you > can create a mess if the loop body interacts with the loop expression, > but that's already the case. > > But if you coded something like this yourself now and ran a ROLLBACK > inside the loop, the holdable cursor would disappear (unless previously > committed), so you couldn't proceed with the loop. > > The SQL standard for persistent stored modules explicitly prohibits > COMMIT and ROLLBACK in cursor loop bodies. But I think people will > eventually want it. The may want it, but silently promoting all cursors to held ones is not the way to give it to them, unless we narrow it down the the 'for-loop derived cursor' only. Even then we should consider syntax decoration: FOR x IN SELECT .... WITH HOLD LOOP END LOOP; merlin
On 12/06/2017 09:41 AM, Peter Eisentraut wrote: > On 12/5/17 13:33, Robert Haas wrote: >> On Tue, Dec 5, 2017 at 1:25 PM, Peter Eisentraut >> <peter.eisentraut@2ndquadrant.com> wrote: >>> I think ROLLBACK in a cursor loop might not make sense, because the >>> cursor query itself could have side effects, so a rollback would have to >>> roll back the entire loop. That might need more refined analysis before >>> it could be allowed. >> COMMIT really has the same problem; if the cursor query has side >> effects, you can't commit those side effects piecemeal as the loop >> executed and have things behave sanely. > The first COMMIT inside the loop would commit the cursor query. This > isn't all that different from what you'd get now if you coded this > manually using holdable cursors or just plain client code. Clearly, you > can create a mess if the loop body interacts with the loop expression, > but that's already the case. > > But if you coded something like this yourself now and ran a ROLLBACK > inside the loop, the holdable cursor would disappear (unless previously > committed), so you couldn't proceed with the loop. > > The SQL standard for persistent stored modules explicitly prohibits > COMMIT and ROLLBACK in cursor loop bodies. But I think people will > eventually want it. > >>>> - COMMIT or ROLLBACK inside a procedure with a SET clause attached, >>> That also needs to be prohibited because of the way the GUC nesting >>> currently works. It's probably possible to fix it, but it would be a >>> separate effort. >>> >>>> and/or while SET LOCAL is in effect either at the inner or outer >>>> level. >>> That seems to work fine. >> These two are related -- if you don't permit anything that makes >> temporary changes to GUCs at all, like SET clauses attached to >> functions, then SET LOCAL won't cause any problems. The problem is if >> you do a transaction operation when something set locally is in the >> stack of values, but not at the top. > Yes, that's exactly the problem. So right now I'm just preventing the > problematic scenario. So fix that, one would possibly have to replace > the stack by something not quite a stack. > > > New patch attached. > In general this looks good. However, I'm getting runtime errors in plperl_elog.c on two different Linux platforms (Fedora 25, Ubuntu 16.04). There seems to be something funky going on. And we do need to work out why the commented out plperl test isn't working. Detailed comments: Referring to anonymous blocks might be a bit mystifying for some readers, unless we note that they are invoked via DO. I think this sentence should probably be worded a bit differently: (And of course, BEGIN and END have different meanings in PL/pgSQL.) Perhaps "Note that" instead of "And of course,". Why aren't the SPI functions that error out or return 0 just void instead of returning an int? The docs say SPI_set_non_atomic() returns 0 on success, but the code says otherwise. This sentence in the comment in SPI_Commit() is slightly odd: This restriction is particular to PLs implemented on top of SPI. Perhaps "required by" rather than "particular to" would make it read better. The empty free_commit() and free_rollback() functions in pl_funcs.c look slightly odd. I realize that the compiler should optimize the calls away, but it seems an odd style. One other thing I wondered about was what if a PL function (say plperl) used SPI to open an explicit cursor and then looped over it? If there were a commit or rollback inside that loop we wouldn't have the same protections we have in plpgsql, ISTM. I haven't tried this yet, so I'm just speculating about what might happen. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 12/7/17 18:47, Andrew Dunstan wrote: > Referring to anonymous blocks might be a bit mystifying for some > readers, unless we note that they are invoked via DO. Added parenthetical comments. > I think this sentence should probably be worded a bit differently: > > (And of course, BEGIN and END have different meanings in PL/pgSQL.) > > Perhaps "Note that" instead of "And of course,". fixed > Why aren't the SPI functions that error out or return 0 just void > instead of returning an int? Tried to align then with existing functions, but I agree it seems weird. Changed to return void. > The docs say SPI_set_non_atomic() returns 0 on success, but the code > says otherwise. Fixed the documentation. > This sentence in the comment in SPI_Commit() is slightly odd: > > This restriction is particular to PLs implemented on top of SPI. > > Perhaps "required by" rather than "particular to" would make it read better. fixed > The empty free_commit() and free_rollback() functions in pl_funcs.c look > slightly odd. I realize that the compiler should optimize the calls > away, but it seems an odd style. That's how the existing code for other statements looks as well. > One other thing I wondered about was what if a PL function (say plperl) > used SPI to open an explicit cursor and then looped over it? If there > were a commit or rollback inside that loop we wouldn't have the same > protections we have in plpgsql, ISTM. I haven't tried this yet, so I'm > just speculating about what might happen. Good point. I added test cases similar to the plpgsql tests to the other three languages, which not-so-amusingly gave three different outcomes. In PL/Perl in particular, the commit clears away the portal, and the next call to spi_fetchrow() will then not find the cursor and just return undefined. So that is not so nice. I'm thinking about extending the portal pinning mechanism to the other languages as well, which seems mildly useful independent of transaction management. I will propose a patch for that soon. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Updated patch attached. I have addressed the most recent review comments I believe. The question about what happens to cursor loops in PL/Perl and PL/Python would be addressed by the separate thread "portal pinning". The test cases in this patch are currently marked by FIXMEs. I have changed the SPI API a bit. I got rid of SPI_set_nonatomic() and instead introduced SPI_connect_ext() that you can pass flags to. The advantage of that is that in the normal case we can continue to use the existing memory contexts, so nothing changes for existing uses, which seems desirable. (This also appears to address some sporadic test failures in PL/Perl.) I have also cleaned up the changes in portalmem.c further, so the changes are now even smaller. The commit message in this patch contains more details about some of these changes. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
A merge conflict has arisen, so for simplicity, here is an updated patch. On 12/20/17 10:08, Peter Eisentraut wrote: > Updated patch attached. > > I have addressed the most recent review comments I believe. > > The question about what happens to cursor loops in PL/Perl and PL/Python > would be addressed by the separate thread "portal pinning". The test > cases in this patch are currently marked by FIXMEs. > > I have changed the SPI API a bit. I got rid of SPI_set_nonatomic() and > instead introduced SPI_connect_ext() that you can pass flags to. The > advantage of that is that in the normal case we can continue to use the > existing memory contexts, so nothing changes for existing uses, which > seems desirable. (This also appears to address some sporadic test > failures in PL/Perl.) > > I have also cleaned up the changes in portalmem.c further, so the > changes are now even smaller. > > The commit message in this patch contains more details about some of > these changes. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 01/05/2018 04:30 PM, Peter Eisentraut wrote: > A merge conflict has arisen, so for simplicity, here is an updated patch. > > On 12/20/17 10:08, Peter Eisentraut wrote: >> Updated patch attached. >> >> I have addressed the most recent review comments I believe. >> >> The question about what happens to cursor loops in PL/Perl and PL/Python >> would be addressed by the separate thread "portal pinning". The test >> cases in this patch are currently marked by FIXMEs. >> >> I have changed the SPI API a bit. I got rid of SPI_set_nonatomic() and >> instead introduced SPI_connect_ext() that you can pass flags to. The >> advantage of that is that in the normal case we can continue to use the >> existing memory contexts, so nothing changes for existing uses, which >> seems desirable. (This also appears to address some sporadic test >> failures in PL/Perl.) >> >> I have also cleaned up the changes in portalmem.c further, so the >> changes are now even smaller. >> >> The commit message in this patch contains more details about some of >> these changes. Generally looks good. This confused me slightly: + Transactions cannot be ended inside loops through query results or inside + blocks with exception handlers. I suggest: "A transaction cannot be ended inside a loop over query results, nor inside a block with exception handlers." The patch has bitrotted slightly in src/backend/commands/portalcmds.c The plperl expected file needs updating. Also, why does spi_commit() in a loop result in an error message but not spi_rollback()? cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1/15/18 12:57, Andrew Dunstan wrote: > This confused me slightly: > > + Transactions cannot be ended inside loops through query results > or inside > + blocks with exception handlers. > > I suggest: "A transaction cannot be ended inside a loop over query > results, nor inside a block with exception handlers." fixed > The patch has bitrotted slightly in src/backend/commands/portalcmds.c merged > The plperl expected file needs updating. Also, why does spi_commit() in > a loop result in an error message but not spi_rollback()? This is all changed now after the patch for portal pinning in PL/Perl and PL/Python has been committed. The attached patch behaves better. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 01/16/2018 10:16 AM, Peter Eisentraut wrote: > On 1/15/18 12:57, Andrew Dunstan wrote: >> This confused me slightly: >> >> + Transactions cannot be ended inside loops through query results >> or inside >> + blocks with exception handlers. >> >> I suggest: "A transaction cannot be ended inside a loop over query >> results, nor inside a block with exception handlers." > fixed > >> The patch has bitrotted slightly in src/backend/commands/portalcmds.c > merged > >> The plperl expected file needs updating. Also, why does spi_commit() in >> a loop result in an error message but not spi_rollback()? > This is all changed now after the patch for portal pinning in PL/Perl > and PL/Python has been committed. The attached patch behaves better. > Looks good. Marking ready for committer. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 6 December 2017 at 22:34, Merlin Moncure <mmoncure@gmail.com> wrote: > On Wed, Dec 6, 2017 at 8:41 AM, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> On 12/5/17 13:33, Robert Haas wrote: >>> On Tue, Dec 5, 2017 at 1:25 PM, Peter Eisentraut >>> <peter.eisentraut@2ndquadrant.com> wrote: >>>> I think ROLLBACK in a cursor loop might not make sense, because the >>>> cursor query itself could have side effects, so a rollback would have to >>>> roll back the entire loop. That might need more refined analysis before >>>> it could be allowed. >>> >>> COMMIT really has the same problem; if the cursor query has side >>> effects, you can't commit those side effects piecemeal as the loop >>> executed and have things behave sanely. >> >> The first COMMIT inside the loop would commit the cursor query. This >> isn't all that different from what you'd get now if you coded this >> manually using holdable cursors or just plain client code. Clearly, you >> can create a mess if the loop body interacts with the loop expression, >> but that's already the case. >> >> But if you coded something like this yourself now and ran a ROLLBACK >> inside the loop, the holdable cursor would disappear (unless previously >> committed), so you couldn't proceed with the loop. >> >> The SQL standard for persistent stored modules explicitly prohibits >> COMMIT and ROLLBACK in cursor loop bodies. But I think people will >> eventually want it. > > The may want it, but silently promoting all cursors to held ones is > not the way to give it to them, unless we narrow it down the the > 'for-loop derived cursor' only. I don't think we should do that automatically for all cursors, but it seems clear that we would want that iff the loop contains COMMIT or ROLLBACK. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 16 January 2018 at 20:24, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote: > Looks good. Marking ready for committer. Few questions/points for the docs. Docs say: "A new transaction is started automatically after a transaction is ended using these commands" Presumably this would have exactly the same isolation level and other transaction characteristics? (Is it somehow possible to vary that. OK if not, no problem) The error "cannot commit while a subtransaction is active" is commented as intending to prevent COMMIT/ROLLBACK inside an EXCEPTION block. That makes sense. It seems it will also prevent SAVEPOINTs, though that seems not to be intended. The two cases are dissimilar and it would be possible to block the former but allow the latter. It's not documented or tested that SET LOCAL would work or not work. Does it work? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1/16/18 15:24, Andrew Dunstan wrote: > Looks good. Marking ready for committer. committed -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services