Thread: [HACKERS] Statement-level rollback
Hello, As I stated here and at the PGConf.ASIA developer meeting last year, I'd like to propose statement-level rollback feature. To repeat myself, this is requested for users to migrate from other DBMSs to PostgreSQL. They expect that a failureof one SQL statement should not abort the entire transaction and their apps (client programs and stored procedures)can continue the transaction with a different SQL statement. SPECIFICATION ================================================== START TRANSACTION ROLLBACK SCOPE { TRANSACTION | STATEMENT }; This syntax controls the behavior of the transaction when an SQL statement fails. TRANSACTION (default) is the traditionalbehavior (i.e. rolls back the entire transaction or subtransaction). STATEMENT rolls back the failed SQL statement. Just like the isolation level and access mode, default_transaction_rollback_scope GUC variable is also available. DESIGN ================================================== Nothing much to talk about... it merely creates a savepoint before each statement execution and destroys it after the statementfinishes. This is done in postgres.c for top-level SQL statements. The stored function hasn't been handled yet; I'll submit the revised patch soon. CONSIDERATIONS AND REQUESTS ================================================== The code for stored functions is not written yet, but I'd like your feedback for the specification and design based on thecurrent patch. I'll add this patch to CommitFest 2017-3. The patch creates and destroys a savepoint for each message of the extended query protocol (Parse, Bind, Execute and Describe). I'm afraid this will add significant overhead, but I don't find a better way, because those messages could besend arbitrarily for different statements, e.g. Parse stmt1, Parse stmt2, Bind stmt1, Execute stmt1, Bind stmt2, Executestmt2. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
"Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com> writes: > As I stated here and at the PGConf.ASIA developer meeting last year, I'd > like to propose statement-level rollback feature. I do not really see how this would ever get past the compatibility problems that forced us to give up on server-side autocommit years ago. If you want to provide a client-side facility for this, perhaps that could fly. regards, tom lane
From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane > "Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com> writes: > > As I stated here and at the PGConf.ASIA developer meeting last year, > > I'd like to propose statement-level rollback feature. > > I do not really see how this would ever get past the compatibility problems > that forced us to give up on server-side autocommit years ago. Could you tell me more about that problem? What kind of incompatibility would this feature introduce? > If you want to provide a client-side facility for this, perhaps that could > fly. Do you mean a feature of psqlODBC that implicitly issues SAVEPOINT and RELEASE SAVEPOINT for each SQL statement? One reasonI want to implement the feature is to avoid eliminate those round-trips for performance. Or, do you mean a client-sideconnection parameter like "rollback_scope={transaction | statement}?" Yes, I'll implement it for major clientdrivers so that the driver issues "SET SESSION CHARACTERISTICS FOR TRANSACTION ROLLBACK SCOPE {TRANSACTION | STATEMENT}"upon connection. psqlODBC has already a connection parameter, Protocol, for that purpose. Regards Takayuki Tsunakawa
On 2/28/17 08:17, Tom Lane wrote: > I do not really see how this would ever get past the compatibility > problems that forced us to give up on server-side autocommit years ago. I think it's different because it's not a global setting, it's only a behavior you select explicitly when you start a transaction block. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2/28/17 02:39, Tsunakawa, Takayuki wrote: > I'd like to propose statement-level rollback feature. To repeat myself, this is requested for users to migrate from otherDBMSs to PostgreSQL. They expect that a failure of one SQL statement should not abort the entire transaction and theirapps (client programs and stored procedures) can continue the transaction with a different SQL statement. Can you provide some references on how other systems provide this feature? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 2/28/17 08:17, Tom Lane wrote: >> I do not really see how this would ever get past the compatibility >> problems that forced us to give up on server-side autocommit years ago. > I think it's different because it's not a global setting, it's only a > behavior you select explicitly when you start a transaction block. Yeah, that's the same it-won't-affect-you-if-you-don't-use-it argument that we heard for server-side autocommit-off. I don't buy it. I can think of two reasons even without any caffeine: 1. The argument for this is mostly, if not entirely, "application compatibility". But it won't succeed at providing that if every BEGIN has to be spelled differently than it would be on other DBMSes. Therefore there is going to be enormous pressure to allow enabling the feature through a GUC, or some other environment-level way, and as soon as we do that we've lost. 2. The proposed feature would affect the internal operation of PL functions, so that those would need to become bulletproof against being invoked in either operating environment. Likewise, all sorts of intermediate tools like connection poolers would no doubt be broken if they don't know about this and support both modes. (We would have to start by fixing postgres_fdw and dblink, for instance.) In short, you can't make fundamental changes in transactional behavior without enormous breakage. That was the lesson we learned from the autocommit fiasco and I do not believe that it's inapplicable here. regards, tom lane
From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Peter Eisentraut > On 2/28/17 02:39, Tsunakawa, Takayuki wrote: > > I'd like to propose statement-level rollback feature. To repeat myself, > this is requested for users to migrate from other DBMSs to PostgreSQL. They > expect that a failure of one SQL statement should not abort the entire > transaction and their apps (client programs and stored procedures) can > continue the transaction with a different SQL statement. > > Can you provide some references on how other systems provide this feature? Oracle doesn't. SQL Server provides like this: SET XACT_ABORT https://msdn.microsoft.com/en-us/library/ms188792.aspx MySQL doesn't. BTW, MySQL enables changing autocommit mode with SET statement: 16.5.2.2 autocommit, Commit, and Rollback https://dev.mysql.com/doc/refman/8.0/en/innodb-autocommit-commit-rollback.html And above all, I've found EnterpriseDB supports statement-level rollback with GUC! So PostgreSQL should be able to do. https://www.edbpostgres.com/docs/en/9.6/asguide/EDB_Postgres_Advanced_Server_Guide.1.17.html#pID0E0QUD0HA ---------------------------------------- edb_stmt_level_tx is set to TRUE, then an exception will not automatically roll back prior uncommitted database updates.If edb_stmt_level_tx is set to FALSE, then an exception will roll back uncommitted database updates. Note: Use edb_stmt_level_tx set to TRUE only when absolutely necessary, as this may cause a negative performance impact. ---------------------------------------- Regards Takayuki Tsunakawa
From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > 1. The argument for this is mostly, if not entirely, "application > compatibility". But it won't succeed at providing that if every BEGIN has > to be spelled differently than it would be on other DBMSes. > Therefore there is going to be enormous pressure to allow enabling the > feature through a GUC, or some other environment-level way, and as soon > as we do that we've lost. I thought so, too. I believe people who want to migrate from other DBMSs would set the GUC in postgresql.conf, or with ALTERDATABASE/USER just for applications which are difficult to modify. > 2. The proposed feature would affect the internal operation of PL functions, > so that those would need to become bulletproof against being invoked in > either operating environment. Likewise, all sorts of intermediate tools > like connection poolers would no doubt be broken if they don't know about > this and support both modes. (We would have to start by fixing postgres_fdw > and dblink, for instance.) Yes, I'm going to modify the PL's behavior. I'll also check the dblink and postgres_fdw as well. In addition, I'll havea quick look at the code of pgpool-II and pgBouncer to see how they depend on the transaction state. I'll run the regressiontests of contribs, pgpool-II and pgBouncer with default_transaction_rollback_scope set to 'statement'. But I don't see how badly the statement-level rollback affects those features other than PL. I think the only relevant thingto those client-side programs is whether the transaction is still running, which is returned with ReadyForQuery. Bothof statement-level rollback and the traditional behavior leave the transaction running when an SQL statement fails. Server-side autocommit differs in that respect. Regards Takayuki Tsunakawa
On 3/3/17 2:43 AM, Tsunakawa, Takayuki wrote: > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] >> 1. The argument for this is mostly, if not entirely, "application >> compatibility". But it won't succeed at providing that if every BEGIN has >> to be spelled differently than it would be on other DBMSes. >> Therefore there is going to be enormous pressure to allow enabling the >> feature through a GUC, or some other environment-level way, and as soon >> as we do that we've lost. > > I thought so, too. I believe people who want to migrate from other DBMSs would set the GUC in postgresql.conf, or withALTER DATABASE/USER just for applications which are difficult to modify. > >> 2. The proposed feature would affect the internal operation of PL functions, >> so that those would need to become bulletproof against being invoked in >> either operating environment. Likewise, all sorts of intermediate tools >> like connection poolers would no doubt be broken if they don't know about >> this and support both modes. (We would have to start by fixing postgres_fdw >> and dblink, for instance.) > > Yes, I'm going to modify the PL's behavior. I'll also check the dblink and postgres_fdw as well. In addition, I'll havea quick look at the code of pgpool-II and pgBouncer to see how they depend on the transaction state. I'll run the regressiontests of contribs, pgpool-II and pgBouncer with default_transaction_rollback_scope set to 'statement'. > > But I don't see how badly the statement-level rollback affects those features other than PL. I think the only relevantthing to those client-side programs is whether the transaction is still running, which is returned with ReadyForQuery. Both of statement-level rollback and the traditional behavior leave the transaction running when an SQL statementfails. Server-side autocommit differs in that respect. Whatever the merits of this patch, it's a pretty major behavioral change with a large potential impact. Even if what is enumerated here is the full list (which I doubt), it's pretty big. Given that this landed on March 28 with no discussion beforehand, I recommend that we immediately move this patch to the 2017-07 CF. -- -David david@pgmasters.net
On 2017-03-03 11:54:06 -0500, David Steele wrote: > Given that this landed on March 28 with no discussion beforehand, I > recommend that we immediately move this patch to the 2017-07 CF. Seconded.
On 3/3/17 12:01 PM, Andres Freund wrote: > On 2017-03-03 11:54:06 -0500, David Steele wrote: >> Given that this landed on March 28 with no discussion beforehand, I >> recommend that we immediately move this patch to the 2017-07 CF. > > Seconded. And of course I meant Feb 28. -- -David david@pgmasters.net
On Fri, Mar 3, 2017 at 9:01 AM, Andres Freund <andres@anarazel.de> wrote: > On 2017-03-03 11:54:06 -0500, David Steele wrote: >> Given that this landed on March 28 with no discussion beforehand, I >> recommend that we immediately move this patch to the 2017-07 CF. > > Seconded. +1 -- Peter Geoghegan
From: David Steele [mailto:david@pgmasters.net] > Whatever the merits of this patch, it's a pretty major behavioral change > with a large potential impact. Even if what is enumerated here is the full > list (which I doubt), it's pretty big. > > Given that this landed on March 28 with no discussion beforehand, I recommend > that we immediately move this patch to the 2017-07 CF. OK, I moved it to 2017-7. I will participate in the review of existing patches. In parallel with that, I'll keep developingthis feature and sometimes submit revised patches and new findings. I'd be happy if anyone could give feedbackthen. Regards Takayuki Tsunakawa
On Fri, Mar 3, 2017 at 2:15 AM, Tsunakawa, Takayuki <tsunakawa.takay@jp.fujitsu.com> wrote: > From: pgsql-hackers-owner@postgresql.org >> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Peter Eisentraut >> On 2/28/17 02:39, Tsunakawa, Takayuki wrote: >> > I'd like to propose statement-level rollback feature. To repeat myself, >> this is requested for users to migrate from other DBMSs to PostgreSQL. They >> expect that a failure of one SQL statement should not abort the entire >> transaction and their apps (client programs and stored procedures) can >> continue the transaction with a different SQL statement. >> >> Can you provide some references on how other systems provide this feature? > > Oracle doesn't. Really? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Robert Haas > >> Can you provide some references on how other systems provide this feature? > > > > Oracle doesn't. > > Really? Sorry, my sentence was misleading. I meant by "Oracle/MySQL doesn't" that they do not provide a configuration parameter or START TRANSACTION mode to choosebetween statement rollback and transaction rollback. They just rolls back the failed statement. I wish Postgres couldbehave the same way. Regards Takayuki Tsunakawa
Hello, EDB Oracle compatibility proposes edb_stmt_level_tx parameter, psql uses ON_ERROR_ROLLBACK = 'on', ODBC has a parameter for this JDBC has nothing and developers has to play with savepoint as described http://blog.endpoint.com/2015/02/postgres-onerrorrollback-explained.html This feature (as a GUC at server level) would be very helpfull for Oracle applications migration. Regards PAscal -- View this message in context: http://www.postgresql-archive.org/Statement-level-rollback-tp5946725p5948032.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
On Tue, Mar 07, 2017 at 01:49:29PM -0700, legrand legrand wrote: > JDBC has nothing and developers has to play with savepoint as described > http://blog.endpoint.com/2015/02/postgres-onerrorrollback-explained.html JDBC has it since 9.4.1210 (2016-09-07), unless I am mistaken: https://github.com/pgjdbc/pgjdbc/commit/adc08d57d2a9726309ea80d574b1db835396c1c8 Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
On 7 March 2017 at 16:18, Michael Banck <michael.banck@credativ.de> wrote:
On Tue, Mar 07, 2017 at 01:49:29PM -0700, legrand legrand wrote:
> JDBC has nothing and developers has to play with savepoint as described
> http://blog.endpoint.com/2015/02/postgres-onerrorrollback- explained.html
JDBC has it since 9.4.1210 (2016-09-07), unless I am mistaken:
https://github.com/pgjdbc/pgjdbc/commit/ adc08d57d2a9726309ea80d574b1db 835396c1c8
I thought he meant we have to play with savepoints.
Yes, we do it for you now
Thanks !
that's a very good new !
I'm still receiving the famous
"current transaction is aborted" error when usingversion 42.0.0 with
jdbc:postgresql://localhost:5432/postgres?autosave=always
But I will see that with pgjdbc team ;o)
PAscal
View this message in context: RE: Statement-level rollback
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
You have to turn it on using the autosave parameter. it's not on by default, and apparently not documented
On 7 March 2017 at 17:15, legrand legrand <legrand_legrand@hotmail.com> wrote:
RegardsThanks !
that's a very good new !
I'm still receiving the famous
"current transaction is aborted" error when usingversion 42.0.0 with
jdbc:postgresql://localhost:
5432/postgres?autosave=always
But I will see that with pgjdbc team ;o)
PAscal
View this message in context: RE: Statement-level rollback
legrand>when usingversion 42.0.0 with
legrand> jdbc:postgresql://localhost:5432/postgres?autosave=always
The pitfall there is the value should be written with upper case like autosave=ALWAYS.
The pitfall there is the value should be written with upper case like autosave=ALWAYS.
I've filed https://github.com/pgjdbc/pgjdbc/issues/769 to improve that at some point.
Vladimir
Please disregard my previous message.
pgjdbc is already doing upcase conversion, so I would like to see a test case that reproduces the error.
Alternatively, could you please capture and share TRACE log? ( https://jdbc.postgresql.org/documentation/head/logging.html#configuration )
Vladimir
ср, 8 мар. 2017 г. в 1:26, Vladimir Sitnikov <sitnikov.vladimir@gmail.com>:
legrand>when usingversion 42.0.0 withlegrand> jdbc:postgresql://localhost:5432/postgres?autosave=always
The pitfall there is the value should be written with upper case like autosave=ALWAYS.I've filed https://github.com/pgjdbc/pgjdbc/issues/769 to improve that at some point.Vladimir
Please disregard my previous message.
pgjdbc is already doing upcase conversion, so I would like to see a test case that reproduces the error.
Alternatively, could you please capture and share TRACE log? ( https://jdbc.postgresql.org/documentation/head/logging.html#configuration )
Vladimir
ср, 8 мар. 2017 г. в 1:26, Vladimir Sitnikov <sitnikov.vladimir@gmail.com>:
legrand>when usingversion 42.0.0 withlegrand> jdbc:postgresql://localhost:5432/postgres?autosave=always
The pitfall there is the value should be written with upper case like autosave=ALWAYS.I've filed https://github.com/pgjdbc/pgjdbc/issues/769 to improve that at some point.Vladimir
There was a mistake in my driver definition,
this works fine with autosave=always (but not with autoSave ...)
Thanks Again
De : Vladimir Sitnikov [via PostgreSQL] <ml-node+[hidden email]>
Envoyé : mardi 7 mars 2017 22:32:27
À : legrand legrand
Objet : Re: Statement-level rollback
Envoyé : mardi 7 mars 2017 22:32:27
À : legrand legrand
Objet : Re: Statement-level rollback
Please disregard my previous message.
pgjdbc is already doing upcase conversion, so I would like to see a test case that reproduces the error.
Alternatively, could you please capture and share TRACE log? ( https://jdbc.postgresql.org/documentation/head/logging.html#configuration )
Vladimir
ср, 8 мар. 2017 г. в 1:26, Vladimir Sitnikov <[hidden email]>:
legrand>when usingversion 42.0.0 with
legrand> jdbc:postgresql://localhost:5432/postgres?autosave=always
The pitfall there is the value should be written with upper case like autosave=ALWAYS.
The pitfall there is the value should be written with upper case like autosave=ALWAYS.
I've filed https://github.com/pgjdbc/pgjdbc/issues/769 to improve that at some point.
Vladimir
If you reply to this email, your message will be added to the discussion below:
http://www.postgresql-archive.org/Statement-level-rollback-tp5946725p5948059.htmlView this message in context: RE: Statement-level rollback
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
Hello, This feature hasn't been updated for a long time, but I've just been interested in this feature and looking into the mailing list. From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us] > In short, you can't make fundamental changes in transactional behavior without > enormous breakage. That was the lesson we learned from the autocommit fiasco > and I do not believe that it's inapplicable here. I've just wanted to confirm what "autocommit fiasco" points out. Are the below threads and git-log relevant discussion? (If there are any other threads, could you please tell me the link?) https://www.postgresql.org/message-id/flat/3E54526A.121EBEE5%40tpf.co.jp#3E54526A.121EBEE5@tpf.co.jp https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f85f43dfb5b9043ea6b01d8b824c195cd7f9ed3c Regards, Takeshi Ideriha
On 1 March 2017 at 16:05, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >> On 2/28/17 08:17, Tom Lane wrote: >>> I do not really see how this would ever get past the compatibility >>> problems that forced us to give up on server-side autocommit years ago. > >> I think it's different because it's not a global setting, it's only a >> behavior you select explicitly when you start a transaction block. > > Yeah, that's the same it-won't-affect-you-if-you-don't-use-it argument > that we heard for server-side autocommit-off. This is a frequently requested feature and I think we should push ahead with it in the next cycle. We're the World's Most Advanced Open Source Database, so a new transaction feature fits in with our goals. > I don't buy it. > I can think of two reasons even without any caffeine: > > 1. The argument for this is mostly, if not entirely, "application > compatibility". But it won't succeed at providing that if every > BEGIN has to be spelled differently than it would be on other DBMSes. > Therefore there is going to be enormous pressure to allow enabling > the feature through a GUC, or some other environment-level way, > and as soon as we do that we've lost. We already use GUCs for various other transaction level features and they work just fine. What we need is a feature that works the way other DBMS do, as an option. If it should be desirable. I do accept there are problems and we do have some experience of those problems. > 2. The proposed feature would affect the internal operation of PL > functions, so that those would need to become bulletproof against > being invoked in either operating environment. Likewise, all sorts > of intermediate tools like connection poolers would no doubt be broken > if they don't know about this and support both modes. (We would have > to start by fixing postgres_fdw and dblink, for instance.) > > In short, you can't make fundamental changes in transactional behavior > without enormous breakage. That was the lesson we learned from the > autocommit fiasco and I do not believe that it's inapplicable here. I think the point we should take from Tom's comments is... a) This feature won't be a replacement for PostgreSQL's default behaviour, at least not in any short/medium term. b) If we get this feature, about 80% of the work will be fixing all the small breakages that happen with other tools, plugins etc.. So it is no small task. If accepted this would be a major feature and will take much work. If we want this in Postgres11 then we must have a fully working patch by start of Sept 2017, plus some analysis of all of the various breakage points we are expecting to see. So lets do the analysis, so we know how deep the mud is before we decide to walk through it. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2/28/17 02:39, Tsunakawa, Takayuki wrote: > The code for stored functions is not written yet, but I'd like your feedback for the specification and design based onthe current patch. I'll add this patch to CommitFest 2017-3. This patch needs to be rebased for the upcoming commit fest. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 14 August 2017 at 23:58, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 2/28/17 02:39, Tsunakawa, Takayuki wrote: >> The code for stored functions is not written yet, but I'd like your feedback for the specification and design based onthe current patch. I'll add this patch to CommitFest 2017-3. > > This patch needs to be rebased for the upcoming commit fest. I'm willing to review this if the patch is going to be actively worked on. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> On 01 Sep 2017, at 13:44, Simon Riggs <simon@2ndquadrant.com> wrote: > > On 14 August 2017 at 23:58, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> On 2/28/17 02:39, Tsunakawa, Takayuki wrote: >>> The code for stored functions is not written yet, but I'd like your feedback for the specification and design based onthe current patch. I'll add this patch to CommitFest 2017-3. >> >> This patch needs to be rebased for the upcoming commit fest. > > I'm willing to review this if the patch is going to be actively worked on. This sounds like a too good offer to pass up on, can we expect a rebased patch for the commitfest? cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
> On 15 Sep 2017, at 16:19, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 01 Sep 2017, at 13:44, Simon Riggs <simon@2ndquadrant.com> wrote: >> >> On 14 August 2017 at 23:58, Peter Eisentraut >> <peter.eisentraut@2ndquadrant.com> wrote: >>> On 2/28/17 02:39, Tsunakawa, Takayuki wrote: >>>> The code for stored functions is not written yet, but I'd like your feedback for the specification and design basedon the current patch. I'll add this patch to CommitFest 2017-3. >>> >>> This patch needs to be rebased for the upcoming commit fest. >> >> I'm willing to review this if the patch is going to be actively worked on. > > This sounds like a too good offer to pass up on, can we expect a rebased patch > for the commitfest? Since this patch was Waiting for author during the entire commitfest without updates, I’m marking it Returned with Feedback. When a new version is ready it can be re-submitted to the then open commitfest. cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Simon Riggs On 14 August 2017 at 23:58, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 2/28/17 02:39, Tsunakawa, Takayuki wrote: >> The code for stored functions is not written yet, but I'd like your feedback for the specification and design based on the current patch. I'll add this patch to CommitFest 2017-3. > > This patch needs to be rebased for the upcoming commit fest. I'm willing to review this if the patch is going to be actively worked on. I'm very sorry I couldn't reply to your kind offer. I rebased the patch and will add it to CF 2017/11. I hope I will complete the patch in this CF. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 10/31/17 13:47, MauMau wrote: > I'm very sorry I couldn't reply to your kind offer. I rebased the > patch and will add it to CF 2017/11. I hope I will complete the patch > in this CF. I've been thinking about this a little bit. Many are worried about repeating the mistakes of the autocommit feature, so it's worth comparing that. The problem with the autocommit setting, or at least the one I remember, is that code is currently written expecting that connect exec SQL statement disconnect will succeed in executing and committing the SQL statement, unless an error is reported. If you turned the autocommit setting off, then this code would effectively silently do nothing, and that is obviously quite bad. So the autocommit setting would break a large proportion of all code out there, and was thus not really usable, and hence it was removed. The proposed statement-level rollback feature works in a slightly different context. It does not change when or how a transaction or transaction block begins and ends. It only changes what happens inside explicit transaction blocks. Considering code like START TRANSACTION; SQL1; SQL2; SQL3; COMMIT; currently an error would cause all subsequent commands to fail. Under statement-level rollback, a failed command would effectively be ignored and the transaction would continue until COMMIT. Therefore, a successful transaction block would always work the same way under either setting. The difference is how error recovery works. So this will necessarily be tied to how the client code or other surrounding code is structured or what the driver or framework is doing in the background to manage transactions. It would also be bad if client code was not prepared for this new behavior, reported the transaction as complete while some commands in the middle were omitted. Drivers can already achieve this behavior and do do that by issuing savepoint commands internally. The point raised in this thread was that that creates too much network overhead, so a backend-based solution would be preferable. We haven't seen any numbers or other evidence to quantify that claim, so maybe it's worth looking into that some more. In principle, a backend-based solution that drivers just have to opt into would save a lot of duplication. But the drivers that care or require it according to their standards presumably already implement this behavior in some other way, so it comes back to whether there is a performance or other efficiency gain here. Another argument was that other SQL implementations have this behavior. This appears to be the case. But as far as I can tell, it is also tied to their particular interfaces and the structure and flow control they provide. So a client-side solution like psql already provides or something in the various drivers would work just fine here. So my summary for the moment is that a GUC or similar run-time setting might be fine, with appropriate explanation and warnings. But it's not clear whether it's worth it given the existing alternatives. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2 November 2017 at 09:33, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > If you turned the autocommit setting off, then this code would > effectively silently do nothing, and that is obviously quite bad. Right. The example often cited is some variant of BEGIN; CREATTE TABLE t2 AS SELECT * FROM t1; DROP TABLE t1; ALTER TABLE t2 RENAME TO t1; COMMIT; Right now, we do the right thing here. With default statement level rollback, you just dropped t1 and all your data. oops. On a related note, psql's -v ON_ERROR_STOP=1 is horrible and hard to discover UI, and one of the top FAQs on Stack Overflow is some variant of "I'm getting random and incomprehensible errors restoring a dump, wtf?". So I'd really love to make it the default, but we'd face similar issues where a SQL script that's currently correct instead produces dangerously wrong results with ON_ERROR_STOP=1 . > In principle, a backend-based solution that drivers just have to opt > into would save a lot of duplication. But the drivers that care or > require it according to their standards presumably already implement > this behavior in some other way, so it comes back to whether there is a > performance or other efficiency gain here. There definitely would be over SQL-level savepoints. They're horrible for performance, especially since libpq can't yet pipeline work so you need three round-trips for each successful statement: SAVEPOINT, statement, RELEASE SAVEPOINT. It produces massive log spam too. What about if we add protocol-level savepoint support? Two new messages: BeginUnnamedSavepoint and EndUnnamedSavepoint where the latter does a rollback-to-last-unnamed-savepoint if the txn state is bad, or a release-last-unnamed-savepoint if the txn state is ok. That means the driver doesn't have to wait for the result of the statement. It knows the conn state and query outcome from our prior messages, and knows that as a result of this message any failed state has been rolled back. This would, with appropriate libpq support, give people who want statement level error handling pretty much what they want. And we could expose it in psql too. No GUCs needed, no fun surprises for apps. psqlODBC could adopt it to replace its current slow and super-log-spammy statement rollback model. Because we'd know it was a special savepoint used for statement level rollback we might still have some optimisation opportunities. Downside is that it needs support in each client driver. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Peter Eisentraut [mailto:peter.eisentraut@2ndquadrant.com] > The difference is how error recovery works. So this will necessarily be > tied to how the client code or other surrounding code is structured or what > the driver or framework is doing in the background to manage transactions. > It would also be bad if client code was not prepared for this new behavior, > reported the transaction as complete while some commands in the middle were > omitted. > > Drivers can already achieve this behavior and do do that by issuing savepoint > commands internally. The point raised in this thread was that that creates > too much network overhead, so a backend-based solution would be preferable. > We haven't seen any numbers or other evidence to quantify that claim, so > maybe it's worth looking into that some more. > > In principle, a backend-based solution that drivers just have to opt into > would save a lot of duplication. But the drivers that care or require it > according to their standards presumably already implement this behavior > in some other way, so it comes back to whether there is a performance or > other efficiency gain here. > > Another argument was that other SQL implementations have this behavior. > This appears to be the case. But as far as I can tell, it is also tied > to their particular interfaces and the structure and flow control they > provide. So a client-side solution like psql already provides or something > in the various drivers would work just fine here. > > So my summary for the moment is that a GUC or similar run-time setting might > be fine, with appropriate explanation and warnings. But it's not clear > whether it's worth it given the existing alternatives. I can think of four reasons why the server-side support is necessary or desirable. First, the server log could be filled with SAVEPOINT and RELEASE lines when you need to investigate performance or auditactivity. Second, the ease of use for those who migrate from other DBMSs. With the server-side support, only the DBA needs to be awareof the configuration in postgresql.conf. Other people don't need to be aware of the client-side parameter when theydeploy applications. Third, lack of server-side support causes trouble to driver developers. In a recent discussion with the psqlODBC committer,he had some trouble improving or fixing the statement-rollback support. Npgsql doesn't have the statement-rollbackyet. PgJDBC has supported the feature with autosave parameter only recently. Do the drivers for otherlanguages like Python, Go, JavaScript have the feature? We should reduce the burdon on the driver developers. Fourth, the runtime performance. In a performance benchmark of one of our customers, where a batch application ran 1.5 or5 million small SELECTs with primary key access, the execution time of the whole batch became shorter by more than 30%(IIRC) when the local connection was used instead of the remote TCP/IP one. The communication overhead is not small. Also, in the PostgreSQL documentation, the communication overhead is treated seriously as follows: https://www.postgresql.org/docs/devel/static/plpgsql-overview.html#plpgsql-advantages [Excerpt] -------------------------------------------------- That means that your client application must send each query to the database server, wait for it to be processed, receiveand process the results, do some computation, then send further queries to the server. All this incurs interprocesscommunication and will also incur network overhead if your client is on a different machine than the databaseserver. With PL/pgSQL you can group a block of computation and a series of queries inside the database server, thus having the powerof a procedural language and the ease of use of SQL, but with considerable savings of client/server communication overhead. •Extra round trips between client and server are eliminated •Intermediate results that the client does not need do not have to be marshaled or transferred between server and client •Multiple rounds of query parsing can be avoided This can result in a considerable performance increase as compared to an application that does not use stored functions. -------------------------------------------------- Craig reports the big communication overhead: PATCH: Batch/pipelining support for libpq https://www.postgresql.org/message-id/CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=B4dMMZw@mail.gmail.com#CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=B4dMMZw@mail.gmail.com Re: foreign table batch insert https://www.postgresql.org/message-id/CAMsr+YFgDUiJ37DEfPRk8WDBuZ58psdAYJd8iNFSaGxtw=wU3g@mail.gmail.com [Excerpt] -------------------------------------------------- The time difference for 10k inserts on the local host over a unix socket shows a solid improvement: batch insert elapsed: 0.244293s sequential insert elapsed: 0.375402s ... but over, say, a connection to a random AWS RDS instance fired up for the purpose that lives about 320ms away the difference is huge: batch insert elapsed: 9.029995s sequential insert elapsed: (I got bored after 10 minutes; it should take a bit less then an hour based on the latency numbers) With 500 rows on the remote AWS RDS instance, once the I/O quota is already saturated: batch insert elapsed: 1.229024s sequential insert elapsed: 156.962180s which is an improvement by a factor of over 120 -------------------------------------------------- Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Craig Ringer [mailto:craig@2ndquadrant.com] > The example often cited is some variant of > > BEGIN; > CREATTE TABLE t2 AS SELECT * FROM t1; > DROP TABLE t1; > ALTER TABLE t2 RENAME TO t1; > COMMIT; > > Right now, we do the right thing here. With default statement level rollback, > you just dropped t1 and all your data. oops. That's a horrible example. So I think the default behavior should be what it is now for existing PostgreSQL users. > On a related note, psql's -v ON_ERROR_STOP=1 is horrible and hard to discover > UI, and one of the top FAQs on Stack Overflow is some variant of "I'm getting > random and incomprehensible errors restoring a dump, wtf?". So I'd really > love to make it the default, but we'd face similar issues where a SQL script > that's currently correct instead produces dangerously wrong results with > ON_ERROR_STOP=1 . Yes. And although unrelated, psql's FETCH_SIZE is also often invisible to users. They report out-of-memory trouble whenthey do SELECT on a large table with psql. > What about if we add protocol-level savepoint support? Two new messages: > > BeginUnnamedSavepoint > > and > > EndUnnamedSavepoint > > where the latter does a rollback-to-last-unnamed-savepoint if the txn state > is bad, or a release-last-unnamed-savepoint if the txn state is ok. That > means the driver doesn't have to wait for the result of the statement. It > knows the conn state and query outcome from our prior messages, and knows > that as a result of this message any failed state has been rolled back. > > This would, with appropriate libpq support, give people who want statement > level error handling pretty much what they want. And we could expose it > in psql too. No GUCs needed, no fun surprises for apps. psqlODBC could adopt > it to replace its current slow and super-log-spammy statement rollback > model. > > Downside is that it needs support in each client driver. Yes, I believe we should avoid the downside. It's tough to develop and maintain a client driver, so we should minimize theburdon with server-side support. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Tsunakawa> PgJDBC has supported the feature with autosave parameter only recently
PgJDBC has the implementation for more than a year (REL9.4.1210, 2016-09-07, see https://github.com/pgjdbc/pgjdbc/pull/477 )
PgJDBC has the implementation for more than a year (REL9.4.1210, 2016-09-07, see https://github.com/pgjdbc/pgjdbc/pull/477 )
Tsunakawa> The point raised in this thread was that that creates
Tsunakawa> too much network overhead, so a backend-based solution would be preferable.
Tsunakawa> We haven't seen any numbers or other evidence to quantify that claim, so
Tsunakawa> maybe it's worth looking into that some more
Tsunakawa> too much network overhead, so a backend-based solution would be preferable.
Tsunakawa> We haven't seen any numbers or other evidence to quantify that claim, so
Tsunakawa> maybe it's worth looking into that some more
The performance overhead for "SELECT" statement (no columns, just select) statement over localhost is 36±4 us vs 38±3 us (savepoint is pipelined along with user-provided query). That is network overhead is close to negligible.
As far as I understand, the main problem with savepoints is they would consume memory even in case the same savepoint is reassigned again and again.
In other words, "savepoint; insert;savepoint; insert;savepoint; insert;savepoint; insert;savepoint; insert;" would allocate xids and might blow up backend's memory.
I see no way driver can workaround that, so it would be great if backend could release memory or provide a way to do so.
Adding protocol messages would blow pgbouncer, etc things, so it makes sense to refrain from new messages unless it is absolutely required.
Vladimir
On 2 November 2017 at 13:59, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote: > The performance overhead for "SELECT" statement (no columns, just select) > statement over localhost is 36±4 us vs 38±3 us (savepoint is pipelined along > with user-provided query). That is network overhead is close to negligible. Yep. Not for psqlODBC or other libpq-based drives that can't pipeline queries though. > In other words, "savepoint; insert;savepoint; insert;savepoint; > insert;savepoint; insert;savepoint; insert;" would allocate xids and might > blow up backend's memory. RELEASE SAVEPOINT, like psqlODBC does. > Adding protocol messages would blow pgbouncer, etc things, so it makes sense > to refrain from new messages unless it is absolutely required. Yeah, it'd affect proxies, true. But it'd let us get rid of a lot of very ugly log spam too. And unlike some of the prior protocol tweaks I've been interested in, it'd be client-initiated so it should be pretty safe. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Vladimir > Sitnikov > Tsunakawa> PgJDBC has supported the feature with autosave parameter only > Tsunakawa> recently > > PgJDBC has the implementation for more than a year (REL9.4.1210, 2016-09-07, > see https://github.com/pgjdbc/pgjdbc/pull/477 ) And I heard from someone in PgJDBC community that the autosave parameter was not documented in the manual for a while, whichI confirmed. So the statement-level rollback is newer to users, isn't it? > The performance overhead for "SELECT" statement (no columns, just select) > statement over localhost is 36±4 us vs 38±3 us (savepoint is pipelined along > with user-provided query). That is network overhead is close to negligible. That's good news, because it also means that the overhead of creating a savepoint is negligible. > As far as I understand, the main problem with savepoints is they would > consume memory even in case the same savepoint is reassigned again and again. > In other words, "savepoint; insert;savepoint; insert;savepoint; > insert;savepoint; insert;savepoint; insert;" would allocate xids and might > blow up backend's memory. > I see no way driver can workaround that, so it would be great if backend > could release memory or provide a way to do so. Doesn't PgJDBC execute RELEASE after each SQL statement? That said, even with RELEASE, the server memory bloat is not solved. The current server implementation allocates a memory chunk of 8KB called CurTranContext for each subtransaction,and retains them until the end of top-level transaction. That's another (separate) issue to address. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Tsunakawa>So the statement-level rollback is newer to users, isn't it?
Technically speaking, the feature was listed in the changelog.
Tsunakawa>Doesn't PgJDBC execute RELEASE after each SQL statement?
Tsunakawa>Doesn't PgJDBC execute RELEASE after each SQL statement?
It does not.
Tsunakawa>That said, even with RELEASE, the server memory bloat is not solved.
That is what I mean.
Vladimir
On 2 November 2017 at 01:33, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > The proposed statement-level rollback feature works in a slightly > different context. It does not change when or how a transaction or > transaction block begins and ends. It only changes what happens inside > explicit transaction blocks. Yes, this is not the same thing as autocommit. There should be no concerns there. > The difference is how error recovery works. Yes > So this will necessarily be > tied to how the client code or other surrounding code is structured or > what the driver or framework is doing in the background to manage > transactions. It would also be bad if client code was not prepared for > this new behavior, reported the transaction as complete while some > commands in the middle were omitted. This new feature allows a simplified development style because earlier statements don't need to be re-executed, nor do we have to manually wrap everything in savepoints. It changes the assumptions of error recovery, so this will break code already written for PostgreSQL. The purpose is to allow new code to be written using the easier style. Compare this with SERIALIZABLE mode - no need for time consuming additional coding. > Drivers can already achieve this behavior and do do that by issuing > savepoint commands internally. The point raised in this thread was that > that creates too much network overhead, so a backend-based solution > would be preferable. We haven't seen any numbers or other evidence to > quantify that claim, so maybe it's worth looking into that some more. > > In principle, a backend-based solution that drivers just have to opt > into would save a lot of duplication. But the drivers that care or > require it according to their standards presumably already implement > this behavior in some other way, so it comes back to whether there is a > performance or other efficiency gain here. > > Another argument was that other SQL implementations have this behavior. > This appears to be the case. But as far as I can tell, it is also tied > to their particular interfaces and the structure and flow control they > provide. So a client-side solution like psql already provides or > something in the various drivers would work just fine here. > > So my summary for the moment is that a GUC or similar run-time setting > might be fine, with appropriate explanation and warnings. But it's not > clear whether it's worth it given the existing alternatives. This is about simplicity for the developer, not so much about performance. A backend-based solution is required for PL procedures and functions. We could put this as an option into PL/pgSQL, but it seems like it is a function of the transaction manager rather than the driver. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Simon Riggs > A backend-based solution is required for PL procedures and functions. > > We could put this as an option into PL/pgSQL, but it seems like it is > a function of the transaction manager rather than the driver. Exactly. Thanks. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 1, 2017 at 6:47 AM, MauMau <maumau307@gmail.com> wrote: > From: Simon Riggs > On 14 August 2017 at 23:58, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> On 2/28/17 02:39, Tsunakawa, Takayuki wrote: >>> The code for stored functions is not written yet, but I'd like your > feedback for the specification and design based on the current patch. > I'll add this patch to CommitFest 2017-3. >> >> This patch needs to be rebased for the upcoming commit fest. > > I'm willing to review this if the patch is going to be actively worked > on. > > > I'm very sorry I couldn't reply to your kind offer. I rebased the > patch and will add it to CF 2017/11. I hope I will complete the patch > in this CF. Hi Tsunakawa-san, With your v2 patch "make docs" fails. Here is a small patch to apply on top of yours to fix that and some small copy/paste errors, if I understood correctly. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
From: Thomas Munro With your v2 patch "make docs" fails. Here is a small patch to apply on top of yours to fix that and some small copy/paste errors, if I understood correctly. Ouch, thanks. I'd like to merge your fix when I submit the next revision of my patch. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Nov 6, 2017 at 9:36 PM, MauMau <maumau307@gmail.com> wrote: > From: Thomas Munro > With your v2 patch "make docs" fails. Here is a small patch to apply > on top of yours to fix that and some small copy/paste errors, if I > understood correctly. > > Ouch, thanks. I'd like to merge your fix when I submit the next > revision of my patch. This thread is waiting at least for a new version, which has not happened in three weeks. so I am marking it as returned with feedback for now. -- Michael
On 6 November 2017 at 12:36, MauMau <maumau307@gmail.com> wrote: > when I submit the next revision of my patch. When will the next version be posted? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: Simon Riggs [mailto:simon@2ndquadrant.com] > When will the next version be posted? I'm very sorry I haven't submitted anything. I'd like to address this during this CF. Thanks for remembering this. Regards Takayuki Tsunakawa
Hi, On 2018-01-09 08:21:33 +0000, Tsunakawa, Takayuki wrote: > From: Simon Riggs [mailto:simon@2ndquadrant.com] > > When will the next version be posted? > > I'm very sorry I haven't submitted anything. I'd like to address this during this CF. Thanks for remembering this. Given that no new version has been submitted since, that this is the last CF, and that we are far from agreeing on a design, I'm marking this as returned with feedback. Greetings, Andres Freund
On 2017-Nov-06, Tsunakawa, Takayuki wrote: > From: Simon Riggs > > A backend-based solution is required for PL procedures and functions. > > > > We could put this as an option into PL/pgSQL, but it seems like it is > > a function of the transaction manager rather than the driver. > > Exactly. Thanks. I've been looking at re-implementing this feature recently, using Tsunakawa's proposed UI of a GUC transaction_rollback_scope that can take values "transaction" (default, current behavior) and "statement". I didn't take other parts of his patch though; see below. I think the main objectionable point is that of making servers behave in a way that could lose data, if applications assume that transactions behave in the way they do today. I propose that we solve this by allowing this feature to be enabled only via one of: * a PGOPTIONS connection-time option * ALTER USER SET (transaction_rollback_scope) but it can be *disabled* normally via SET. In other words, changing the scope from transaction to statement in a running session is forbidden, but changing it the other way around is allowed (if app is unsure whether env is unsafe, it can set the scope to "transaction" to ensure it's safe from that point onwards). Changing the scope in postgresql.conf is forbidden, so a server is never unsafe as a whole. Drivers such as JDBC can easily use this mode, for example a connection option such as "AUTOSAVE=SERVER" can automatically add the transaction_rollback_scope option. (Naturally, if the server does not support transaction_rollback_scope and the user gave that option, this causes an exception to be raised -- NOT to fallback to the standard transaction behavior!) Tsunakawa's implementation puts the feature in postgres.c's client loop. I think a better way to implement this is to change xact.c to have a new TBLOCK state which indicates when to start a new internal subtransaction; StartTransactionCommand pushes a new element into the transaction stack and puts it in the new state; a subsequent operation actually starts the new subtransaction. (This design decision allows things like SAVEPOINT to work correctly by having the subtrasaction-for-savepoint appear *before* the internal subtransaction, so a subsequent "SELECT 0/0" doesn't remove the user declared savepoint.) I have a PoC implementation that's slightly different: it adds more code to a few xact.c low-level routines (StartTransactionCommand creates the internal savepoint itself). It's not as nice because SAVEPOINT has to close the internal subtransaction, then create the savepoint, then create the internal subtransaction again. And it doesn't handle RELEASE. But as a PoC it's quite nice. I measured the performance overhead to be about 2% - 3% loss of the current mode, which seems acceptable. I would like to hear opinions on whether the protections I propose are sufficient to appease the objections. In my opinion they are, and we should press forward with this, which seems to be one of the frequently requested features from people porting from other DBMSs. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: Alvaro Herrera > I've been looking at re-implementing this feature recently, using > Tsunakawa's proposed UI of a GUC transaction_rollback_scope that can > take values "transaction" (default, current behavior) and "statement". > I didn't take other parts of his patch though; see below. Thank you so much for reviving this thread! > I propose that we solve this by > allowing this feature to be enabled only via one of: > > * a PGOPTIONS connection-time option > * ALTER USER SET (transaction_rollback_scope) Why don't we also allow ALTER DATABASE SET for a database exclusively for data migrated from another DBMS? > but it can be *disabled* normally via SET. In other words, changing the > scope from transaction to statement in a running session is forbidden, > but changing it the other way around is allowed (if app is unsure > whether env is unsafe, it can set the scope to "transaction" to ensure > it's safe from that point onwards). Changing the scope in > postgresql.conf is forbidden, so a server is never unsafe as a whole. Would it be dangerous to allow both enabling and disabling the statement-level rollback only outside a transaction block? I thought it was considered dangerous to change the setting inside a transaction block. > Drivers such as JDBC can easily use this mode, for example a connection > option such as "AUTOSAVE=SERVER" can automatically add the > transaction_rollback_scope option. (Naturally, if the server does not > support transaction_rollback_scope and the user gave that option, this > causes an exception to be raised -- NOT to fallback to the standard > transaction behavior!) How do the drivers know, from the server error response to connection request, that transaction_rollback_scope is unsupported? > Tsunakawa's implementation puts the feature in postgres.c's client loop. > I think a better way to implement this is to change xact.c to have a new > TBLOCK state which indicates when to start a new internal > subtransaction; StartTransactionCommand pushes a new element into the > transaction stack and puts it in the new state; a subsequent operation > actually starts the new subtransaction. (This design decision allows > things like SAVEPOINT to work correctly by having the > subtrasaction-for-savepoint appear *before* the internal subtransaction, > so a subsequent "SELECT 0/0" doesn't remove the user declared > savepoint.) That sounds interesting. * How can PLs like PL/pgSQL utilize this to continue upon an SQL failure? They don't call StartTransactionCommand. * How can psql make use of this feature for its ON_ERROR_ROLLBACK? Regards MauMau
On Fri, Jun 15, 2018 at 4:23 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > I've been looking at re-implementing this feature recently, using > Tsunakawa's proposed UI of a GUC transaction_rollback_scope that can > take values "transaction" (default, current behavior) and "statement". > I didn't take other parts of his patch though; see below. > > I think the main objectionable point is that of making servers behave in > a way that could lose data, if applications assume that transactions > behave in the way they do today. I propose that we solve this by > allowing this feature to be enabled only via one of: > > * a PGOPTIONS connection-time option > * ALTER USER SET (transaction_rollback_scope) > > but it can be *disabled* normally via SET. In other words, changing the > scope from transaction to statement in a running session is forbidden, > but changing it the other way around is allowed (if app is unsure > whether env is unsafe, it can set the scope to "transaction" to ensure > it's safe from that point onwards). Changing the scope in > postgresql.conf is forbidden, so a server is never unsafe as a whole. I'm not sure that really solves the problem, because changing the GUC in either direction causes the system to behave differently. I don't see any particular reason to believe that changing the behavior from A to B is any more or less likely to break applications than a change from B to A. I put this feature, which - in this interest of full disclosure - is already implemented in Advanced Server and has been for many years, more or less in the same category as a hypothetical GUC that changes case-folding from lower case to upper case. That is, it's really nice for compatibility, but you can't get around the fact that every bit of software that is supposed to run on all PostgreSQL instances has to be tested in all possible modes, because otherwise you might find that it doesn't work in all of those modes, or doesn't work as expected. It is a behavior-changing GUC par excellence. Some people are going to love that, and some people are going to hate it, but I don't think adding some funny GUC mode that only allows it to be changed in one direction actually makes any difference. Other people may, of course, disagree. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 15 June 2018 at 21:23, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > I think the main objectionable point is that of making servers behave in > a way that could lose data, if applications assume that transactions > behave in the way they do today. I propose that we solve this by > allowing this feature to be enabled only via one of: > > * a PGOPTIONS connection-time option > * ALTER USER SET (transaction_rollback_scope) > > but it can be *disabled* normally via SET. In other words, changing the > scope from transaction to statement in a running session is forbidden, > but changing it the other way around is allowed (if app is unsure > whether env is unsafe, it can set the scope to "transaction" to ensure > it's safe from that point onwards). If that allows us to annotate a function with SET transaction_rollback_scope = whatever then it works. We probably need to be able to identify code that will/will not work in the new mode. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-Jun-16, Robert Haas wrote: > I'm not sure that really solves the problem, because changing the GUC > in either direction causes the system to behave differently. I don't > see any particular reason to believe that changing the behavior from A > to B is any more or less likely to break applications than a change > from B to A. I suppose the other option is to just disallow a change during a running session completely. That is, whatever value is has when you connect is final. Maybe a better idea is to make write-once: the application connects, establishes its desired behavior, and then it cannot be changed anymore. > I put this feature, which - in this interest of full disclosure - is > already implemented in Advanced Server and has been for many years, > more or less in the same category as a hypothetical GUC that changes > case-folding from lower case to upper case. That is, it's really nice > for compatibility, but you can't get around the fact that every bit of > software that is supposed to run on all PostgreSQL instances has to be > tested in all possible modes, because otherwise you might find that it > doesn't work in all of those modes, or doesn't work as expected. It > is a behavior-changing GUC par excellence. Thanks for bringing this up. While I agree that both your example and the feature being proposed are behavior-changing, I don't think the parallel is very good, because the level of effort in order to port from one behavior to the other is much higher with statement-scoped rollback than with case-folding. In the case-folding example, I don't think you need to audit/rewrite all your applications in order to make them work: most of the time just rename a few tables, or if not just add a few quotes (and you can probably do it programatically.) With statement-scope rollback what you face is a more thorough review .. probably adding a savepoint call every other line. I'm not sure that for a large codebase this is even reasonable to start talking about. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jun 18, 2018 at 4:51 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > On 2018-Jun-16, Robert Haas wrote: >> I'm not sure that really solves the problem, because changing the GUC >> in either direction causes the system to behave differently. I don't >> see any particular reason to believe that changing the behavior from A >> to B is any more or less likely to break applications than a change >> from B to A. > > I suppose the other option is to just disallow a change during a running > session completely. That is, whatever value is has when you connect is > final. > > Maybe a better idea is to make write-once: the application connects, > establishes its desired behavior, and then it cannot be changed anymore. That sounds even worse. I think if we're going to have this behavior at all, it should be possible to change the setting. >> I put this feature, which - in this interest of full disclosure - is >> already implemented in Advanced Server and has been for many years, >> more or less in the same category as a hypothetical GUC that changes >> case-folding from lower case to upper case. That is, it's really nice >> for compatibility, but you can't get around the fact that every bit of >> software that is supposed to run on all PostgreSQL instances has to be >> tested in all possible modes, because otherwise you might find that it >> doesn't work in all of those modes, or doesn't work as expected. It >> is a behavior-changing GUC par excellence. > > Thanks for bringing this up. > > While I agree that both your example and the feature being proposed are > behavior-changing, I don't think the parallel is very good, because the > level of effort in order to port from one behavior to the other is much > higher with statement-scoped rollback than with case-folding. In the > case-folding example, I don't think you need to audit/rewrite all your > applications in order to make them work: most of the time just rename a > few tables, or if not just add a few quotes (and you can probably do it > programatically.) > > With statement-scope rollback what you face is a more thorough review .. > probably adding a savepoint call every other line. I'm not sure that > for a large codebase this is even reasonable to start talking about. Yeah. The real problem is what happens when two code bases collide. For example, suppose you have a large code base that is using this, and then you install some extensions that weren't tested with it enabled. Or, you install a tool like pgAdmin or pgpool or whatever that turns out not to understand the new mode, and stuff breaks. It's a distributed burden on the ecosystem. I don't think we can avoid that. It's just a matter of whether it is worth it or not. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company