Thread: BUG #17810: Update from 13.09 to 13.10 breaks SQLs with VACUUM
The following bug has been logged on the website: Bug reference: 17810 Logged by: Yuri Cherio Email address: cherio@gmail.com PostgreSQL version: 13.10 Operating system: Ubuntu Linux 20.04.5 LTS Description: My system has numerous SQL scripts (each script contains multiple SQL statements). Many of them look like UPDATE ... INSERT ... ... VACUUM ... These scripts are being run from a Java program, with AutoCommit=true, via JDBC as: Statement statement = connection.createStatement(); connection.setAutoCommit(true); statement.execute(sql); ... statement.getMoreResults(); ... This worked well for years until the server (not the client program) was upgraded to 13.10. After the upgrade all compound SQL statements with VACUUM at the end fail with this error: 09:48:56.585 SEVERE Script processing failed: org.postgresql.util.PSQLException: ERROR: VACUUM cannot be executed within a pipeline at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2365) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:355) at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:490) at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:408) at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:329) at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:315) at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:291) at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:286) JDBC driver version is 42.3.3 (i tried the latest 42.5.4 and it doesn't make a difference). The same SQL scripts are still working with the latest postgresql 14.7
One more detail. Adding COMMIT; just before VACUUM makes it work again.
It looks like the AutoCommit flag is being ignored after the latest point release.
SQL statements that precede VACUUM include simple INSERT/UPDATE statements, CTEs and some include anonymous Pg/PlSql blocks. None of them start explicit transactions.
I do not have version 15 server to test the behavior.
On Mon, Feb 27, 2023 at 10:28 AM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 17810
Logged by: Yuri Cherio
Email address: cherio@gmail.com
PostgreSQL version: 13.10
Operating system: Ubuntu Linux 20.04.5 LTS
Description:
My system has numerous SQL scripts (each script contains multiple SQL
statements). Many of them look like
UPDATE ...
INSERT ...
...
VACUUM ...
These scripts are being run from a Java program, with AutoCommit=true, via
JDBC as:
Statement statement = connection.createStatement();
connection.setAutoCommit(true);
statement.execute(sql);
...
statement.getMoreResults();
...
This worked well for years until the server (not the client program) was
upgraded to 13.10. After the upgrade all compound SQL statements with VACUUM
at the end fail with this error:
09:48:56.585 SEVERE Script processing failed:
org.postgresql.util.PSQLException: ERROR: VACUUM cannot be executed within a
pipeline
at
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675)
at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2365)
at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:355)
at
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:490)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:408)
at
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:329)
at
org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:315)
at
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:291)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:286)
JDBC driver version is 42.3.3 (i tried the latest 42.5.4 and it doesn't make
a difference).
The same SQL scripts are still working with the latest postgresql 14.7
PG Bug reporting form <noreply@postgresql.org> writes: > My system has numerous SQL scripts (each script contains multiple SQL > statements). Many of them look like > UPDATE ... > INSERT ... > ... > VACUUM ... > These scripts are being run from a Java program, with AutoCommit=true, via > JDBC as: Sorry, you'll have to fix your scripts to do that separately. (It might be sufficient to add COMMIT before the VACUUM.) It was a bug that we didn't enforce this requirement all along. > The same SQL scripts are still working with the latest postgresql 14.7 I'm quite skeptical of that claim; I see the restriction being enforced the same way in all supported branches. regards, tom lane
Thank you for the clarification.
I completely understand the need to make things right. At the same time it baffles me that a breaking change would be introduced somewhere in a point release. The release notes don't seem to even mention this or give a warning to watch out for behavior change: https://www.postgresql.org/docs/13/release-13-10.html
While adding explicit COMMIT seems to be a working solution (at least for the moment) it would be great to understand the nature of the change, the specifics of what was allowed before and what the behaviour is like after the change, the cases being affected, etc. I (and anyone who stumbles upon this after the upgrade) would appreciate it if you could point me in the right direction of where to look.
My claim about 14.7 was invalid indeed. The same change occurred when upgrading from 14.6.
On Mon, Feb 27, 2023 at 11:34 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> My system has numerous SQL scripts (each script contains multiple SQL
> statements). Many of them look like
> UPDATE ...
> INSERT ...
> ...
> VACUUM ...
> These scripts are being run from a Java program, with AutoCommit=true, via
> JDBC as:
Sorry, you'll have to fix your scripts to do that separately.
(It might be sufficient to add COMMIT before the VACUUM.)
It was a bug that we didn't enforce this requirement all along.
> The same SQL scripts are still working with the latest postgresql 14.7
I'm quite skeptical of that claim; I see the restriction being
enforced the same way in all supported branches.
regards, tom lane
I just realized that the ramifications of this change go further than just VACUUM related statements in the scripts. Assume 2 scripts
UPDATE tableA
UPDATE tableB
and
UPDATE tableA
UPDATE tableB
Before the change they could run in parallel without issues. After the change this will cause one of the queries to fail due to transaction locks. This is a conceptual change and the consequences of it are significant for everyone who came to rely upon the fact that such scripts do not run in one transaction.
While the visible effect of the change, like the one I observe with having VACUUM at the end, are easy to spot, the other adverse effects may prove to be more insidious.
IMHO it is a big conceptual change and I would expect it to come with not only a warning but also with a way to gradually transition to, allowing the legacy behavior at first by default, then on opt-in basis and eventually removing it altogether.
On Mon, Feb 27, 2023 at 12:31 PM Cherio <cherio@gmail.com> wrote:
Thank you for the clarification.I completely understand the need to make things right. At the same time it baffles me that a breaking change would be introduced somewhere in a point release. The release notes don't seem to even mention this or give a warning to watch out for behavior change: https://www.postgresql.org/docs/13/release-13-10.htmlWhile adding explicit COMMIT seems to be a working solution (at least for the moment) it would be great to understand the nature of the change, the specifics of what was allowed before and what the behaviour is like after the change, the cases being affected, etc. I (and anyone who stumbles upon this after the upgrade) would appreciate it if you could point me in the right direction of where to look.My claim about 14.7 was invalid indeed. The same change occurred when upgrading from 14.6.On Mon, Feb 27, 2023 at 11:34 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:PG Bug reporting form <noreply@postgresql.org> writes:
> My system has numerous SQL scripts (each script contains multiple SQL
> statements). Many of them look like
> UPDATE ...
> INSERT ...
> ...
> VACUUM ...
> These scripts are being run from a Java program, with AutoCommit=true, via
> JDBC as:
Sorry, you'll have to fix your scripts to do that separately.
(It might be sufficient to add COMMIT before the VACUUM.)
It was a bug that we didn't enforce this requirement all along.
> The same SQL scripts are still working with the latest postgresql 14.7
I'm quite skeptical of that claim; I see the restriction being
enforced the same way in all supported branches.
regards, tom lane
Cherio <cherio@gmail.com> writes: > I just realized that the ramifications of this change go further than just > VACUUM related statements in the scripts. Assume 2 scripts > UPDATE tableA > UPDATE tableB > and > UPDATE tableA > UPDATE tableB > Before the change they could run in parallel without issues. After the > change this will cause one of the queries to fail due to transaction locks. Uh ... really? Please provide evidence. AFAIK this set of changes only affects commands that are meant to not run inside tranaction blocks. regards, tom lane
The second script should have tables used in the reversed order:
UPDATE tableA
UPDATE tableB
UPDATE tableB
and
UPDATE tableB
UPDATE tableA
UPDATE tableA
Since they will run in individual transactions tableA gets locked by the 1st script and tableB by the 2nd; as execution flow proceeds to the next update in each script, those tables would be locked in separate transactions.
On Mon, Feb 27, 2023 at 1:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Cherio <cherio@gmail.com> writes:
> I just realized that the ramifications of this change go further than just
> VACUUM related statements in the scripts. Assume 2 scripts
> UPDATE tableA
> UPDATE tableB
> and
> UPDATE tableA
> UPDATE tableB
> Before the change they could run in parallel without issues. After the
> change this will cause one of the queries to fail due to transaction locks.
Uh ... really? Please provide evidence. AFAIK this set of changes
only affects commands that are meant to not run inside tranaction blocks.
regards, tom lane
Cherio <cherio@gmail.com> writes: > The second script should have tables used in the reversed order: > UPDATE tableA > UPDATE tableB > and > UPDATE tableB > UPDATE tableA > Since they will run in individual transactions tableA gets locked by the > 1st script and tableB by the 2nd; as execution flow proceeds to the next > update in each script, those tables would be locked in separate > transactions. I think you are working with a completely wrong mental model of what this change did. It will not affect a pipeline that doesn't contain any VACUUM, ANALYZE, or similar maintenance commands. regards, tom lane
> I think you are working with a completely wrong mental model of what this change did.
You are right. My mental model of the change is a speculation, but this is only because the specifics of the change are not clarified anywhere.
From what you are saying I am deriving that transaction model is not affected, and the individual statements still run within their own atomic transactions when AutoCommit is enabled, it's just a limited set of commands (at the moment I am aware of VACUUM and .... ANALYZE???) can't be bundled with any other statements, because doing that blows up the new pipeline :)
BTW, why ANALYZE? I just tested ANALYZE and it seems to work without the need to be preceded with COMMIT.
Are VACUUM and ANALYZE the only commands that must be executed separately?
Again, is there at least a brief description of the scope of what was affected?
I believe I understand it better now but the picture is still made of bits and pieces glued with trial and error.
On Mon, Feb 27, 2023 at 1:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Cherio <cherio@gmail.com> writes:
> The second script should have tables used in the reversed order:
> UPDATE tableA
> UPDATE tableB
> and
> UPDATE tableB
> UPDATE tableA
> Since they will run in individual transactions tableA gets locked by the
> 1st script and tableB by the 2nd; as execution flow proceeds to the next
> update in each script, those tables would be locked in separate
> transactions.
I think you are working with a completely wrong mental model of what
this change did. It will not affect a pipeline that doesn't contain
any VACUUM, ANALYZE, or similar maintenance commands.
regards, tom lane
On Mon, Feb 27, 2023 at 11:42 AM Cherio <cherio@gmail.com> wrote:
> I think you are working with a completely wrong mental model of what this change did.You are right. My mental model of the change is a speculation, but this is only because the specifics of the change are not clarified anywhere.From what you are saying I am deriving that transaction model is not affected, and the individual statements still run within their own atomic transactions when AutoCommit is enabled, it's just a limited set of commands (at the moment I am aware of VACUUM and .... ANALYZE???) can't be bundled with any other statements, because doing that blows up the new pipeline :)BTW, why ANALYZE? I just tested ANALYZE and it seems to work without the need to be preceded with COMMIT.Are VACUUM and ANALYZE the only commands that must be executed separately?Again, is there at least a brief description of the scope of what was affected?I believe I understand it better now but the picture is still made of bits and pieces glued with trial and error.
The commit that implemented this fix is here, it links to discussion:
As for your usage of "conn.setAutocommit(true)" - IIUC that is irrelevant to this entire discussion. You've chosen to bundle up multiple statements into a single Statement.execute(string) call which obeys the rules of the simple query protocol - multiple statements:
https://www.postgresql.org/docs/current/protocol-flow.html#id-1.10.6.7.4 (subsection 55.2.2.1)
Namely:
"When a simple Query message contains more than one SQL statement (separated by semicolons), those statements are executed as a single transaction, unless explicit transaction control commands are included to force a different behavior."
Now, per the documentation for Vacuum:
"VACUUM cannot be executed inside a transaction block."
And so per the documentation your script has always been invalid as written.
Is there room for improved communication on the minor-release change in behavior? Probably.
As for discussing reverting this in the back-branches given new evidence and scenarios, that is possible and I've yet to go back and fully review that discussion thread in light of this new information.
David J.
Cherio <cherio@gmail.com> writes: > You are right. My mental model of the change is a speculation, but this is > only because the specifics of the change are not clarified anywhere. The problem is that statements that aren't allowed to execute in a transaction block (usually because they need an immediate commit) weren't checking that properly in the context of pipelined queries. This should never have been allowed, but we weren't enforcing that properly. The net effect in some cases is that such commands unexpectedly committed the effects of prior commands in the pipeline, and in other cases that a rollback occurring later in the pipeline would leave you with corrupted on-disk state. We judged that to be a bad enough bug that the fix should be backpatched. The original fix for that went in six months ago (13.8 et al) but we later discovered that there was still a hole in it. The fact that the 13.10 release notes only mention ANALYZE is a result of my inadequate summarization of the commit log entry :-(. > BTW, why ANALYZE? ANALYZE is a slightly different case. It can work inside a transaction block or not, but it has two different strategies depending on that: one uses internal commits and the other doesn't. It was applying the wrong one in this context, which again had the effect of prematurely committing previous commands in the pipeline. > Are VACUUM and ANALYZE the only commands that must be executed > separately? There's a couple dozen such commands --- easiest way to find them is to grep the source code for PreventInTransactionBlock calls. I believe "can't run inside a transaction block" is mentioned in all their manual pages, but we don't have a central list. regards, tom lane
"David G. Johnston" <david.g.johnston@gmail.com> writes: > As for your usage of "conn.setAutocommit(true)" - IIUC that is irrelevant > to this entire discussion. You've chosen to bundle up multiple statements > into a single Statement.execute(string) call which obeys the rules of the > simple query protocol - multiple statements: I doubt it. We closed the not-in-transaction-block loophole decades ago for simple query protocol. What's at stake here is what happens when a series of extended-protocol commands are given without Sync between them, which we interpret as a request to run them all in the same transaction. I'm a bit surprised that the JDBC driver is choosing to issue them that way, because it implies (at least) that it's parsing the string enough to break it down into separate SQL commands. But we'd not be having this conversation if that weren't happening. Anyway, that scenario *should* be subject to the same rules as multiple statements in one simple-query message, only it wasn't up till now. > Is there room for improved communication on the minor-release change in > behavior? Probably. Yeah, I wish I could have that release note back :-(. We don't really have any mechanism for updating release notes after the fact. I could go and fix it in the git repo, but nobody would see it until after the next quarterly releases which seems a bit too late. > As for discussing reverting this in the back-branches given new evidence > and scenarios, that is possible and I've yet to go back and fully review > that discussion thread in light of this new information. I think there's zero chance we'd revert the bug fix at this point. If we'd realized that the JDBC driver behaves this way, we might have debated a little harder about whether this really had to be fixed in the back branches --- but it's done now, and it is a bug fix. regards, tom lane
On Mon, Feb 27, 2023 at 1:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> As for your usage of "conn.setAutocommit(true)" - IIUC that is irrelevant
> to this entire discussion. You've chosen to bundle up multiple statements
> into a single Statement.execute(string) call which obeys the rules of the
> simple query protocol - multiple statements:
I doubt it. We closed the not-in-transaction-block loophole decades
ago for simple query protocol. What's at stake here is what happens
when a series of extended-protocol commands are given without Sync
between them, which we interpret as a request to run them all in the
same transaction. I'm a bit surprised that the JDBC driver is choosing
to issue them that way, because it implies (at least) that it's parsing
the string enough to break it down into separate SQL commands. But
we'd not be having this conversation if that weren't happening.
Yeah, I realized a bit after I wrote my comments that it is similar but not exactly this issue. And yes, the JDBC driver does indeed go to the trouble of parsing out statements.
David J.
Tom, David thank you very much for the explanation and the links!!!!!