Thread: BUG #17840: Failing to execute auto_explain for logging leads to transaction rollback.

BUG #17840: Failing to execute auto_explain for logging leads to transaction rollback.

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17840
Logged by:          Sebastien Caunes
Email address:      bokanist@gmail.com
PostgreSQL version: 13.2
Operating system:   Linux
Description:

I had this problem when importing data from oracle with oracle_fdw and
auto_explain option set on the postgres server : If the query took too long,
it triggers the auto_explain wich queries things on oracle side where there
was a permission problem. The error bubble up outside of logging scope and
make the transaction rollback after having properly transferred dozen
million records (that then stood as dead tuples).

Logging is one thing with its own scope and it should never interfere with
the behavior of the software. Beside whatever reason triggering the error in
logging scope, the transaction that had no problem itself should not be
rollbacked. It makes no sense.

Isn't there something to refactor in Postgres ?


PG Bug reporting form <noreply@postgresql.org> writes:
> I had this problem when importing data from oracle with oracle_fdw and
> auto_explain option set on the postgres server : If the query took too long,
> it triggers the auto_explain wich queries things on oracle side where there
> was a permission problem. The error bubble up outside of logging scope and
> make the transaction rollback after having properly transferred dozen
> million records (that then stood as dead tuples).

> Logging is one thing with its own scope and it should never interfere with
> the behavior of the software. Beside whatever reason triggering the error in
> logging scope, the transaction that had no problem itself should not be
> rollbacked. It makes no sense.

> Isn't there something to refactor in Postgres ?

This bug report is not actionable due to complete lack of specifics.
Please read

https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

Having said that, my guess is that if there's anything to be done
it would need to be done in oracle_fdw.  So you need to take this
up with the oracle_fdw authors, not here which is for core-server
problem reports.

            regards, tom lane



What more can I say ? It's clear that the auto_explain feature, which is a logging process, can make perfectly fine transactions being rolled back. That makes no sense, the transaction should be committed before any logging event is triggered, for me it's a bug in the core of postgres. It has nothing to do with the plugin used.

----
Postgres opens a transaction containing a query that import data. Everything works fine until just before the commit (I guess).
Postgres triggers the auto_explain logging function which has nothing to do with the transaction itself and the integrity of data.
The explain query fails, it is allowed to, that's not a bug of oracle_fdw.
Postgres rollbacks the transaction because of the failure in its own logging process.



Le mar. 14 mars 2023 à 15:10, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
PG Bug reporting form <noreply@postgresql.org> writes:
> I had this problem when importing data from oracle with oracle_fdw and
> auto_explain option set on the postgres server : If the query took too long,
> it triggers the auto_explain wich queries things on oracle side where there
> was a permission problem. The error bubble up outside of logging scope and
> make the transaction rollback after having properly transferred dozen
> million records (that then stood as dead tuples).

> Logging is one thing with its own scope and it should never interfere with
> the behavior of the software. Beside whatever reason triggering the error in
> logging scope, the transaction that had no problem itself should not be
> rollbacked. It makes no sense.

> Isn't there something to refactor in Postgres ?

This bug report is not actionable due to complete lack of specifics.
Please read

https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

Having said that, my guess is that if there's anything to be done
it would need to be done in oracle_fdw.  So you need to take this
up with the oracle_fdw authors, not here which is for core-server
problem reports.

                        regards, tom lane


--
Sébastien Caunes
+33 6 7 229 229 7
On Wed, Mar 15, 2023 at 11:24 AM Sébastien <bokanist@gmail.com> wrote:
What more can I say ? It's clear that the auto_explain feature, which is a logging process, can make perfectly fine transactions being rolled back. That makes no sense, the transaction should be committed before any logging event is triggered, for me it's a bug in the core of postgres. It has nothing to do with the plugin used.


It may be an undesired behavior, and I do sympathize with the position, but the current design is quite intentional and long-standing and is not going to be treated as a bug to be fixed immediately and back-patched.

A query you can execute should be incapable of failing if EXPLAIN for the same query is issued.  If some interplay with oracle_fdw causes this fundamental assumption to be falsified then auto_explain, which works under that assumption, and oracle_fdw are incompatible and you will need to decide which one (possibly on a per-transaction basis) you wish to use.

David J.

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> A query you can execute should be incapable of failing if EXPLAIN for the
> same query is issued.

Indeed.  This should be especially true for auto_explain, which isn't
even doing a re-parse or re-plan, but just dumping data from the
executor state tree for the just-finished query.  Barring edge cases
like out-of-memory, it really shouldn't fail; so I see no reason why
we should consider major structural changes to make it (perhaps) less
likely to fail.

I continue to think that the most likely explanation is oracle_fdw
doing something it probably shouldn't be doing.  I have no interest
in poking into that code myself, though.

            regards, tom lane



Thank you for your answers.  

Now I understand the problem have been understood.

Oracle_fdw author is working on this.

Best regards 

Le mer. 15 mars 2023, 23:06, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> A query you can execute should be incapable of failing if EXPLAIN for the
> same query is issued.

Indeed.  This should be especially true for auto_explain, which isn't
even doing a re-parse or re-plan, but just dumping data from the
executor state tree for the just-finished query.  Barring edge cases
like out-of-memory, it really shouldn't fail; so I see no reason why
we should consider major structural changes to make it (perhaps) less
likely to fail.

I continue to think that the most likely explanation is oracle_fdw
doing something it probably shouldn't be doing.  I have no interest
in poking into that code myself, though.

                        regards, tom lane