Re: [PATCH] Hooks at XactCommand level - Mailing list pgsql-hackers

From Gilles Darold
Subject Re: [PATCH] Hooks at XactCommand level
Date
Msg-id 9875dd5b-ce95-3348-468d-cd4cfadcddaf@darold.net
Whole thread Raw
In response to Re: [PATCH] Hooks at XactCommand level  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [PATCH] Hooks at XactCommand level  (Gilles Darold <gilles@darold.net>)
List pgsql-hackers
Le 14/07/2021 à 21:26, Tom Lane a écrit :
Gilles Darold <gilles@darold.net> writes:
I have renamed the patch and the title of this proposal registered in 
the commitfest "Xact/SubXact event callback at command start" to reflect 
the last changes that do not include new hooks anymore.
Hmm, it doesn't seem like this has addressed my concern at all.
The callbacks are still effectively defined as "run during
start_xact_command", so they're not any less squishy semantically
than they were before.  The point of my criticism was that you
should move the call site to someplace that's more organically
connected to execution of commands.

I would like to move it closer to the command execution but the only place I see would be in BeginCommand() which actually is waiting for code to execute, for the moment this function does nothing. I don't see another possible place after start_xact_command() call. All my attempt to inject the callback after start_xact_command() result in a failure. If you see an other place I will be please to give it a test.


Another thing I'm not too pleased with in this formulation is that it's
very unclear what the distinction is between XACT_EVENT_COMMAND_START
and SUBXACT_EVENT_COMMAND_START.  AFAICS, *every* potential use-case
for this would have to hook into both callback chains, and most likely
would treat the two events alike.  Plus, as you note, the goalposts
have suddenly been moved for the amount of overhead it's okay to have
in an XactCallback or SubXactCallback function.  So that might cause
problems for unrelated code.  It's probably better to not try to
re-use that infrastructure.

Actually XACT_EVENT_COMMAND_START occurs only after the call BEGIN, when a transaction starts, whereas SUBXACT_EVENT_COMMAND_START occurs in all subsequent statement execution of this transaction. This helps to perform different actions following the event. In the example extension only SUBXACT_EVENT_COMMAND_START is used but for example I could use event XACT_EVENT_COMMAND_START to not send a RELEASE savepoint as there is none. I detect this case differently but this could be an improvement in the extension.



<digression>

The objective of this new callback is to be able to call user-defined 
code before any new statement is executed. For example it can call a 
rollback to savepoint if there was an error in the previous transaction 
statement, which allow to implements Rollback at Statement Level at 
server side using a PostgreSQL extension, see [1] .
Urgh.  Isn't this re-making the same mistake we made years ago, namely
trying to implement autocommit on the server side?  I fear this will
be a disaster even larger than that was, because if it's an extension
then pretty much no applications will be prepared for the new semantics.
I strongly urge you to read the discussions that led up to f85f43dfb,
and to search the commit history before that for mentions of
"autocommit", to see just how extensive the mess was.

I spent a little time trying to locate said discussions; it's harder
than it should be because we didn't have the practice of citing email
threads in the commit log at the time.  I did find

https://www.postgresql.org/message-id/flat/Pine.LNX.4.44.0303172059170.1975-100000%40peter.localdomain#7ae26ed5c1bfbf9b22a420dfd8b8e69f

which seems to have been the proximate decision, and here are
a few threads talking about all the messes that were created
for JDBC etc:

https://www.postgresql.org/message-id/flat/3D793A93.7030000%40xythos.com#4a2e2d9bdf2967906a6e0a75815d6636
https://www.postgresql.org/message-id/flat/3383060E-272E-11D7-BA14-000502E740BA%40wellsgaming.com
https://www.postgresql.org/message-id/flat/Law14-F37PIje6n0ssr00000bc1%40hotmail.com

Basically, changing transactional semantics underneath clients is
a horrid idea.  Having such behavior in an extension rather than
the core doesn't make it less horrid.  If we'd designed it to act
that way from day one, maybe it'd have been fine.  But as things
stand, we are quite locked into the position that this has to be
managed on the client side.


Yes I have suffered of this implementation for server side autocommit, it was reverted in PG 7.4 if I remember well. I'm old enough to remember that :-). I'm also against restoring this feature inside PG core but the fact that the subject comes again almost every 2 years mean that there is a need on this feature. This is why I'm proposing to be able to use an extension for those who really need the feature, with all the associated warning.


For example in my case the first time I was needing this feature was to emulate the behavior of DB2 that allows rollback at statement level. This is not exactly autocommit because the transaction still need to be validated or rolledback at end, this is just that an error will not invalidate the full transaction but just the failing statement. I think that this is different. Actually I have an extension that is doing that for most of the work but we still have to send the ROLLBACK TO savepoint at client side which is really a performances killer and especially painful to implement with JDBC Exception blocks.


Recently I was working on an Oracle to PostgreSQL migration and want to implement an other Oracle feature like that is heavily used when importing data from different sources into a data warehouse. It's very common in the Oracle world to batch data import inside a transaction and log silently the errors into a dedicated table to be processed later. "Whatever" (this concern only certain errors) happens you continue to import the data and DBAs will check what to fix and will re-import the records in error. Again, I have an extension that is doing that but we still have to generate the ROLLBACK TO at client side. This can be avoided with this proposal and will greatly simplify the code at client side.


We all know the problems of such server side implementation but once you have implemented it at client side and you are looking for better performances it's obvious that this kind of extension could help. The other solution is to move to a proprietary PostgreSQL fork which is surely not what we want.

 

</digression>

That point doesn't necessarily invalidate the value of having
some sort of hook in this general area.  But I would kind of like
to see another use-case, because I don't believe in this one.


I have sited two use-case, they are both based on the rollback at statement level feature. I'm pretty sure that there is several other use-cases that escape my poor imagination. IMHO the possibility to offer the rollback at statement level feature through an extension should be enough but if anyone have other use-case I will be pleased to create an extension to test it :-)


-- 
Gilles Darold
http://www.darold.net/

pgsql-hackers by date:

Previous
From: Denis Hirn
Date:
Subject: Re: [PATCH] Allow multiple recursive self-references
Next
From: gkokolatos@pm.me
Date:
Subject: Re: Introduce pg_receivewal gzip compression tests