Thread: Issue with server side statement-level rollback
Hi, We're facing some issue in a new extension we use at LzLabs to emulate server side rollback at statement level in PostgreSQL, see for full detail https://github.com/lzlabs/pg_statement_rollback/ The problem we are encountering is when PostgreSQL is compiled in debug mode with --enable-cassert. At line 1327 of src/backend/tcop/pquery.c the following assert fail: /* * Clear subsidiary contexts to recover temporary memory. */ Assert(portal->portalContext == CurrentMemoryContext); MemoryContextDeleteChildren(portal->portalContext); This extension, although it is a risky implementation, works extremely well when used in a fully controlled environment. It avoid the latency of the extra communication for the RELEASE+SAVEPOINT usually controlled at client side. The client is only responsible to issue the "ROLLBACK TO autosavepoint" when needed. The extension allow a high performances gain for this feature that helps customers using Oracle or DB2 to migrate to PostgreSQL. Actually with the extension the memory context is not CurrentMemoryContext as expected by the assert. (gdb) b pquery.c:1327 Breakpoint 1 at 0x55792fd7a04d: file pquery.c, line 1327. (gdb) c Continuing. Breakpoint 1, PortalRunMulti (portal=portal@entry=0x5579316e3e10, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x557931755ce8, altdest=altdest@entry=0x557931755ce8, qc=qc@entry=0x7ffc4aa1f8a0) at pquery.c:1327 1327 Assert(portal->portalContext == CurrentMemoryContext); (gdb) p portal->sourceText $1 = 0x557931679c80 "INSERT INTO savepoint_test SELECT 1;" (gdb) p MemoryContextStats(portal->portalContext) $2 = void (gdb) The memory context dump output from log: PortalContext: 1024 total in 1 blocks; 704 free (1 chunks); 320 used: <unnamed> Grand total: 1024 bytes in 1 blocks; 704 free (1 chunks); 320 used If I naively remove the assert on pquery.c everything works without any new assert error. As I said I am aware that this is clearly not a standard PostgreSQL use but we have no choice. We are emulating DB2 statement-level rollback behavior and we have chosen to not create a new fork of PostgreSQL and only work with extensions. As there is no hook or API that could allow a perfect server side integration of this feature we have done what is possible to do in the extension. So my question is should we allow such use through an extension and in this case what is the change to PostgreSQL code that could avoid the assert crash? Or perhaps we have missed something in this extension to be able to make the assert happy but I don't think so. Cheers -- Gilles Darold
Hi, On 2020-11-12 11:40:22 +0100, Gilles Darold wrote: > The problem we are encountering is when PostgreSQL is compiled in debug > mode with --enable-cassert. At line 1327 of src/backend/tcop/pquery.c > the following assert fail: > > /* > * Clear subsidiary contexts to recover temporary memory. > */ > Assert(portal->portalContext == CurrentMemoryContext); > > MemoryContextDeleteChildren(portal->portalContext); > > This extension, although it is a risky implementation, works extremely > well when used in a fully controlled environment. It avoid the latency > of the extra communication for the RELEASE+SAVEPOINT usually controlled at > client side. The client is only responsible to issue the "ROLLBACK TO > autosavepoint" > when needed. The extension allow a high performances gain for this feature > that helps customers using Oracle or DB2 to migrate to PostgreSQL. > > > Actually with the extension the memory context is not CurrentMemoryContext > as expected by the assert. What is it instead? I don't think you really can safely be in a different context at this point. There's risks of CurrentMemoryContext pointing to a deleted context, and risks of memory leaks, depending on the situation. > As there is no hook or API that could allow a perfect server side > integration of this feature we have done what is possible to do in the > extension. > So my question is should we allow such use through an extension and in > this case what is the change to PostgreSQL code that could avoid the > assert crash? Or perhaps we have missed something in this extension to > be able to make the assert happy but I don't think so. Without more detail of what you actually are precisely doing, and what the hooks / integration you'd like would look like, it's hard to comment usefully here. Greetings, Andres Freund
Hi, Le 19/11/2020 à 21:43, Andres Freund a écrit : > Hi, > > On 2020-11-12 11:40:22 +0100, Gilles Darold wrote: >> The problem we are encountering is when PostgreSQL is compiled in debug >> mode with --enable-cassert. At line 1327 of src/backend/tcop/pquery.c >> the following assert fail: >> >> /* >> * Clear subsidiary contexts to recover temporary memory. >> */ >> Assert(portal->portalContext == CurrentMemoryContext); >> >> MemoryContextDeleteChildren(portal->portalContext); >> >> This extension, although it is a risky implementation, works extremely >> well when used in a fully controlled environment. It avoid the latency >> of the extra communication for the RELEASE+SAVEPOINT usually controlled at >> client side. The client is only responsible to issue the "ROLLBACK TO >> autosavepoint" >> when needed. The extension allow a high performances gain for this feature >> that helps customers using Oracle or DB2 to migrate to PostgreSQL. >> >> >> Actually with the extension the memory context is not CurrentMemoryContext >> as expected by the assert. > What is it instead? I don't think you really can safely be in a > different context at this point. There's risks of CurrentMemoryContext > pointing to a deleted context, and risks of memory leaks, depending on > the situation. This is a PortalContext. Yes this implementation has some risks but until now I have not met any problem because its use and the environment are fully controlled. > >> So my question is should we allow such use through an extension and in >> this case what is the change to PostgreSQL code that could avoid the >> assert crash? Or perhaps we have missed something in this extension to >> be able to make the assert happy but I don't think so. > Without more detail of what you actually are precisely doing, and what > the hooks / integration you'd like would look like, it's hard to comment > usefully here. We have implemented an extension to allow server side "statement-level rollback" with what is possible to do now with PG but the objective was to do the same thing that what was proposed as a core patch submitted by Takayuki Tsunakawa [1] . This patch will not be included into core and what I'm trying to do now is to have some facilities to allow this feature through an extension that does not suffer from the same limitation of pg_statement_rollback. Looking that this patch for example, if we have a hook on finish_xact_command(), finish_xact_command() and AbortCurrentTransaction() I think we could probably be able to implement the feature through an extension in a more "safe" way. A hook on start_xact_command() seems useless as it looks it is executed before the UtilityProcess and Executor* hooks. See attached patch for an example of what could be useful for this kind of extension. Unfortunately my knowledge doesn't allow me to see further and especially if there is drawbacks. I hope this is more clear, I will work later on a POC to demonstrate the use case I want to implement. [1] https://www.postgresql.org/message-id/flat/0A3221C70F24FB45833433255569204D1F6A9286%40G01JPEXMBYT05 -- Gilles Darold http://www.darold.net/
Attachment
I will work later on a POC to demonstrate the use case I want to implement.
Hi Andres,
I have created a new version of the pg_statement_rollback extension [1] to demonstrate the use of the hooks on start_xact_command(), finish_xact_command() and AbortCurrentTransaction() to implement the statement-level rollback feature entirely driven at serverside. It require that the patch [2] I've provided be applied on PostgreSQL source first.
Here is what can be achieved with this patch:
LOAD 'pg_statement_rollback.so';
LOAD
SET pg_statement_rollback.enabled TO on;
SET
CREATE SCHEMA testrsl;
CREATE SCHEMA
SET search_path TO testrsl,public;
SET
BEGIN;
BEGIN
CREATE TABLE tbl_rsl(id integer, val varchar(256));
CREATE TABLE
INSERT INTO tbl_rsl VALUES (1, 'one');
INSERT 0 1
WITH write AS (INSERT INTO tbl_rsl VALUES (2, 'two') RETURNING id, val) SELECT * FROM write;
id | val
----+-----
2 | two
(1 row)
UPDATE tbl_rsl SET id = 'two', val = 2 WHERE id = 1; -- >>>>> will fail
psql:simple.sql:14: ERROR: invalid input syntax for type integer: "two"
LINE 1: UPDATE tbl_rsl SET id = 'two', val = 2 WHERE id = 1;
^
SELECT * FROM tbl_rsl; -- Should show records id 1 + 2
id | val
----+-----
1 | one
2 | two
(2 rows)
COMMIT;
COMMIT
Actually unlike I've though this is the hook on finish_xact_command() that is useless. In the extension I'm executing the RELEASE/SAVEPOINT in the start_xact_command() hook before executing the next statement. The hook on AbortCurrentTransaction() is used to signal that a ROLLOBACK TO/SAVEPOINT need to be executed into the start_xact_command() hook instead of a RELEASE/SAVEPOINT.
This works perfectly and do not crash PG anymore when compiled with assert. Advanced tests (with triggers, client savepoint, CTE, etc.) are available in the test/sql/ directory. Use of "make installcheck" allow to run the regression tests.
Based on this result I really think that these hooks should be included to be able to extend PostgreSQL for such feature although I have not though about an other use that this one.
Regards,
I've attached all code for archiving but the current version can be found here too:
--
Gilles Darold http://www.darold.net/