Re: logical decoding of two-phase transactions - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: logical decoding of two-phase transactions |
Date | |
Msg-id | CAD21AoDDC7USpUyUim=7BwuFByauRGfLzkE7hhiV=jjDckc+KA@mail.gmail.com Whole thread Raw |
In response to | Re: logical decoding of two-phase transactions (Stas Kelvich <s.kelvich@postgrespro.ru>) |
List | pgsql-hackers |
On Thu, Mar 30, 2017 at 12:55 AM, Stas Kelvich <s.kelvich@postgrespro.ru> wrote: > >> On 28 Mar 2017, at 18:08, Andres Freund <andres@anarazel.de> wrote: >> >> On 2017-03-28 15:55:15 +0100, Simon Riggs wrote: >>> >>> >>> That assertion is obviously false... the plugin can resolve this in >>> various ways, if we allow it. >> >> Handling it by breaking replication isn't handling it (e.g. timeouts in >> decoding etc). Handling it by rolling back *prepared* transactions >> (which are supposed to be guaranteed to succeed!), isn't either. >> >> >>> You can say that in your opinion you prefer to see this handled in >>> some higher level way, though it would be good to hear why and how. >> >> It's pretty obvious why: A bit of DDL by the user shouldn't lead to the >> issues mentioned above. >> >> >>> Bottom line here is we shouldn't reject this patch on this point, >> >> I think it definitely has to be rejected because of that. And I didn't >> bring this up at the last minute, I repeatedly brought it up before. >> Both to Craig and Stas. > > Okay. In order to find more realistic cases that blocks replication > i’ve created following setup: > > * in backend: tests_decoding plugins hooks on xact events and utility > statement hooks and transform each commit into prepare, then sleeps > on latch. If transaction contains DDL that whole statement pushed in > wal as transactional message. If DDL can not be prepared or disallows > execution in transaction block than it goes as nontransactional logical > message without prepare/decode injection. If transaction didn’t issued any > DDL and didn’t write anything to wal, then it skips 2pc too. > > * after prepare is decoded, output plugin in walsender unlocks backend > allowing to proceed with commit prepared. So in case when decoding > tries to access blocked catalog everything should stop. > > * small python script that consumes decoded wal from walsender (thanks > Craig and Petr) > > After small acrobatics with that hooks I’ve managed to run whole > regression suite in parallel mode through such setup of test_decoding > without any deadlocks. I’ve added two xact_events to postgres and > allowedn prepare of transactions that touched temp tables since > they are heavily used in tests and creates a lot of noise in diffs. > > So it boils down to 3 failed regression tests out of 177, namely: > > * transactions.sql — here commit of tx stucks with obtaining SafeSnapshot(). > I didn’t look what is happening there specifically, but just checked that > walsender isn’t blocked. I’m going to look more closely at this. > > * prepared_xacts.sql — here select prepared_xacts() sees our prepared > tx. It is possible to filter them out, but obviously it works as expected. > > * guc.sql — here pendingActions arrives on 'DISCARD ALL’ preventing tx > from being prepared. I didn’t found the way to check presence of > pendingActions outside of async.c so decided to leave it as is. > > It seems that at least in regression tests nothing can block twophase > logical decoding. Is that strong enough argument to hypothesis that current > approach doesn’t creates deadlock except locks on catalog which should be > disallowed anyway? > > Patches attached. logical_twophase_v5 is slightly modified version of previous > patch merged with Craig’s changes. Second file is set of patches over previous > one, that implements logic i’ve just described. There is runtest.sh script that > setups postgres, runs python logical consumer in background and starts > regression test. > > I reviewed this patch but when I tried to build contrib/test_decoding I got the following error. $ make gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -g -fpic -I. -I. -I../../src/include -D_GNU_SOURCE -c -o test_decoding.o test_decoding.c -MMD -MP -MF .deps/test_decoding.Po test_decoding.c: In function '_PG_init': test_decoding.c:126: warning: assignment from incompatible pointer type test_decoding.c: In function 'test_decoding_process_utility': test_decoding.c:271: warning: passing argument 5 of 'PreviousProcessUtilityHook' from incompatible pointer type test_decoding.c:271: note: expected 'struct QueryEnvironment *' but argument is of type 'struct DestReceiver *' test_decoding.c:271: warning: passing argument 6 of 'PreviousProcessUtilityHook' from incompatible pointer type test_decoding.c:271: note: expected 'struct DestReceiver *' but argument is of type 'char *' test_decoding.c:271: error: too few arguments to function 'PreviousProcessUtilityHook' test_decoding.c:276: warning: passing argument 5 of 'standard_ProcessUtility' from incompatible pointer type ../../src/include/tcop/utility.h:38: note: expected 'struct QueryEnvironment *' but argument is of type 'struct DestReceiver *' test_decoding.c:276: warning: passing argument 6 of 'standard_ProcessUtility' from incompatible pointer type ../../src/include/tcop/utility.h:38: note: expected 'struct DestReceiver *' but argument is of type 'char *' test_decoding.c:276: error: too few arguments to function 'standard_ProcessUtility' test_decoding.c: At top level: test_decoding.c:285: warning: 'test_decoding_twophase_commit' was used with no prototype before its definition make: *** [test_decoding.o] Error 1 --- After applied both patches the regression test 'make check' failed. I think you should update expected/transactions.out file as well. $ cat src/test/regress/regression.diffs *** /home/masahiko/pgsql/source/postgresql/src/test/regress/expected/transactions.out Mon May 2 09:16:02 2016 --- /home/masahiko/pgsql/source/postgresql/src/test/regress/results/transactions.out Tue Apr 4 09:52:44 2017 *************** *** 43,58 **** -- Read-only tests CREATE TABLE writetest (a int); CREATE TEMPORARY TABLE temptest (a int); ! BEGIN; ! SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, READ ONLY, DEFERRABLE; -- ok ! SELECT * FROM writetest; -- ok ! a ! --- ! (0 rows) ! ! SET TRANSACTION READ WRITE; --fail ! ERROR: transaction read-write mode must be set before any query ! COMMIT; BEGIN; SET TRANSACTION READ ONLY; -- ok SET TRANSACTION READ WRITE; -- ok --- 43,53 ---- -- Read-only tests CREATE TABLE writetest (a int); CREATE TEMPORARY TABLE temptest (a int); ! -- BEGIN; ! -- SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, READ ONLY, DEFERRABLE; -- ok ! -- SELECT * FROM writetest; -- ok ! -- SET TRANSACTION READ WRITE; --fail ! -- COMMIT; BEGIN; SET TRANSACTION READ ONLY; -- ok SET TRANSACTION READ WRITE; -- ok ====================================================================== There are still some unnecessary code in v5 patch. --- +/* PREPARE callback */ +static void +pg_decode_prepare_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn, + XLogRecPtr prepare_lsn) +{ + TestDecodingData *data = ctx->output_plugin_private; + int backend_procno; + + // if (data->skip_empty_xacts && !data->xact_wrote_changes) + // return; + + OutputPluginPrepareWrite(ctx, true); + Could you please update these patches? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: