Re: ERROR: subtransaction logged without previous top-level txn record - Mailing list pgsql-bugs
From | Arseny Sher |
---|---|
Subject | Re: ERROR: subtransaction logged without previous top-level txn record |
Date | |
Msg-id | 87tv3x72hh.fsf@ars-thinkpad Whole thread Raw |
In response to | Re: ERROR: subtransaction logged without previous top-level txn record (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: ERROR: subtransaction logged without previous top-level txn record
|
List | pgsql-bugs |
Amit Kapila <amit.kapila16@gmail.com> writes: >> That's weird, it reliably fails with expected error for me. There are >> already two s2_checkpoint's: first establishes potential (broken) >> restart_lsn (serializes snapshot after first xl_xact_assignment of s0 >> xact, but before first record of s1 xact), the second ensures >> s2_get_changes directly following it will actually advance the slot, >> > > In my case, s2_get_changes doesn't seem to be advancing the restart > lsn because when it processed running_xact by s2_checkpoint, the slots > confirm flush location (slot->data.confirmed_flush) was behind it. As > confirmed_flush was behind running_xact of s2_checkpoint, it couldn't > update slot->candidate_restart_lsn (in function > LogicalIncreaseRestartDecodingForSlot). I think the confirmed_flush > location will only be updated at the end of get_changes. This is the > reason I need extra get_changes call to generate an error. > > I will think and investigate this more, but thought of sharing the > current situation with you. There is something different going on in > my system or maybe the nature of test is like that. Ah, I think I know what's happening -- you have one more xl_running_xacts which catches the advancement -- similar issue is explained in the comment in oldest_xmin.spec. Try attached. From bd9e50d919cc07ec30f06c443d142f5f8510887c Mon Sep 17 00:00:00 2001 From: Arseny Sher <sher-ars@yandex.ru> Date: Wed, 23 Oct 2019 15:56:46 +0300 Subject: [PATCH 2/2] Stop demanding that top xact must be seen before subxact in decoding. Manifested as ERROR: subtransaction logged without previous top-level txn record this check forbids legit behaviours like - First xl_xact_assignment record is beyond reading, i.e. earlier restart_lsn. - After restart_lsn there is some change of a subxact. - After that, there is second xl_xact_assignment (for another subxact) revealing relationship between top and first subxact. Such transaction won't be streamed anyway because we hadn't seen it in full; confirmed_flush_lsn must be past all these records. Saying for sure whether xact of some record encountered after snapshot was deserialized can be streamed or not requires to know whether it wrote something before deserialization point -- if yes, it hasn't been seen in full and can't be decoded. Snapshot doesn't have such info, so there is no easy way to relax the check. --- contrib/test_decoding/Makefile | 2 +- .../test_decoding/expected/subxact_without_top.out | 39 +++++++++++++ .../test_decoding/specs/subxact_without_top.spec | 67 ++++++++++++++++++++++ src/backend/replication/logical/reorderbuffer.c | 3 - 4 files changed, 107 insertions(+), 4 deletions(-) create mode 100644 contrib/test_decoding/expected/subxact_without_top.out create mode 100644 contrib/test_decoding/specs/subxact_without_top.spec diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile index 4afb1d963e..f439c582a5 100644 --- a/contrib/test_decoding/Makefile +++ b/contrib/test_decoding/Makefile @@ -7,7 +7,7 @@ REGRESS = ddl xact rewrite toast permissions decoding_in_xact \ decoding_into_rel binary prepared replorigin time messages \ spill slot truncate ISOLATION = mxact delayed_startup ondisk_startup concurrent_ddl_dml \ - oldest_xmin snapshot_transfer + oldest_xmin snapshot_transfer subxact_without_top REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf ISOLATION_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf diff --git a/contrib/test_decoding/expected/subxact_without_top.out b/contrib/test_decoding/expected/subxact_without_top.out new file mode 100644 index 0000000000..99ce998822 --- /dev/null +++ b/contrib/test_decoding/expected/subxact_without_top.out @@ -0,0 +1,39 @@ +Parsed test spec with 3 sessions + +starting permutation: s0_begin s0_first_subxact s2_checkpoint s1_begin s1_dml s0_many_subxacts s0_commit s2_checkpoint s2_get_changes_suppress_outputs2_get_changes_suppress_output s1_commit s2_get_changes +step s0_begin: BEGIN; +step s0_first_subxact: + DO LANGUAGE plpgsql $$ + BEGIN + BEGIN + INSERT INTO harvest VALUES (41); + EXCEPTION WHEN OTHERS THEN RAISE; + END; + END $$; + +step s2_checkpoint: CHECKPOINT; +step s1_begin: BEGIN; +step s1_dml: INSERT INTO harvest VALUES (43); +step s0_many_subxacts: select subxacts(); +subxacts + + +step s0_commit: COMMIT; +step s2_checkpoint: CHECKPOINT; +step s2_get_changes_suppress_output: SELECT null n FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids','0', 'skip-empty-xacts', '1') GROUP BY n; +n + + +step s2_get_changes_suppress_output: SELECT null n FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids','0', 'skip-empty-xacts', '1') GROUP BY n; +n + +step s1_commit: COMMIT; +step s2_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts','1'); +data + +BEGIN +table public.harvest: INSERT: apples[integer]:43 +COMMIT +?column? + +stop diff --git a/contrib/test_decoding/specs/subxact_without_top.spec b/contrib/test_decoding/specs/subxact_without_top.spec new file mode 100644 index 0000000000..6e9984da62 --- /dev/null +++ b/contrib/test_decoding/specs/subxact_without_top.spec @@ -0,0 +1,67 @@ +# Forces decoding session to see (and associate with toplevel before commit) +# subxact with previous top records (first xl_xact_assignment) beyond +# restart_lsn. Such partially seen xact won't be streamed as it must finish +# before confirmed_flush_lsn, but nevertheless this harmless schedule used to +# cause +# ERROR: subtransaction logged without previous top-level txn record +# failures. + +setup +{ + SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); -- must be first write inxact + CREATE TABLE harvest(apples integer); + CREATE OR REPLACE FUNCTION subxacts() returns void as $$ + BEGIN + FOR i in 1 .. 128 LOOP + BEGIN + INSERT INTO harvest VALUES (42); + EXCEPTION + WHEN OTHERS THEN + RAISE; + END; + END LOOP; + END; $$LANGUAGE 'plpgsql'; +} + +teardown +{ + DROP TABLE IF EXISTS harvest; + SELECT 'stop' FROM pg_drop_replication_slot('isolation_slot'); +} + +session "s0" +setup { SET synchronous_commit=on; } +step "s0_begin" { BEGIN; } +step "s0_first_subxact" { + DO LANGUAGE plpgsql $$ + BEGIN + BEGIN + INSERT INTO harvest VALUES (41); + EXCEPTION WHEN OTHERS THEN RAISE; + END; + END $$; +} +step "s0_many_subxacts" { select subxacts(); } +step "s0_commit" { COMMIT; } + +session "s1" +setup { SET synchronous_commit=on; } +step "s1_begin" { BEGIN; } +step "s1_dml" { INSERT INTO harvest VALUES (43); } +step "s1_commit" { COMMIT; } + +session "s2" +setup { SET synchronous_commit=on; } +step "s2_checkpoint" { CHECKPOINT; } +step "s2_get_changes" { SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0','skip-empty-xacts', '1'); } +step "s2_get_changes_suppress_output" { SELECT null n FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids','0', 'skip-empty-xacts', '1') GROUP BY n; } + +# First s2_checkpoint serializes snapshot (creates potential restart_lsn point) +# *after* initial subxact, i.e. first xl_xact_assignment is beyond it. +# s0_many_subxacts forces second xl_xact_assignment (issues > +# PGPROC_MAX_CACHED_SUBXIDS) to associate subxact with toplevel before commit. +# Second s2_checkpoint ensures s2_get_changes directly following it will +# advance the slot, establishing that restart_lsn. +# We do get_changes twice because if one more xl_running_xacts record had slipped +# before our CHECKPOINT, confirmed_flush will be advanced only to that record. +permutation "s0_begin" "s0_first_subxact" "s2_checkpoint" "s1_begin" "s1_dml" "s0_many_subxacts" "s0_commit" "s2_checkpoint""s2_get_changes_suppress_output" "s2_get_changes_suppress_output" "s1_commit" "s2_get_changes" diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index aeebbf243a..481277a1fd 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -838,9 +838,6 @@ ReorderBufferAssignChild(ReorderBuffer *rb, TransactionId xid, txn = ReorderBufferTXNByXid(rb, xid, true, &new_top, lsn, true); subtxn = ReorderBufferTXNByXid(rb, subxid, true, &new_sub, lsn, false); - if (new_top && !new_sub) - elog(ERROR, "subtransaction logged without previous top-level txn record"); - if (!new_sub) { if (rbtxn_is_known_subxact(subtxn)) -- 2.11.0 -- cheers, arseny
pgsql-bugs by date: