Re: logical decoding and replication of sequences - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: logical decoding and replication of sequences |
Date | |
Msg-id | 93af17c3-a78d-4918-5f6b-76dfeb2d48bd@enterprisedb.com Whole thread Raw |
In response to | Re: logical decoding and replication of sequences (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: logical decoding and replication of sequences
|
List | pgsql-hackers |
On 3/7/22 17:53, Tomas Vondra wrote: > On 2/28/22 12:46, Amit Kapila wrote: >> On Sat, Feb 12, 2022 at 6:04 AM Tomas Vondra >> <tomas.vondra@enterprisedb.com> wrote: >>> >>> On 2/10/22 19:17, Tomas Vondra wrote: >>>> I've polished & pushed the first part adding sequence decoding >>>> infrastructure etc. Attached are the two remaining parts. >>>> >>>> I plan to wait a day or two and then push the test_decoding part. The >>>> last part (for built-in replication) will need more work and maybe >>>> rethinking the grammar etc. >>>> >>> >>> I've pushed the second part, adding sequences to test_decoding. >>> >> >> The test_decoding is failing randomly in the last few days. I am not >> completely sure but they might be related to this work. The two of >> these appears to be due to the same reason: >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2022-02-25%2018%3A50%3A09 >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=locust&dt=2022-02-17%2015%3A17%3A07 >> >> TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: >> "reorderbuffer.c", Line: 1173, PID: 35013) >> 0 postgres 0x00593de0 ExceptionalCondition + 160\\0 >> > > This might be related to the issue reported by Amit, i.e. that > sequence_decode does not call ReorderBufferProcessXid(). If this keeps > failing, we'll have to add some extra debug info (logging LSN etc.), at > least temporarily. It'd be valuable to inspect the WAL too. > >> Another: >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2022-02-16%2006%3A21%3A48 >> >> --- /home/nm/farm/xlc32/HEAD/pgsql.build/contrib/test_decoding/expected/rewrite.out >> 2022-02-14 20:19:14.000000000 +0000 >> +++ /home/nm/farm/xlc32/HEAD/pgsql.build/contrib/test_decoding/results/rewrite.out >> 2022-02-16 07:42:18.000000000 +0000 >> @@ -126,6 +126,7 @@ >> table public.replication_example: INSERT: id[integer]:4 >> somedata[integer]:3 text[character varying]:null >> testcolumn1[integer]:null >> table public.replication_example: INSERT: id[integer]:5 >> somedata[integer]:4 text[character varying]:null >> testcolumn1[integer]:2 testcolumn2[integer]:1 >> COMMIT >> + sequence public.replication_example_id_seq: transactional:0 >> last_value: 38 log_cnt: 0 is_called:1 >> BEGIN >> table public.replication_example: INSERT: id[integer]:6 >> somedata[integer]:5 text[character varying]:null >> testcolumn1[integer]:3 testcolumn2[integer]:null >> COMMIT >> @@ -133,7 +134,7 @@ >> table public.replication_example: INSERT: id[integer]:7 >> somedata[integer]:6 text[character varying]:null >> testcolumn1[integer]:4 testcolumn2[integer]:null >> table public.replication_example: INSERT: id[integer]:8 >> somedata[integer]:7 text[character varying]:null >> testcolumn1[integer]:5 testcolumn2[integer]:null >> testcolumn3[integer]:1 >> COMMIT >> - (15 rows) >> + (16 rows) >> > > Interesting. I can think of one reason that might cause this - we log > the first sequence increment after a checkpoint. So if a checkpoint > happens in an unfortunate place, there'll be an extra WAL record. On > slow / busy machines that's quite possible, I guess. > I've tweaked the checkpoint_interval to make checkpoints more aggressive (set it to 1s), and it seems my hunch was correct - it produces failures exactly like this one. The best fix probably is to just disable decoding of sequences in those tests that are not aimed at testing sequence decoding. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: