Re: Add two missing tests in 035_standby_logical_decoding.pl - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Add two missing tests in 035_standby_logical_decoding.pl |
Date | |
Msg-id | CAA4eK1+-uuH+sC42_=wA=xPyzCN+U3VNeW_d-BRGvt-pGkNT+Q@mail.gmail.com Whole thread Raw |
In response to | Re: Add two missing tests in 035_standby_logical_decoding.pl ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>) |
Responses |
Re: Add two missing tests in 035_standby_logical_decoding.pl
|
List | pgsql-hackers |
On Fri, May 5, 2023 at 7:53 PM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote: > > > On 5/5/23 2:28 PM, Amit Kapila wrote: > > On Fri, May 5, 2023 at 5:36 PM Drouvot, Bertrand > > <bertranddrouvot.pg@gmail.com> wrote: > > > > It seems due to some reason the current wal file is not switched due > > to some reason. > > Oh wait, here is a NON failing one: https://cirrus-ci.com/task/5086849685782528 (I modified the > .cirrus.yml so that we can download the "testrun.zip" file even if the test is not failing). > > So, in this testrun.zip we can see, that the test is ok: > > $ grep -i retain ./build/testrun/recovery/035_standby_logical_decoding/log/regress_log_035_standby_logical_decoding > [10:06:08.789](0.000s) ok 19 - invalidated logical slots do not lead to retaining WAL > > and that the WAL file we expect to be removed is: > > $ grep "WAL file is" ./build/testrun/recovery/035_standby_logical_decoding/log/regress_log_035_standby_logical_decoding > [10:06:08.789](0.925s) # BDT WAL file is /Users/admin/pgsql/build/testrun/recovery/035_standby_logical_decoding/data/t_035_standby_logical_decoding_standby_data/pgdata/pg_wal/000000010000000000000003 > > This WAL file has been removed by the standby: > > $ grep -i 000000010000000000000003 ./build/testrun/recovery/035_standby_logical_decoding/log/035_standby_logical_decoding_standby.log| grep -i recy > 2023-05-05 10:06:08.787 UTC [17521][checkpointer] DEBUG: 00000: recycled write-ahead log file "000000010000000000000003" > > But on the primary, it has been recycled way after that time: > > $ grep -i 000000010000000000000003 ./build/testrun/recovery/035_standby_logical_decoding/log/035_standby_logical_decoding_primary.log| grep -i recy > 2023-05-05 10:06:13.370 UTC [16785][checkpointer] DEBUG: 00000: recycled write-ahead log file "000000010000000000000003" > > As, the checkpoint on the primary after the WAL file switch only recycled (001 and 002): > > $ grep -i recycled ./build/testrun/recovery/035_standby_logical_decoding/log/035_standby_logical_decoding_primary.log > 2023-05-05 10:05:57.196 UTC [16785][checkpointer] LOG: 00000: checkpoint complete: wrote 4 buffers (3.1%); 0 WAL file(s)added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.027 s; sync files=0, longest=0.000 s, average=0.000s; distance=11219 kB, estimate=11219 kB; lsn=0/2000060, redo lsn=0/2000028 > 2023-05-05 10:06:08.138 UTC [16785][checkpointer] DEBUG: 00000: recycled write-ahead log file "000000010000000000000001" > 2023-05-05 10:06:08.138 UTC [16785][checkpointer] DEBUG: 00000: recycled write-ahead log file "000000010000000000000002" > 2023-05-05 10:06:08.138 UTC [16785][checkpointer] LOG: 00000: checkpoint complete: wrote 20 buffers (15.6%); 0 WAL file(s)added, 0 removed, 2 recycled; write=0.001 s, sync=0.001 s, total=0.003 s; sync files=0, longest=0.000 s, average=0.000s; distance=32768 kB, estimate=32768 kB; lsn=0/40000D0, redo lsn=0/4000098 > > > So, even on a successful test, we can see that the WAL file we expect to be removed on the standby has not been recycledon the primary before the test. > Okay, one possibility of not removing on primary is that at the time of checkpoint (when we compute RedoRecPtr), the wal_swtich and insert is not yet performed because in that case it will compute the RedoRecPtr as a location before those operations which would be 0000*3 file. However, it is not clear how is that possible except from a background checkpoint happening at that point but from LOGs, it appears that the checkpoint triggered by test has recycled the wal files. > > I think we need to add more DEBUG info to find that > > out. Can you please try to print 'RedoRecPtr', '_logSegNo', and > > recptr? > > > > Yes, will do. > Okay, thanks, please try to print similar locations on standby in CreateRestartPoint(). -- With Regards, Amit Kapila.
pgsql-hackers by date: