Thread: Re: Add a perl function in Cluster.pm to generate WAL
On Wed, Aug 24, 2022 at 6:42 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Tue, 16 Aug 2022 18:40:49 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in > > On 2022-Aug-16, Andrew Dunstan wrote: > > > > > I don't think there's a hard and fast rule about it. Certainly the case > > > would be more compelling if the functions were used across different TAP > > > suites. The SSL suite has suite-specific modules. That's a pattern also > > > worth considering. e.g something like. > > > > > > use FindBin qw($Bin); > > > use lib $Bin; > > > use MySuite; > > > > > > and then you put your common routines in MySuite.pm in the same > > > directory as the TAP test files. > > > > Yeah, I agree with that for advance_wal. Regarding find_in_log, that > > one seems general enough to warrant being in Cluster.pm -- consider > > issues_sql_like, which also slurps_file($log). That could be unified a > > little bit, I think. > > +1 With the generalized function for find_in_log() has been added as part of https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e25e5f7fc6b74c9d4ce82627e9145ef5537412e2, I'm proposing a generalized function for advance_wal(). Please find the attached patch. I tried to replace the existing tests with the new cluster function advance_wal(). Please let me know if I'm missing any other tests. Also, this new function can be used by an in-progress feature - https://commitfest.postgresql.org/43/3663/. Thoughts? FWIW, it's discussed here - https://www.postgresql.org/message-id/ZIKVd%2Ba43UfsIWJE%40paquier.xyz. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Mmm. It seems like the email I thought I'd sent failed to go out. At Sun, 11 Jun 2023 07:14:54 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > On Wed, Aug 24, 2022 at 6:42 AM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > > > At Tue, 16 Aug 2022 18:40:49 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in > > > On 2022-Aug-16, Andrew Dunstan wrote: > > > Yeah, I agree with that for advance_wal. Regarding find_in_log, that > > > one seems general enough to warrant being in Cluster.pm -- consider > > > issues_sql_like, which also slurps_file($log). That could be unified a > > > little bit, I think. > > > > +1 > > With the generalized function for find_in_log() has been added as part > of https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e25e5f7fc6b74c9d4ce82627e9145ef5537412e2, > I'm proposing a generalized function for advance_wal(). Please find > the attached patch. > > I tried to replace the existing tests with the new cluster function > advance_wal(). Please let me know if I'm missing any other tests. > Also, this new function can be used by an in-progress feature - > https://commitfest.postgresql.org/43/3663/. > > Thoughts? Thanks! + "CREATE TABLE tt (); DROP TABLE tt; SELECT pg_switch_wal();"); At least since 11, we can utilize pg_logical_emit_message() for this purpose. It's more lightweight and seems appropriate, not only because it doesn't cause any side effects but also bacause we don't have to worry about name conflicts. - SELECT 'finished';", - timeout => $PostgreSQL::Test::Utils::timeout_default)); -is($result[1], 'finished', 'check if checkpoint command is not blocked'); - +$node_primary2->advance_wal(1); +$node_primary2->safe_psql('postgres', 'CHECKPOINT;'); This test anticipates that the checkpoint could get blocked. Shouldn't we keep the timeout? -$node_primary->safe_psql( - 'postgres', "create table retain_test(a int); - select pg_switch_wal(); - insert into retain_test values(1); - checkpoint;"); +$node_primary->advance_wal(1); +$node_primary->safe_psql('postgres', "checkpoint;"); The original test generated some WAL after the segment switch, which appears to be a significant characteristics of the test. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Thu, Jun 15, 2023 at 01:40:15PM +0900, Kyotaro Horiguchi wrote: > + "CREATE TABLE tt (); DROP TABLE tt; SELECT pg_switch_wal();"); > > At least since 11, we can utilize pg_logical_emit_message() for this > purpose. It's more lightweight and seems appropriate, not only because > it doesn't cause any side effects but also bacause we don't have to > worry about name conflicts. Making this as cheap as possible by design is a good concept for a common routine. +1. > - SELECT 'finished';", > - timeout => $PostgreSQL::Test::Utils::timeout_default)); > -is($result[1], 'finished', 'check if checkpoint command is not blocked'); > - > +$node_primary2->advance_wal(1); > +$node_primary2->safe_psql('postgres', 'CHECKPOINT;'); > > This test anticipates that the checkpoint could get blocked. Shouldn't > we keep the timeout? Indeed, this would partially invalidate what's getting tested in light of 1816a1c6 where we run a secondary command after the checkpoint. So the last SELECT should remain around. > -$node_primary->safe_psql( > - 'postgres', "create table retain_test(a int); > - select pg_switch_wal(); > - insert into retain_test values(1); > - checkpoint;"); > +$node_primary->advance_wal(1); > +$node_primary->safe_psql('postgres', "checkpoint;"); > > The original test generated some WAL after the segment switch, which > appears to be a significant characteristics of the test. Still it does not matter for this specific case? The logical slot has been already invalidated, so we don't care much about logical changes in WAL, do we? -- Michael
Attachment
Thanks for the comments. At Fri, 16 Jun 2023 11:30:15 +0900, Michael Paquier <michael@paquier.xyz> wrote in > > -$node_primary->safe_psql( > > - 'postgres', "create table retain_test(a int); > > - select pg_switch_wal(); > > - insert into retain_test values(1); > > - checkpoint;"); > > +$node_primary->advance_wal(1); > > +$node_primary->safe_psql('postgres', "checkpoint;"); > > > > The original test generated some WAL after the segment switch, which > > appears to be a significant characteristics of the test. > > Still it does not matter for this specific case? The logical slot has > been already invalidated, so we don't care much about logical changes > in WAL, do we? The change itself doesn't seem to matter, but it seems intended to let checkpoint trigger the removal of the last segment. However, I'm unsure how the insert would influence this that way. If my understanding is correct, then I'd support its removal. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Fri, Jun 16, 2023 at 8:00 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Jun 15, 2023 at 01:40:15PM +0900, Kyotaro Horiguchi wrote: > > + "CREATE TABLE tt (); DROP TABLE tt; SELECT pg_switch_wal();"); > > > > At least since 11, we can utilize pg_logical_emit_message() for this > > purpose. It's more lightweight and seems appropriate, not only because > > it doesn't cause any side effects but also bacause we don't have to > > worry about name conflicts. > > Making this as cheap as possible by design is a good concept for a > common routine. +1. While it seems reasonable to use pg_logical_emit_message, it won't work for all the cases - what if someone wants to advance WAL by a few WAL files? I think pg_switch_wal() is the way, no? For instance, the replslot_limit.pl test increases the WAL in a very calculated way - it increases by 5 WAL files. So, -1 to use pg_logical_emit_message. I understand the naming conflicts for the table name used ('tt' in this case). If the table name 'tt' looks so simple and easy for someone to have tables with that name in their tests file, we can generate a random table name in advance_wal, something like in the attached v2 patch. > > - SELECT 'finished';", > > - timeout => $PostgreSQL::Test::Utils::timeout_default)); > > -is($result[1], 'finished', 'check if checkpoint command is not blocked'); > > - > > +$node_primary2->advance_wal(1); > > +$node_primary2->safe_psql('postgres', 'CHECKPOINT;'); > > > > This test anticipates that the checkpoint could get blocked. Shouldn't > > we keep the timeout? > > Indeed, this would partially invalidate what's getting tested in light > of 1816a1c6 where we run a secondary command after the checkpoint. So > the last SELECT should remain around. Changed. > > -$node_primary->safe_psql( > > - 'postgres', "create table retain_test(a int); > > - select pg_switch_wal(); > > - insert into retain_test values(1); > > - checkpoint;"); > > +$node_primary->advance_wal(1); > > +$node_primary->safe_psql('postgres', "checkpoint;"); > > > > The original test generated some WAL after the segment switch, which > > appears to be a significant characteristics of the test. > > Still it does not matter for this specific case? The logical slot has > been already invalidated, so we don't care much about logical changes > in WAL, do we? Correct, the slot has already been invalidated and the test is verifying that WAL isn't retained by the invalidated slot, so essentially what it needs is to generate "some" wal. So, using advance_wal there seems fine to me. CFBot doesn't complain anything - https://github.com/BRupireddy/postgres/tree/add_a_TAP_test_function_to_generate_WAL_v2. Attached the v2 patch. Thoughts? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Wed, Jul 19, 2023 at 4:11 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Attached the v2 patch. Thoughts? Rebase needed, attached v3 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Mon, Dec 18, 2023, at 2:39 AM, Bharath Rupireddy wrote:
Rebase needed, attached v3 patch.
I think you don't understand the suggestion proposed by Michael and Kyotaro. If
you do a comparison with the following SQL commands:
euler=# select pg_walfile_name(pg_current_wal_lsn());
pg_walfile_name
--------------------------
000000010000000000000040
(1 row)
euler=# select pg_logical_emit_message(true, 'prefix', 'message4');
pg_logical_emit_message
-------------------------
0/400000A8
(1 row)
euler=# select pg_switch_wal();
pg_switch_wal
---------------
0/400000F0
(1 row)
euler=# create table cc (b int);
CREATE TABLE
euler=# drop table cc;
DROP TABLE
euler=# select pg_switch_wal();
pg_switch_wal
---------------
0/41017C88
(1 row)
euler=# select pg_walfile_name(pg_current_wal_lsn());
pg_walfile_name
--------------------------
000000010000000000000041
(1 row)
You get
$ pg_waldump 000000010000000000000040
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/40000028, prev 0/3F0001C0, desc: RUNNING_XACTS nextXid 295180 latestCompletedXid 295179 oldestRunningXid 295180
rmgr: LogicalMessage len (rec/tot): 65/ 65, tx: 295180, lsn: 0/40000060, prev 0/40000028, desc: MESSAGE transactional, prefix "prefix"; payload (8 bytes): 6D 65 73 73 61 67 65 34
rmgr: Transaction len (rec/tot): 46/ 46, tx: 295180, lsn: 0/400000A8, prev 0/40000060, desc: COMMIT 2023-12-18 08:35:06.821322 -03
rmgr: XLOG len (rec/tot): 24/ 24, tx: 0, lsn: 0/400000D8, prev 0/400000A8, desc: SWITCH
$ pg_waldump 000000010000000000000041
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/41000028, prev 0/400000D8, desc: RUNNING_XACTS nextXid 295181 latestCompletedXid 295180 oldestRunningXid 295181
rmgr: Storage len (rec/tot): 42/ 42, tx: 0, lsn: 0/41000060, prev 0/41000028, desc: CREATE base/33287/88102
rmgr: Heap2 len (rec/tot): 60/ 60, tx: 295181, lsn: 0/41000090, prev 0/41000060, desc: NEW_CID rel: 1663/33287/1247, tid: 14/16, cmin: 0, cmax: 4294967295, combo: 4294967295
rmgr: Heap len (rec/tot): 54/ 3086, tx: 295181, lsn: 0/410000D0, prev 0/41000090, desc: INSERT off: 16, flags: 0x00, blkref #0: rel 1663/33287/1247 blk 14 FPW
rmgr: Btree len (rec/tot): 53/ 5133, tx: 295181, lsn: 0/41000CE0, prev 0/410000D0, desc: INSERT_LEAF off: 252, blkref #0: rel 1663/33287/2703 blk 2 FPW
.
.
.
rmgr: Btree len (rec/tot): 72/ 72, tx: 295181, lsn: 0/41016E48, prev 0/41014F00, desc: INSERT_LEAF off: 111, blkref #0: rel 1663/33287/2674 blk 7
rmgr: Heap2 len (rec/tot): 69/ 69, tx: 295181, lsn: 0/41016E90, prev 0/41016E48, desc: PRUNE snapshotConflictHorizon: 295177, nredirected: 0, ndead: 7, nunused: 0, redirected: [], dead: [20, 21, 22, 23, 24, 26, 27], unused: [], blkref #0: rel 1663/33287/1249 blk 17
rmgr: Transaction len (rec/tot): 385/ 385, tx: 295181, lsn: 0/41016ED8, prev 0/41016E90, desc: INVALIDATION ; inval msgs: catcache 80 catcache 79 catcache 80 catcache 79 catcache 55 catcache 54 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608 relcache 88102
rmgr: Standby len (rec/tot): 42/ 42, tx: 295181, lsn: 0/41017060, prev 0/41016ED8, desc: LOCK xid 295181 db 33287 rel 88102
rmgr: Transaction len (rec/tot): 405/ 405, tx: 295181, lsn: 0/41017090, prev 0/41017060, desc: COMMIT 2023-12-18 08:35:22.342122 -03; inval msgs: catcache 80 catcache 79 catcache 80 catcache 79 catcache 55 catcache 54 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608 relcache 88102
rmgr: Standby len (rec/tot): 42/ 42, tx: 295182, lsn: 0/41017228, prev 0/41017090, desc: LOCK xid 295182 db 33287 rel 88102
rmgr: Heap2 len (rec/tot): 61/ 61, tx: 295182, lsn: 0/41017258, prev 0/41017228, desc: PRUNE snapshotConflictHorizon: 295177, nredirected: 0, ndead: 3, nunused: 0, redirected: [], dead: [9, 12, 15], unused: [], blkref #0: rel 1663/33287/2608 blk 3
rmgr: Heap2 len (rec/tot): 60/ 60, tx: 295182, lsn: 0/41017298, prev 0/41017258, desc: NEW_CID rel: 1663/33287/1247, tid: 14/17, cmin: 4294967295, cmax: 0, combo: 4294967295
rmgr: Heap len (rec/tot): 54/ 54, tx: 295182, lsn: 0/410172D8, prev 0/41017298, desc: DELETE xmax: 295182, off: 17, infobits: [KEYS_UPDATED], flags: 0x00, blkref #0: rel 1663/33287/1247 blk 14
.
.
.
rmgr: Heap2 len (rec/tot): 60/ 60, tx: 295182, lsn: 0/410178D8, prev 0/410178A0, desc: NEW_CID rel: 1663/33287/2608, tid: 3/24, cmin: 4294967295, cmax: 2, combo: 4294967295
rmgr: Heap len (rec/tot): 54/ 54, tx: 295182, lsn: 0/41017918, prev 0/410178D8, desc: DELETE xmax: 295182, off: 24, infobits: [KEYS_UPDATED], flags: 0x00, blkref #0: rel 1663/33287/2608 blk 3
rmgr: Transaction len (rec/tot): 321/ 321, tx: 295182, lsn: 0/41017950, prev 0/41017918, desc: INVALIDATION ; inval msgs: catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 55 catcache 54 relcache 88102 snapshot 2608
rmgr: Transaction len (rec/tot): 469/ 469, tx: 295182, lsn: 0/41017A98, prev 0/41017950, desc: COMMIT 2023-12-18 08:35:25.053905 -03; rels: base/33287/88102; dropped stats: 2/33287/88102; inval msgs: catcache 80 catcache 79 catcache 80 catcache 79 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 55 catcache 54 snapshot 2608 snapshot 2608 relcache 88102 snapshot 2608
rmgr: XLOG len (rec/tot): 24/ 24, tx: 0, lsn: 0/41017C70, prev 0/41017A98, desc: SWITCH
The difference is
euler=# select '0/400000A8'::pg_lsn - '0/40000028'::pg_lsn;
?column?
----------
128
(1 row)
euler=# select '0/41017A98'::pg_lsn - '0/41000028'::pg_lsn;
?column?
----------
96880
(1 row)
It is cheaper.
On Mon, Dec 18, 2023 at 08:48:09AM -0300, Euler Taveira wrote: > It is cheaper. Agreed that this could just use a set of pg_logical_emit_message() when jumping across N segments. Another thing that seems quite important to me is to force a flush of WAL with the last segment switch, and the new "flush" option of pg_logical_emit_message() can be very handy for this purpose. -- Michael
Attachment
On Tue, Dec 19, 2023 at 9:51 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, Dec 18, 2023 at 08:48:09AM -0300, Euler Taveira wrote: > > It is cheaper. > > Agreed that this could just use a set of pg_logical_emit_message() > when jumping across N segments. Thanks. I missed the point of using pg_logical_emit_message() over CREATE .. DROP TABLE to generate WAL. And, I agree that it's better and relatively cheaper in terms of amount of WAL generated. > Another thing that seems quite > important to me is to force a flush of WAL with the last segment > switch, and the new "flush" option of pg_logical_emit_message() can > be very handy for this purpose. I used pg_logical_emit_message() in non-transactional mode without needing an explicit WAL flush as the pg_switch_wal() does a WAL flush at the end [1]. Attached v4 patch. [1] /* * If this was an XLOG_SWITCH record, flush the record and the empty * padding space that fills the rest of the segment, and perform * end-of-segment actions (eg, notifying archiver). */ if (class == WALINSERT_SPECIAL_SWITCH) { TRACE_POSTGRESQL_WAL_SWITCH(); XLogFlush(EndPos); -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Tue, Dec 19, 2023 at 11:25:50AM +0530, Bharath Rupireddy wrote: > I used pg_logical_emit_message() in non-transactional mode without > needing an explicit WAL flush as the pg_switch_wal() does a WAL flush > at the end [1]. Indeed, that should be enough to answer my comment. > Attached v4 patch. LGTM, thanks. Euler, what do you think? -- Michael
Attachment
On Tue, Dec 19, 2023, at 8:00 PM, Michael Paquier wrote:
On Tue, Dec 19, 2023 at 11:25:50AM +0530, Bharath Rupireddy wrote:> I used pg_logical_emit_message() in non-transactional mode without> needing an explicit WAL flush as the pg_switch_wal() does a WAL flush> at the end [1].Indeed, that should be enough to answer my comment.> Attached v4 patch.LGTM, thanks. Euler, what do you think?
LGTM.
On Wed, Dec 20, 2023 at 12:24:04AM -0300, Euler Taveira wrote: > LGTM. I was eyeing at 020_messages.pl and it has a pg_switch_wal() after a transaction rollbacked. 020_archive_status.pl creates a table, does one segment switch, then a checkpoint (table is used afterwards). Perhaps these could be changed with the new routine, but it does not seem like this improves the readability of the tests, either, contrary to the ones updated here where a fake table is created to force some records. What do you think? We have a few more pg_switch_wal() calls, as well, but these rely on WAL being already generated beforehand. I have added a comment about pg_logical_emit_message() being in non-transactional mode and the flush implied by pg_switch_wal() as that's important, edited a bit the whole, then applied the patch. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > I have added a comment about pg_logical_emit_message() being in > non-transactional mode and the flush implied by pg_switch_wal() as > that's important, edited a bit the whole, then applied the patch. Buildfarm member skink has failed 3 times in 035_standby_logical_decoding.pl in the last couple of days: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2020%3A07%3A15 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2017%3A09%3A27 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-01%2020%3A10%3A18 These all look like # poll_query_until timed out executing this query: # select (confl_active_logicalslot = 1) from pg_stat_database_conflicts where datname = 'testdb' # expecting this output: # t # last actual query output: # f although it's notable that two different tests are involved (vacuum vs. vacuum full). I am not real sure what is happening there, but I see that c161ab74f changed some details of how that test works, so I wonder if it's responsible for these failures. The timing isn't a perfect match, since this commit went in two weeks ago, but I don't see any more-recent commits that seem like plausible explanations. regards, tom lane
On Wed, Jan 03, 2024 at 06:39:29PM -0500, Tom Lane wrote: > I am not real sure what is happening there, but I see that c161ab74f > changed some details of how that test works, so I wonder if it's > responsible for these failures. The timing isn't a perfect match, > since this commit went in two weeks ago, but I don't see any > more-recent commits that seem like plausible explanations. Likely the INSERT query on retain_test that has been removed from the test is impacting the slot conflict analysis that we'd expect. -- Michael
Attachment
Hello Tom, 04.01.2024 02:39, Tom Lane wrote: > Buildfarm member skink has failed 3 times in > 035_standby_logical_decoding.pl in the last couple of days: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2020%3A07%3A15 > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2017%3A09%3A27 > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-01%2020%3A10%3A18 > > These all look like > > # poll_query_until timed out executing this query: > # select (confl_active_logicalslot = 1) from pg_stat_database_conflicts where datname = 'testdb' > # expecting this output: > # t > # last actual query output: > # f > > although it's notable that two different tests are involved > (vacuum vs. vacuum full). > > I am not real sure what is happening there, but I see that c161ab74f > changed some details of how that test works, so I wonder if it's > responsible for these failures. The timing isn't a perfect match, > since this commit went in two weeks ago, but I don't see any > more-recent commits that seem like plausible explanations. Reproduced here. As I can see in the failure logs you referenced, the first problem is: # Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_authid' # at t/035_standby_logical_decoding.pl line 224. It reminded me of: https://www.postgresql.org/message-id/b2a1f7d0-7629-72c0-2da7-e9c4e336b010%40gmail.com It seems that it's not something new, and maybe that my analysis is still valid. If so, VACUUM FREEZE/autovacuum = off should fix the issue. Best regards, Alexander
On Thu, Jan 04, 2024 at 04:00:01PM +0300, Alexander Lakhin wrote: > Reproduced here. Did you just make the run slow enough to show the failure with valgrind? > As I can see in the failure logs you referenced, the first problem is: > # Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_authid' > # at t/035_standby_logical_decoding.pl line 224. > It reminded me of: > https://www.postgresql.org/message-id/b2a1f7d0-7629-72c0-2da7-e9c4e336b010%40gmail.com > > It seems that it's not something new, and maybe that my analysis is still > valid. If so, VACUUM FREEZE/autovacuum = off should fix the issue. Not sure about that yet. -- Michael
Attachment
05.01.2024 02:48, Michael Paquier wrote: > On Thu, Jan 04, 2024 at 04:00:01PM +0300, Alexander Lakhin wrote: >> Reproduced here. > Did you just make the run slow enough to show the failure with > valgrind? Yes, I just run several test instances (with no extra modifications) under Valgrind with parallel as follows: for i in `seq 20`; do cp -r src/test/recovery/ src/test/recovery_$i/; sed "s|src/test/recovery|src/test/recovery_$i|" -i src/test/recovery_$i/Makefile; done for i in `seq 20`; do echo "iteration $i"; parallel --halt now,fail=1 -j20 --linebuffer --tag PROVE_TESTS="t/035*" NO_TEMP_INSTALL=1 make check -s -C src/test/recovery_{} PROVE_FLAGS="--timer" ::: `seq 20` || break; done The test run fails for me on iterations 9, 8, 14 like so: ... iteration 9 ... 5 5 # Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_authid' 5 # at t/035_standby_logical_decoding.pl line 224. 5 5 # Failed test 'activeslot slot invalidation is logged with vacuum on pg_authid' 5 # at t/035_standby_logical_decoding.pl line 229. 13 [07:35:53] t/035_standby_logical_decoding.pl .. ok 432930 ms ( 0.02 usr 0.00 sys + 292.08 cusr 13.05 csys = 305.15 CPU) 13 [07:35:53] 13 All tests successful. ... I've also reproduced it without Valgrind in a VM with CPU slowed down to 5% (on iterations 2, 5, 4), where average test duration is about 800 sec: 4 4 # Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_authid' 4 # at t/035_standby_logical_decoding.pl line 222. 4 4 # Failed test 'activeslot slot invalidation is logged with vacuum on pg_authid' 4 # at t/035_standby_logical_decoding.pl line 227. 6 [15:16:53] t/035_standby_logical_decoding.pl .. ok 763168 ms ( 0.68 usr 0.10 sys + 19.55 cusr 102.59 csys = 122.92 CPU) > >> As I can see in the failure logs you referenced, the first problem is: >> # Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_authid' >> # at t/035_standby_logical_decoding.pl line 224. >> It reminded me of: >> https://www.postgresql.org/message-id/b2a1f7d0-7629-72c0-2da7-e9c4e336b010%40gmail.com >> >> It seems that it's not something new, and maybe that my analysis is still >> valid. If so, VACUUM FREEZE/autovacuum = off should fix the issue. > Not sure about that yet. > Your suspicion was proved right. After git show c161ab74f src/test/recovery/t/035_standby_logical_decoding.pl | git apply -R 20 iterations with 20 tests in parallel performed successfully for me (twice). So it looks like c161ab74f really made the things worse. Best regards, Alexander
On Fri, Jan 05, 2024 at 11:00:00PM +0300, Alexander Lakhin wrote: > Your suspicion was proved right. After > git show c161ab74f src/test/recovery/t/035_standby_logical_decoding.pl | git apply -R > 20 iterations with 20 tests in parallel performed successfully for me > (twice). > > So it looks like c161ab74f really made the things worse. We have two different failures here, one when VACUUM fails for a shared relation: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2017%3A09%3A27 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-01%2020%3A10%3A18 And the second failure happens for VACUUM FULL with a shared relation: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2020%3A07%3A15 In the second case, the VACUUM FULL happens *BEFORE* the new advance_wal(), making c161ab74f unrelated, no? Anyway, if one looks at the buildfarm logs, this failure is more ancient than c161ab74f. We have many of them before that, some reported back in October: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2023-10-19%2000%3A44%3A58 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2023-10-30%2013%3A39%3A20 I suspect on the contrary that c161ab74f may be actually helping here, because we've switched the CREATE TABLE/INSERT queries to not use a snapshot anymore, reducing the reasons why a slot conflict would happen? Or maybe that's just a matter of luck because the test is racy anyway. Anyway, this has the smell of a legit bug to me. I am also a bit dubious about the choice of pg_authid as shared catalog to choose for the slot invalidation check. Isn't that potentially racy with the scans we may do on it at connection startup? Something else should be chosen, like pg_shdescription as it is non-critical? I am adding in CC Bertrand and Andres, as author and committer behind befcd77d53217b. -- Michael
Attachment
07.01.2024 10:10, Michael Paquier wrote: > On Fri, Jan 05, 2024 at 11:00:00PM +0300, Alexander Lakhin wrote: >> Your suspicion was proved right. After >> git show c161ab74f src/test/recovery/t/035_standby_logical_decoding.pl | git apply -R >> 20 iterations with 20 tests in parallel performed successfully for me >> (twice). >> >> So it looks like c161ab74f really made the things worse. After more waiting, I saw the test failure (with c161ab74f reverted) on iteration 17 in VM where one test run takes up to 800 seconds. > We have two different failures here, one when VACUUM fails for a > shared relation: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2017%3A09%3A27 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-01%2020%3A10%3A18 > > And the second failure happens for VACUUM FULL with a shared relation: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2020%3A07%3A15 > > In the second case, the VACUUM FULL happens *BEFORE* the new > advance_wal(), making c161ab74f unrelated, no? > > Anyway, if one looks at the buildfarm logs, this failure is more > ancient than c161ab74f. We have many of them before that, some > reported back in October: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2023-10-19%2000%3A44%3A58 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2023-10-30%2013%3A39%3A20 Yes, I wrote exactly about that upthread and referenced my previous investigation. But what I'm observing now, is that the failure probability greatly increased with c161ab74f, so something really changed in the test behaviour. (I need a couple of days to investigate this.) Best regards, Alexander
On Sun, Jan 07, 2024 at 05:00:00PM +0300, Alexander Lakhin wrote: > Yes, I wrote exactly about that upthread and referenced my previous > investigation. But what I'm observing now, is that the failure probability > greatly increased with c161ab74f, so something really changed in the test > behaviour. (I need a couple of days to investigate this.) As far as I've cross-checked the logs between successful and failed runs on skink and my own machines (not reproduced it locally unfortunately), I did not notice a correlation with autovacuum running while VACUUM (with or without FULL) is executed on the catalogs. Perhaps a next sensible step would be to plug-in pg_waldump or pg_walinspect and get some sense from the WAL records if we fail to detect an invalidation from the log contents, from a LSN retrieved slightly at the beginning of each scenario. I would be tempted to add more increments of $Test::Builder::Level as well in the subroutines of the test because it is kind of hard to find out from where a failure comes now. One needs to grep for the slot names, whose strings are built from prefixes and suffixes defined as arguments of these subroutines... -- Michael
Attachment
Hi, On Mon, Jan 08, 2024 at 09:16:19AM +0900, Michael Paquier wrote: > On Sun, Jan 07, 2024 at 05:00:00PM +0300, Alexander Lakhin wrote: > > Yes, I wrote exactly about that upthread and referenced my previous > > investigation. But what I'm observing now, is that the failure probability > > greatly increased with c161ab74f, so something really changed in the test > > behaviour. (I need a couple of days to investigate this.) > > As far as I've cross-checked the logs between successful and failed > runs on skink and my own machines (not reproduced it locally > unfortunately), I did not notice a correlation with autovacuum running > while VACUUM (with or without FULL) is executed on the catalogs. If one is able to reproduce, would it be possible to change the test and launch the vacuum in verbose mode? That way, we could see if this is somehow due to [1] (means something holding global xmin). BTW, I think we should resume working on [1] and having it fixed in all the cases. [1]: https://www.postgresql.org/message-id/d40d015f-03a4-1cf2-6c1f-2b9aca860762%40gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hello Bertrand, 08.01.2024 10:34, Bertrand Drouvot wrote: > If one is able to reproduce, would it be possible to change the test and launch > the vacuum in verbose mode? That way, we could see if this is somehow due to [1] > (means something holding global xmin). Yes, I've added (VERBOSE) and also cut down the test to catch the failure faster. The difference between a successful and a failed run: 2024-01-08 11:58:30.679 UTC [668363] 035_standby_logical_decoding.pl INFO: vacuuming "testdb.pg_catalog.pg_authid" 2024-01-08 11:58:30.679 UTC [668363] 035_standby_logical_decoding.pl INFO: finished vacuuming "testdb.pg_catalog.pg_authid": index scans: 1 pages: 0 removed, 1 remain, 1 scanned (100.00% of total) tuples: 1 removed, 15 remain, 0 are dead but not yet removable removable cutoff: 767, which was 0 XIDs old when operation ended new relfrozenxid: 767, which is 39 XIDs ahead of previous value frozen: 0 pages from table (0.00% of total) had 0 tuples frozen index scan needed: 1 pages from table (100.00% of total) had 1 dead item identifiers removed --- 2024-01-08 12:00:59.903 UTC [669119] 035_standby_logical_decoding.pl LOG: statement: VACUUM (VERBOSE) pg_authid; 2024-01-08 12:00:59.904 UTC [669119] 035_standby_logical_decoding.pl INFO: vacuuming "testdb.pg_catalog.pg_authid" 2024-01-08 12:00:59.904 UTC [669119] 035_standby_logical_decoding.pl INFO: finished vacuuming "testdb.pg_catalog.pg_authid": index scans: 0 pages: 0 removed, 1 remain, 1 scanned (100.00% of total) tuples: 0 removed, 16 remain, 1 are dead but not yet removable removable cutoff: 766, which was 1 XIDs old when operation ended new relfrozenxid: 765, which is 37 XIDs ahead of previous value frozen: 0 pages from table (0.00% of total) had 0 tuples frozen index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed The difference in WAL is essentially the same as I observed before [1]. rmgr: Transaction len (rec/tot): 82/ 82, tx: 766, lsn: 0/0403D418, prev 0/0403D3D8, desc: COMMIT 2024-01-08 11:58:30.679035 UTC; inval msgs: catcache 11 catcache 10 rmgr: Heap2 len (rec/tot): 57/ 57, tx: 0, lsn: 0/0403D470, prev 0/0403D418, desc: PRUNE snapshotConflictHorizon: 766, nredirected: 0, ndead: 1, isCatalogRel: T, nunused: 0, redirected: [], dead: [16], unused: [], blkref #0: rel 1664/0/1260 blk 0 rmgr: Btree len (rec/tot): 52/ 52, tx: 0, lsn: 0/0403D4B0, prev 0/0403D470, desc: VACUUM ndeleted: 1, nupdated: 0, deleted: [1], updated: [], blkref #0: rel 1664/0/2676 blk 1 rmgr: Btree len (rec/tot): 52/ 52, tx: 0, lsn: 0/0403D4E8, prev 0/0403D4B0, desc: VACUUM ndeleted: 1, nupdated: 0, deleted: [16], updated: [], blkref #0: rel 1664/0/2677 blk 1 rmgr: Heap2 len (rec/tot): 50/ 50, tx: 0, lsn: 0/0403D520, prev 0/0403D4E8, desc: VACUUM nunused: 1, unused: [16], blkref #0: rel 1664/0/1260 blk 0 rmgr: Heap2 len (rec/tot): 64/ 8256, tx: 0, lsn: 0/0403D558, prev 0/0403D520, desc: VISIBLE snapshotConflictHorizon: 0, flags: 0x07, blkref #0: rel 1664/0/1260 fork vm blk 0 FPW, blkref #1: rel 1664/0/1260 blk 0 rmgr: Heap len (rec/tot): 225/ 225, tx: 0, lsn: 0/0403F5B0, prev 0/0403D558, desc: INPLACE off: 26, blkref #0: rel 1663/16384/16410 blk 4 vs rmgr: Transaction len (rec/tot): 82/ 82, tx: 766, lsn: 0/0403F480, prev 0/0403F440, desc: COMMIT 2024-01-08 12:00:59.901582 UTC; inval msgs: catcache 11 catcache 10 rmgr: XLOG len (rec/tot): 49/ 8241, tx: 0, lsn: 0/0403F4D8, prev 0/0403F480, desc: FPI_FOR_HINT , blkref #0: rel 1664/0/1260 fork fsm blk 2 FPW rmgr: XLOG len (rec/tot): 49/ 8241, tx: 0, lsn: 0/04041528, prev 0/0403F4D8, desc: FPI_FOR_HINT , blkref #0: rel 1664/0/1260 fork fsm blk 1 FPW rmgr: XLOG len (rec/tot): 49/ 8241, tx: 0, lsn: 0/04043578, prev 0/04041528, desc: FPI_FOR_HINT , blkref #0: rel 1664/0/1260 fork fsm blk 0 FPW rmgr: Heap len (rec/tot): 225/ 225, tx: 0, lsn: 0/040455C8, prev 0/04043578, desc: INPLACE off: 26, blkref #0: rel 1663/16384/16410 blk 4 (Complete logs and the modified test are attached.) With FREEZE, 10 iterations with 20 tests in parallel succeeded for me (while without it, I get failures on iterations 1,2). [1] https://www.postgresql.org/message-id/b2a1f7d0-7629-72c0-2da7-e9c4e336b010%40gmail.com Best regards, Alexander
Attachment
On Mon, Jan 08, 2024 at 08:00:00PM +0300, Alexander Lakhin wrote: > Yes, I've added (VERBOSE) and also cut down the test to catch the failure faster. > The difference between a successful and a failed run: > tuples: 1 removed, 15 remain, 0 are dead but not yet removable > [...] > tuples: 0 removed, 16 remain, 1 are dead but not yet removable Yep, it's clear that the horizon is not stable. > With FREEZE, 10 iterations with 20 tests in parallel succeeded for me > (while without it, I get failures on iterations 1,2). > > [1] https://www.postgresql.org/message-id/b2a1f7d0-7629-72c0-2da7-e9c4e336b010%40gmail.com Alexander, does the test gain in stability once you begin using the patch posted on [2], mentioned by Bertrand? (Also, perhaps we'd better move the discussion to the other thread where the patch has been sent.) [2]: https://www.postgresql.org/message-id/d40d015f-03a4-1cf2-6c1f-2b9aca860762@gmail.com -- Michael
Attachment
Hi, On Tue, Jan 09, 2024 at 01:59:08PM +0900, Michael Paquier wrote: > On Mon, Jan 08, 2024 at 08:00:00PM +0300, Alexander Lakhin wrote: > > Yes, I've added (VERBOSE) and also cut down the test to catch the failure faster. Thanks Alexander! > > The difference between a successful and a failed run: > > tuples: 1 removed, 15 remain, 0 are dead but not yet removable > > [...] > > tuples: 0 removed, 16 remain, 1 are dead but not yet removable > > Yep, it's clear that the horizon is not stable. Yeap. > > > With FREEZE, 10 iterations with 20 tests in parallel succeeded for me > > (while without it, I get failures on iterations 1,2). > > > > [1] https://www.postgresql.org/message-id/b2a1f7d0-7629-72c0-2da7-e9c4e336b010%40gmail.com > > Alexander, does the test gain in stability once you begin using the > patch posted on [2], mentioned by Bertrand? Alexander, pleae find attached v3 which is more or less a rebased version of it. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com