Re: Race conditions in 019_replslot_limit.pl - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Race conditions in 019_replslot_limit.pl
Date
Msg-id 20220215235157.jknqi73c5scietxe@alap3.anarazel.de
Whole thread Raw
In response to Race conditions in 019_replslot_limit.pl  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Race conditions in 019_replslot_limit.pl
List pgsql-hackers
Hi,

On 2022-02-15 23:29:20 +0200, Heikki Linnakangas wrote:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2022-02-14%2006%3A30%3A04
> 
> [07:42:23] t/018_wal_optimize.pl ................ ok    12403 ms ( 0.00 usr
> 0.00 sys +  1.40 cusr  0.63 csys =  2.03 CPU)
> # poll_query_until timed out executing this query:
> # SELECT wal_status FROM pg_replication_slots WHERE slot_name = 'rep3'
> # expecting this output:
> # lost
> # last actual query output:
> # unreserved

The relevant bit from the log is:
2022-02-14 07:42:27.817 CET [6209f9d3.68bab:3] LOG:  database system is ready to accept read-only connections
2022-02-14 07:42:27.819 CET [6209f9d3.68bb7:1] LOG:  started streaming WAL from primary at 0/1B00000 on timeline 1
2022-02-14 07:42:27.819 CET [6209f9d3.68bb7:2] FATAL:  could not receive data from WAL stream: ERROR:  requested WAL
segment00000001000000000000001B has already been removed
 
2022-02-14 07:42:27.822 CET [6209f9d3.68bb9:1] LOG:  started streaming WAL from primary at 0/1B00000 on timeline 1
2022-02-14 07:42:27.822 CET [6209f9d3.68bb9:2] FATAL:  could not receive data from WAL stream: ERROR:  requested WAL
segment00000001000000000000001B has already been removed
 

I think what happened is that there was no WAL to receive between the start of
the primary and the $node_primary3->wait_for_catchup($node_standby3);

Because the slot is created without reserving WAL that allows the primary to
remove the WAL segments without ever creating a slot based conflict. I think
that should be fixable by reserving the slot at creation time?


> and:
> 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2022-02-15%2011%3A00%3A08
> 
> #   Failed test 'have walsender pid 3682154
> # 3682136'
> #   at t/019_replslot_limit.pl line 335.
> #                   '3682154
> # 3682136'
> #     doesn't match '(?^:^[0-9]+$)'
> 
> The latter looks like there are two walsenders active, which confuses the
> test.

Too bad there's no plain pid in the log_line_prefix. Looks like that used to be the
buildfarm default, and I haven't fixed that animals configuration...

%c apparently is hex(process startup time).hex(pid) in hex, so we're looking
for 382f58... Confirmed by the slot name:
2022-02-15 12:10:20.874 CET [620b8a1c.382f58:5] LOG:  received replication command: CREATE_REPLICATION_SLOT
"pg_basebackup_3682136"TEMPORARY PHYSICAL ( RESERVE_WAL)
 
which pg_basebackup builds using the backend pid:
        replication_slot = psprintf("pg_basebackup_%d", (int) PQbackendPID(param->bgconn));

The logs for that pid are:
2022-02-15 12:10:20.873 CET [620b8a1c.382f58:1] LOG:  connection received: host=[local]
2022-02-15 12:10:20.874 CET [620b8a1c.382f58:2] LOG:  replication connection authorized: user=bf
application_name=019_replslot_limit.pl
2022-02-15 12:10:20.874 CET [620b8a1c.382f58:3] LOG:  received replication command: SHOW data_directory_mode
2022-02-15 12:10:20.874 CET [620b8a1c.382f58:4] STATEMENT:  SHOW data_directory_mode
2022-02-15 12:10:20.874 CET [620b8a1c.382f58:5] LOG:  received replication command: CREATE_REPLICATION_SLOT
"pg_basebackup_3682136"TEMPORARY PHYSICAL ( RESERVE_WAL)
 
2022-02-15 12:10:20.874 CET [620b8a1c.382f58:6] STATEMENT:  CREATE_REPLICATION_SLOT "pg_basebackup_3682136" TEMPORARY
PHYSICAL( RESERVE_WAL)
 
2022-02-15 12:10:20.875 CET [620b8a1c.382f58:7] LOG:  received replication command: IDENTIFY_SYSTEM
2022-02-15 12:10:20.875 CET [620b8a1c.382f58:8] STATEMENT:  IDENTIFY_SYSTEM
2022-02-15 12:10:20.875 CET [620b8a1c.382f58:9] LOG:  received replication command: START_REPLICATION SLOT
"pg_basebackup_3682136"0/600000 TIMELINE 1
 
2022-02-15 12:10:20.875 CET [620b8a1c.382f58:10] STATEMENT:  START_REPLICATION SLOT "pg_basebackup_3682136" 0/600000
TIMELINE1
 

Even though the node has log_disconnect = true, and other processes indeed log
their disconnection, there's no disconnect for the above session until the
server is shut down.  Even though pg_basebackup clearly finished? Uh, huh?

I guess it's conceivable that the backend was still working through process
shutdown? But it doesn't seem too likely, given that several other connections
manage to get through entire connect / disconnect cycles?



> Has anyone looked into these yet?

Hadn't yet...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: last_archived_wal is not necessary the latest WAL file (was Re: pgsql: Add test case for an archive recovery corner case.)
Next
From: Tomas Vondra
Date:
Subject: Re: Column Filtering in Logical Replication