Re: Refactoring postmaster's code to cleanup after child exit - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Refactoring postmaster's code to cleanup after child exit
Date
Msg-id fd5e9523-78d3-4270-86b2-fd1b1eeb4fc9@iki.fi
Whole thread Raw
In response to Re: Refactoring postmaster's code to cleanup after child exit  (Andres Freund <andres@anarazel.de>)
Responses Re: Refactoring postmaster's code to cleanup after child exit
List pgsql-hackers
In short, all the 4 patches look good to me. Thanks for picking this up!

On 06/03/2025 22:16, Andres Freund wrote:
> On 2025-03-05 20:49:33 -0800, Noah Misch wrote:
>>> This behaviour makes it really hard to debug problems. It'd have been a lot
>>> easier to understand the problem if we'd seen psql's stderr before the test
>>> died.
>>>
>>> I guess that mean at the very least we'd need to put an eval {} around the
>>> ->pump() call., print $self->{stdout}, ->{stderr} and reraise an error?
>>
>> That sounds right.
> 
> In the attached patch I did that for wait_connect().  I did verify that it
> works by implementing the wait_connect() fix before fixing
> 002_connection_limits.pl, which fails if a sleep(1) is added just before the
> proc_exit(1) for FATAL.

+1. For the archives sake, I just want to clarify that this pump stuff 
is all about getting better error messages on a test failure. It doesn't 
help with the original issue.

This is all annoyingly complicated, but getting good error messages is 
worth it.

> On 2025-03-05 08:23:32 +0900, Michael Paquier wrote:>> Why not adding an injection point with a WARNING or a LOG
generated,
 
then
>> check the server logs for the code path taken based on the elog() generated
>> with the point name?
> 
> I think the log_min_messages approach is a lot simpler. If we need something
> like this more widely we can reconsider injection points...

+1. It's a little annoying to depend on a detail like the "client 
backend process exited" debug message, but seems like the best fix for now.

> I also attached a patch to improve connect_fails()/connect_ok() test names a
> bit. They weren't symmetric and I felt they were lacking in detail for the
> psql return code check.

+1.

While we're at it, attached are a few more cleanups I noticed.

> Another annoying and also funny problem I saw is this failure:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2025-03-06%2009%3A18%3A21
> 2025-03-06 10:42:02.552 UTC [372451][postmaster][:0] LOG:  1800 s is outside the valid range for parameter
"authentication_timeout"(1 s .. 600 s)
 
> 
> I had to increase PG_TEST_TIMEOUT_DEFAULT due to some other test timing out
> when run under valgrind (due to having to insert a lot of rows). But then this
> test runs into the above issue.
> 
> The easiest way seems to be to just limit PG_TEST_TIMEOUT_DEFAULT in this
> test.

LGTM

-- 
Heikki Linnakangas
Neon (https://neon.tech)
Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: dblink: Add SCRAM pass-through authentication
Next
From: Nathan Bossart
Date:
Subject: Re: Statistics Import and Export