Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible
Date
Msg-id tfc4r4sr5m47f42nwsymadafjr3h7sjskilq62s64ujdpi4oxc@dfgxzqcxjrjy
Whole thread Raw
In response to Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible  (Jacob Champion <jacob.champion@enterprisedb.com>)
Responses Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible
List pgsql-hackers
Hi,

On 2025-03-05 08:16:45 -0800, Jacob Champion wrote:
> From efc9fc3b3993601e9611131f229fbcaf4daa46f1 Mon Sep 17 00:00:00 2001
> From: Michael Paquier <michael@paquier.xyz>
> Date: Wed, 5 Mar 2025 13:30:43 +0900
> Subject: [PATCH 1/2] Fix race condition in pre-auth test
> 
> ---
>  src/test/authentication/t/007_pre_auth.pl | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/src/test/authentication/t/007_pre_auth.pl b/src/test/authentication/t/007_pre_auth.pl
> index a638226dbaf..90aaea4b5a6 100644
> --- a/src/test/authentication/t/007_pre_auth.pl
> +++ b/src/test/authentication/t/007_pre_auth.pl
> @@ -43,12 +43,14 @@ $psql->query_safe("SELECT injection_points_attach('init-pre-auth', 'wait')");
>  # authentication. Use the $psql connection handle for server interaction.
>  my $conn = $node->background_psql('postgres', wait => 0);
>  
> -# Wait for the connection to show up.
> +# Wait for the connection to show up in pg_stat_activity, with the wait_event
> +# of the injection point.
>  my $pid;
>  while (1)
>  {
>      $pid = $psql->query(
> -        "SELECT pid FROM pg_stat_activity WHERE state = 'starting';");
> +        qq{SELECT pid FROM pg_stat_activity
> +  WHERE state = 'starting' and wait_event = 'init-pre-auth';});
>      last if $pid ne "";

Unrelated to the change in this patch, but tests really shouldn't use while(1)
loops without a termination condition. If something is wrong, the test will
hang indefinitely, instead of timing out.  On the buildfarm that can take out
an animal if it hasn't configured a timeout (with autoconf at least, meson
terminates tests after a timeout).

I guess you can't use poll_query_until() here, but in that case you should
copy some of the timeout logic. Or, perhaps better, add a poll_query_until()
to BackgroundPsql.pm.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: per backend WAL statistics
Next
From: Tom Lane
Date:
Subject: Re: Should we add debug_parallel_query=regress to CI?