Thread: Re: Log connection establishment timings

Re: Log connection establishment timings

From
Guillaume Lelarge
Date:
Hello,

Le lun. 16 déc. 2024 à 22:00, Melanie Plageman <melanieplageman@gmail.com> a écrit :
Hi,

Users wishing to debug slow connection establishment have little
visibility into which steps take the most time. We don't expose any
stats and none of the logging includes durations.

The attached patch logs durations for authentication, forking the
Postgres backend, and the end-to-end connection establishment duration
starting from when the postmaster calls accept() and ending the first
time the forked backend is ready for query.

As an example of how this could be useful, I artificially slowed down
authentication and you can see that that is now visible in logs:

LOG:  backend ready for query. pid=86341. socket=10. connection
establishment times (ms): total: 8, fork: 0, authentication: 0
LOG:  backend ready for query. pid=86794. socket=10. connection
establishment times (ms): total: 108, fork: 0, authentication: 100

Two notes on implementation:

To make this portable, the timestamps captured in the postmaster
(socket creation time, fork initiation time) are passed through the
ClientSocket and BackendStartupData structures instead of simply saved
in backend local memory inherited by the child process.

Secondly, I used TimestampTz for the times, but it is pretty unclear
to me when instr_time should be used vs TimestampTz for such things.
instr_time says it is more portable, but there are all sorts of places
that use TimestampTz that would probably need to be portable.
Also, INSTR_TIME_SUBTRACT() isn't overflow safe, which must be because
the actual data type of ticks is platform dependent and so the caller
is responsible for not passing it values that would overflow when
subtracted.


Just a quick note. I like this patch a lot and it would help to diagnose some situations our customers may have. I tried the patch on my laptop, and it works fine. I'll find some time to read the code as well, but in the meantime, this looks like a nice thing to have in PostgreSQL.

Thanks Melanie.


--
Guillaume.

Re: Log connection establishment timings

From
Melanie Plageman
Date:
Thanks for taking a look!

On Mon, Jan 20, 2025 at 10:01 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> The patch needed a rebase due to 34486b6092e. I did it in v2 attached (it's
> a minor rebase due to the AmRegularBackendProcess() introduction in miscadmin.h).
>
> v2 could rely on AmRegularBackendProcess() instead of AmClientBackendProcess()
> but I kept it with AmClientBackendProcess() to reduce "my" changes as compared to
> v1.

Thanks for doing this! I have implemented your suggestion in attached v3.

> Regarding the TimestampTz vs instr_time choice, we have things like:
< -- snip -- >
> So with TimestampTz, we would:
>
> 1. return 0 if we moved the time backward
> 2. provide an inflated duration including the time jump (if the time move
> forward).
>
> But with instr_time (and on systems that support CLOCK_MONOTONIC) then
> pg_clock_gettime_ns() should not be affected by system time change IIUC.
>
> Though time changes are "rare", given the fact that those metrics could provide
> "inaccurate" measurements during that particular moment (time change) then it
> might be worth considering instr_time instead for this particular metric.

Great point. This all makes sense. I've switched to using instr_time in v3.

- Melanie

Attachment

Re: Log connection establishment timings

From
Melanie Plageman
Date:
Thanks for taking a look!

On Mon, Jan 20, 2025 at 12:53 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
>
> On Mon, Jan 20, 2025 at 7:01 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> > Though time changes are "rare", given the fact that those metrics could provide
> > "inaccurate" measurements during that particular moment (time change) then it
> > might be worth considering instr_time instead for this particular metric.
>
> +1, I think a CLOCK_MONOTONIC source should be used for this if we've got it.

Done in v3 (see [1]).

> For the EXEC_BACKEND case (which, to be honest, I don't know much
> about), I originally wondered if the fork_duration should include any
> of the shared memory manipulations or library reloads that are done to
> match Unix behavior. But I think I prefer the way the patch does it.

You mean if the EXEC_BACKEND not defined version should calculate the
end of fork_duration basically at the end of
postmaster_child_launch()?

> Can the current timestamp be recorded right at the beginning of
> SubPostmasterMain(), to avoid counting the time setting up GUCs and
> reading the variables file, or do we have to wait?

We actually don't have the startup data until after we
read_backend_variables(), so I did it as soon as I could after that.

You are right that this will include timing from some extra steps.
But, some of these steps are overhead unique to the slow process of
"forking" a backend when EXEC_BACKEND is defined anyway, so it
probably makes sense for them to be included in the timing of "fork"
for these backends.

> nit: conn_timing is currently declared in the "interrupts and crit
> section" part of miscadmin.h; should it be moved down to the
> general-purpose globals?

Whoops, it would help if I read comments and stuff. Thanks! Fixed in v3 in [1].

- Melanie

[1] https://www.postgresql.org/message-id/CAAKRu_YrNsA7-v5L9d318XZu%2BtPqcxp%2BctCGy2EGYrSt69ON%3DA%40mail.gmail.com



Re: Log connection establishment timings

From
Melanie Plageman
Date:
On Tue, Feb 25, 2025 at 3:23 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Thanks for doing this! I have implemented your suggestion in attached v3.

I missed an include in the EXEC_BACKEND not defined case. attached v4
is fixed up.

- Melanie

Attachment

Re: Log connection establishment timings

From
Fujii Masao
Date:

On 2025/02/26 6:36, Melanie Plageman wrote:
> On Tue, Feb 25, 2025 at 3:23 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
>>
>> Thanks for doing this! I have implemented your suggestion in attached v3.
> 
> I missed an include in the EXEC_BACKEND not defined case. attached v4
> is fixed up.

Thanks for updating the patch!

+    /* Capture time Postmaster initiates fork for logging */
+    if (child_type == B_BACKEND)
+        INSTR_TIME_SET_CURRENT(((BackendStartupData *) startup_data)->fork_time);

When log_connections is enabled, walsender connections are also logged.
However, with the patch, it seems the connection time for walsenders isn't captured.
Is this intentional?


With the current patch, when log_connections is enabled, the connection time is always
captured, and which might introduce performance overhead. No? Some users who enable
log_connections may not want this extra detail and want to avoid such overhead.
So, would it make sense to extend log_connections with a new option like "timing" and
log the connection time only when "timing" is specified?


+                ereport(LOG,
+                        errmsg("backend ready for query. pid=%d. socket=%d. connection establishment times (ms):
total:%ld, fork: %ld, authentication: %ld",
 
+                               MyProcPid,
+                               (int) MyClientSocket->sock,

Why expose the socket's file descriptor? I'm not sure how users would use that information.


Including the PID seems unnecessary since it's already available via log_line_prefix with %p?


Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Log connection establishment timings

From
Bertrand Drouvot
Date:
Hi,

On Wed, Feb 26, 2025 at 01:46:19PM +0900, Fujii Masao wrote:
> 
> 
> On 2025/02/26 6:36, Melanie Plageman wrote:
> > On Tue, Feb 25, 2025 at 3:23 PM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > > 
> > > Thanks for doing this! I have implemented your suggestion in attached v3.

Thanks for the new patch version!

> +    /* Capture time Postmaster initiates fork for logging */
> +    if (child_type == B_BACKEND)
> +        INSTR_TIME_SET_CURRENT(((BackendStartupData *) startup_data)->fork_time);
> 
> When log_connections is enabled, walsender connections are also logged.
> However, with the patch, it seems the connection time for walsenders isn't captured.
> Is this intentional?

Good point. I'm tempted to say that it should also be, specially because a
connection done as "psql replication=database" is of "walsender" backend type and
would not be reported.

> With the current patch, when log_connections is enabled, the connection time is always
> captured, and which might introduce performance overhead. No? Some users who enable
> log_connections may not want this extra detail and want to avoid such overhead.
> So, would it make sense to extend log_connections with a new option like "timing" and
> log the connection time only when "timing" is specified?

+1, I also think it's a good idea to let users decide if they want the timing
measurement overhead (and it's common practice with track_io_timing,
track_wal_io_timing, the newly track_cost_delay_timing for example)

> +                ereport(LOG,
> +                        errmsg("backend ready for query. pid=%d. socket=%d. connection establishment times (ms):
total:%ld, fork: %ld, authentication: %ld",
 
> +                               MyProcPid,
> +                               (int) MyClientSocket->sock,
> 
> Why expose the socket's file descriptor? I'm not sure how users would use that information.
> 
> 
> Including the PID seems unnecessary since it's already available via log_line_prefix with %p?

Yeah, we would get things like:

[1111539] LOG:  connection received: host=[local]
[1111539] LOG:  connection authenticated: user="postgres" method=trust
(/home/postgres/postgresql/pg_installed/pg18/data/pg_hba.conf:117)
[1111539] LOG:  connection authorized: user=postgres database=postgres application_name=psql
[1111539] LOG:  backend ready for query. pid=1111539. socket=9. connection establishment times (ms): total: 2, fork: 0,
authentication:0
 

I also wonder if "backend ready for query" is worth it. Maybe something like:

2025-02-26 06:44:23.265 UTC [1111539] LOG:  connection establishment times (ms): total: 2, fork: 0, authentication: 0

would be good enough?

A few random comments:

=== 1

+typedef struct ConnectionTiming
+{
+       instr_time      fork_duration;
+       instr_time      auth_duration;
+} ConnectionTiming;

As it's all about instr_time, I wonder if we could use an enum + array instead.
That's probably just a matter of taste but that sounds more flexible to extend
(should we want to add more timing in the future).

=== 2

+ConnectionTiming conn_timing = {0};

There is no padding in ConnectionTiming and anyway we just access its fields
so that's ok to initialize that way.

=== 3

Add a few words in the log_connections GUC doc? (anyway we will have to if
Fujii-san idea above about the timing is implemented)

=== 4

+               /* Calculate total fork duration in child backend for logging */
+               if (child_type == B_BACKEND)
+               {
+                       INSTR_TIME_SET_CURRENT(conn_timing.fork_duration);
+                       INSTR_TIME_SUBTRACT(conn_timing.fork_duration,
+                                                               ((BackendStartupData *) startup_data)->fork_time);
+               }
+
                /* Close the postmaster's sockets */
                ClosePostmasterPorts(child_type == B_LOGGER);

@@ -618,6 +630,14 @@ SubPostmasterMain(int argc, char *argv[])
        /* Read in the variables file */
        read_backend_variables(argv[2], &startup_data, &startup_data_len);

+       /* Calculate total fork duration in child backend for logging */
+       if (child_type == B_BACKEND)
+       {
+               INSTR_TIME_SET_CURRENT(conn_timing.fork_duration);
+               INSTR_TIME_SUBTRACT(conn_timing.fork_duration,
+                               ((BackendStartupData *) startup_data)->fork_time);
+       }

worth to add a helper function to avoid code duplication?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Log connection establishment timings

From
Guillaume Lelarge
Date:
On 26/02/2025 08:41, Bertrand Drouvot wrote:
> Hi,
> 
> On Wed, Feb 26, 2025 at 01:46:19PM +0900, Fujii Masao wrote:
>>
>>
>> On 2025/02/26 6:36, Melanie Plageman wrote:
>>> On Tue, Feb 25, 2025 at 3:23 PM Melanie Plageman
>>> <melanieplageman@gmail.com> wrote:
>>>>
>>>> Thanks for doing this! I have implemented your suggestion in attached v3.
> 
> Thanks for the new patch version!
> 

+1

>> +    /* Capture time Postmaster initiates fork for logging */
>> +    if (child_type == B_BACKEND)
>> +        INSTR_TIME_SET_CURRENT(((BackendStartupData *) startup_data)->fork_time);
>>
>> When log_connections is enabled, walsender connections are also logged.
>> However, with the patch, it seems the connection time for walsenders isn't captured.
>> Is this intentional?
> 
> Good point. I'm tempted to say that it should also be, specially because a
> connection done as "psql replication=database" is of "walsender" backend type and
> would not be reported.
> 

Agreed.

>> With the current patch, when log_connections is enabled, the connection time is always
>> captured, and which might introduce performance overhead. No? Some users who enable
>> log_connections may not want this extra detail and want to avoid such overhead.
>> So, would it make sense to extend log_connections with a new option like "timing" and
>> log the connection time only when "timing" is specified?
> 
> +1, I also think it's a good idea to let users decide if they want the timing
> measurement overhead (and it's common practice with track_io_timing,
> track_wal_io_timing, the newly track_cost_delay_timing for example)
> 

track_connection_delay_timing ? I'm fine with this, but I'm a bit afraid 
that it will lead us to an awful lot of GUCs for simple things.

>> +                ereport(LOG,
>> +                        errmsg("backend ready for query. pid=%d. socket=%d. connection establishment times (ms):
total:%ld, fork: %ld, authentication: %ld",
 
>> +                               MyProcPid,
>> +                               (int) MyClientSocket->sock,
>>
>> Why expose the socket's file descriptor? I'm not sure how users would use that information.
>>
>>
>> Including the PID seems unnecessary since it's already available via log_line_prefix with %p?
> 
> Yeah, we would get things like:
> 
> [1111539] LOG:  connection received: host=[local]
> [1111539] LOG:  connection authenticated: user="postgres" method=trust
(/home/postgres/postgresql/pg_installed/pg18/data/pg_hba.conf:117)
> [1111539] LOG:  connection authorized: user=postgres database=postgres application_name=psql
> [1111539] LOG:  backend ready for query. pid=1111539. socket=9. connection establishment times (ms): total: 2, fork:
0,authentication: 0
 
> 
> I also wonder if "backend ready for query" is worth it. Maybe something like:
> 
> 2025-02-26 06:44:23.265 UTC [1111539] LOG:  connection establishment times (ms): total: 2, fork: 0, authentication:
0
> 
> would be good enough?
> 

Sounds definitely better to me.

> A few random comments:
> 
> === 1
> 
> +typedef struct ConnectionTiming
> +{
> +       instr_time      fork_duration;
> +       instr_time      auth_duration;
> +} ConnectionTiming;
> 
> As it's all about instr_time, I wonder if we could use an enum + array instead.
> That's probably just a matter of taste but that sounds more flexible to extend
> (should we want to add more timing in the future).
> 
> === 2
> 
> +ConnectionTiming conn_timing = {0};
> 
> There is no padding in ConnectionTiming and anyway we just access its fields
> so that's ok to initialize that way.
> 
> === 3
> 
> Add a few words in the log_connections GUC doc? (anyway we will have to if
> Fujii-san idea above about the timing is implemented)
> 
> === 4
> 
> +               /* Calculate total fork duration in child backend for logging */
> +               if (child_type == B_BACKEND)
> +               {
> +                       INSTR_TIME_SET_CURRENT(conn_timing.fork_duration);
> +                       INSTR_TIME_SUBTRACT(conn_timing.fork_duration,
> +                                                               ((BackendStartupData *) startup_data)->fork_time);
> +               }
> +
>                  /* Close the postmaster's sockets */
>                  ClosePostmasterPorts(child_type == B_LOGGER);
> 
> @@ -618,6 +630,14 @@ SubPostmasterMain(int argc, char *argv[])
>          /* Read in the variables file */
>          read_backend_variables(argv[2], &startup_data, &startup_data_len);
> 
> +       /* Calculate total fork duration in child backend for logging */
> +       if (child_type == B_BACKEND)
> +       {
> +               INSTR_TIME_SET_CURRENT(conn_timing.fork_duration);
> +               INSTR_TIME_SUBTRACT(conn_timing.fork_duration,
> +                               ((BackendStartupData *) startup_data)->fork_time);
> +       }
> 
> worth to add a helper function to avoid code duplication?
> 
> Regards,
> 


-- 
Guillaume Lelarge
Consultant
https://dalibo.com