Re: Log connection establishment timings - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: Log connection establishment timings
Date
Msg-id Z8ALQSr1o0AoGpAo@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: Log connection establishment timings  (Melanie Plageman <melanieplageman@gmail.com>)
List pgsql-hackers
Hi,

On Wed, Feb 26, 2025 at 01:45:39PM -0500, Melanie Plageman wrote:
> Thanks for the continued review!
> 
> On Wed, Feb 26, 2025 at 2:41 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > On Wed, Feb 26, 2025 at 01:46:19PM +0900, Fujii Masao wrote:
> > >
> > > With the current patch, when log_connections is enabled, the connection time is always
> > > captured, and which might introduce performance overhead. No? Some users who enable
> > > log_connections may not want this extra detail and want to avoid such overhead.
> > > So, would it make sense to extend log_connections with a new option like "timing" and
> > > log the connection time only when "timing" is specified?
> >
> > +1, I also think it's a good idea to let users decide if they want the timing
> > measurement overhead (and it's common practice with track_io_timing,
> > track_wal_io_timing, the newly track_cost_delay_timing for example)
> 
> It seems to me like the extra timing collected and the one additional
> log message isn't enough overhead to justify its own guc (for now).

Agree. IIUC, I think that Fujii-san's idea was to extend log_connections with
a new option "timing" (i.e move it from ConfigureNamesBool to say
ConfigureNamesEnum with say on, off and timing?). I think that's a good idea.

I just did a quick check and changing a GUC from ConfigureNamesBool to 
ConfigureNamesEnum is something that has already been done in the past (see
240067b3b0f and ffd37740ee6 for example).

In my previous up-thead message, I did not mean to suggest to add a new GUC,
just saying that when new "timing" is measured then users have the choice to
enable or disable it.

> > > Including the PID seems unnecessary since it's already available via log_line_prefix with %p?
> >
> > Yeah, we would get things like:
> >
> > [1111539] LOG:  connection received: host=[local]
> > [1111539] LOG:  connection authenticated: user="postgres" method=trust
(/home/postgres/postgresql/pg_installed/pg18/data/pg_hba.conf:117)
> > [1111539] LOG:  connection authorized: user=postgres database=postgres application_name=psql
> > [1111539] LOG:  backend ready for query. pid=1111539. socket=9. connection establishment times (ms): total: 2,
fork:0, authentication: 0
 
> >
> > I also wonder if "backend ready for query" is worth it. Maybe something like:
> >
> > 2025-02-26 06:44:23.265 UTC [1111539] LOG:  connection establishment times (ms): total: 2, fork: 0, authentication:
0
> >
> > would be good enough?
> 
> Yes, thank you. v5 attached in [1] changes the wording as you recommend..

Thanks for the updated version!

> > +typedef struct ConnectionTiming
> > +{
> > +       instr_time      fork_duration;
> > +       instr_time      auth_duration;
> > +} ConnectionTiming;
> >
> > As it's all about instr_time, I wonder if we could use an enum + array instead.
> > That's probably just a matter of taste but that sounds more flexible to extend
> > (should we want to add more timing in the future).
> 
> I think we can change it later if we add many more. For now I prefer
> the clarity of accessing members by name. Especially because we don't
> have any code yet that loops through all of them or anything like
> that.

Yeah makes sense.

> > Add a few words in the log_connections GUC doc? (anyway we will have to if
> > Fujii-san idea above about the timing is implemented)
> 
> I took a stab at this in attached v6. I feel that what I have is a bit
> stilted, but I'm not sure how to fix it.

Yeah, that might be easier to reason about if we're going with Fujii-san
suggestion to extend log_connections with a new option?

> > +               /* Calculate total fork duration in child backend for logging */
> > +               if (child_type == B_BACKEND)
> > +               {
> > +                       INSTR_TIME_SET_CURRENT(conn_timing.fork_duration);
> > +                       INSTR_TIME_SUBTRACT(conn_timing.fork_duration,
> > +                                                               ((BackendStartupData *) startup_data)->fork_time);
> > +               }
> > +
> >                 /* Close the postmaster's sockets */
> >                 ClosePostmasterPorts(child_type == B_LOGGER);
> >
> > @@ -618,6 +630,14 @@ SubPostmasterMain(int argc, char *argv[])
> >         /* Read in the variables file */
> >         read_backend_variables(argv[2], &startup_data, &startup_data_len);
> >
> > +       /* Calculate total fork duration in child backend for logging */
> > +       if (child_type == B_BACKEND)
> > +       {
> > +               INSTR_TIME_SET_CURRENT(conn_timing.fork_duration);
> > +               INSTR_TIME_SUBTRACT(conn_timing.fork_duration,
> > +                               ((BackendStartupData *) startup_data)->fork_time);
> > +       }
> >
> > worth to add a helper function to avoid code duplication?
> 
> I've added INSTR_TIME_GET_DURATION_SINCE(start_time). Which I like
> because it seems generally useful.

Great idea! Could probably be used in other places but I did not check and
it's outside the scope of this thread anyway.

> It does not however cut down on LOC, so I'm somewhat on the fence.

I think that's somehow also around code maintenance (not only LOC), say for example
if we want to add more "child_type" in the check (no need to remember to update both
locations).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Yuki Seino
Date:
Subject: Re: Add “FOR UPDATE NOWAIT” lock details to the log.
Next
From: Amul Sul
Date:
Subject: Re: NOT ENFORCED constraint feature