Re: Log connection establishment timings - Mailing list pgsql-hackers
From | Bertrand Drouvot |
---|---|
Subject | Re: Log connection establishment timings |
Date | |
Msg-id | Z8ALQSr1o0AoGpAo@ip-10-97-1-34.eu-west-3.compute.internal Whole thread Raw |
In response to | Re: Log connection establishment timings (Melanie Plageman <melanieplageman@gmail.com>) |
List | pgsql-hackers |
Hi, On Wed, Feb 26, 2025 at 01:45:39PM -0500, Melanie Plageman wrote: > Thanks for the continued review! > > On Wed, Feb 26, 2025 at 2:41 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Wed, Feb 26, 2025 at 01:46:19PM +0900, Fujii Masao wrote: > > > > > > With the current patch, when log_connections is enabled, the connection time is always > > > captured, and which might introduce performance overhead. No? Some users who enable > > > log_connections may not want this extra detail and want to avoid such overhead. > > > So, would it make sense to extend log_connections with a new option like "timing" and > > > log the connection time only when "timing" is specified? > > > > +1, I also think it's a good idea to let users decide if they want the timing > > measurement overhead (and it's common practice with track_io_timing, > > track_wal_io_timing, the newly track_cost_delay_timing for example) > > It seems to me like the extra timing collected and the one additional > log message isn't enough overhead to justify its own guc (for now). Agree. IIUC, I think that Fujii-san's idea was to extend log_connections with a new option "timing" (i.e move it from ConfigureNamesBool to say ConfigureNamesEnum with say on, off and timing?). I think that's a good idea. I just did a quick check and changing a GUC from ConfigureNamesBool to ConfigureNamesEnum is something that has already been done in the past (see 240067b3b0f and ffd37740ee6 for example). In my previous up-thead message, I did not mean to suggest to add a new GUC, just saying that when new "timing" is measured then users have the choice to enable or disable it. > > > Including the PID seems unnecessary since it's already available via log_line_prefix with %p? > > > > Yeah, we would get things like: > > > > [1111539] LOG: connection received: host=[local] > > [1111539] LOG: connection authenticated: user="postgres" method=trust (/home/postgres/postgresql/pg_installed/pg18/data/pg_hba.conf:117) > > [1111539] LOG: connection authorized: user=postgres database=postgres application_name=psql > > [1111539] LOG: backend ready for query. pid=1111539. socket=9. connection establishment times (ms): total: 2, fork:0, authentication: 0 > > > > I also wonder if "backend ready for query" is worth it. Maybe something like: > > > > 2025-02-26 06:44:23.265 UTC [1111539] LOG: connection establishment times (ms): total: 2, fork: 0, authentication: 0 > > > > would be good enough? > > Yes, thank you. v5 attached in [1] changes the wording as you recommend.. Thanks for the updated version! > > +typedef struct ConnectionTiming > > +{ > > + instr_time fork_duration; > > + instr_time auth_duration; > > +} ConnectionTiming; > > > > As it's all about instr_time, I wonder if we could use an enum + array instead. > > That's probably just a matter of taste but that sounds more flexible to extend > > (should we want to add more timing in the future). > > I think we can change it later if we add many more. For now I prefer > the clarity of accessing members by name. Especially because we don't > have any code yet that loops through all of them or anything like > that. Yeah makes sense. > > Add a few words in the log_connections GUC doc? (anyway we will have to if > > Fujii-san idea above about the timing is implemented) > > I took a stab at this in attached v6. I feel that what I have is a bit > stilted, but I'm not sure how to fix it. Yeah, that might be easier to reason about if we're going with Fujii-san suggestion to extend log_connections with a new option? > > + /* Calculate total fork duration in child backend for logging */ > > + if (child_type == B_BACKEND) > > + { > > + INSTR_TIME_SET_CURRENT(conn_timing.fork_duration); > > + INSTR_TIME_SUBTRACT(conn_timing.fork_duration, > > + ((BackendStartupData *) startup_data)->fork_time); > > + } > > + > > /* Close the postmaster's sockets */ > > ClosePostmasterPorts(child_type == B_LOGGER); > > > > @@ -618,6 +630,14 @@ SubPostmasterMain(int argc, char *argv[]) > > /* Read in the variables file */ > > read_backend_variables(argv[2], &startup_data, &startup_data_len); > > > > + /* Calculate total fork duration in child backend for logging */ > > + if (child_type == B_BACKEND) > > + { > > + INSTR_TIME_SET_CURRENT(conn_timing.fork_duration); > > + INSTR_TIME_SUBTRACT(conn_timing.fork_duration, > > + ((BackendStartupData *) startup_data)->fork_time); > > + } > > > > worth to add a helper function to avoid code duplication? > > I've added INSTR_TIME_GET_DURATION_SINCE(start_time). Which I like > because it seems generally useful. Great idea! Could probably be used in other places but I did not check and it's outside the scope of this thread anyway. > It does not however cut down on LOC, so I'm somewhat on the fence. I think that's somehow also around code maintenance (not only LOC), say for example if we want to add more "child_type" in the check (no need to remember to update both locations). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: