Thread: Introduce "log_connection_stages" setting.
Hello, The problem we face is excessive logging of connection information that clutters the logs and in corner cases with many short-lived connections leads to disk space exhaustion. Current connection log lines share significant parts of the information - host, port, very close timestamps etc. - in the common case of a successful connection: 2022-07-12 12:17:39.369 UTC,,,12875,"10.2.101.63:35616",62cd6663.324b,1,"",2022-07-12 12:17:39 UTC,,0,LOG,00000,"connection received: host=10.2.101.63 port=35616",,,,,,,,,"","not initialized" 2022-07-12 12:17:39.374 UTC,"some_user","postgres",12875,"10.2.101.63:35616",62cd6663.324b,2,"authentication",2022-07-12 12:17:39 UTC,18/4571,0,LOG,00000,"connection authorized: user=some_user database=postgres SSL enabled (protocol=, cipher=, compression=)",,,,,,,,,"","client backend" 2022-07-12 12:17:39.430 UTC,"some_user","postgres",12875,"10.2.101.63:35616",62cd6663.324b,3,"idle",2022-07-12 12:17:39 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.060 user=some_user database=postgres host=10.2.101.63 port=35616",,,,,,,,,"","client backend" Removing some of the lines should not harm log-based investigations in most cases, but will shrink the logs improving readability and space consumption. I would like to get feedback on the following idea: Add the `log_connection_stages` setting of type “string” with possible values "received", "authorized", "authenticated", "disconnected", and "all", with "all" being the default. The setting would have effect only when `log_connections` is on. Example: log_connection_stages=’authorized,disconnected’. That also implies there would be no need for a separate "log_disconnection" setting. For the sake of completeness I have to mention omitting ‘received’ from `log_connection_stages` would lead to absence in logs info about connections that do not complete initialization: for them only the “connection received” line is currently logged. The attachment contains a log excerpt to clarify the situation I am talking about. Regards, Sergey.
Attachment
On Tue, Jul 12, 2022, at 10:52 AM, Sergey Dudoladov wrote:
The problem we face is excessive logging of connection informationthat clutters the logs and in corner cases with many short-livedconnections leads to disk space exhaustion.
You are proposing a fine-grained control over connection stages reported when
log_connections = on. It seems useful if you're only interested in (a)
malicious access or (b) authorized access (for audit purposes).
I would like to get feedback on the following idea:Add the `log_connection_stages` setting of type “string” with possiblevalues "received", "authorized", "authenticated", "disconnected", and"all", with "all" being the default.The setting would have effect only when `log_connections` is on.Example: log_connection_stages=’authorized,disconnected’.That also implies there would be no need for a separate"log_disconnection" setting.
Your proposal will add more confusion to the already-confused logging-related
GUCs. If you are proposing to introduce a fine-grained control, the first step
should be merge log_connections and log_disconnections into a new GUC (?) and
deprecate them. (I wouldn't introduce a new GUC that depends on the stage of
other GUC as you proposed.) There are 3 stages: connect (received), authorized
(authenticated), disconnect. You can also add 'all' that mimics log_connections
/ log_disconnections enabled. Another question is if we can reuse
log_connections for this improvement instead of a new GUC. In this case, you
would turn the boolean value into an enum value. Will it cause trouble while
upgrading to this new version? It is one of the reasons to create a new GUC. I
would suggest log_connection_messages or log_connection (with the 's' in the
end -- since it is too similar to the current GUC name, I'm afraid it is not a
good name for it).
Hello, Thank you for the constructive feedback. > Your proposal will add more confusion to the already-confused logging-related GUCs. > I wouldn't introduce a new GUC that depends on the stage of other GUC as you proposed. Agreed, coupling a new GUC with "log_connections" is likely to lead to extra confusion. > There are 3 stages: connect (received), authorized (authenticated), disconnect. I've taken connection stages and terminology from the existing log messages. The reason I have separated "authorized" and "authenticated" are [1] and [2] usages of "log_connections"; "received" is mentioned at [3]. >> Example: log_connection_stages=’authorized,disconnected’. > would turn the boolean value into an enum value I have thought about enums too, but we need to cover arbitrary combinations of message types, for example log only "received" and "disconnected". Hence the proposed type "string" with individual values within the string really drawn from an enum. > merge log_connections and log_disconnections into a new GUC (?) and deprecate them. Are there any specific deprecation guidelines ? I have not found any after a quick search for GUC deprecation in Google and commit history. A deprecation scheme could look like that: 1. Mention in the docs "log_(dis)connections" are deprecated in favor of "log_connection_stages" 2. Map "log_(dis)connections" to relevant values of "log_connection_stages" in code if the latter is unset. 3. Complain in the logs if a conflict arises between the old params and "log_connection_stages", with "log_connection_stages" taking the precedence. Regards, Sergey [1] https://github.com/postgres/postgres/blob/3f8148c256e067dc2e8929ed174671ba7dc3339c/src/backend/utils/init/postinit.c#L257-L262 [2] https://github.com/postgres/postgres/blob/02c408e21a6e78ff246ea7a1beb4669634fa9c4c/src/backend/libpq/auth.c#L372 [3] https://github.com/postgres/postgres/blob/master/src/backend/postmaster/postmaster.c#L4393
On Thu, Jul 14, 2022, at 8:20 AM, Sergey Dudoladov wrote:
I've taken connection stages and terminology from the existing log messages.The reason I have separated "authorized" and "authenticated" are [1]and [2] usages of "log_connections";"received" is mentioned at [3].
After checking the commit 9afffcb833d, I agree that you need 4 stages:
connected, authorized, authenticated, and disconnected.
I have thought about enums too, but we need to cover arbitrarycombinations of message types, for example log only "received" and"disconnected".Hence the proposed type "string" with individual values within thestring really drawn from an enum.
Ooops. I said enum but I meant string list.
Are there any specific deprecation guidelines ? I have not found anyafter a quick search for GUC deprecation in Google and commit history.A deprecation scheme could look like that:1. Mention in the docs "log_(dis)connections" are deprecated in favorof "log_connection_stages"2. Map "log_(dis)connections" to relevant values of"log_connection_stages" in code if the latter is unset.3. Complain in the logs if a conflict arises between the old paramsand "log_connection_stages", with "log_connection_stages"taking the precedence.
No. AFAICS in this case, you just remove log_connections and log_disconnections
and create the new one (see for example the commit 88e98230268 that replace
checkpoint_segments with min_wal_size and max_wal_size). We don't generally
keep ConfigureNames* entries for deprecated GUCs. Unless you are renaming a GUC
-- see map_old_guc_names; that's not the case. When we remove a GUC, we are
introducing an incompatibility so the only place it will be mentioned is the
release notes (there is a section called "Migration to Version X" that lists
all incompatibilities). From the developer's point of view, you only need to
mention in the commit message that this commit is introducing an
incompatibility. Hence, when it is time to write the release notes, the
information about the removal and the new replacement will be added.
Hi hackers, I've sketched an initial patch version; feedback is welcome. Regards, Sergey Dudoladov
Attachment
On Tue, Nov 08, 2022 at 07:30:38PM +0100, Sergey Dudoladov wrote: > + <entry>Logs reception of a connection. At this point a connection has been received, but no further work has beendone: receipt > + <entry>Logs the original identity that an authentication method employs to identify a user. In most cases, theidentity string equals the PostgreSQL username, s/equals/matches > +/* check_hook: validate new log_connection_messages value */ > +bool > +check_log_connection_messages(char **newval, void **extra, GucSource source) > +{ > + char *rawname; > + List *namelist; > + ListCell *l; > + char *log_connection_messages = *newval; > + bool *myextra; > + > + /* > + * Set up the "extra" struct actually used by assign_log_connection_messages. > + */ > + myextra = (bool *) guc_malloc(LOG, 4 * sizeof(bool)); This function hardcodes each of the 4 connections: > + if (pg_strcasecmp(stage, "received") == 0) > + myextra[0] = true; It'd be better to use #defines or enums for these. > --- a/src/backend/tcop/postgres.c > +++ b/src/backend/tcop/postgres.c > @@ -84,8 +84,11 @@ const char *debug_query_string; /* client-supplied query string */ > /* Note: whereToSendOutput is initialized for the bootstrap/standalone case */ > CommandDest whereToSendOutput = DestDebug; > > -/* flag for logging end of session */ > -bool Log_disconnections = false; > +/* flags for logging information about session state */ > +bool Log_disconnected = false; > +bool Log_authenticated = false; > +bool Log_authorized = false; > +bool Log_received = false; I think this ought to be an integer with flag bits, rather than 4 booleans (I don't know, but there might be more later?). Then, the implementation follows the user-facing GUC and also follows log_destination. -- Justin
On Thu, Nov 17, 2022 at 09:36:29AM -0600, Justin Pryzby wrote: > On Tue, Nov 08, 2022 at 07:30:38PM +0100, Sergey Dudoladov wrote: ... Also (I didn't realize there was a CF entry), this is failing tests. http://cfbot.cputube.org/sergey-dudoladov.html -- Justin
Hi, +1 for the idea! > + <entry><literal>authenticated</literal></entry> > + <entry>Logs the original identity that an authentication method employs to identify a user. In most cases, theidentity string equals the PostgreSQL username, > + but some third-party authentication methods may alter the original user identifier before the server stores it.Failed authentication is always logged regardless of the value of this setting.</entry> I think the documentation needs to be rewrapped; those are very long lines. On 11/17/22 07:36, Justin Pryzby wrote: > This function hardcodes each of the 4 connections: > >> + if (pg_strcasecmp(stage, "received") == 0) >> + myextra[0] = true; > > It'd be better to use #defines or enums for these. Hardcoding seems reasonable to me, if this is the only place we're doing string comparison. >> --- a/src/backend/tcop/postgres.c >> +++ b/src/backend/tcop/postgres.c >> @@ -84,8 +84,11 @@ const char *debug_query_string; /* client-supplied query string */ >> /* Note: whereToSendOutput is initialized for the bootstrap/standalone case */ >> CommandDest whereToSendOutput = DestDebug; >> >> -/* flag for logging end of session */ >> -bool Log_disconnections = false; >> +/* flags for logging information about session state */ >> +bool Log_disconnected = false; >> +bool Log_authenticated = false; >> +bool Log_authorized = false; >> +bool Log_received = false; > > I think this ought to be an integer with flag bits, rather than 4 > booleans (I don't know, but there might be more later?). Then, the > implementation follows the user-facing GUC and also follows > log_destination. Agreed. Or at the very least, follow what's done with wal_consistency_checking? But I think flag bits would be better. The tests should be expanded for cases other than 'all'. As to the failing test cases: it looks like there's a keyword issue with ALTER SYSTEM and 'all', but trying to fix it by quoting also fails. I think it's because of GUC_LIST_QUOTE -- is there a reason that's used here? I don't think we'd need any special characters in future option names. wal_consistency_checking is very similar, and it just uses GUC_LIST_INPUT. Thanks, --Jacob
Hello, hackers Thank you for the reviews. I've modified the patch to incorporate your suggestions: + flag bits are now used to encode different connection stages + failing tests are now fixed. It was not a keyword issue but rather "check_log_connection_messages" not allocating memory properly in the special case log_connection_messages = 'all' + the GUC option is now only GUC_LIST_INPUT + typo fixes and line rewrapping in the docs Regards, Sergey
Attachment
Thanks for updating the patch. It's currently failing check-world, due to a test that was added on January 23 (a9dc7f941): http://cfbot.cputube.org/sergey-dudoladov.html [19:15:57.101] Summary of Failures: [19:15:57.101] [19:15:57.101] 250/251 postgresql:ldap / ldap/002_bindpasswd ERROR 1.38s 2023-01-30 19:15:52.427 GMT [56038] LOG: unrecognized configuration parameter "log_connections" in file "/tmp/cirrus-ci-build/build/testrun/ldap/002_bindpasswd/data/t_002_bindpasswd_node_data/pgdata/postgresql.conf"line 839 > + received, but no further work has been done: Postgres is about to start say "PostgreSQL" to match the rest of the docs. > + GUC_check_errmsg("Invalid value '%s'", stage); This message shouldn't be uppercased. > + GUC_check_errdetail("Valid values to use in the list are 'received', 'authenticated', 'authorized', 'disconnected',and 'all'." > + "If 'all' is present, it must be the only value in the list."); Maybe "all" should be first ? There's no spaces before "If": | 2023-01-31 00:17:48.906 GMT [5676] DETALLE: Valid values to use in the list are 'received', 'authenticated', 'authorized','disconnected', and 'all'.If 'all' is present, it must be the only value in the list. > +/* flags for logging information about session state */ > +int Log_connection_messages = LOG_CONNECTION_ALL; The initial value here is overwritten by the GUC default during startup. For consistency, the integer should be initialized to 0. > +extern PGDLLIMPORT int Log_connection_messages; > + > +/* Bitmap for logging connection messages */ > +#define LOG_CONNECTION_RECEIVED 1 > +#define LOG_CONNECTION_AUTHENTICATED 2 > +#define LOG_CONNECTION_AUTHORIZED 4 > +#define LOG_CONNECTION_DISCONNECTED 8 > +#define LOG_CONNECTION_ALL 15 Maybe the integers should be written like (1<<0).. And maybe ALL should be 0xffff ? More nitpicks: > + Causes stages of each attempted connection to the server to be logged. Example: <literal>authorized,disconnected</literal>. "Causes the specified stages of each connection attempt .." > + The default is the empty string meaning nothing is logged. ".. string, which means that nothing is logged" > + <entry>Logs the original identity that an authentication method employs > + to identify a user. In most cases, the identity string matches the ".. original identity used by an authentication method ..' -- Justin
Hi again, Justin, thank you for the fast review. The new version is attached. Regards, Sergey Dudoladov
Attachment
On 2/1/23 11:59, Sergey Dudoladov wrote: > Justin, thank you for the fast review. The new version is attached. This is looking very good. One bigger comment: > + myextra = (int *) guc_malloc(ERROR, sizeof(int)); > + *myextra = newlogconnect; If I've understood Tom correctly in [1], both of these guc_mallocs should be using a loglevel less than ERROR, to avoid forcing a postmaster exit when out of memory. (I used WARNING in that thread instead, which seemed to be acceptable.) And a couple of nitpicks: > + Causes the specified stages of each connection attempt to the server to be logged. Example: <literal>authorized,disconnected</literal>. Long line; should be rewrapped. > + else { > + GUC_check_errcode(ERRCODE_INVALID_PARAMETER_VALUE); > + GUC_check_errmsg("invalid value '%s'", stage); > + GUC_check_errdetail("Valid values to use in the list are 'all', 'received', 'authenticate > + " If 'all' is present, it must be the only value in the list."); I think the errmsg here should reuse the standard message format invalid value for parameter "%s": "%s" both for consistency and ease of translation. Thanks! --Jacob [1] https://www.postgresql.org/message-id/2012342.1658356951%40sss.pgh.pa.us
Jacob Champion <jchampion@timescale.com> writes: > This is looking very good. One bigger comment: >> + myextra = (int *) guc_malloc(ERROR, sizeof(int)); >> + *myextra = newlogconnect; > If I've understood Tom correctly in [1], both of these guc_mallocs > should be using a loglevel less than ERROR, to avoid forcing a > postmaster exit when out of memory. (I used WARNING in that thread > instead, which seemed to be acceptable.) Actually, preferred practice is as seen in e.g. check_datestyle: myextra = (int *) guc_malloc(LOG, 2 * sizeof(int)); if (!myextra) return false; myextra[0] = newDateStyle; myextra[1] = newDateOrder; *extra = (void *) myextra; which gives the guc.c functions an opportunity to manage the failure. A quick grep shows that there are existing check functions that did not get that memo, e.g. check_recovery_target_lsn. We ought to clean them up. This is, of course, not super important unless you're allocating something quite large; the odds of going OOM in the postmaster should be pretty small. regards, tom lane
On 3/2/23 14:56, Tom Lane wrote: > Jacob Champion <jchampion@timescale.com> writes: >> If I've understood Tom correctly in [1], both of these guc_mallocs >> should be using a loglevel less than ERROR, to avoid forcing a >> postmaster exit when out of memory. (I used WARNING in that thread >> instead, which seemed to be acceptable.) > > Actually, preferred practice is as seen in e.g. check_datestyle: > > myextra = (int *) guc_malloc(LOG, 2 * sizeof(int)); > if (!myextra) > return false; > myextra[0] = newDateStyle; > myextra[1] = newDateOrder; > *extra = (void *) myextra; > > which gives the guc.c functions an opportunity to manage the > failure. Ah, thanks for the correction. (My guc_strdup(WARNING, ...) calls may need to be cleaned up too, then.) --Jacob
Hello, I have attached the fourth version of the patch. Regards, Sergey.
Attachment
> On 16 May 2023, at 20:51, Sergey Dudoladov <sergey.dudoladov@gmail.com> wrote: > I have attached the fourth version of the patch. This version fails the ldap_password test on all platforms on pg_ctl failing to start: [14:48:10.544] --- stdout --- [14:48:10.544] # executing test in /tmp/cirrus-ci-build/build/testrun/ldap_password_func/001_mutated_bindpasswd group ldap_password_functest 001_mutated_bindpasswd [14:48:10.544] # waiting for slapd to accept requests... [14:48:10.544] # setting up PostgreSQL instance [14:48:10.544] Bail out! pg_ctl start failed [14:48:10.544] # test failed Updating src/test/modules/ldap_password_func/t/001_mutated_bindpasswd.pl with the new GUC might solve the problem from skimming this. Please send a fixed version, I'm marking the patch Waiting on Author in the meantime. -- Daniel Gustafsson
> On 3 Jul 2023, at 15:57, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 16 May 2023, at 20:51, Sergey Dudoladov <sergey.dudoladov@gmail.com> wrote: > >> I have attached the fourth version of the patch. > > This version fails the ldap_password test on all platforms on pg_ctl failing to start: > > [14:48:10.544] --- stdout --- > [14:48:10.544] # executing test in /tmp/cirrus-ci-build/build/testrun/ldap_password_func/001_mutated_bindpasswd group ldap_password_functest 001_mutated_bindpasswd > [14:48:10.544] # waiting for slapd to accept requests... > [14:48:10.544] # setting up PostgreSQL instance > [14:48:10.544] Bail out! pg_ctl start failed > [14:48:10.544] # test failed > > Updating src/test/modules/ldap_password_func/t/001_mutated_bindpasswd.pl with > the new GUC might solve the problem from skimming this. > > Please send a fixed version, I'm marking the patch Waiting on Author in the > meantime. With the patch failing tests and the thread stalled with no update I am marking this returned with feedback. Please feel free to resubmit to a future CF. -- Daniel Gustafsson