Thread: Logging which interface was connected to in log_line_prefix
Someone on -general was asking about this, as they are listening on multiple IPs and would like to know which exact one clients were hitting. I took a quick look and we already have that information, so I grabbed some stuff from inet_server_addr and added it as part of a "%L" (for 'local interface'). Quick patch / POC attached.
Cheers,
Greg
Attachment
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: tested, passed Hi I did a quick test on this patch and it seems to work as expected. Originally I thought the patch would add the name of "localinterface" such as "eth0", "eth1", "lo"... etc as %L log prefix format. Instead, it formats the local interface IPaddresses , but I think it is fine too. I have tested this new addition with various types of IPs including IPv4, IPv4 and IPv6 local loop back addresses, globalIPv6 address, linked local IPv6 address with interface specifier, it seems to format these IPs correctly There is a comment in the patch that states: /* We do not need clean_ipv6_addr here: just report verbatim */ I am not quite sure what it means, but I am guessing it means that the patch does not need to format the IPv6 addresses inany specific way. For example, removing leading zeros or compressing consecutive zeros to make a IPv6 address shorter.It may not be necessary to indicate this in a comment because In my test, if any of my interface's IPv6 address haveconsecutive zeroes like this: 2000:0000:0000:0000:0000:0000:200:cafe/64, my network driver (Ubuntu 18.04) will formatit as 2000::200:cafe, and the patch of course will read it as 2000::200:cafe, which is ... correct and clean. thank you Cary Huang www.highgo.ca
Thank you for taking the time to review this. I've attached a new rebased version, which has no significant changes.
There is a comment in the patch that states:
/* We do not need clean_ipv6_addr here: just report verbatim */
I am not quite sure what it means, but I am guessing it means that the patch does not need to format the IPv6 addresses in any specific way.
Yes, basically correct. There is a kluge (their word, not mine) in utils/adt/network.c to strip the zone - see the comment for the clean_ipv6_addr() function in that file. I added the patch comment in case some future person wonders why we don't "clean up" the ipv6 address, like other places in the code base do. We don't need to pass it back to anything else, so we can simply output the correct version, zone and all.
Cheers,
Greg
Attachment
On 06.03.24 16:59, Greg Sabino Mullane wrote: > Someone on -general was asking about this, as they are listening on > multiple IPs and would like to know which exact one clients were > hitting. I took a quick look and we already have that information, so I > grabbed some stuff from inet_server_addr and added it as part of a "%L" > (for 'local interface'). Quick patch / POC attached. I was confused by this patch title. This feature does not log the interface (like "eth0" or "lo"), but the local address. Please adjust the terminology. I noticed that for Unix-domain socket connections, %r and %h write "[local]". I think that should be done for this new placeholder as well.
On 01.05.24 19:04, Greg Sabino Mullane wrote: > Thank you for taking the time to review this. I've attached a new > rebased version, which has no significant changes. > > There is a comment in the patch that states: > > /* We do not need clean_ipv6_addr here: just report verbatim */ > > I am not quite sure what it means, but I am guessing it means that > the patch does not need to format the IPv6 addresses in any specific > way. > > > Yes, basically correct. There is a kluge (their word, not mine) in > utils/adt/network.c to strip the zone - see the comment for the > clean_ipv6_addr() function in that file. I added the patch comment in > case some future person wonders why we don't "clean up" the ipv6 > address, like other places in the code base do. We don't need to pass it > back to anything else, so we can simply output the correct version, zone > and all. clean_ipv6_addr() needs to be called before trying to convert a string representation into inet/cidr types. This is not what is happening here. So the comment is not applicable.
Peter, thank you for the feedback. Attached is a new patch with "address" rather than "interface", plus a new default of "local" if there is no address. I also removed the questionable comment, and updated the commitfest title.
Cheers,
Greg
Attachment
On 5/24/24 22:33, Greg Sabino Mullane wrote: > Peter, thank you for the feedback. Attached is a new patch with > "address" rather than "interface", plus a new default of "local" if > there is no address. I also removed the questionable comment, and > updated the commitfest title. I tried the updated patch and it behaved as expected with [local] being logged for peer connections and an IP being logged for host connections. One thing -- the changes in postgresql.conf.sample should use tabs to match the other lines. The patch uses spaces. I also find the formatting in log_status_format() pretty awkward but I guess that will be handled by pgindent. Regards, -David
Thanks for the review. Please find attached a new version with proper tabs and indenting.
Cheers,
Greg
Attachment
On 7/11/24 23:09, Greg Sabino Mullane wrote: > Thanks for the review. Please find attached a new version with proper > tabs and indenting. This looks good to me now. +1 overall for the feature. Regards, -David
Hi Greg On 11.07.24 18:09, Greg Sabino Mullane wrote: > Thanks for the review. Please find attached a new version with proper > tabs and indenting. > > Cheers, > Greg > I'm testing this new log prefix and I'm wondering whether the following behaviour is expected. The value of '%L' is different in the following cases: postgres=# SHOW log_line_prefix; log_line_prefix ----------------- %m [%p] -> %L (1 row) -- postgres=# SELECT 1/0; 2024-11-18 16:00:42.720 CET [3135117] -> 192.168.178.27 ERROR: division by zero 2024-11-18 16:00:42.720 CET [3135117] -> 192.168.178.27 STATEMENT: SELECT 1/0; -- postgres=# SELECT pg_reload_conf(); 2024-11-18 16:01:23.273 CET [3114980] -> [local] LOG: received SIGHUP, reloading configuration files -- postgres=# CHECKPOINT; 2024-11-18 16:01:46.758 CET [3114981] -> [local] LOG: checkpoint starting: immediate force wait 2024-11-18 16:01:46.769 CET [3114981] -> [local] LOG: checkpoint complete: wrote 0 buffers (0.0%), wrote 0 SLRU buffers; 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.012 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=25924 kB; lsn=0/26166430, redo lsn=0/261663D8 Is it supposed to be like this? Thanks for the patch! -- Jim
On Mon, Nov 18, 2024 at 10:07 AM Jim Jones <jim.jones@uni-muenster.de> wrote:
2024-11-18 16:00:42.720 CET [3135117] -> 192.168.178.27 STATEMENT:
...
2024-11-18 16:01:23.273 CET [3114980] -> [local] LOG: received SIGHUP,
...
2024-11-18 16:01:46.769 CET [3114981] -> [local] LOG: checkpoint
Is it supposed to be like this?
Great question. I think "supposed to" is a bit of a stretch, but I presume it's the difference between a client connecting and using its connection information versus an already existing backend process, which is always going to be "local".
Overall this makes sense, as that checkpoint example above is coming from the checkpointer background process at 3114981, not the backend process that happened to trigger it. And 3114981 has no way of knowing the details of the caller's connection.
FWIW, the patch still applies cleanly to head as of 2/27/2025, so no rebase needed.
Cheers,
Greg
--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
On 27.02.25 14:54, Greg Sabino Mullane wrote: > Great question. I think "supposed to" is a bit of a stretch, but I > presume it's the difference between a client connecting and using its > connection information versus an already existing backend process, > which is always going to be "local". > > Overall this makes sense, as that checkpoint example above is coming > from the checkpointer background process at 3114981, not the backend > process that happened to trigger it. And 3114981 has no way of knowing > the details of the caller's connection. > In that case, it LGTM. I revisited this patch and tested it with two different computers (for client and server). Initially, I was momentarily confused by the logged address format, which varies depending on the client's format. However, I found that %h behaves just like this, so I guess it is ok. postgres=# SHOW log_line_prefix; log_line_prefix ----------------------- %m [%p]: L=%L, h=%h (1 row) 2025-03-02 18:19:07.859 CET [2246150]: L=192.168.178.27, h=192.168.178.79 ERROR: division by zero 2025-03-02 18:19:07.859 CET [2246150]: L=192.168.178.27, h=192.168.178.79 STATEMENT: SELECT 1/0 2025-03-02 18:19:19.327 CET [2246291]: L=2a02:...:7591, h=2a02:...:4a7 ERROR: division by zero 2025-03-02 18:19:19.327 CET [2246291]: L=2a02:...:7591, h=2a02:...:4a7 STATEMENT: SELECT 1/0 Best, Jim
Jim Jones <jim.jones@uni-muenster.de> writes: > In that case, it LGTM. I looked at 0002 briefly. I don't have any particular objection to the proposed feature, but I'm quite concerned about the potential performance implications of doing a new pg_getnameinfo_all() call for every line of log output. I think that needs to be cached somehow. It's a little tricky since we may pass through this function one or more times before the socket info has been filled in, but it seems do-able. "Local address" doesn't convey a lot to my mind, and Jim's confusion about what "[local]" means seems tied to that. Can we think of a different explanation for the docs? Also, in %h we use "[local]" to signify a Unix socket, but this patch prints that for both the Unix-socket case and the case of no client connection at all. I think "[none]" or some such would be a better idea than "[local]" for background processes. The patch pays no attention to the "padding" feature that all the other switch cases honor. Also, the coding style is randomly unlike project style in various details, notably paren placement and the use of "0 == foo" to mean "!foo". regards, tom lane
Oh, one other thing: pg_getnameinfo_all is perfectly capable of dealing with a Unix-socket address, so I think you should drop the tests on ss_family and let pg_getnameinfo_all be in charge of producing "[local]" for that case. regards, tom lane
Thanks for all the feedback. Please find attached a version which prints "[none]" as the default value, "[local]" for a socket, and otherwise whatever pg_getnameinfo_all spits out. I cleaned up the coding, respected padding, removed the family checks, and expanded the docs a tiny bit to give the reader more context as to what "local address" means. I also looked into alternatives to the term "local address" but that still seems the most correct and commonly used term.
I have not attempted the caching change yet.
Cheers,
Greg
--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
Attachment
Greg Sabino Mullane <htamfids@gmail.com> writes: > I have not attempted the caching change yet. After some thought I concluded that caching the local-address string in MyProcPort itself would be the most robust way of making that work. Otherwise you need some way to update the cache when MyProcPort is created (in case the process already emitted some messages before that), and the patch starts to spread into other places. I think 0004 attached is about committable, but there is one definitional point that is troubling me slightly: our choice to emit "[none]" when there's no port isn't consistent with the log_line_prefix documentation's statement that ... Some escapes are only recognized by session processes, and will be treated as empty by background processes such as the main server process. Since we've marked %L as "Session only" = yes, this implies that it should print as an empty string not "[none]". We could either 1. Ignore the inconsistency, commit 0004 as-is. 2. Change the output to be an empty string in background processes. This is consistent, but it goes against our upthread feeling that "[none]" would avoid confusion. 3. Mark %L as "Session only" = no. This seems a little weird, but it'd also be consistent. 4. Add something to the above-quoted text about %L being an exception. I don't really care for #3 or #4, but I'm ambivalent between #1 and #2. I think the worry about confusion originated when the patch would print "[local]" for either a Unix socket or a background process, and that certainly was confusing. "[local]" versus an empty string is not so ambiguous, so maybe it's fine. Thoughts? regards, tom lane diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index fea683cb49c..a8542fe41ce 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7689,6 +7689,12 @@ local0.* /var/log/postgresql <entry>Remote host name or IP address</entry> <entry>yes</entry> </row> + <row> + <entry><literal>%L</literal></entry> + <entry>Local address (the IP address on the server that the + client connected to)</entry> + <entry>yes</entry> + </row> <row> <entry><literal>%b</literal></entry> <entry>Backend type</entry> diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index 8a6b6905079..d6299633ab7 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -67,6 +67,7 @@ #endif #include "access/xact.h" +#include "common/ip.h" #include "libpq/libpq.h" #include "libpq/pqformat.h" #include "mb/pg_wchar.h" @@ -3084,6 +3085,38 @@ log_status_format(StringInfo buf, const char *format, ErrorData *edata) appendStringInfoSpaces(buf, padding > 0 ? padding : -padding); break; + case 'L': + { + const char *local_host; + + if (MyProcPort) + { + if (MyProcPort->local_host[0] == '\0') + { + /* + * First time through: cache the lookup, since it + * might not have trivial cost. + */ + (void) pg_getnameinfo_all(&MyProcPort->laddr.addr, + MyProcPort->laddr.salen, + MyProcPort->local_host, + sizeof(MyProcPort->local_host), + NULL, 0, + NI_NUMERICHOST | NI_NUMERICSERV); + } + local_host = MyProcPort->local_host; + } + else + { + /* Background process, or connection not yet made */ + local_host = "[none]"; + } + if (padding != 0) + appendStringInfo(buf, "%*s", padding, local_host); + else + appendStringInfoString(buf, local_host); + } + break; case 'r': if (MyProcPort && MyProcPort->remote_host) { diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index ff56a1f0732..f154906c2fa 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -603,6 +603,7 @@ # %d = database name # %r = remote host and port # %h = remote host + # %L = local address # %b = backend type # %p = process ID # %P = process ID of parallel group leader diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h index 0d1f1838f73..d6e671a6382 100644 --- a/src/include/libpq/libpq-be.h +++ b/src/include/libpq/libpq-be.h @@ -139,6 +139,9 @@ typedef struct Port int remote_hostname_errcode; /* see above */ char *remote_port; /* text rep of remote port */ + /* local_host is filled only if needed (see log_status_format) */ + char local_host[64]; /* ip addr of local socket for client conn */ + /* * Information that needs to be saved from the startup packet and passed * into backend execution. "char *" fields are NULL if not set.
On Sun, Apr 06, 2025 at 06:01:01PM -0400, Tom Lane wrote: > I don't really care for #3 or #4, but I'm ambivalent between #1 and > #2. I think the worry about confusion originated when the patch > would print "[local]" for either a Unix socket or a background > process, and that certainly was confusing. "[local]" versus > an empty string is not so ambiguous, so maybe it's fine. I'd suggest the addition of this data to csvlog.c and jsonlog.c, perhaps only adding this information if local_host[0] is not '\0' rather than assigning a default "[none]" all the time to save some space in the entries generated. config.sgml would also need a refresh: - runtime-config-logging-jsonlog-keys-values for the new key/value pair. - runtime-config-logging-csvlog for the CREATE TABLE and the list of columns. Perhaps it would be time to switch the list of columns to use a proper table instead of a raw list for the CSV docs, for clarity. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > I'd suggest the addition of this data to csvlog.c and jsonlog.c, > perhaps only adding this information if local_host[0] is not '\0' > rather than assigning a default "[none]" all the time to save some > space in the entries generated. I think that's completely impractical for csvlog: changing the row type for CSV output is a serious compatibility break, and there is exactly zero evidence that the local address info is worth that. JSON has less of a compatibility problem, but I still doubt that the local address info is worth the space it'd take, for about 99% of users. Also note that we have no good way for the user to specify what fields she wants in jsonlog, otherwise we could make it appear only if asked for. regards, tom lane
On Sun, Apr 6, 2025 at 6:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
1. Ignore the inconsistency, commit 0004 as-is.
2. Change the output to be an empty string in background processes.
This is consistent, but it goes against our upthread feeling that "[none]" would avoid confusion.
I lean for #1. Yes, there is some inconsistency, but it feels like the right thing to do, and this is a feature I suspect not many people will use anyway.
--
Cheers,
Greg
--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
Greg Sabino Mullane <htamfids@gmail.com> writes: > On Sun, Apr 6, 2025 at 6:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> 1. Ignore the inconsistency, commit 0004 as-is. >> >> 2. Change the output to be an empty string in background processes. >> This is consistent, but it goes against our upthread feeling that >> "[none]" would avoid confusion. > I lean for #1. Yes, there is some inconsistency, but it feels like the > right thing to do, and this is a feature I suspect not many people will use > anyway. Hearing no other comments, I pushed 0004 as-is. It's trivial enough to replace "[none]" with "" if somebody makes the case for that. regards, tom lane
On Sun, Apr 6, 2025 at 8:50 PM Greg Sabino Mullane <htamfids@gmail.com> wrote: > I lean for #1. Yes, there is some inconsistency, but it feels like the right thing to do, and this is a feature I suspectnot many people will use anyway. I just saw the commit message here and thought I would show up to say that it sounds like a cool feature. I agree that not many people will use it, but when you need it, you're probably going to be really happy to have it. The only thing that makes me a little bit sad is that we don't seem to have added this to pg_stat_activity. I feel like that would be the dream here: in the rare situation where you're not sure which interface is connecting to a client, having this in pg_stat_activity would allow you to clarify the situation without needing to change your logging. Maybe it's not super-important; in a pinch, at least on Linux, you can probably use netstat to figure out what's happening. However, putting it in pg_stat_activity would, at least for me, would make the information routinely available without extra steps. If you have to ask a user to run an extra tool, they have to have it installed and know how to use it and run it at the same time they collect the pg_stat_activity data and so on; if you just add a column to pg_stat_activity then the information just shows up as part of routine data gathering. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > The only thing that makes me a little bit sad is that we don't seem to > have added this to pg_stat_activity. Hmm, that seems like it'd be a completely separate discussion. My main objection to the idea is that if we do that then everybody will pay the overhead for it, whether they use it or not. regards, tom lane
On Mon, Apr 7, 2025 at 11:59 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > The only thing that makes me a little bit sad is that we don't seem to > > have added this to pg_stat_activity. > > Hmm, that seems like it'd be a completely separate discussion. Yes, not something we should try to squeeze in right before freeze, just a thought! > My main objection to the idea is that if we do that then everybody > will pay the overhead for it, whether they use it or not. I was hoping that the overhead would be small enough that nobody would really care. -- Robert Haas EDB: http://www.enterprisedb.com