Thread: protocol support for labels
pgconf.dev is coming up soon. I won't be able to make it to Montreal, but I saw that Dave Cramer posted on twitter asking about postgres protocol topics. while I don't have a patch, I wanted to send an email about something: it'd be great to have a place - perhaps like application_name - for arbitrary labels; and we need protocol support to pass those labels inline with the query instead of having to require a separate round trip. it would also be good if these labels were somehow accessible in logging formats. observability frameworks like OpenTelemetry support tracing through all layers of a stack, and trace_ids can even be passed into sql with extensions like sqlcommenter. however sqlcommenter puts the trace_id into a comment which effectively breaks all the utility of pg_stat_statements since each query has a unique trace_id. if protocol enhancements are on the table, I think it would be great for this topic to get a little discussion. -Jeremy
On Tue, 11 Mar 2025 at 11:09, Jeremy Schneider <schneider@ardentperf.com> wrote: > observability frameworks like OpenTelemetry support tracing through all > layers of a stack, and trace_ids can even be passed into sql with > extensions like sqlcommenter. however sqlcommenter puts the trace_id > into a comment which effectively breaks all the utility of > pg_stat_statements since each query has a unique trace_id. > There are some other use-cases: 1) RO-RW routing. Users can pass target-session-attrs to the server within query labels to hint about its need. Useful when some kind of proxy (odyssey,pgbouncer,spqr,pgpool II, pgcat, pg_doorman) is used. 2) pg_comment_stats uses comments in the query to accumulate statistics. [0] However, I don't think PostgreSQL community is open for this big change for much (non-big) matters. [0] git@github.com:munakoiso/pg_comment_stats.git -- Best regards, Kirill Reshke
> On Mar 11, 2025, at 3:03 AM, Kirill Reshke <reshkekirill@gmail.com> wrote: > > On Tue, 11 Mar 2025 at 11:09, Jeremy Schneider <schneider@ardentperf.com> wrote: > >> observability frameworks like OpenTelemetry support tracing through all >> layers of a stack, and trace_ids can even be passed into sql with >> extensions like sqlcommenter. however sqlcommenter puts the trace_id >> into a comment which effectively breaks all the utility of >> pg_stat_statements since each query has a unique trace_id. >> > There are some other use-cases: > 1) RO-RW routing. Users can pass target-session-attrs to the server > within query labels to hint about its need. Useful when some kind of > proxy (odyssey,pgbouncer,spqr,pgpool II, pgcat, pg_doorman) is used. > 2) pg_comment_stats uses comments in the query to accumulate statistics. [0] Thinking a bit more, my root issue is specifically around pg_stat_statements so maybe it’s also solvable with some changesto how query jumbling is done But that topic seems like one where we’d never get consensus Should query jumbling for calculating query_id be customizable somehow? How would we resolve multiple concurrent opinionsabout how queries should be jumbled (eg if comment_stats needs different tweaks than sqlcommenter)? Was there previousdiscussion about this already? I’ll need to go search mailing list history a bit -Jeremy Sent from my TI-83
The usecase that I think might be useful is to have a database client send metadata along with a query.
This partially is possible today by setting application_name, but that is a separate request, it would be great if that could be sent along with the query in one go.
Another option to pass metadata is to add a comment (/* .. */), but a comment cannot be set for a prepared statement, because the statement is prepared first and then later invoked on runtime, which executes a query that is fixed.
Frits Hoogland
On 11 Mar 2025, at 15:49, Jeremy Schneider <schneider@ardentperf.com> wrote:On Mar 11, 2025, at 3:03 AM, Kirill Reshke <reshkekirill@gmail.com> wrote:
On Tue, 11 Mar 2025 at 11:09, Jeremy Schneider <schneider@ardentperf.com> wrote:observability frameworks like OpenTelemetry support tracing through allThere are some other use-cases:
layers of a stack, and trace_ids can even be passed into sql with
extensions like sqlcommenter. however sqlcommenter puts the trace_id
into a comment which effectively breaks all the utility of
pg_stat_statements since each query has a unique trace_id.
1) RO-RW routing. Users can pass target-session-attrs to the server
within query labels to hint about its need. Useful when some kind of
proxy (odyssey,pgbouncer,spqr,pgpool II, pgcat, pg_doorman) is used.
2) pg_comment_stats uses comments in the query to accumulate statistics. [0]
Thinking a bit more, my root issue is specifically around pg_stat_statements so maybe it’s also solvable with some changes to how query jumbling is done
But that topic seems like one where we’d never get consensus
Should query jumbling for calculating query_id be customizable somehow? How would we resolve multiple concurrent opinions about how queries should be jumbled (eg if comment_stats needs different tweaks than sqlcommenter)? Was there previous discussion about this already? I’ll need to go search mailing list history a bit
-Jeremy
Sent from my TI-83
Dave Cramer
www.postgres.rocks
On Tue, 11 Mar 2025 at 12:23, Frits Hoogland <frits.hoogland@gmail.com> wrote:
The usecase that I think might be useful is to have a database client send metadata along with a query.This partially is possible today by setting application_name, but that is a separate request, it would be great if that could be sent along with the query in one go.Another option to pass metadata is to add a comment (/* .. */), but a comment cannot be set for a prepared statement, because the statement is prepared first and then later invoked on runtime, which executes a query that is fixed.Frits HooglandOn 11 Mar 2025, at 15:49, Jeremy Schneider <schneider@ardentperf.com> wrote:On Mar 11, 2025, at 3:03 AM, Kirill Reshke <reshkekirill@gmail.com> wrote:
On Tue, 11 Mar 2025 at 11:09, Jeremy Schneider <schneider@ardentperf.com> wrote:observability frameworks like OpenTelemetry support tracing through allThere are some other use-cases:
layers of a stack, and trace_ids can even be passed into sql with
extensions like sqlcommenter. however sqlcommenter puts the trace_id
into a comment which effectively breaks all the utility of
pg_stat_statements since each query has a unique trace_id.
1) RO-RW routing. Users can pass target-session-attrs to the server
within query labels to hint about its need. Useful when some kind of
proxy (odyssey,pgbouncer,spqr,pgpool II, pgcat, pg_doorman) is used.
2) pg_comment_stats uses comments in the query to accumulate statistics. [0]
Thinking a bit more, my root issue is specifically around pg_stat_statements so maybe it’s also solvable with some changes to how query jumbling is done
But that topic seems like one where we’d never get consensus
Should query jumbling for calculating query_id be customizable somehow? How would we resolve multiple concurrent opinions about how queries should be jumbled (eg if comment_stats needs different tweaks than sqlcommenter)? Was there previous discussion about this already? I’ll need to go search mailing list history a bit
-Jeremy
Sent from my TI-83
Jeremy,
Thanks for this. I am hoping to get consensus of ideas for changes to the protocol.
Please join the discussion here https://discord.gg/bWum3hbM as well.
Krill, change starts with requests and without them nothing will happen. Please post your ideas.
Dave
On Tue, Mar 11, 2025 at 05:23:14PM +0100, Frits Hoogland wrote: > The usecase that I think might be useful is to have a database client send metadata along with a query. > This partially is possible today by setting application_name, but that is a separate request, it would be great if thatcould be sent along with the query in one go. > Another option to pass metadata is to add a comment (/* .. */), but a comment cannot be set for a prepared statement, becausethe statement is prepared first and then later invoked on runtime, which executes a query that is fixed. How about using a `set_config()` to deonte the "application_name" (and any other details) for the _next_ query, then have those details appear in the pg_stat_statements rows and logs? Clients would send a `SELECT set_config(...)` and also the next query one after the other without waiting for the response to the first. The server could similarly batch the two responses. Look ma', no protocol change. Nico --
On Tue, 11 Mar 2025 14:03:12 -0500 Nico Williams <nico@cryptonector.com> wrote: > How about using a `set_config()` to deonte the "application_name" (and > any other details) for the _next_ query, then have those details > appear in the pg_stat_statements rows and logs? > > Clients would send a `SELECT set_config(...)` and also the next query > one after the other without waiting for the response to the first. > The server could similarly batch the two responses. Isn't multiple queries in one packet only possible with the simple protocol, but not possible with the extended protocol? So this would be entirely incompatible with prepared/parameterized statements? -Jeremy
On Wed, Mar 12, 2025 at 11:39:57PM -0700, Jeremy Schneider wrote: > Isn't multiple queries in one packet only possible with the simple > protocol, but not possible with the extended protocol? So this would be > entirely incompatible with prepared/parameterized statements? It's TCP. Packet and segment boundaries are not exposed to the application, so PG doesn't know that you sent "two queries in one packet" (or in one segment). It just reads one query, processes it, then it reads the next one, processes it, etc. So all that's needed here is for a well-defined GUC that can be set with set_config() and whose purpose is to decorate subsequent queries in pg_stat_statements and/or logs. You could set it once for all your queries or once per-query -- whatever. As long as the client-side API can batch one of these calls to set_config() followed by the actual query you want to do in the same write() on the socket (TCP_CORK helps) then this idea will be barely noticeable performance-wise. Nico --