Thread: Proposal - Allow extensions to set a Plan Identifier
Hi, A patch by Lukas Fittl [1] introduces the pg_stat_plans extension, which exposes execution plans along with execution statistics. As part of this work, a new infrastructure is being proposed to compute a plan identifier (planId), which is exposed in both pg_stat_plans and pg_stat_activity. Exposing planId in pg_stat_activity enables users to identify long-running or high-load plans and correlate them with plan text from the extension. One of open question in [1] is which components of a plan tree should be considered when computing planId. Lukas has created a wiki [2] for discussion, but reaching a consensus may take time—possibly in time for v19, but it's uncertain. Meanwhile, existing extensions like pg_stat_monitor [3] compute a planId and store the plan text, but they lack a way to expose planId in pg_stat_activity. This limits their usefulness, as identifying top or long-running plans from pg_stat_activity is critical for monitoring. To address this, I propose leveraging work from [1] and allow extensions to set a planId This could be implemented sooner than an in-core computed planId. Proposal Overview: 1/ Add a planId field to PgBackendStatus. 2/ Add a planId field to PlannedStmt. 3/ Create APIs for extensions to set the current planId. Storing planId in PlannedStmt ensures it is available with cached plans. One consideration is whether reserving a 64-bit integer in PgBackendStatus and PlannedStmt is acceptable, given that planId may not always be used. However, this is already the case for queryId when compute_query_id is disabled and no extension sets it, so it may not be a concern. Looking forward to feedback on this approach. If there is agreement, I will work on preparing the patches. Regards, Sami Imseih Amazon Web Services (AWS) [1] https://www.postgresql.org/message-id/flat/CAP53Pkyow59ajFMHGpmb1BK9WHDypaWtUsS_5DoYUEfsa_Hktg%40mail.gmail.com [2] https://www.postgresql.org/message-id/CAP53PkxocbNr%2BeRag3FEJp3-7S1U80FspOg8UQjO902TWMG%3D6A%40mail.gmail.com [3] https://github.com/percona/pg_stat_monitor
Hi, On Wed, Feb 12, 2025 at 05:44:20PM -0600, Sami Imseih wrote: > Meanwhile, existing extensions like pg_stat_monitor [3] compute a planId and > store the plan text, but they lack a way to expose planId in pg_stat_activity. > This limits their usefulness, as identifying top or long-running plans from > pg_stat_activity is critical for monitoring. Agree. > To address this, I propose leveraging work from [1] and allow extensions > to set a planId This could be implemented sooner than an in-core > computed planId. I think that makes sense and then the idea would be later on to move to something like 5fd9dfa5f50, but for the "planId": is my understanding correct? > Proposal Overview: > > 1/ Add a planId field to PgBackendStatus. > 2/ Add a planId field to PlannedStmt. > 3/ Create APIs for extensions to set the current planId. Note sure 3 is needed (MyBEEntry is available publicly and PlannedStmt also through some hooks). But did not look close enough and would probably depends of the implementation, so let's see. > Storing planId in PlannedStmt ensures it is available with > cached plans. > > One consideration is whether reserving a 64-bit integer in PgBackendStatus > and PlannedStmt is acceptable, given that planId may not always be used. From what I can see PgBackendStatus is currently 432 bytes and PlannedStmt is 152 bytes. It looks like that adding an uint64 at the end of those structs would not add extra padding (that's what "ptype /o struct <struct>" tells me) so would increase to 440 bytes and 160 bytes respectively. I don't think that's an issue. > However, this is already the case for queryId when compute_query_id is > disabled and no extension sets it Yup. > Looking forward to feedback on this approach. > If there is agreement, I will work on preparing the patches. That sounds like a plan for me but better to wait for others to share their thoughts too. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Thanks for the feedback! > I think that makes sense and then the idea would be later on to move to something > like 5fd9dfa5f50, but for the "planId": is my understanding correct? correct. This is adding infrastructure to eventually have an in-core planId; but in the meanwhile give extensions that already compute a planId the ability to broadcast the planId in pg_stat_activity. > > Proposal Overview: > > > > 1/ Add a planId field to PgBackendStatus. > > 2/ Add a planId field to PlannedStmt. > > 3/ Create APIs for extensions to set the current planId. > > Note sure 3 is needed (MyBEEntry is available publicly and PlannedStmt also > through some hooks). But did not look close enough and would probably depends > of the implementation, so let's see. I don't think direct setting of values is a good idea. We will need an API similar to pgstat_report_query_id which ensures we are only reporting top level planIds -and- in the case of multiple extensions with the capability to set a planId, only the first one in the stack wins. pgstat_report_query_id does allow for forcing a queryId ( force flag which is false by default ), and I think this new API should allow the same. > > Storing planId in PlannedStmt ensures it is available with > > cached plans. > > > > One consideration is whether reserving a 64-bit integer in PgBackendStatus > > and PlannedStmt is acceptable, given that planId may not always be used. > > From what I can see PgBackendStatus is currently 432 bytes and PlannedStmt > is 152 bytes. It looks like that adding an uint64 at the end of those structs > would not add extra padding (that's what "ptype /o struct <struct>" tells me) > so would increase to 440 bytes and 160 bytes respectively. I don't think that's > an issue. Correct, thanks for adding this detail. -- Sami
On 14/2/2025 08:21, Michael Paquier wrote: > On Thu, Feb 13, 2025 at 11:10:27AM -0600, Sami Imseih wrote: >> I don't think direct setting of values is a good idea. We will need an API >> similar to pgstat_report_query_id which ensures we are only reporting top >> level planIds -and- in the case of multiple extensions with the capability >> to set a planId, only the first one in the stack wins. pgstat_report_query_id >> does allow for forcing a queryId ( force flag which is false by default ), and >> I think this new API should allow the same. > I'm obviously siding with this proposal because we have an ask to > track this kind of data, and because this kind of data is kind of hard > to track across a stack of extensions through the core backend code. > > Point is, would others be interested in this addition or just object > to it because it touches the core code? I have already implemented it twice in different ways as a core patch. In my projects, we need to track queryId and plan node ID for two reasons: 1. Optimisational decisions made during transformation/path generation stages up to the end of execution to correct them in the future. 2. Cache information about the query tree/node state to use it for statistical purposes. In my experience, we don't need a single plan_id field; we just need an 'extended list' pointer at the end of the Plan, PlannedStmt, Query, and RelOptInfo structures and a hook at the end of the create_plan_recurse() to allow passing some info from the path generator to the plan tree. An extension may add its data to the list (we may register an extensible node type to be sure we don't interfere with other extensions) and manipulate it in a custom way and with custom UI. Generally, it makes the optimiser internals more open to extensions. -- regards, Andrei Lepikhov
On Sun, Feb 16, 2025 at 5:34 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Sat, Feb 15, 2025 at 10:29:41AM +0100, Andrei Lepikhov wrote: > > I have already implemented it twice in different ways as a core patch. > > In my projects, we need to track queryId and plan node ID for two reasons: > > Are these available in the public somewhere or is that simply what > Sami is proposing? Andrei, you mention "plan node ID" which if I understand correctly will be a good thing to expose to determine which part of the plan most of the time is spent. This is similar to an idea I raised in a different thread for explain plan progress [1]. However, I am proposing something different, which is we track a plan_id of the full execution plan. Some extensions may hash the text version of the plan, others may choose to do something more elaborate such as "jumbling" a plan tree. The point becomes is that monitoring extensions such as pg_stat_monitor and others can set the plan_id in core so it's available in backend status. > > 1. Optimisational decisions made during transformation/path generation > > stages up to the end of execution to correct them in the future. > > 2. Cache information about the query tree/node state to use it for > > statistical purposes. > > Gathering of statistical data based on a node tree is one reason, > where it may or may not be required to walk through a path. > Influencing the plan used with an already-generated one (where hints > could be used) was the second one, mostly replacing a plan in the > planner hook. Influencing the paths in a plan or a subplan didn't > really matter much with hints to drive the paths. > > > In my experience, we don't need a single plan_id field; we just need an > > 'extended list' pointer at the end of the Plan, PlannedStmt, Query, and > > RelOptInfo structures and a hook at the end of the create_plan_recurse() to > > allow passing some info from the path generator to the plan tree. > > An extension may add its data to the list (we may register an extensible > > node type to be sure we don't interfere with other extensions) and > > manipulate it in a custom way and with custom UI. > > Generally, it makes the optimiser internals more open to extensions. > > Sounds to me that this maps with the addition of a "private" area to > some of the plan structures to allow extensions to attach some data > that would be reused elsewhere, which is rather independent than > what's suggested here? +1, such a private area is different from what is being proposed. [1] https://www.postgresql.org/message-id/CAA5RZ0uGDKWxqUCMrsWKV425T2f6mqJsXKg6chq%2BWuyCwNPUGw%40mail.gmail.com. -- Sami
I put together patches to do as is being proposed. v1-0001: 1. Adds a planId field in PlannedStmt 2. Added an st_plan_id fields in PgBackendStatus 3. APIs to report and to retrieve a planId to PgBackendStatus An extension is able to set a planId in PlannedStmt directly, and while they can do the same for PgBackendStatus, I felt it is better to provide similar APIs for this as we do for queryId. Unless the extension passed force=true to the API, this will ensure that a planId already set either by a top-level statement or by another extension cannot be set when a plan is active. Also, the extension does not need to worry about resetting the planId at the end of execution as this will follow the same mechanism as is currently being done for queryId. This will remove responsibility from the extension author to have to manage this aspect. The extension should normally compute the planId in the planner or ExecutorStart hook. If the planId is computed in the planner_hook, the extension can set the planId in plannedStmt making the planId available to subsequent executions of a cached plan. What this patch does not cover is adding a "Plan Identifier" to explain output or to logs. Also, there is no core GUC like compute_query_id, since it does not make sense if we're not computing a planId in core. v2-0001: This adds a planId to pg_stat_get_activity ( not pg_stat_activity ). An extension can offer its own view, similar to pg_stat_activity, which can include planId. Note that there are no documentation updates required here as we don't have per-field documentation for pg_stat_get_activity. These 2 patches can probably be combined , but will leave them like this for now. Looking forward to feedback. Regards Sami
Attachment
Hi, Thank you for the patches. They're very important for many extensions. I've debugged them using simple extensions pg_stat_statements and auto_explain, specifically checking cases involving PlannedStmt and pg_stat_get_activity - , and haven't encountered any issues so far. However, I have a question: what value should planid have when we execute the standard planner without using a planner_hook? For example, if pg_stat_statementsis disabled, would planid default to zero? If zero is the expected default, should we explicitly write this behavior? result->planid = 0; -- Best regards, Ilia Evdokimov, Tantor Labs LLC.
On 18.03.2025 14:46, Ilia Evdokimov wrote: > > However, I have a question: what value should planid have when we > execute the standard planner without using a planner_hook? For > example, if pg_stat_statementsis disabled, would planid default to > zero? If zero is the expected default, should we explicitly write this > behavior? > > result->planid = 0; > Sorry for spam. I found the answer this question in thread [0]. [0]: https://www.postgresql.org/message-id/flat/CAP53Pkyow59ajFMHGpmb1BK9WHDypaWtUsS_5DoYUEfsa_Hktg%40mail.gmail.com -- Best regards, Ilia Evdokimov, Tantor Labs LLC.
On Thu, Feb 20, 2025 at 05:03:19PM -0600, Sami Imseih wrote: > v2-0001: > > This adds a planId to pg_stat_get_activity ( not pg_stat_activity ). > An extension > can offer its own view, similar to pg_stat_activity, which can include planId. > > Note that there are no documentation updates required here as we don't > have per-field documentation for pg_stat_get_activity. > > These 2 patches can probably be combined , but will leave them like > this for now. The split makes more sense to me, and FWIW I see more value in 0001 that provides an anchor on the backend side for extensions to plug in a plan ID that can be tracked across multiple layers. I am less convinced about the need for 0002 that adds this information to pg_stat_activity at this stage, as the getter routine in 0001 is enough to know what's happening, roughly. +void +pgstat_report_plan_id(uint64 plan_id, bool force) +{ + volatile PgBackendStatus *beentry = MyBEEntry; + + if (!beentry || !pgstat_track_activities) + return; + + if (beentry->st_plan_id != 0 && !force) + return; Right, and that's in line with the cases I've seen because we wanted to be able to optionally control if only the top-level plan ID should be stored or not, which is useful when going through the planner hook at some sub-level. Hmm. Shouldn't we document all such expectations, which are similar to a query ID? Our experience with pgss in this area offers some insights, which I guess folks are rather used to and could be applied to plan IDs. In short, it would be nice to do 0001 for this release and start from that (with more documentation around the expectations of these interfaces). For 0002 we could always see later about exposing it in a system view, even if it would offer the advantage of more consistency across a single call of pgstat_get_local_beentry_by_index() for each backend entry. -- Michael
Attachment
> > This adds a planId to pg_stat_get_activity ( not pg_stat_activity ). > > An extension > > can offer its own view, similar to pg_stat_activity, which can include planId. > > > > Note that there are no documentation updates required here as we don't > > have per-field documentation for pg_stat_get_activity. > > > > These 2 patches can probably be combined , but will leave them like > > this for now. > > The split makes more sense to me, and FWIW I see more value in 0001 > that provides an anchor on the backend side for extensions to plug in > a plan ID that can be tracked across multiple layers. Yes, I think this is a step in the right direction for this functionality. > I am less > convinced about the need for 0002 that adds this information to > pg_stat_activity at this stage, as the getter routine in 0001 is > enough to know what's happening, roughly. I agree. I don't think exposing this in a system view is required at this time. Once extensions start to use it, and it becomes obvious it should be exposed in a system view, that discussion can be had at that time. > Hmm. Shouldn't we document all such expectations, > which are similar to a query ID? I had a lazy comment :) * For the same reasons as the query identifiers (see above), but, I went ahead and commented it similar to how we document pgstat_report_query_id and pgstat_get_my_query_id routines. attached is v2-0001 -- Sami Imseih Amazon Web Services (AWS)
Attachment
On Thu, Mar 20, 2025 at 09:46:55PM -0400, Sami Imseih wrote: > * For the same reasons as the query identifiers (see above), > > but, I went ahead and commented it similar to how we document > pgstat_report_query_id and pgstat_get_my_query_id routines. > attached is v2-0001 Looks mostly OK from here.. Except for two things. @@ -2016,6 +2017,17 @@ exec_bind_message(StringInfo input_message) */ cplan = GetCachedPlan(psrc, params, NULL, NULL); + foreach(lc, cplan->stmt_list) + { + PlannedStmt *plan = lfirst_node(PlannedStmt, lc); + + if (plan->planId != UINT64CONST(0)) + { + pgstat_report_plan_id(plan->planId, false); + break; + } + } In exec_bind_message(), the comment at the top of PortalDefineQuery() tells to not put any code between this call and the GetCachedPlan() that could issue an error. pgstat_report_plan_id() is OK, but I'd rather play it safe and set the ID once the portal is defined, based on portal->stmts instead. That's the same as your patch, but slightly safer in the long-term, especially if pgstat_report_plan_id() is twisted in such a way that it introduces a path where it ERRORs. planner() is the sole place in the core code where the planner hook can be called. Shouldn't we have at least a call to pgstat_report_plan_id() after planning a query? At least that should be the behavior I'd expect, where a module pushes a planId to a PlannedStmt, then core publishes it to the backend entry in non-force mode. bind and execute message paths are OK as they stand, where we set a plan ID once their portal is defined from its planned statements. With some adjustments to some comments and the surroundings of the code, I get the attached. What do you think? -- Michael
Attachment
> In exec_bind_message(), the comment at the top of PortalDefineQuery() > tells to not put any code between this call and the GetCachedPlan() > that could issue an error. pgstat_report_plan_id() is OK, but I'd > rather play it safe and set the ID once the portal is defined, based > on portal->stmts instead. That's the same as your patch, but slightly > safer in the long-term, especially if pgstat_report_plan_id() is > twisted in such a way that it introduces a path where it ERRORs. Yes that makes sense. As long as we report plan_id before PortalStart, for obvious reasons. > planner() is the sole place in the core code where the planner hook > can be called. Shouldn't we have at least a call to > pgstat_report_plan_id() after planning a query? At least that should > be the behavior I'd expect, where a module pushes a planId to a > PlannedStmt, then core publishes it to the backend entry in non-force > mode. I agree. I was just thinking we rely on the exec_ routines to report the plan_id at the start. But, besides the good reason you give, reporting (slightly) earlier is better for monitoring tools; as it reduces the likelihood they find an empty plan_id. Overall, v3 LGTM -- Sami Imseih Amazon Web Services (AWS)
On Fri, Mar 21, 2025 at 09:25:24AM -0400, Sami Imseih wrote: >> planner() is the sole place in the core code where the planner hook >> can be called. Shouldn't we have at least a call to >> pgstat_report_plan_id() after planning a query? At least that should >> be the behavior I'd expect, where a module pushes a planId to a >> PlannedStmt, then core publishes it to the backend entry in non-force >> mode. > > I agree. I was just thinking we rely on the exec_ routines to report the plan_id > at the start. But, besides the good reason you give, reporting > (slightly) earlier is > better for monitoring tools; as it reduces the likelihood they find an empty > plan_id. Yep. pgstat_report_plan_id() is not something that extensions should do, but they should report the planId in the PlannedStmt and let the backend do the rest. > Overall, v3 LGTM Thanks. If anybody has any objections and/or comments, now would be a good time. I'll revisit this thread at the beginning of next week. -- Michael
Attachment
On 3/22/25 06:48, Michael Paquier wrote: > On Fri, Mar 21, 2025 at 09:25:24AM -0400, Sami Imseih wrote: >>> planner() is the sole place in the core code where the planner hook >>> can be called. Shouldn't we have at least a call to >>> pgstat_report_plan_id() after planning a query? At least that should >>> be the behavior I'd expect, where a module pushes a planId to a >>> PlannedStmt, then core publishes it to the backend entry in non-force >>> mode. >> >> I agree. I was just thinking we rely on the exec_ routines to report the plan_id >> at the start. But, besides the good reason you give, reporting >> (slightly) earlier is >> better for monitoring tools; as it reduces the likelihood they find an empty >> plan_id. > > Yep. pgstat_report_plan_id() is not something that extensions should > do, but they should report the planId in the PlannedStmt and let the > backend do the rest. > >> Overall, v3 LGTM > > Thanks. If anybody has any objections and/or comments, now would be a > good time. I'll revisit this thread at the beginning of next week. The last time I wrote the email, I mistakenly thought about plan_node_id. I apologize for that. planId actually looks less controversial than queryId or plan_node_id. At the same time, it is not free from the different logic that extensions may incorporate into this value: I can imagine, for example, the attempt of uniquely numbering plans related to the same queryId, plain hashing of the plan tree, hashing without constants, etc. So, it seems that extensions may conflict with the same field. Are we sure that will happen if they are loaded in different orders in neighbouring backends? I'm not very good at stat collector guts, but would it be possible to allow extensions to report their state in standard tuple format? In that case, doubts about queryId or planId may be resolved. -- regards, Andrei Lepikhov
On Sat, Mar 22, 2025 at 11:50:06PM +0100, Andrei Lepikhov wrote: > planId actually looks less controversial than queryId or plan_node_id. At > the same time, it is not free from the different logic that extensions may > incorporate into this value: I can imagine, for example, the attempt of > uniquely numbering plans related to the same queryId, plain hashing of the > plan tree, hashing without constants, etc. One plan ID should have one single query ID, but the opposite may be not necessarily true: a query ID could be associated with multiple plan patterns (aka multiple plan IDs). What this offers is that we know about which plan one query is currently using in live, for a given query ID. > So, it seems that extensions may conflict with the same field. Are we sure > that will happen if they are loaded in different orders in neighbouring > backends? Depends on what kind of computation once wants to do, and surely without some coordination across the extensions you are using these cannot be shared, but it's no different from the concept of a query ID. The use cases I've seen where this field is useful is when splitting code that uses the planner hook for several purposes across more than one extension. Three of them which are independent, still want to treat know about a plan ID: - Set/get an existing node tree plan based on a specific ID. - Hash calculation of a tree (like Lukas proposal). - Statistics related to the plan tree used (aka custom cumulative stats play here). > I'm not very good at stat collector guts, but would it be possible to allow > extensions to report their state in standard tuple format? In that case, > doubts about queryId or planId may be resolved. I am not exactly sure what you mean here. We are only going to have one plan ID set for each query execution. Any stats plan data related to a plan ID surely needs to be upper-bounded if you don't want to bloat pgstats, with the query ID of the query string it relates to stored in it and the plan ID used as hash key, but it does not change that we're only going to always have one single plan ID in a PlannedStmt or in a backend entry doing a query execution, like a query ID. -- Michael
Attachment
> On Sat, Mar 22, 2025 at 11:50:06PM +0100, Andrei Lepikhov wrote: > > planId actually looks less controversial than queryId or plan_node_id. At > > the same time, it is not free from the different logic that extensions may > > incorporate into this value: I can imagine, for example, the attempt of > > uniquely numbering plans related to the same queryId, plain hashing of the > > plan tree, hashing without constants, etc. I think plan_node_id is probably the least controversial because that value comes straight from core, and different extensions cannot have their own interpretation of what that value could be. Computing a plan_id is even more open for interpretation than it is for query_id perhaps, which is why giving this ability to extensions will be useful. Different extensions may choose to compute a plan_id different ways, but they all should be able to push this value into core, and this is what this patch will allow for. FWIW, Lukas did start a Wiki [0] to open the discussion for what parts of the plan should be used to compute a plan_id, and maybe we can in the future compite a plan_id in core by default. > > So, it seems that extensions may conflict with the same field. Are we sure > > that will happen if they are loaded in different orders in neighbouring > > backends? > > Depends on what kind of computation once wants to do, and surely > without some coordination across the extensions you are using these > cannot be shared, but it's no different from the concept of a query > ID. correct. This was also recently raised in the "making Explain extensible" [0] thread also. That is the nature of extensions, and coordination is required. [0] https://wiki.postgresql.org/wiki/Plan_ID_Jumbling [1] https://www.postgresql.org/message-id/CA%2BTgmoZ8sTodL-orXQm51_npNxuDAS86BS5kC8t0LULneShRbg%40mail.gmail.com -- Sami Imseih Amazon Web Services (AWS)
On 3/23/25 01:01, Michael Paquier wrote: > On Sat, Mar 22, 2025 at 11:50:06PM +0100, Andrei Lepikhov wrote: >> planId actually looks less controversial than queryId or plan_node_id. At >> the same time, it is not free from the different logic that extensions may >> incorporate into this value: I can imagine, for example, the attempt of >> uniquely numbering plans related to the same queryId, plain hashing of the >> plan tree, hashing without constants, etc. > > One plan ID should have one single query ID, I'm not sure I understand what do you mean by that. QueryId reflects syntactic structure of the query, but planId - semantics. For example: SELECT relname FROM pg_class JOIN pg_am USING (oid); SELECT relname FROM pg_am JOIN pg_class USING (oid); have different queryIds: -6493293269785547447 and 4243743071053231833 but the same plan: Hash Join Hash Cond: (pg_class.oid = pg_am.oid) -> Seq Scan on pg_class -> Hash -> Seq Scan on pg_am You can invent multiple other cases here - remember query tree transformations we made before optimisation. > but the opposite may be > not necessarily true: a query ID could be associated with multiple > plan patterns (aka multiple plan IDs). What this offers is that we > know about which plan one query is currently using in live, for a > given query ID. Okay, as I've said before, it seems practical. I just worry because I see that people don't understand that queryId is a more or less arbitrary value that may or may not group queries that differ only by constants to a single "Query Class". I think the same attitude will be paid to planId. It is okay for statistics. However, non-statistical extensions need more guarantees and want to put different logic into these values. For now, we have examples of only statistical extensions in open-source and may live with a proposed solution. > >> So, it seems that extensions may conflict with the same field. Are we sure >> that will happen if they are loaded in different orders in neighbouring >> backends? > > Depends on what kind of computation once wants to do, and surely > without some coordination across the extensions you are using these > cannot be shared, but it's no different from the concept of a query > ID. Hmm, queryId generation code lies in the core and we already came to terms that this field has only a statistical purpose. Do you want to commit planId generation code? -- regards, Andrei Lepikhov
On 3/23/25 04:22, Sami Imseih wrote: >> On Sat, Mar 22, 2025 at 11:50:06PM +0100, Andrei Lepikhov wrote: >>> planId actually looks less controversial than queryId or plan_node_id. At >>> the same time, it is not free from the different logic that extensions may >>> incorporate into this value: I can imagine, for example, the attempt of >>> uniquely numbering plans related to the same queryId, plain hashing of the >>> plan tree, hashing without constants, etc. > > I think plan_node_id is probably the least controversial because that value > comes straight from core, and different extensions cannot have their own > interpretation of what that value could be. I meant the proposal to let extensions calculate this field. Someone may want node_id to be unique inside the plan; for someone - it should reflect the nature of the node, and it may be the same even inside one plan, etc. In this case, it is highly controversial. > > Computing a plan_id is even more open for interpretation than it is > for query_id perhaps, which is why giving this ability to extensions > will be useful. Different extensions may choose to compute a plan_id > different ways, but they all should be able to push this value into core, > and this is what this patch will allow for. That's why I worry about allowing extensions to compute it. Remember: for now, an extension can't be sure that a conflicting one is not already loaded in the backend. The code [1] may relieve this problem. [1] https://www.postgresql.org/message-id/flat/441661.1742683797@sss.pgh.pa.us#7640020deb6497fe049ac4839eaeb33d -- regards, Andrei Lepikhov
> > but the opposite may be > > not necessarily true: a query ID could be associated with multiple > > plan patterns (aka multiple plan IDs). What this offers is that we > > know about which plan one query is currently using in live, for a > > given query ID. > Okay, as I've said before, it seems practical. I just worry because I > see that people don't understand that queryId is a more or less > arbitrary value that may or may not group queries that differ only by > constants to a single "Query Class". I think the same attitude will be > paid to planId. It is okay for statistics. However, non-statistical > extensions need more guarantees and want to put different logic into > these values. > For now, we have examples of only statistical extensions in open-source > and may live with a proposed solution. statistical/monitoring reasons are an important use case. Detecting plan changes, load attributed to a specific plan, etc. However, I also do see other extensions that could implement a plan_id that has meaning beyond statistical/monitoring purposes. Plan management/enforcement is another use-case. > >> So, it seems that extensions may conflict with the same field. Are we sure > >> that will happen if they are loaded in different orders in neighbouring > >> backends? > > > > Depends on what kind of computation once wants to do, and surely > > without some coordination across the extensions you are using these > > cannot be shared, but it's no different from the concept of a query > > ID. > Hmm, queryId generation code lies in the core and we already came to > terms that this field has only a statistical purpose. Do you want to > commit planId generation code? But, extensions don't necessarily need to rely on the core queryId. They can generate their own queryId. We have it documented for compute_query_id as such [0] "Note that an external module can alternatively be used if the in-core query identifier computation method is not acceptable" [0] https://www.postgresql.org/docs/current/runtime-config-statistics.html#GUC-COMPUTE-QUERY-ID -- Sami Imseih Amazon Web Services (AWS)
On 3/23/25 15:56, Sami Imseih wrote: >> Hmm, queryId generation code lies in the core and we already came to >> terms that this field has only a statistical purpose. Do you want to >> commit planId generation code? > > But, extensions don't necessarily need to rely on the core queryId. They > can generate their own queryId. > We have it documented for compute_query_id as such [0] > "Note that an external module can alternatively be used if the in-core query > identifier computation method is not acceptable" In theory they don't, but in practice they must. This mess especially boring because one of problematic extensions at the same time the most popular one - pg_stat_statements. People forget to follow strict instructions about load order of extensions and think it is the extension instability when one of their query plans isn't captured, but should. So, it may be not an issue in a cloud provider predefined installations, but a headache for custom configurations. -- regards, Andrei Lepikhov
On Sun, Mar 23, 2025 at 04:30:04PM +0100, Andrei Lepikhov wrote: > So, it may be not an issue in a cloud provider predefined installations, but > a headache for custom configurations. Sure, I mean, but it's not really related to the issue discussed on this thread, so.. It sounds like we could improve the documentation about the GUCs that hold a list of library names and load them in the order specified, so as we could point somewhere rather that just saying "don't do that". Another thing is extensions that have the idea of not tracking a hook previously registered, and missing to call it. Another one is that the previous hook can be called before a module's look, or after it. Depending on how an out-of-core extension is maintained, there are a lot of things that can be done. -- Michael
Attachment
On Sat, Mar 22, 2025 at 10:22:37PM -0500, Sami Imseih wrote: > I think plan_node_id is probably the least controversial because that value > comes straight from core, and different extensions cannot have their own > interpretation of what that value could be. Depends. An extension can plug in what they want. The point is that the key used to identify a single plan is up to what extensions think is relevant in a plan. That's heavily subject to interpretation. What's not really subject to interpretation is that an extension cannot know it should set and/or use as key identifier without something that some portion pf the code structures knows about, or these extensions have an inter-dependency. Anyway, there are also the arguments about the set timing, reset timing, the extended protocol argument, etc. So I've applied the patch for now, to start with something. > FWIW, Lukas did start a Wiki [0] to open the discussion for what parts > of the plan should be used to compute a plan_id, and maybe we can > in the future compite a plan_id in core by default. Let's see where this leads.. I suspect that this is going to take some time, assuming that we're ever able to settle on a clear definition. Perhaps we will, or perhaps we will not. -- Michael
Attachment
On Sun, Mar 23, 2025 at 9:43 PM Michael Paquier <michael@paquier.xyz> wrote:
So I've applied the patch for now, to start with
something.
Thanks for committing that, I think that's a great starting point for 18!
Ideally we can also land the jumble funcs work in the other thread to allow extensions to re-use the in-core logic for jumbling expressions found in plan node trees.
> FWIW, Lukas did start a Wiki [0] to open the discussion for what parts
> of the plan should be used to compute a plan_id, and maybe we can
> in the future compite a plan_id in core by default.
Let's see where this leads.. I suspect that this is going to take
some time, assuming that we're ever able to settle on a clear
definition. Perhaps we will, or perhaps we will not.
I think there is a good chance we'll land on a reasonable planid calculation for core (or at least a pg_stat_plans in contrib that does its own tree walk) within the PG19 development cycle - I think plan IDs are actually less complicated than query IDs in terms of what should be considered unique - but maybe that's just my perspective :)
I'll be at PGConf.dev this year, would be great to do an unconference session to discuss this further.
Thanks,
Lukas
Lukas Fittl
Hi! I started reviewing it and noticed that your code repeated this cycle maybe it would be better to put it in a separate function, for example in the form of a name like "analyze_stmts"? or is it possible to create a macro for them? @@ -2016,6 +2017,17 @@ exec_bind_message(StringInfo input_message) */ cplan = GetCachedPlan(psrc, params, NULL, NULL); + foreach(lc, cplan->stmt_list) + { + PlannedStmt *plan = lfirst_node(PlannedStmt, lc); + + if (plan->planId != UINT64CONST(0)) + { + pgstat_report_plan_id(plan->planId, false); + break; + } + } + /* * Now we can define the portal. * @@ -2170,6 +2182,17 @@ exec_execute_message(const char *portal_name, long max_rows) } } + foreach(lc, portal->stmts) + { + PlannedStmt *stmt = lfirst_node(PlannedStmt, lc); + + if (stmt->planId != UINT64CONST(0)) + { + pgstat_report_plan_id(stmt->planId, false); + break; + } + } + cmdtagname = GetCommandTagNameAndLen(portal->commandTag, &cmdtaglen); -- Regards, Alena Rybakina Postgres Professional
> Hi! I started reviewing it and noticed that your code repeated this > cycle maybe it would be better to put it in a separate function, for > example in the form of a name like "analyze_stmts"? > > or is it possible to create a macro for them? Perhaps it may be better to stick this all in a macro, but I did it that way because we have similar pattern for queryId, with some repeating patterns for looping through a list of Query or PlannedStmt. As this patch already got committed, feel free to propose a patch for this improvement. -- Sami Imseih Amazon Web Services (AWS)
> Ideally we can also land the jumble funcs work in the other thread to allow extensions to re-use the > in-core logic for jumbling expressions found in plan node trees. IIUC, there should be a refactor I am working on at this moment to make that possible [0] >> > FWIW, Lukas did start a Wiki [0] to open the discussion for what parts >> > of the plan should be used to compute a plan_id, and maybe we can >> > in the future compite a plan_id in core by default. >> >> Let's see where this leads.. I suspect that this is going to take >> some time, assuming that we're ever able to settle on a clear >> definition. Perhaps we will, or perhaps we will not. > > > I think there is a good chance we'll land on a reasonable planid calculation for core I agree > I'll be at PGConf.dev this year, would be great to do an unconference session to discuss this further. That will be a good idea! [0] https://www.postgresql.org/message-id/Z9khOo14yzZHCnmn%40paquier.xyz
On Mon, Mar 24, 2025 at 02:36:05PM -0500, Sami Imseih wrote: >> Ideally we can also land the jumble funcs work in the other thread >> to allow extensions to re-use the >> in-core logic for jumbling expressions found in plan node trees. > > IIUC, there should be a refactor I am working on at this moment to make that > possible [0] That would be nice, thanks. -- Michael
Attachment
> On Mon, Mar 24, 2025 at 02:36:05PM -0500, Sami Imseih wrote: > >> Ideally we can also land the jumble funcs work in the other thread > >> to allow extensions to re-use the > >> in-core logic for jumbling expressions found in plan node trees. > > > > IIUC, there should be a refactor I am working on at this moment to make that > > possible [0] > > That would be nice, thanks. As mentioned in [0], I spent some time thinking about making some of the core jumbling facilities available to plugins, and I have several patches to share. First, none of the changes change the names of the files from queryjumblefuncs.c to something more generic, as there was some resistance to that idea in this thread. I will keep it as is for now, and this can be revisited at a later time, if need be. So, v1-0001 provides the following to an extension: InitializeJumbleState - to initialize a jumble state JumbleNode - To jumble an expression HashJumbleState - To produce the 64-bit hash from the jumble state Also, an Assert was added inside RecordConstLocation to assert that the jumble state was initialized with record_clocations. I know it was mentioned above by both Michael and Andrei that AppendJumble should not be exposed. I am not sure I agree with that. I think it should be exposed, along with JUMBLE_FIELD, JUMBLE_FIELD_SINGLE and JUMBLE_STRING and _jumbleList. I am afraid that if we don't expose these routines/macros, the extension will have to re-invent those wheels. This is not included in v1-0001, but If there is no objection, I can update the patch. Thoughts? [0] https://www.postgresql.org/message-id/Z9khOo14yzZHCnmn%40paquier.xyz
Attachment
On Mon, Mar 24, 2025 at 06:47:59PM -0500, Sami Imseih wrote: > I know it was mentioned above by both Michael and Andrei that > AppendJumble should not be exposed. I am not sure I agree with > that. I think it should be exposed, along with > JUMBLE_FIELD, JUMBLE_FIELD_SINGLE and JUMBLE_STRING > and _jumbleList. > > I am afraid that if we don't expose these routines/macros, the > extension will have > to re-invent those wheels. This is not included in v1-0001, but If > there is no objection, > I can update the patch. Thoughts? Hmm, yes, the macros could prove to be useful to allow extensions to compile their own code the manipulations they want to do on the node structures they'd like to touch, but wouldn't that mean that they would copy in some way a portion of what gen_node_support.pl does? Perhaps we should think more carefully about this part, and refactor this script to work as a perl module so as it could be reused by some external code, at least for the parsing of the headers and the attributes assigned to each nodes and their attributes? This is just to drop an idea in the bucket of ideas, trying to think about the potential larger picture. -- Michael
Attachment
On Tue, Mar 25, 2025 at 12:13 AM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Mar 24, 2025 at 06:47:59PM -0500, Sami Imseih wrote:
> I know it was mentioned above by both Michael and Andrei that
> AppendJumble should not be exposed. I am not sure I agree with
> that. I think it should be exposed, along with
> JUMBLE_FIELD, JUMBLE_FIELD_SINGLE and JUMBLE_STRING
> and _jumbleList.
>
> I am afraid that if we don't expose these routines/macros, the
> extension will have
> to re-invent those wheels. This is not included in v1-0001, but If
> there is no objection,
> I can update the patch. Thoughts?
Hmm, yes, the macros could prove to be useful to allow extensions to
compile their own code the manipulations they want to do on the node
structures they'd like to touch, but wouldn't that mean that they
would copy in some way a portion of what gen_node_support.pl does?
I agree with Sami that we should expose AppendJumble (it'd be necessary for any extension that wants to re-use the jumbling logic), and I don't see a reason to skip over the convenience macros, but also not hard to recreate those. AFAIK you could get by without having access to _jumbleList, since you can just call JumbleNode which can jumble lists.
The initial patch I had posted over at the other thread [0], (patch 0003 to be precise in the initial set, which was partially based on work Sami had done previously), showed what I think is the minimum we should enable:
case T_IndexScan:
{
IndexScan *splan = (IndexScan *) plan;
...
{
IndexScan *splan = (IndexScan *) plan;
...
JUMBLE_VALUE(splan->indexid);
JumbleNode(jstate, (Node *) splan->indexqual);
...
...
i.e. allow someone to defer to core logic for expressions, but require them to implement their own (manual) way of writing the jumble functions/conditions for the more limited set of expression-containing nodes they care about (like plan nodes).
Of course, thinking about other use cases, I could imagine someone would potentially be interested to change core jumbling logic for only a specific node (e.g. implementing a query ID that considers constant values to be significant), but I'd argue that making that level of customization work is out of scope / hard to do in general.
Perhaps we should think more carefully about this part, and refactor
this script to work as a perl module so as it could be reused by some
external code, at least for the parsing of the headers and the
attributes assigned to each nodes and their attributes?
I've been thinking about how to rework things for a PG18-requiring pg_stat_plans extension I'd like to publish, and if I were to choose a node attribute-based approach I'd probably:
1. Check out the upstream Postgres source for the given version I'm generating jumble code for
2. Modify the headers as needed to have the node attributes I want
3. Call the gen_node_support.pl via the build process
4. Copy out the resulting jumble node funcs/conds
Even with the state currently committed this is already possible, but (4) ends up with a lot of duplicated code in the extension. Assuming we have a way to call jumbleNode + AppendJumble and macros, that step could only keep the jumble conds/funcs that are not defined in core.
Even with the state currently committed this is already possible, but (4) ends up with a lot of duplicated code in the extension. Assuming we have a way to call jumbleNode + AppendJumble and macros, that step could only keep the jumble conds/funcs that are not defined in core.
So based on that workflow I'm not sure turning gen_node_support.pl into a re-usable module would help, but that's just my perspective without having built this out (yet).
Thanks,
Lukas
--
Lukas Fittl
On 3/25/25 00:47, Sami Imseih wrote: > I know it was mentioned above by both Michael and Andrei that > AppendJumble should not be exposed. I am not sure I agree with > that. I think it should be exposed, along with > JUMBLE_FIELD, JUMBLE_FIELD_SINGLE and JUMBLE_STRING > and _jumbleList. It would be helpful if you could provide an example to support the reasoning behind exposing this stuff. I assume you want to influence the logic of jumbling, correct? If that’s the case, it might be beneficial to add a callback to the JumbleState that is triggered during each node jumbling attempt. This could facilitate moving clocations code and data into pg_stat_statements and give developers the opportunity to influence the jumbling process, adding extra meaning to their hashes. One reason this might be necessary is that generating the same hash for both a Param node and a Const node could lead to a more generalized hash, allowing us to group more queries under the same value. -- regards, Andrei Lepikhov
> On 3/25/25 00:47, Sami Imseih wrote: > > I know it was mentioned above by both Michael and Andrei that > > AppendJumble should not be exposed. I am not sure I agree with > > that. I think it should be exposed, along with > > JUMBLE_FIELD, JUMBLE_FIELD_SINGLE and JUMBLE_STRING > > and _jumbleList. > It would be helpful if you could provide an example to support the > reasoning behind exposing this stuff. I assume you want to influence the > logic of jumbling, correct? I was actually thinking more about the case which Lukas mentions above [0], which is jumbling a Plan. So, an extension having access to AppendJumble ( and the macros ) will allow it to create a routine similar to JumbleNode but that can deal with the plannodes. Of course, the same pattern can be applied to other types of nodes. > Hmm, yes, the macros could prove to be useful to allow extensions to > compile their own code the manipulations they want to do on the node > structures they'd like to touch, but wouldn't that mean that they > would copy in some way a portion of what gen_node_support.pl does? I feel like gen_node_support.pl should just deal with generating functions for core jumbling logic, and extensions should be able to build out their jumbling funcs, and do it simply if they have access to AppendJumble and friends. > So based on that workflow I'm not sure turning gen_node_support.pl into a re-usable module > would help, but that's just my perspective without having built this out (yet). > 1. Check out the upstream Postgres source for the given version I'm generating jumble code for > 2. Modify the headers as needed to have the node attributes I want > 3. Call the gen_node_support.pl via the build process > 4. Copy out the resulting jumble node funcs/conds I like this approach, and the artifacts from #4 will be packed with the extension. [0] https://www.postgresql.org/message-id/CAP53Pkw7HD%2B3sssn5UcASWLn%2Ba9oRJOM9hXteDCg64JJ662bHQ%40mail.gmail.com -- Sami Imseih Amazon Web Services (AWS)
On Tue, Mar 25, 2025 at 04:23:15PM -0500, Sami Imseih wrote: > On 3/25/25 00:47, Sami Imseih wrote: >> 1. Check out the upstream Postgres source for the given version I'm generating jumble code for >> 2. Modify the headers as needed to have the node attributes I want >> 3. Call the gen_node_support.pl via the build process >> 4. Copy out the resulting jumble node funcs/conds > > I like this approach, and the artifacts from #4 will be packed with > the extension. So this comes down to forking the Postgres code to do the job. What I had in mind was a slightly different flow, where we would be able to push custom node attributes between the header parsing and the generation of the extension code. If we do that, there would be no need to fork the Postgres code: extensions could force the definitions they want to the nodes they want to handle, as long as these do not need to touch the in-core query jumble logic, of course. Perhaps we should give more time and thoughts to the concepts we want to expose at this level of the code for extensions. Hence I would side with not rushing things, and consider our options more carefully for v19 with what we would consider to be better design. -- Michael
Attachment
So this comes down to forking the Postgres code to do the job. What I
had in mind was a slightly different flow, where we would be able to
push custom node attributes between the header parsing and the
generation of the extension code. If we do that, there would be no
need to fork the Postgres code: extensions could force the definitions
they want to the nodes they want to handle, as long as these do not
need to touch the in-core query jumble logic, of course.
Allowing extensions to generate code for custom node attributes could be
taken up in 19. What about for 18 we think about exposing AppendJumble,
JUMBLE_FIELD, JUMBLE_FIELD_SINGLE and JUMBLE_STRING?
—
Sami Imseih
Amazon Web Services (AWs)
On Tue, Mar 25, 2025 at 8:12 PM Sami Imseih <samimseih@gmail.com> wrote:
So this comes down to forking the Postgres code to do the job. What I
had in mind was a slightly different flow, where we would be able to
push custom node attributes between the header parsing and the
generation of the extension code. If we do that, there would be no
need to fork the Postgres code: extensions could force the definitions
they want to the nodes they want to handle, as long as these do not
need to touch the in-core query jumble logic, of course.Allowing extensions to generate code for custom node attributes could betaken up in 19. What about for 18 we think about exposing AppendJumble,JUMBLE_FIELD, JUMBLE_FIELD_SINGLE and JUMBLE_STRING?
+1, I think exposing the node jumbling logic + ability to jumble values for extensions would make a big difference to get into 18, specifically for allowing extensions to do plan ID calculations efficiently.
Attached a rebased version that accounts for the changes in f31aad9b0 and ad9a23bc4, and also makes AppendJumble* available, as well as two helper defines JUMBLE_VALUE and JUMBLE_VALUE_STRING. These are intended to work on values, not nodes, because I think that requiring the caller to have a local "expr" variable doesn't seem like a good API.
I've also intentionally reduced the API surface area and not allowed initializing a jumble state that records constant locations / not exported RecordConstLocations. I think the utility of that seems less clear (possibly out-of-core queryid customization with a future extension hook in jumbleNode) and requires more refinement.
Thoughts?
Thanks,
Lukas
Lukas Fittl
Attachment
On Mon, Mar 31, 2025 at 1:28 PM Lukas Fittl <lukas@fittl.com> wrote: > > On Tue, Mar 25, 2025 at 8:12 PM Sami Imseih <samimseih@gmail.com> wrote: >>> >>> So this comes down to forking the Postgres code to do the job. What I >>> had in mind was a slightly different flow, where we would be able to >>> push custom node attributes between the header parsing and the >>> generation of the extension code. If we do that, there would be no >>> need to fork the Postgres code: extensions could force the definitions >>> they want to the nodes they want to handle, as long as these do not >>> need to touch the in-core query jumble logic, of course. >> >> >> Allowing extensions to generate code for custom node attributes could be >> taken up in 19. What about for 18 we think about exposing AppendJumble, >> JUMBLE_FIELD, JUMBLE_FIELD_SINGLE and JUMBLE_STRING? > > > +1, I think exposing the node jumbling logic + ability to jumble values for extensions would make a big difference to getinto 18, specifically for allowing extensions to do plan ID calculations efficiently I think getting these APIs into 18 is super important, especially with the optimizations made in ad9a23bc4; I will reiterate that extension developers should not have to re-invent these wheels :) > I've also intentionally reduced the API surface area and not allowed initializing a jumble state that records constantlocations / not exported RecordConstLocations. > I think the utility of that seems less clear (possibly out-of-core queryid customization with a future extension hook injumbleNode) and requires more refinement. I agree with that primarily because I don't know of the use case at this time in which an extension will need to record a constant location. I reviewed the patch and I only have one comment. I still think we need an Assert inside RecordConstantLocation to make sure clocations is allocated if the caller intends to record constant locations. RecordConstLocation(JumbleState *jstate, int location, bool squashed) { + Assert(jstate->clocations); + -- Sami Imseih Amazon Web Services (AWS)
> I reviewed the patch and I only have one comment. I still think > we need an Assert inside RecordConstantLocation to make sure clocations > is allocated if the caller intends to record constant locations. > > RecordConstLocation(JumbleState *jstate, int location, bool squashed) > { > + Assert(jstate->clocations); > + Here is v3 with the Assert in place -- Sami Imseih Amazon Web Services (AWS)