Thread: Eliminating SPI from RI triggers - take 2
Hi, I had proposed $subject for some RI trigger functions in the last dev cycle [1]. Briefly, the proposal was to stop using an SQL query (using the SPI interface) for RI checks that could be done by directly scanning the primary/unique key index of the referenced table, which must always be there. While acknowledging that the patch showed a clear performance benefit, Tom gave the feedback that doing so only for some RI checks but not others is not very desirable [2]. The other cases include querying the referencing table when deleting from the referenced table to handle the referential action clause. Two main hurdles to not using an SQL query for those cases that I hadn't addressed were: 1) What should the hard-coded plan be? Referencing table may not always have an index on the queried foreign key columns. Even if there is one, it's not clear if scanning it is *always* better than scanning the whole table to find the matching rows. 2) While the RI check functions for RESTRICT and NO ACTION actions issue a `SELECT ... LIMIT 1` query, those for CASCADE and SET actions issue a `UPDATE SET / DELETE`. I had no good idea as to how much of the executor functionality would need to be replicated in order to perform the update/delete actions without leaving the ri_triggers.c module. We had an unconference session to discuss these concerns at this year's PGCon, whose minutes can be found at [3]. Among other suggestions, one was to only stop using the SPI interface to issue the RI check/action queries, while continuing to use the same SQL queries as now. That means creating a copy in ri_triggers.c of the functionality of SPI_prepare(), which creates the CachedPlanSource for the query, and of SPI_execute_plan(), which executes a CachedPlan obtained from that CachedPlanSource to produce the result tuples if any. That may not have the same performance boost as skipping the planner/plancache and the executor altogether, but at least it becomes easier to check the difference between semantic behaviors of an RI query implemented as SQL and another implemented using some hard-coded plan if we choose to do, because the logic would no longer be divided between ri_trigger.c and spi.c. I think that will, at least to some degree, alleviate the concerns that Tom expressed about the previous effort. So, I hacked together a patch (attached 0001) that invents an "RI plan" construct (struct RIPlan) to replace the use of an "SPI plan" (struct _SPI_plan). While the latter encapsulates the CachedPlanSource of an RI query directly, I decided to make it an option for a given RI trigger to specify whether it would like to have its RIPlan store CachedPlanSource if its check is still implemented as an SQL query or something else if the implementation will be a hard-coded plan. RIPlan contains callbacks to create, execute, validate, and free a plan that implements a given RI query. For example, an RI plan for checks implemented as SQL will call the callback ri_SqlStringPlanCreate() to parse the query and allocate a CachedPlanSource and ri_SqlStringPlanExecute() to a CachedPlan and executes it PlannedStmt using the executor interface directly. Remaining callbacks ri_SqlStringPlanIsValid() and ri_SqlStringPlanFree() use CachedPlanIsValid() and DropCachedPlan(), respectively, to validate and free a CachedPlan. With that in place, I decided to rebase my previous patch [1] to use this new interface and the result is attached 0002. One notable improvement over the previous standalone patch is that the snapshot setting logic need no longer be in function implementing the proposed hard-coded plan for RI check triggers. That logic and other configuration needed before executing the plan is now a part of the top-level ri_PerformCheck() function that is shared between various RI plan implementations. So whether an RI check or action is implemented using SQL plan or a hard-code plan, the execution should proceed with the effectively same config/environment. I will continue investigating what to do about points (1) and (2) mentioned above and see if we can do away with using SQL in the remaining cases. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com [1] https://postgr.es/m/CA+HiwqGkfJfYdeq5vHPh6eqPKjSbfpDDY+j-kXYFePQedtSLeg@mail.gmail.com [2] https://postgr.es/m/3400437.1649363527%40sss.pgh.pa.us [3] https://wiki.postgresql.org/wiki/PgCon_2022_Developer_Unconference#Removing_SPI_from_RI_trigger_implementation
Attachment
On Thu, Jun 30, 2022 at 11:23 PM Amit Langote <amitlangote09@gmail.com> wrote: > > I will continue investigating what to do about points (1) and (2) > mentioned above and see if we can do away with using SQL in the > remaining cases. Hi Amit, looks like isolation tests are failing in cfbot: https://cirrus-ci.com/task/6642884727275520 Note also the uninitialized variable warning that cfbot picked up; that may or may not be related. Thanks, --Jacob
On Wed, Jul 6, 2022 at 3:24 AM Jacob Champion <jchampion@timescale.com> wrote: > On Thu, Jun 30, 2022 at 11:23 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > I will continue investigating what to do about points (1) and (2) > > mentioned above and see if we can do away with using SQL in the > > remaining cases. > > Hi Amit, looks like isolation tests are failing in cfbot: > > https://cirrus-ci.com/task/6642884727275520 > > Note also the uninitialized variable warning that cfbot picked up; > that may or may not be related. Thanks for the heads up. Yeah, I noticed the warning when I compiled with a different set of gcc parameters, though not the isolation test failures, so not sure what the bot is running into. Attaching updated patches which fix the warning and a few other issues I noticed. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Attachment
On Wed, Jul 6, 2022 at 11:55 AM Amit Langote <amitlangote09@gmail.com> wrote: > On Wed, Jul 6, 2022 at 3:24 AM Jacob Champion <jchampion@timescale.com> wrote: > > On Thu, Jun 30, 2022 at 11:23 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > > > I will continue investigating what to do about points (1) and (2) > > > mentioned above and see if we can do away with using SQL in the > > > remaining cases. > > > > Hi Amit, looks like isolation tests are failing in cfbot: > > > > https://cirrus-ci.com/task/6642884727275520 > > > > Note also the uninitialized variable warning that cfbot picked up; > > that may or may not be related. > > Thanks for the heads up. > > Yeah, I noticed the warning when I compiled with a different set of > gcc parameters, though not the isolation test failures, so not sure > what the bot is running into. > > Attaching updated patches which fix the warning and a few other issues > I noticed. Hmm, cfbot is telling me that detach-partition-concurrently-2 is failing on Cirrus-CI [1]. Will look into it. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com [1] https://cirrus-ci.com/task/5253369525698560?logs=test_world#L317
On Fri, Jul 1, 2022 at 2:23 AM Amit Langote <amitlangote09@gmail.com> wrote: > So, I hacked together a patch (attached 0001) that invents an "RI > plan" construct (struct RIPlan) to replace the use of an "SPI plan" > (struct _SPI_plan). > > With that in place, I decided to rebase my previous patch [1] to use > this new interface and the result is attached 0002. I think inventing something like RIPlan is probably reasonable, but I'm not sure how much it really does to address the objections that were raised previously. How do we know that ri_LookupKeyInPkRel does all the same things that executing a plan would have done? I see that function contains permission-checking logic, for example, as well as snapshot-related logic, and maybe there are other subsystems to worry about, like rules or triggers or row-level security. Maybe there's no answer to that problem other than careful manual verification, because after all the only way to be 100% certain we're doing all the things that would happen if you executed a plan is to execute a plan, which kind of defeats the point of the whole thing. All I'm saying is that I'm not sure that this refactoring in and of itself addresses that concern. As far as 0002 goes, the part I'm most skeptical about is this: +static bool +ri_LookupKeyInPkRelPlanIsValid(RI_Plan *plan) +{ + /* Never store anything that can be invalidated. */ + return true; +} Isn't that leaving rather a lot on the table? ri_LookupKeyInPkRel is going to be called a lot of times and do a lot of things over and over again that maybe only need to be done once, like checking permissions and looking up the operators to use and reopening the index. And all the stuff ExecGetLeafPartitionForKey does too, yikes that's a lot of stuff. Now maybe that's what Tom wants, I don't know. Certainly, the existing SQL-based implementation is going to do that stuff on every call, too; I'm just not sure that's a good thing. I think there's some debate to be had here over what behavior we need to preserve exactly vs. what we can and should change. For instance, it seems clear to me that leaving out permissions checks altogether would be not OK, but if this implementation arranged to cache the results of a permission check and the SQL-based implementations don't, is that OK? Maybe Tom would argue that it isn't, because he considers that a part of the user-visible behavior, but I'm not sure that's the right view of it. I think what we're promising the user is that we will check permissions, not that we're going to do it separately for every trigger firing, or even that every kind of trigger is going to do it exactly the same number of times as every other trigger. I think we need some input from Tom (and perhaps others) on how rigidly we need to maintain the high-level behavior here before we can really say much about whether the implementation is as good as it can be. I suspect, though, that there's more that can be done here in terms of sharing code. For instance, picking on the permissions checking logic, presumably that's something that every non-SQL implementation would need to do. But the rest of what's in ri_LookupKeyInPkRel() is specific to one particular kind of trigger. If we had multiple non-SQL trigger types, we'd want to somehow have common logic for permissions checking for all of them. I also suspect that we ought to have a separation between planning and execution even for non-SQL based things. You don't really have that here. What that ought to look like, though, depends on the answers to the questions above, about how exactly we think we need to reproduce the existing behavior. I find my ego slightly wounded by the comment that "the partition descriptor machinery has a hack that assumes that the queries originating in this module push the latest snapshot in the transaction-snapshot mode." It's true that the partition descriptor machinery gives different answers depending on the active snapshot, but, err, is that a hack, or just a perfectly reasonable design decision? An alternative might be for PartitionDirectoryLookup to take a snapshot as an explicit argument rather than relying on the global variable to get that information from context. I generally feel that we rely too much on global variables where we should be passing around explicit parameters, so if you're just arguing that explicit parameters would be better here, then I agree and just didn't think of it. If you're arguing that making the answer depend on the snapshot is itself a bad idea, I don't agree with that. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > ... I think there's some > debate to be had here over what behavior we need to preserve exactly > vs. what we can and should change. For sure. For example, people occasionally complain because user-defined triggers can defeat RI integrity checks. Should we change that? I dunno, but if we're not using the standard executor then there's at least some room to consider it. I think people would be upset if we stopped firing user triggers at all; but if triggers couldn't defeat RI actions short of throwing a transaction-aborting error, I believe a lot of people would consider that an improvement. > For instance, it seems clear to me > that leaving out permissions checks altogether would be not OK, but if > this implementation arranged to cache the results of a permission > check and the SQL-based implementations don't, is that OK? Maybe Tom > would argue that it isn't, because he considers that a part of the > user-visible behavior, but I'm not sure that's the right view of it. Uh ... if such caching behavior is at all competently implemented, it will be transparent because the cache will notice and respond to events that should change its outputs. So I don't foresee a semantic problem there. It may well be that it's practical to cache permissions-check info for RI checks when it isn't for more general queries, so looking into ideas like that seems well within scope here. (Or then again, maybe we should be building a more general permissions cache?) I'm too tired to have more than that to say right now, but I agree that there is room for discussion about exactly what behavior we want to preserve. regards, tom lane
On Fri, Jul 8, 2022 at 10:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Uh ... if such caching behavior is at all competently implemented, > it will be transparent because the cache will notice and respond to > events that should change its outputs. Well, that assumes that we emit appropriate invalidations in every place where permissions are updated, and take appropriate locks every place where they are checked. I think that the first one might be too optimistic, and the second one is definitely too optimistic. For instance, consider pg_proc_ownercheck. There's no lock of any kind taken on the function here, and at least in typical cases, I don't think the caller takes one either. Compare the extensive tap-dancing around locking and permissions checking in RangeVarGetRelidExtended against the blithe unconcern in FuncnameGetCandidates. I believe that of all the types of SQL objects in the system, only relations have anything like proper interlocking against concurrent DDL. Other examples of not caring at all include LookupCollation() and LookupTypeNameExtended(). There's just no heavyweight locking here at all, and so no invalidation based on sinval messages can ever be reliable. GRANT and REVOKE don't take proper locks, either, even on tables: rhaas=# begin; BEGIN rhaas=*# lock table pgbench_accounts; LOCK TABLE rhaas=*# Then, in another session: rhaas=# create role foo; CREATE ROLE rhaas=# grant select on pgbench_accounts to foo; GRANT rhaas=# Executing "SELECT * FROM pgbench_accounts" in the other session would have blocked, but the GRANT has no problem at all. I don't see that any of this is this patch's job to fix. If nobody's cared enough to fix it any time in the past 20 years, or just didn't want to pay the locking cost, well then we probably don't need to do it now either. But I think it means that even the slightest change in the timing or frequency of permissions checks is in theory a user-visible change, because there are no grounds for assuming that the permissions on any of the objects involved aren't changing while the query is executing. -- Robert Haas EDB: http://www.enterprisedb.com
On Sat, Jul 9, 2022 at 1:15 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Jul 1, 2022 at 2:23 AM Amit Langote <amitlangote09@gmail.com> wrote: > > So, I hacked together a patch (attached 0001) that invents an "RI > > plan" construct (struct RIPlan) to replace the use of an "SPI plan" > > (struct _SPI_plan). > > > > With that in place, I decided to rebase my previous patch [1] to use > > this new interface and the result is attached 0002. > Thanks for taking a look at this. I'll try to respond to other points in a separate email, but I wanted to clarify something about below: > I find my ego slightly wounded by the comment that "the partition > descriptor machinery has a hack that assumes that the queries > originating in this module push the latest snapshot in the > transaction-snapshot mode." It's true that the partition descriptor > machinery gives different answers depending on the active snapshot, > but, err, is that a hack, or just a perfectly reasonable design > decision? I think my calling it a hack of "partition descriptor machinery" is not entirely fair (sorry), because it's talking about the following comment in find_inheritance_children_extended(), which describes it as being a hack, so I mentioned the word "hack" in my comment too: /* * Cope with partitions concurrently being detached. When we see a * partition marked "detach pending", we omit it from the returned set * of visible partitions if caller requested that and the tuple's xmin * does not appear in progress to the active snapshot. (If there's no * active snapshot set, that means we're not running a user query, so * it's OK to always include detached partitions in that case; if the * xmin is still running to the active snapshot, then the partition * has not been detached yet and so we include it.) * * The reason for this hack is that we want to avoid seeing the * partition as alive in RI queries during REPEATABLE READ or * SERIALIZABLE transactions: such queries use a different snapshot * than the one used by regular (user) queries. */ That bit came in to make DETACH CONCURRENTLY produce sane answers for RI queries in some cases. I guess my comment should really have said something like: HACK: find_inheritance_children_extended() has a hack that assumes that the queries originating in this module push the latest snapshot in transaction-snapshot mode. > An alternative might be for PartitionDirectoryLookup to take > a snapshot as an explicit argument rather than relying on the global > variable to get that information from context. I generally feel that > we rely too much on global variables where we should be passing around > explicit parameters, so if you're just arguing that explicit > parameters would be better here, then I agree and just didn't think of > it. If you're arguing that making the answer depend on the snapshot is > itself a bad idea, I don't agree with that. No, I'm not arguing that using a snapshot there is wrong and haven't really thought hard about an alternative. I tend to agree passing a snapshot explicitly might be better than using ActiveSnapshot stuff for this. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Wed, Jul 13, 2022 at 8:59 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Sat, Jul 9, 2022 at 1:15 AM Robert Haas <robertmhaas@gmail.com> wrote: > Thanks for taking a look at this. I'll try to respond to other points > in a separate email, but I wanted to clarify something about below: > > > I find my ego slightly wounded by the comment that "the partition > > descriptor machinery has a hack that assumes that the queries > > originating in this module push the latest snapshot in the > > transaction-snapshot mode." It's true that the partition descriptor > > machinery gives different answers depending on the active snapshot, > > but, err, is that a hack, or just a perfectly reasonable design > > decision? > > I think my calling it a hack of "partition descriptor machinery" is > not entirely fair (sorry), because it's talking about the following > comment in find_inheritance_children_extended(), which describes it as > being a hack, so I mentioned the word "hack" in my comment too: > > /* > * Cope with partitions concurrently being detached. When we see a > * partition marked "detach pending", we omit it from the returned set > * of visible partitions if caller requested that and the tuple's xmin > * does not appear in progress to the active snapshot. (If there's no > * active snapshot set, that means we're not running a user query, so > * it's OK to always include detached partitions in that case; if the > * xmin is still running to the active snapshot, then the partition > * has not been detached yet and so we include it.) > * > * The reason for this hack is that we want to avoid seeing the > * partition as alive in RI queries during REPEATABLE READ or > * SERIALIZABLE transactions: such queries use a different snapshot > * than the one used by regular (user) queries. > */ > > That bit came in to make DETACH CONCURRENTLY produce sane answers for > RI queries in some cases. > > I guess my comment should really have said something like: > > HACK: find_inheritance_children_extended() has a hack that assumes > that the queries originating in this module push the latest snapshot > in transaction-snapshot mode. Posting a new version with this bit fixed; cfbot complained that 0002 needed a rebase over 3592e0ff98. I will try to come up with a patch to enhance the PartitionDirectory interface to allow passing the snapshot to use when scanning pg_inherits explicitly, so we won't need the above "hack". -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Attachment
On Thu, Aug 4, 2022 at 1:05 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Wed, Jul 13, 2022 at 8:59 PM Amit Langote <amitlangote09@gmail.com> wrote: > > That bit came in to make DETACH CONCURRENTLY produce sane answers for > > RI queries in some cases. > > > > I guess my comment should really have said something like: > > > > HACK: find_inheritance_children_extended() has a hack that assumes > > that the queries originating in this module push the latest snapshot > > in transaction-snapshot mode. > > Posting a new version with this bit fixed; cfbot complained that 0002 > needed a rebase over 3592e0ff98. > > I will try to come up with a patch to enhance the PartitionDirectory > interface to allow passing the snapshot to use when scanning > pg_inherits explicitly, so we won't need the above "hack". Sorry about the delay. So I came up with such a patch that is attached as 0003. The main problem I want to fix with it is the need for RI_FKey_check() to "force"-push the latest snapshot that the PartitionDesc code wants to use to correctly include or omit a detach-pending partition from the view of that function's RI query. Scribbling on ActiveSnapshot that way means that *all* scans involved in the execution of that query now see a snapshot that they shouldn't likely be seeing; a bug resulting from this has been demonstrated in a test case added by the commit 00cb86e75d. The fix is to make RI_FKey_check(), or really its RI_Plan's execution function ri_LookupKeyInPkRel() added by patch 0002, pass the latest snapshot explicitly as a parameter of PartitionDirectoryLookup(), which passes it down to the PartitionDesc code. No need to manipulate ActiveSnapshot. The actual fix is in patch 0004, which I extracted out of 0002 to keep the latter a mere refactoring patch without any semantic changes (though a bit more on that below). BTW, I don't know of a way to back-patch a fix like this for the bug, because there is no way other than ActiveSnapshot to pass the desired snapshot to the PartitionDesc code if the only way we get to that code is by executing an SQL query plan. 0003 moves the relevant logic out of find_inheritance_children_extended() into its callers. The logic of deciding which snapshot to use to determine if a detach-pending partition should indeed be omitted from the consideration of a caller based on the result of checking the visibility of the corresponding pg_inherits row with the snapshot; it just uses ActiveSnapshot now. Given the problems with using ActiveSnapshot mentioned above, I think it is better to make the callers decide the snapshot and pass it using a parameter named omit_detached_snapshot. Only PartitionDesc code actually cares about sending anything but the parent query's ActiveSnapshot, so the PartitionDesc and PartitionDirectory interface has been changed to add the same omit_detached_snapshot parameter. find_inheritance_children(), the other caller used in many sites that look at a table's partitions, defaults to using ActiveSnapshot, which does not seem problematic. Furthermore, only RI_FKey_check() needs to pass anything other than ActiveSnapshot, so other users of PartitionDesc, like user queries, still default to using the ActiveSnapshot, which doesn't have any known problems either. 0001 and 0002 are mostly unchanged in this version, except I took out the visibility bug-fix from 0002 into 0004 described above, which looks better using the interface added by 0003 anyway. I need to address the main concern that it's still hard to be sure that the patch in its current form doesn't break any user-level semantics of these RI check triggers and other concerns about the implementation that Robert expressed in [1]. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com [1] https://www.postgresql.org/message-id/CA%2BTgmoaiTNj4DgQy42OT9JmTTP1NWcMV%2Bke0i%3D%2Ba7%3DVgnzqGXw%40mail.gmail.com
Attachment
On Thu, Sep 29, 2022 at 1:46 PM Amit Langote <amitlangote09@gmail.com> wrote: > Sorry about the delay. > > So I came up with such a patch that is attached as 0003. > > The main problem I want to fix with it is the need for RI_FKey_check() > to "force"-push the latest snapshot that the PartitionDesc code wants > to use to correctly include or omit a detach-pending partition from > the view of that function's RI query. Scribbling on ActiveSnapshot > that way means that *all* scans involved in the execution of that > query now see a snapshot that they shouldn't likely be seeing; a bug > resulting from this has been demonstrated in a test case added by the > commit 00cb86e75d. > > The fix is to make RI_FKey_check(), or really its RI_Plan's execution > function ri_LookupKeyInPkRel() added by patch 0002, pass the latest > snapshot explicitly as a parameter of PartitionDirectoryLookup(), > which passes it down to the PartitionDesc code. No need to manipulate > ActiveSnapshot. The actual fix is in patch 0004, which I extracted > out of 0002 to keep the latter a mere refactoring patch without any > semantic changes (though a bit more on that below). BTW, I don't know > of a way to back-patch a fix like this for the bug, because there is > no way other than ActiveSnapshot to pass the desired snapshot to the > PartitionDesc code if the only way we get to that code is by executing > an SQL query plan. > > 0003 moves the relevant logic out of > find_inheritance_children_extended() into its callers. The logic of > deciding which snapshot to use to determine if a detach-pending > partition should indeed be omitted from the consideration of a caller > based on the result of checking the visibility of the corresponding > pg_inherits row with the snapshot; it just uses ActiveSnapshot now. > Given the problems with using ActiveSnapshot mentioned above, I think > it is better to make the callers decide the snapshot and pass it using > a parameter named omit_detached_snapshot. Only PartitionDesc code > actually cares about sending anything but the parent query's > ActiveSnapshot, so the PartitionDesc and PartitionDirectory interface > has been changed to add the same omit_detached_snapshot parameter. > find_inheritance_children(), the other caller used in many sites that > look at a table's partitions, defaults to using ActiveSnapshot, which > does not seem problematic. Furthermore, only RI_FKey_check() needs to > pass anything other than ActiveSnapshot, so other users of > PartitionDesc, like user queries, still default to using the > ActiveSnapshot, which doesn't have any known problems either. > > 0001 and 0002 are mostly unchanged in this version, except I took out > the visibility bug-fix from 0002 into 0004 described above, which > looks better using the interface added by 0003 anyway. I need to > address the main concern that it's still hard to be sure that the > patch in its current form doesn't break any user-level semantics of > these RI check triggers and other concerns about the implementation > that Robert expressed in [1]. Oops, I apparently posted the wrong 0004, containing a bug that crashes `make check`. Fixed version attached. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Attachment
On Thu, Sep 29, 2022 at 4:43 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Thu, Sep 29, 2022 at 1:46 PM Amit Langote <amitlangote09@gmail.com> wrote: > > Sorry about the delay. > > > > So I came up with such a patch that is attached as 0003. > > > > The main problem I want to fix with it is the need for RI_FKey_check() > > to "force"-push the latest snapshot that the PartitionDesc code wants > > to use to correctly include or omit a detach-pending partition from > > the view of that function's RI query. Scribbling on ActiveSnapshot > > that way means that *all* scans involved in the execution of that > > query now see a snapshot that they shouldn't likely be seeing; a bug > > resulting from this has been demonstrated in a test case added by the > > commit 00cb86e75d. > > > > The fix is to make RI_FKey_check(), or really its RI_Plan's execution > > function ri_LookupKeyInPkRel() added by patch 0002, pass the latest > > snapshot explicitly as a parameter of PartitionDirectoryLookup(), > > which passes it down to the PartitionDesc code. No need to manipulate > > ActiveSnapshot. The actual fix is in patch 0004, which I extracted > > out of 0002 to keep the latter a mere refactoring patch without any > > semantic changes (though a bit more on that below). BTW, I don't know > > of a way to back-patch a fix like this for the bug, because there is > > no way other than ActiveSnapshot to pass the desired snapshot to the > > PartitionDesc code if the only way we get to that code is by executing > > an SQL query plan. > > > > 0003 moves the relevant logic out of > > find_inheritance_children_extended() into its callers. The logic of > > deciding which snapshot to use to determine if a detach-pending > > partition should indeed be omitted from the consideration of a caller > > based on the result of checking the visibility of the corresponding > > pg_inherits row with the snapshot; it just uses ActiveSnapshot now. > > Given the problems with using ActiveSnapshot mentioned above, I think > > it is better to make the callers decide the snapshot and pass it using > > a parameter named omit_detached_snapshot. Only PartitionDesc code > > actually cares about sending anything but the parent query's > > ActiveSnapshot, so the PartitionDesc and PartitionDirectory interface > > has been changed to add the same omit_detached_snapshot parameter. > > find_inheritance_children(), the other caller used in many sites that > > look at a table's partitions, defaults to using ActiveSnapshot, which > > does not seem problematic. Furthermore, only RI_FKey_check() needs to > > pass anything other than ActiveSnapshot, so other users of > > PartitionDesc, like user queries, still default to using the > > ActiveSnapshot, which doesn't have any known problems either. > > > > 0001 and 0002 are mostly unchanged in this version, except I took out > > the visibility bug-fix from 0002 into 0004 described above, which > > looks better using the interface added by 0003 anyway. I need to > > address the main concern that it's still hard to be sure that the > > patch in its current form doesn't break any user-level semantics of > > these RI check triggers and other concerns about the implementation > > that Robert expressed in [1]. > > Oops, I apparently posted the wrong 0004, containing a bug that > crashes `make check`. > > Fixed version attached. Here's another version that hopefully fixes the crash reported by Cirrus CI [1] that is not reliably reproducible. I suspect it may have to do with error_context_stack not being reset when ri_LookupKeyInPkRel() does an early return; the `return false` in that case was wrong too: @@ -2693,7 +2693,7 @@ ri_LookupKeyInPkRel(struct RI_Plan *plan, * looking for. */ if (leaf_pk_rel == NULL) - return false; + goto done; ... +done: /* * Pop the error context stack */ -- Thanks, Amit Langote EDB: http://www.enterprisedb.com [1] https://cirrus-ci.com/task/4901906421121024 (permalink?)
Attachment
On Thu, Sep 29, 2022 at 6:09 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Thu, Sep 29, 2022 at 4:43 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Thu, Sep 29, 2022 at 1:46 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > Sorry about the delay. > > > > > > So I came up with such a patch that is attached as 0003. > > > > > > The main problem I want to fix with it is the need for RI_FKey_check() > > > to "force"-push the latest snapshot that the PartitionDesc code wants > > > to use to correctly include or omit a detach-pending partition from > > > the view of that function's RI query. Scribbling on ActiveSnapshot > > > that way means that *all* scans involved in the execution of that > > > query now see a snapshot that they shouldn't likely be seeing; a bug > > > resulting from this has been demonstrated in a test case added by the > > > commit 00cb86e75d. > > > > > > The fix is to make RI_FKey_check(), or really its RI_Plan's execution > > > function ri_LookupKeyInPkRel() added by patch 0002, pass the latest > > > snapshot explicitly as a parameter of PartitionDirectoryLookup(), > > > which passes it down to the PartitionDesc code. No need to manipulate > > > ActiveSnapshot. The actual fix is in patch 0004, which I extracted > > > out of 0002 to keep the latter a mere refactoring patch without any > > > semantic changes (though a bit more on that below). BTW, I don't know > > > of a way to back-patch a fix like this for the bug, because there is > > > no way other than ActiveSnapshot to pass the desired snapshot to the > > > PartitionDesc code if the only way we get to that code is by executing > > > an SQL query plan. > > > > > > 0003 moves the relevant logic out of > > > find_inheritance_children_extended() into its callers. The logic of > > > deciding which snapshot to use to determine if a detach-pending > > > partition should indeed be omitted from the consideration of a caller > > > based on the result of checking the visibility of the corresponding > > > pg_inherits row with the snapshot; it just uses ActiveSnapshot now. > > > Given the problems with using ActiveSnapshot mentioned above, I think > > > it is better to make the callers decide the snapshot and pass it using > > > a parameter named omit_detached_snapshot. Only PartitionDesc code > > > actually cares about sending anything but the parent query's > > > ActiveSnapshot, so the PartitionDesc and PartitionDirectory interface > > > has been changed to add the same omit_detached_snapshot parameter. > > > find_inheritance_children(), the other caller used in many sites that > > > look at a table's partitions, defaults to using ActiveSnapshot, which > > > does not seem problematic. Furthermore, only RI_FKey_check() needs to > > > pass anything other than ActiveSnapshot, so other users of > > > PartitionDesc, like user queries, still default to using the > > > ActiveSnapshot, which doesn't have any known problems either. > > > > > > 0001 and 0002 are mostly unchanged in this version, except I took out > > > the visibility bug-fix from 0002 into 0004 described above, which > > > looks better using the interface added by 0003 anyway. I need to > > > address the main concern that it's still hard to be sure that the > > > patch in its current form doesn't break any user-level semantics of > > > these RI check triggers and other concerns about the implementation > > > that Robert expressed in [1]. > > > > Oops, I apparently posted the wrong 0004, containing a bug that > > crashes `make check`. > > > > Fixed version attached. > > Here's another version that hopefully fixes the crash reported by > Cirrus CI [1] that is not reliably reproducible. And cfbot #1, which failed a bit after the above one, is not happy with my failing to include utils/snapshot.h in a partdesc.h to which I added: @@ -65,9 +66,11 @@ typedef struct PartitionDescData extern PartitionDesc RelationGetPartitionDesc(Relation rel, bool omit_detached); +extern PartitionDesc RelationGetPartitionDescExt(Relation rel, bool omit_detached, + Snapshot omit_detached_snapshot); extern PartitionDirectory CreatePartitionDirectory(MemoryContext mcxt, bool omit_detached); -extern PartitionDesc PartitionDirectoryLookup(PartitionDirectory, Relation); +extern PartitionDesc PartitionDirectoryLookup(PartitionDirectory, Relation, Snapshot); So, here's a final revision for today. Sorry for the noise. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Attachment
Hi, On 2022-09-29 18:18:10 +0900, Amit Langote wrote: > So, here's a final revision for today. Sorry for the noise. This appears to fail on 32bit systems. Seems the new test is indeed worthwhile... https://cirrus-ci.com/task/6581521615159296?logs=test_world_32#L406 [19:12:24.452] Summary of Failures: [19:12:24.452] [19:12:24.452] 2/243 postgresql:main / main/regress FAIL 45.08s (exitstatus 1) [19:12:24.452] 4/243 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade ERROR 71.96s [19:12:24.452] 32/243 postgresql:recovery / recovery/027_stream_regress ERROR 45.84s Unfortunately ccf36ea2580f66abbc37f27d8c296861ffaad9bf seems to not have suceeded in capture the test files of the 32bit build (and perhaps broke it for 64bit builds as well?), so I can't see the regression.diffs contents. [19:12:24.387] alter_table ... FAILED 4546 ms ... [19:12:24.387] ======================== [19:12:24.387] 1 of 211 tests failed. [19:12:24.387] ======================== [19:12:24.387] ... Greetings, Andres Freund
Hi, On 2022-10-01 18:21:15 -0700, Andres Freund wrote: > On 2022-09-29 18:18:10 +0900, Amit Langote wrote: > > So, here's a final revision for today. Sorry for the noise. > > This appears to fail on 32bit systems. Seems the new test is indeed > worthwhile... > > https://cirrus-ci.com/task/6581521615159296?logs=test_world_32#L406 > > [19:12:24.452] Summary of Failures: > [19:12:24.452] > [19:12:24.452] 2/243 postgresql:main / main/regress FAIL 45.08s(exit status 1) > [19:12:24.452] 4/243 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade ERROR 71.96s > [19:12:24.452] 32/243 postgresql:recovery / recovery/027_stream_regress ERROR 45.84s > > Unfortunately ccf36ea2580f66abbc37f27d8c296861ffaad9bf seems to not have > suceeded in capture the test files of the 32bit build (and perhaps broke it > for 64bit builds as well?), so I can't see the regression.diffs contents. Oh, that appears to have been an issue on the CI side (*), while uploading the logs. The previous run did catch the error: diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/alter_table.out /tmp/cirrus-ci-build/build-32/testrun/main/regress/results/alter_table.out --- /tmp/cirrus-ci-build/src/test/regress/expected/alter_table.out 2022-09-30 15:05:49.930613669 +0000 +++ /tmp/cirrus-ci-build/build-32/testrun/main/regress/results/alter_table.out 2022-09-30 15:11:21.050383258 +0000 @@ -672,6 +672,8 @@ ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable; -- Check it actually works INSERT INTO FKTABLE VALUES(42); -- should succeed +ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_ftest1_fkey" +DETAIL: Key (ftest1)=(42) is not present in table "pktable". INSERT INTO FKTABLE VALUES(43); -- should fail ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_ftest1_fkey" DETAIL: Key (ftest1)=(43) is not present in table "pktable". Greetings, Andres Freund * Error from upload stream: rpc error: code = Unknown desc =
On Sun, Oct 2, 2022 at 10:24 AM Andres Freund <andres@anarazel.de> wrote: > On 2022-10-01 18:21:15 -0700, Andres Freund wrote: > > On 2022-09-29 18:18:10 +0900, Amit Langote wrote: > > > So, here's a final revision for today. Sorry for the noise. > > > > This appears to fail on 32bit systems. Seems the new test is indeed > > worthwhile... > > > > https://cirrus-ci.com/task/6581521615159296?logs=test_world_32#L406 > > > > [19:12:24.452] Summary of Failures: > > [19:12:24.452] > > [19:12:24.452] 2/243 postgresql:main / main/regress FAIL 45.08s(exit status 1) > > [19:12:24.452] 4/243 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade ERROR 71.96s > > [19:12:24.452] 32/243 postgresql:recovery / recovery/027_stream_regress ERROR 45.84s > > > > Unfortunately ccf36ea2580f66abbc37f27d8c296861ffaad9bf seems to not have > > suceeded in capture the test files of the 32bit build (and perhaps broke it > > for 64bit builds as well?), so I can't see the regression.diffs contents. > > Oh, that appears to have been an issue on the CI side (*), while uploading the > logs. The previous run did catch the error: > > diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/alter_table.out /tmp/cirrus-ci-build/build-32/testrun/main/regress/results/alter_table.out > --- /tmp/cirrus-ci-build/src/test/regress/expected/alter_table.out 2022-09-30 15:05:49.930613669 +0000 > +++ /tmp/cirrus-ci-build/build-32/testrun/main/regress/results/alter_table.out 2022-09-30 15:11:21.050383258 +0000 > @@ -672,6 +672,8 @@ > ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable; > -- Check it actually works > INSERT INTO FKTABLE VALUES(42); -- should succeed > +ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_ftest1_fkey" > +DETAIL: Key (ftest1)=(42) is not present in table "pktable". > INSERT INTO FKTABLE VALUES(43); -- should fail > ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_ftest1_fkey" > DETAIL: Key (ftest1)=(43) is not present in table "pktable". Thanks for the heads up. Hmm, this I am not sure how to reproduce on my own, so I am currently left with second-guessing what may be going wrong on 32 bit machines with whichever of the 4 patches. For now, I'll just post 0001, which I am claiming has no semantic changes (proof pending), to rule out that that one's responsible. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Attachment
On Fri, Oct 7, 2022 at 6:26 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Sun, Oct 2, 2022 at 10:24 AM Andres Freund <andres@anarazel.de> wrote: > > On 2022-10-01 18:21:15 -0700, Andres Freund wrote: > > > On 2022-09-29 18:18:10 +0900, Amit Langote wrote: > > > > So, here's a final revision for today. Sorry for the noise. > > > > > > This appears to fail on 32bit systems. Seems the new test is indeed > > > worthwhile... > > > > > > https://cirrus-ci.com/task/6581521615159296?logs=test_world_32#L406 > > > > > > [19:12:24.452] Summary of Failures: > > > [19:12:24.452] > > > [19:12:24.452] 2/243 postgresql:main / main/regress FAIL 45.08s(exit status 1) > > > [19:12:24.452] 4/243 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade ERROR 71.96s > > > [19:12:24.452] 32/243 postgresql:recovery / recovery/027_stream_regress ERROR 45.84s > > > > > > Unfortunately ccf36ea2580f66abbc37f27d8c296861ffaad9bf seems to not have > > > suceeded in capture the test files of the 32bit build (and perhaps broke it > > > for 64bit builds as well?), so I can't see the regression.diffs contents. > > > > Oh, that appears to have been an issue on the CI side (*), while uploading the > > logs. The previous run did catch the error: > > > > diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/alter_table.out /tmp/cirrus-ci-build/build-32/testrun/main/regress/results/alter_table.out > > --- /tmp/cirrus-ci-build/src/test/regress/expected/alter_table.out 2022-09-30 15:05:49.930613669 +0000 > > +++ /tmp/cirrus-ci-build/build-32/testrun/main/regress/results/alter_table.out 2022-09-30 15:11:21.050383258 +0000 > > @@ -672,6 +672,8 @@ > > ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable; > > -- Check it actually works > > INSERT INTO FKTABLE VALUES(42); -- should succeed > > +ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_ftest1_fkey" > > +DETAIL: Key (ftest1)=(42) is not present in table "pktable". > > INSERT INTO FKTABLE VALUES(43); -- should fail > > ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_ftest1_fkey" > > DETAIL: Key (ftest1)=(43) is not present in table "pktable". > > Thanks for the heads up. Hmm, this I am not sure how to reproduce on > my own, so I am currently left with second-guessing what may be going > wrong on 32 bit machines with whichever of the 4 patches. > > For now, I'll just post 0001, which I am claiming has no semantic > changes (proof pending), to rule out that that one's responsible. Nope, not 0001. Here's 0001+0002. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On 2022-Oct-07, Amit Langote wrote: > > Thanks for the heads up. Hmm, this I am not sure how to reproduce on > > my own, so I am currently left with second-guessing what may be going > > wrong on 32 bit machines with whichever of the 4 patches. > > > > For now, I'll just post 0001, which I am claiming has no semantic > > changes (proof pending), to rule out that that one's responsible. > > Nope, not 0001. Here's 0001+0002. Please note that you can set up a github repository so that cirrus-ci tests whatever patches you like, without having to post them to pg-hackers. See src/tools/ci/README, it takes three minutes if you already have the account and repository. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
On Fri, Oct 7, 2022 at 19:15 Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2022-Oct-07, Amit Langote wrote:
> > Thanks for the heads up. Hmm, this I am not sure how to reproduce on
> > my own, so I am currently left with second-guessing what may be going
> > wrong on 32 bit machines with whichever of the 4 patches.
> >
> > For now, I'll just post 0001, which I am claiming has no semantic
> > changes (proof pending), to rule out that that one's responsible.
>
> Nope, not 0001. Here's 0001+0002.
Please note that you can set up a github repository so that cirrus-ci
tests whatever patches you like, without having to post them to
pg-hackers. See src/tools/ci/README, it takes three minutes if you
already have the account and repository.
Ah, that’s right. Will do so, thanks for the suggestion.
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
EDB: http://www.enterprisedb.com
On Fri, Oct 7, 2022 at 7:17 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Fri, Oct 7, 2022 at 19:15 Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> On 2022-Oct-07, Amit Langote wrote: >> > > Thanks for the heads up. Hmm, this I am not sure how to reproduce on >> > > my own, so I am currently left with second-guessing what may be going >> > > wrong on 32 bit machines with whichever of the 4 patches. >> > > >> > > For now, I'll just post 0001, which I am claiming has no semantic >> > > changes (proof pending), to rule out that that one's responsible. >> > >> > Nope, not 0001. Here's 0001+0002. I had forgotten to actually attach anything with that email. >> Please note that you can set up a github repository so that cirrus-ci >> tests whatever patches you like, without having to post them to >> pg-hackers. See src/tools/ci/README, it takes three minutes if you >> already have the account and repository. > > Ah, that’s right. Will do so, thanks for the suggestion. I'm waiting to hear from GitHub Support to resolve an error I'm facing trying to add Cirrus CI to my account. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Attachment
On Thu, Sep 29, 2022 at 12:47 AM Amit Langote <amitlangote09@gmail.com> wrote: > [ patches ] While looking over this thread I came across this code: /* For data reading, executor always omits detached partitions */ if (estate->es_partition_directory == NULL) estate->es_partition_directory = CreatePartitionDirectory(estate->es_query_cxt, false); But CreatePartitionDirectory is declared like this: extern PartitionDirectory CreatePartitionDirectory(MemoryContext mcxt, bool omit_detached); So the comment seems to say the opposite of what the code does. The code seems to match the explanation in the commit message for 71f4c8c6f74ba021e55d35b1128d22fb8c6e1629, so I am guessing that perhaps s/always/never/ is needed here. I also noticed that ExecCreatePartitionPruneState no longer exists in the code but is still referenced in src/test/modules/delay_execution/specs/partition-addition.spec Regarding 0003, it seems unfortunate that find_inheritance_children_extended() will now have 6 arguments 4 of which have to do with detached partition handling. That is a lot of detached partition handling, and it's hard to reason about. I don't see an obvious way of simplifying things very much, but I wonder if we could at least have the new omit_detached_snapshot snapshot replace the existing bool omit_detached flag. Something like the attached incremental patch. Probably we need to go further than the attached, though. I don't think that PartitionDirectoryLookup() should be getting any new arguments. The whole point of that function is that it's supposed to ensure that the returned value is stable, and the comments say so. But with these changes it isn't any more, because it depends on the snapshot you pass. It seems fine to specify when you create the partition directory that you want it to show a different, still-stable view of the world, but as written, it seems to me to undermine the idea that the return value is expected to be stable at all. Is there a way we can avoid that? -- Robert Haas EDB: http://www.enterprisedb.com
Attachment
On Wed, Oct 12, 2022 at 2:27 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Sep 29, 2022 at 12:47 AM Amit Langote <amitlangote09@gmail.com> wrote: > > [ patches ] > > While looking over this thread I came across this code: Thanks for looking. > /* For data reading, executor always omits detached partitions */ > if (estate->es_partition_directory == NULL) > estate->es_partition_directory = > CreatePartitionDirectory(estate->es_query_cxt, false); > > But CreatePartitionDirectory is declared like this: > > extern PartitionDirectory CreatePartitionDirectory(MemoryContext mcxt, > bool omit_detached); > > So the comment seems to say the opposite of what the code does. The > code seems to match the explanation in the commit message for > 71f4c8c6f74ba021e55d35b1128d22fb8c6e1629, so I am guessing that > perhaps s/always/never/ is needed here. I think you are right. In commit 8aba9322511 that fixed a bug in this area, we have this hunk: - /* Executor must always include detached partitions */ + /* For data reading, executor always omits detached partitions */ if (estate->es_partition_directory == NULL) estate->es_partition_directory = - CreatePartitionDirectory(estate->es_query_cxt, true); + CreatePartitionDirectory(estate->es_query_cxt, false); The same commit also renamed the include_detached parameter of CreatePartitionDirectory() to omit_detached but the comment change didn't quite match with that. I will fix this and other related comments to be consistent about using the word "omit". Will include them in the updated 0003. > I also noticed that ExecCreatePartitionPruneState no longer exists in > the code but is still referenced in > src/test/modules/delay_execution/specs/partition-addition.spec It looks like we missed that reference in commit 297daa9d435 wherein we renamed it to just CreatePartitionPruneState(). I have posted a patch to fix this. > Regarding 0003, it seems unfortunate that > find_inheritance_children_extended() will now have 6 arguments 4 of > which have to do with detached partition handling. That is a lot of > detached partition handling, and it's hard to reason about. I don't > see an obvious way of simplifying things very much, but I wonder if we > could at least have the new omit_detached_snapshot snapshot replace > the existing bool omit_detached flag. Something like the attached > incremental patch. Yeah, I was wondering the same too and don't see a reason why we couldn't do it that way. I have merged your incremental patch into 0003. > Probably we need to go further than the attached, though. I don't > think that PartitionDirectoryLookup() should be getting any new > arguments. The whole point of that function is that it's supposed to > ensure that the returned value is stable, and the comments say so. But > with these changes it isn't any more, because it depends on the > snapshot you pass. It seems fine to specify when you create the > partition directory that you want it to show a different, still-stable > view of the world, but as written, it seems to me to undermine the > idea that the return value is expected to be stable at all. Is there a > way we can avoid that? Ok, I think it makes sense to have CreatePartitionDirectory take in the snapshot and store it in PartitionDirectoryData for use during each subsequent PartitionDirectoryLookup(). So we'll be replacing the current omit_detached flag in PartitionDirectoryData, just as we are doing for the interface functions. Done that way in 0003. Regarding 0002, which introduces ri_LookupKeyInPkRel(), I realized that it may have been initializing the ScanKeys wrongly. It was using ScanKeyInit(), which uses InvalidOid for sk_subtype, causing the index AM / btree code to use the wrong comparison functions when PK and FK column types don't match. That may have been a reason for 32-bit machine failures pointed out by Andres upthread. I've fixed it by using ScanKeyEntryInitialize() to pass the opfamily-specified right argument (FK column) type OID. Attached updated patches. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Attachment
On Sat, Oct 15, 2022 at 1:47 AM Amit Langote <amitlangote09@gmail.com> wrote: > I have merged your incremental patch into 0003. Note that if someone goes to commit 0003, they would have no idea that I contributed to the effort. You should probably try to keep a running list of co-authors, reviewers, or other people that need to be acknowledged in your draft commit messages. On that note, I think that the commit messages for 0001 and to some extent 0002 need some more work. In particular, it seems like the commit message for 0001 is entirely concerned with what the patch does and says nothing about why it's a good idea. In my opinion, a good commit message needs to do both, ideally but not always in less space than this patch takes to do only one of those things. 0002 has the same problem to a lesser degree, since it is perhaps not so hard to infer that the reason for avoiding the SQL query is performance. I am wondering if the ordering for this patch series needs to be rethought. The commit message for 0004 reads as if it is fixing a bug introduced by earlier patches in the series. If that is not correct, maybe it can be made clearer. If it is correct, then that's not good, because we don't want to commit buggy patches and then make follow-up commits to remove the bugs. If a planned commit needs new infrastructure to avoid being buggy, the commits adding that infrastructure should happen first. But I think the bigger problem for this patch set is that the design-level feedback from https://www.postgresql.org/message-id/CA%2BTgmoaiTNj4DgQy42OT9JmTTP1NWcMV%2Bke0i%3D%2Ba7%3DVgnzqGXw%40mail.gmail.com hasn't really been addressed, AFAICS. ri_LookupKeyInPkRelPlanIsValid is still trivial in v7, and that still seems wrong to me. And I still don't know how we're going to avoid changing the semantics in ways that are undesirable, or even knowing precisely what we did change. If we don't have answers to those questions, then I suspect that this patch set isn't going anywhere. -- Robert Haas EDB: http://www.enterprisedb.com
Hi, On 2022-10-15 14:47:05 +0900, Amit Langote wrote: > Attached updated patches. These started to fail to build recently: [04:43:33.046] ccache cc -Isrc/backend/postgres_lib.a.p -Isrc/include -I../src/include -Isrc/include/storage -Isrc/include/utils-Isrc/include/catalog -Isrc/include/nodes -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall-Winvalid-pch -g -fno-strict-aliasing -fwrapv -fexcess-precision=standard -D_GNU_SOURCE -Wmissing-prototypes -Wpointer-arith-Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wshadow=compatible-local-Wformat-security -Wdeclaration-after-statement -Wno-format-truncation -Wno-stringop-truncation-fPIC -pthread -DBUILDING_DLL -MD -MQ src/backend/postgres_lib.a.p/executor_execPartition.c.o -MFsrc/backend/postgres_lib.a.p/executor_execPartition.c.o.d -o src/backend/postgres_lib.a.p/executor_execPartition.c.o -c../src/backend/executor/execPartition.c [04:43:33.046] ../src/backend/executor/execPartition.c: In function ‘ExecGetLeafPartitionForKey’: [04:43:33.046] ../src/backend/executor/execPartition.c:1679:19: error: too few arguments to function ‘build_attrmap_by_name_if_req’ [04:43:33.046] 1679 | AttrMap *map = build_attrmap_by_name_if_req(RelationGetDescr(root_rel), [04:43:33.046] | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ [04:43:33.046] In file included from ../src/include/access/tupconvert.h:17, [04:43:33.046] from ../src/include/nodes/execnodes.h:32, [04:43:33.046] from ../src/include/executor/execPartition.h:16, [04:43:33.046] from ../src/backend/executor/execPartition.c:21: [04:43:33.046] ../src/include/access/attmap.h:47:17: note: declared here [04:43:33.046] 47 | extern AttrMap *build_attrmap_by_name_if_req(TupleDesc indesc, [04:43:33.046] | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ Regards, Andres
On Mon, 17 Oct 2022 at 14:59, Robert Haas <robertmhaas@gmail.com> wrote: > > On Sat, Oct 15, 2022 at 1:47 AM Amit Langote <amitlangote09@gmail.com> wrote: > > But I think the bigger problem for this patch set is that the > design-level feedback from > https://www.postgresql.org/message-id/CA%2BTgmoaiTNj4DgQy42OT9JmTTP1NWcMV%2Bke0i%3D%2Ba7%3DVgnzqGXw%40mail.gmail.com > hasn't really been addressed, AFAICS. ri_LookupKeyInPkRelPlanIsValid > is still trivial in v7, and that still seems wrong to me. And I still > don't know how we're going to avoid changing the semantics in ways > that are undesirable, or even knowing precisely what we did change. If > we don't have answers to those questions, then I suspect that this > patch set isn't going anywhere. Amit, do you plan to work on this patch for this commitfest (and therefore this release?). And do you think it has a realistic chance of being ready for commit this month? It looks to me like you have some good feedback and can progress and are unlikely to finish this patch for this release. In which case maybe we can move it forward to the next release? -- Gregory Stark As Commitfest Manager
Hi Greg, On Tue, Mar 21, 2023 at 3:54 AM Gregory Stark (as CFM) <stark.cfm@gmail.com> wrote: > On Mon, 17 Oct 2022 at 14:59, Robert Haas <robertmhaas@gmail.com> wrote: > > But I think the bigger problem for this patch set is that the > > design-level feedback from > > https://www.postgresql.org/message-id/CA%2BTgmoaiTNj4DgQy42OT9JmTTP1NWcMV%2Bke0i%3D%2Ba7%3DVgnzqGXw%40mail.gmail.com > > hasn't really been addressed, AFAICS. ri_LookupKeyInPkRelPlanIsValid > > is still trivial in v7, and that still seems wrong to me. And I still > > don't know how we're going to avoid changing the semantics in ways > > that are undesirable, or even knowing precisely what we did change. If > > we don't have answers to those questions, then I suspect that this > > patch set isn't going anywhere. > > Amit, do you plan to work on this patch for this commitfest (and > therefore this release?). And do you think it has a realistic chance > of being ready for commit this month? Unfortunately, I don't think so. > It looks to me like you have some good feedback and can progress and > are unlikely to finish this patch for this release. In which case > maybe we can move it forward to the next release? Yes, that's what I am thinking too at this point. I agree with Robert's point that changing the implementation from an SQL query plan to a hand-rolled C function is going to change the semantics in some known and perhaps many unknown ways. Until I have enumerated all those semantic changes, it's hard to judge whether the hand-rolled implementation is correct to begin with. I had started doing that a few months back but couldn't keep up due to some other work. An example I had found of a thing that would be broken by taking out the executor out of the equation, as the patch does, is the behavior of an update under READ COMMITTED isolation, whereby a PK tuple being checked for existence is concurrently updated and thus needs to rechecked whether it still satisfies the RI query's conditions. The executor has the EvalPlanQual() mechanism to do that, but while the hand-rolled implementation did refactor ExecLockRows() to allow doing the tuple-locking without a PlanState, it gave no consideration to handling rechecking under READ COMMITTED isolation. There may be other such things and I think I'd better look for them carefully in the next cycle than in the next couple of weeks for this release. My apologies that I didn't withdraw the patch sooner. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
> On 21 Mar 2023, at 06:03, Amit Langote <amitlangote09@gmail.com> wrote: > On Tue, Mar 21, 2023 at 3:54 AM Gregory Stark (as CFM) <stark.cfm@gmail.com> wrote: >> On Mon, 17 Oct 2022 at 14:59, Robert Haas <robertmhaas@gmail.com> wrote: >>> But I think the bigger problem for this patch set is that the >>> design-level feedback from >>> https://www.postgresql.org/message-id/CA%2BTgmoaiTNj4DgQy42OT9JmTTP1NWcMV%2Bke0i%3D%2Ba7%3DVgnzqGXw%40mail.gmail.com >>> hasn't really been addressed, AFAICS. ri_LookupKeyInPkRelPlanIsValid >>> is still trivial in v7, and that still seems wrong to me. And I still >>> don't know how we're going to avoid changing the semantics in ways >>> that are undesirable, or even knowing precisely what we did change. If >>> we don't have answers to those questions, then I suspect that this >>> patch set isn't going anywhere. >> >> Amit, do you plan to work on this patch for this commitfest (and >> therefore this release?). And do you think it has a realistic chance >> of being ready for commit this month? > > Unfortunately, I don't think so. This thread has stalled with the patch not building and/or applying for a while, so I am going to mark this Returned with Feebdback. Please feel free to resubmit to a future CF when there is renewed interest/time to work on this. -- Daniel Gustafsson
On Mon, Jul 10, 2023 at 5:27 PM Daniel Gustafsson <daniel@yesql.se> wrote: > > On 21 Mar 2023, at 06:03, Amit Langote <amitlangote09@gmail.com> wrote: > > On Tue, Mar 21, 2023 at 3:54 AM Gregory Stark (as CFM) <stark.cfm@gmail.com> wrote: > >> On Mon, 17 Oct 2022 at 14:59, Robert Haas <robertmhaas@gmail.com> wrote: > > >>> But I think the bigger problem for this patch set is that the > >>> design-level feedback from > >>> https://www.postgresql.org/message-id/CA%2BTgmoaiTNj4DgQy42OT9JmTTP1NWcMV%2Bke0i%3D%2Ba7%3DVgnzqGXw%40mail.gmail.com > >>> hasn't really been addressed, AFAICS. ri_LookupKeyInPkRelPlanIsValid > >>> is still trivial in v7, and that still seems wrong to me. And I still > >>> don't know how we're going to avoid changing the semantics in ways > >>> that are undesirable, or even knowing precisely what we did change. If > >>> we don't have answers to those questions, then I suspect that this > >>> patch set isn't going anywhere. > >> > >> Amit, do you plan to work on this patch for this commitfest (and > >> therefore this release?). And do you think it has a realistic chance > >> of being ready for commit this month? > > > > Unfortunately, I don't think so. > > This thread has stalled with the patch not building and/or applying for a > while, so I am going to mark this Returned with Feebdback. Agreed, I was about to do so myself. I'll give this another try later in the cycle. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com