Re: Eliminating SPI from RI triggers - take 2 - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Eliminating SPI from RI triggers - take 2 |
Date | |
Msg-id | CA+TgmoaiTNj4DgQy42OT9JmTTP1NWcMV+ke0i=+a7=VgnzqGXw@mail.gmail.com Whole thread Raw |
In response to | Eliminating SPI from RI triggers - take 2 (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: Eliminating SPI from RI triggers - take 2
Re: Eliminating SPI from RI triggers - take 2 |
List | pgsql-hackers |
On Fri, Jul 1, 2022 at 2:23 AM Amit Langote <amitlangote09@gmail.com> wrote: > So, I hacked together a patch (attached 0001) that invents an "RI > plan" construct (struct RIPlan) to replace the use of an "SPI plan" > (struct _SPI_plan). > > With that in place, I decided to rebase my previous patch [1] to use > this new interface and the result is attached 0002. I think inventing something like RIPlan is probably reasonable, but I'm not sure how much it really does to address the objections that were raised previously. How do we know that ri_LookupKeyInPkRel does all the same things that executing a plan would have done? I see that function contains permission-checking logic, for example, as well as snapshot-related logic, and maybe there are other subsystems to worry about, like rules or triggers or row-level security. Maybe there's no answer to that problem other than careful manual verification, because after all the only way to be 100% certain we're doing all the things that would happen if you executed a plan is to execute a plan, which kind of defeats the point of the whole thing. All I'm saying is that I'm not sure that this refactoring in and of itself addresses that concern. As far as 0002 goes, the part I'm most skeptical about is this: +static bool +ri_LookupKeyInPkRelPlanIsValid(RI_Plan *plan) +{ + /* Never store anything that can be invalidated. */ + return true; +} Isn't that leaving rather a lot on the table? ri_LookupKeyInPkRel is going to be called a lot of times and do a lot of things over and over again that maybe only need to be done once, like checking permissions and looking up the operators to use and reopening the index. And all the stuff ExecGetLeafPartitionForKey does too, yikes that's a lot of stuff. Now maybe that's what Tom wants, I don't know. Certainly, the existing SQL-based implementation is going to do that stuff on every call, too; I'm just not sure that's a good thing. I think there's some debate to be had here over what behavior we need to preserve exactly vs. what we can and should change. For instance, it seems clear to me that leaving out permissions checks altogether would be not OK, but if this implementation arranged to cache the results of a permission check and the SQL-based implementations don't, is that OK? Maybe Tom would argue that it isn't, because he considers that a part of the user-visible behavior, but I'm not sure that's the right view of it. I think what we're promising the user is that we will check permissions, not that we're going to do it separately for every trigger firing, or even that every kind of trigger is going to do it exactly the same number of times as every other trigger. I think we need some input from Tom (and perhaps others) on how rigidly we need to maintain the high-level behavior here before we can really say much about whether the implementation is as good as it can be. I suspect, though, that there's more that can be done here in terms of sharing code. For instance, picking on the permissions checking logic, presumably that's something that every non-SQL implementation would need to do. But the rest of what's in ri_LookupKeyInPkRel() is specific to one particular kind of trigger. If we had multiple non-SQL trigger types, we'd want to somehow have common logic for permissions checking for all of them. I also suspect that we ought to have a separation between planning and execution even for non-SQL based things. You don't really have that here. What that ought to look like, though, depends on the answers to the questions above, about how exactly we think we need to reproduce the existing behavior. I find my ego slightly wounded by the comment that "the partition descriptor machinery has a hack that assumes that the queries originating in this module push the latest snapshot in the transaction-snapshot mode." It's true that the partition descriptor machinery gives different answers depending on the active snapshot, but, err, is that a hack, or just a perfectly reasonable design decision? An alternative might be for PartitionDirectoryLookup to take a snapshot as an explicit argument rather than relying on the global variable to get that information from context. I generally feel that we rely too much on global variables where we should be passing around explicit parameters, so if you're just arguing that explicit parameters would be better here, then I agree and just didn't think of it. If you're arguing that making the answer depend on the snapshot is itself a bad idea, I don't agree with that. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date:
Previous
From: Bharath RupireddyDate:
Subject: Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
Next
From: Bharath RupireddyDate:
Subject: Re: pg_walcleaner - new tool to detect, archive and delete the unneeded wal files (was Re: pg_archivecleanup - add the ability to detect, archive and delete the unneeded wal files on the primary)