Thread: oat_post_create expected behavior
Hello,
I was using an object access hook for oat_post_create access while creating an extension and expected that I would be able to query for the newly created extension with get_extension_oid(), but it was returning InvalidOid. However, the same process works for triggers, so I was wondering what the expected behavior is?
From the documentation in objectaccess.h, it doesn't mention anything about CommandCounterIncrement() for POST_CREATE which implied to me that it wasn't something I would need to worry about.
One option I thought of was this patch where CCI is called before the access hook so that the new tuple is visible in the hook. Another option would be to revise the documentation to reflect the expected behavior.
Thanks,
Mary Xu
Attachment
On Thu, Jun 2, 2022 at 6:37 PM Mary Xu <yxu2162@gmail.com> wrote: > I was using an object access hook for oat_post_create access while creating an extension and expected that I would be ableto query for the newly created extension with get_extension_oid(), but it was returning InvalidOid. However, the sameprocess works for triggers, so I was wondering what the expected behavior is? > From the documentation in objectaccess.h, it doesn't mention anything about CommandCounterIncrement() for POST_CREATE whichimplied to me that it wasn't something I would need to worry about. > One option I thought of was this patch where CCI is called before the access hook so that the new tuple is visible in thehook. Another option would be to revise the documentation to reflect the expected behavior. I don't think a proposal to add CommandCounterIncrement() calls just for the convenience of object access hooks has much chance of being accepted. Possibly there is some work that could be done to ensure consistent placement of the calls to post-create hooks so that either all of them happen before, or all of them happen after, a CCI has occurred, but I'm not sure whether or not that is feasible. Currently, I don't think we promise anything about whether a post-create hook call will occur before or after a CCI, which is why sepgsql_schema_post_create(), sepgsql_schema_post_create(), and sepgsql_attribute_post_create() perform a catalog scan using SnapshotSelf, while sepgsql_database_post_create() uses get_database_oid(). You might want to adopt a similar technique. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, 2022-06-06 at 10:51 -0400, Robert Haas wrote: > I don't think a proposal to add CommandCounterIncrement() calls just > for the convenience of object access hooks has much chance of being > accepted. Out of curiosity, why not? The proposed patch only runs it if the object access hook is set. Do you see a situation where it would be confusing that an earlier DDL change is visible? And if so, would it make more sense to call CCI unconditionally? Also, would it ever be reasonable for such a hook to call CCI itself? As you say, it could use SnapshotSelf, but sometimes you might want to call routines that assume they can use an MVCC snapshot. This question applies to the OAT_POST_ALTER hook as well as OAT_POST_CREATE. > Possibly there is some work that could be done to ensure > consistent placement of the calls to post-create hooks so that either > all of them happen before, or all of them happen after, a CCI has > occurred, but I'm not sure whether or not that is feasible. I like the idea of having a test in place so that we at least know when something changes. Otherwise it would be pretty easy to break an extension by adjusting some code. > Currently, > I don't think we promise anything about whether a post-create hook > call will occur before or after a CCI, which is why > sepgsql_schema_post_create(), sepgsql_schema_post_create(), and > sepgsql_attribute_post_create() perform a catalog scan using > SnapshotSelf, while sepgsql_database_post_create() uses > get_database_oid(). You might want to adopt a similar technique. It would be good to document this a little better though: * OAT_POST_CREATE should be invoked just after the object is created. * Typically, this is done after inserting the primary catalog records and * associated dependencies. doesn't really give any guidance, while the comment for alter does: * OAT_POST_ALTER should be invoked just after the object is altered, * but before the command counter is incremented. An extension using the * hook can use a current MVCC snapshot to get the old version of the tuple, * and can use SnapshotSelf to get the new version of the tuple. Regards, Jeff Davis PS: I added this to the July CF: https://commitfest.postgresql.org/38/3676/
On Mon, Jun 6, 2022 at 1:35 PM Jeff Davis <pgsql@j-davis.com> wrote: > Out of curiosity, why not? The proposed patch only runs it if the > object access hook is set. Do you see a situation where it would be > confusing that an earlier DDL change is visible? And if so, would it > make more sense to call CCI unconditionally? Well, I think that a fair amount of work has been done previously to cut down on unnecessary CCIs. I suspect Tom in particular is likely to object to adding a whole bunch more of them, and I think that objection would have some merit. I definitely think if we were going to do it, it would need to be unconditional. Otherwise I think we'll end up with bugs, because nobody's going to go test all of that code with and without an object access hook installed every time they tweak some DDL-related code. > Also, would it ever be reasonable for such a hook to call CCI itself? > As you say, it could use SnapshotSelf, but sometimes you might want to > call routines that assume they can use an MVCC snapshot. This question > applies to the OAT_POST_ALTER hook as well as OAT_POST_CREATE. I definitely wouldn't want to warrant that it doesn't break anything. I think the extension can do it at its own risk, but I wouldn't recommend it. OAT_POST_ALTER is unlike OAT_POST_CREATE in that OAT_POST_ALTER documents that it should be called after a CCI, whereas OAT_POST_CREATE does not make a representation either way. > > Possibly there is some work that could be done to ensure > > consistent placement of the calls to post-create hooks so that either > > all of them happen before, or all of them happen after, a CCI has > > occurred, but I'm not sure whether or not that is feasible. > > I like the idea of having a test in place so that we at least know when > something changes. Otherwise it would be pretty easy to break an > extension by adjusting some code. Sure. I find writing meaningful tests for this kind of stuff hard, but there are plenty of people around here who are better at figuring out how to test obscure scenarios than I. > > Currently, > > I don't think we promise anything about whether a post-create hook > > call will occur before or after a CCI, which is why > > sepgsql_schema_post_create(), sepgsql_schema_post_create(), and > > sepgsql_attribute_post_create() perform a catalog scan using > > SnapshotSelf, while sepgsql_database_post_create() uses > > get_database_oid(). You might want to adopt a similar technique. > > It would be good to document this a little better though: > > * OAT_POST_CREATE should be invoked just after the object is created. > * Typically, this is done after inserting the primary catalog records > and > * associated dependencies. > > doesn't really give any guidance, while the comment for alter does: > > * OAT_POST_ALTER should be invoked just after the object is altered, > * but before the command counter is incremented. An extension using > the > * hook can use a current MVCC snapshot to get the old version of the > tuple, > * and can use SnapshotSelf to get the new version of the tuple. Yeah, that comment could be made more clear. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, 2022-06-06 at 13:43 -0400, Robert Haas wrote: > Yeah, that comment could be made more clear. I still don't understand what the rule is. Is the rule that OAT_POST_CREATE must always use SnapshotSelf for any catalog access? And if so, do we need to update code in contrib extensions to follow that rule? Regards, Jeff Davis
On Mon, Jun 6, 2022 at 3:46 PM Jeff Davis <pgsql@j-davis.com> wrote: > On Mon, 2022-06-06 at 13:43 -0400, Robert Haas wrote: > > Yeah, that comment could be made more clear. > > I still don't understand what the rule is. > > Is the rule that OAT_POST_CREATE must always use SnapshotSelf for any > catalog access? And if so, do we need to update code in contrib > extensions to follow that rule? I don't think there is a rule in the sense that you want there to be one. We sometimes call the object access hook before the CCI, and sometimes after, and the sepgsql code knows which cases are handled which ways and proceeds differently on that basis. If we went and changed the sepgsql code that uses system catalog lookups to use SnapshotSelf instead, I think it would still work, but it would be less efficient, so that doesn't seem like a desirable change to me. If it's possible to make the hook placement always happen after a CCI, then we could change the sepgsql code to always use catalog lookups, which would probably be more efficient but likely require adding some CCI calls, which may attract objections from Tom --- or maybe it won't. Absent either of those things, I'm inclined to just make the comment clearly state that we're not consistent about it. That's not great, but it may be the best we can do. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jun 6, 2022 at 1:35 PM Jeff Davis <pgsql@j-davis.com> wrote: >> Out of curiosity, why not? The proposed patch only runs it if the >> object access hook is set. Do you see a situation where it would be >> confusing that an earlier DDL change is visible? And if so, would it >> make more sense to call CCI unconditionally? > Well, I think that a fair amount of work has been done previously to > cut down on unnecessary CCIs. I suspect Tom in particular is likely to > object to adding a whole bunch more of them, and I think that > objection would have some merit. We've gotten things to the point where a no-op CCI is pretty cheap, so I'm not sure there is a performance concern here. I do wonder though if there are semantic or bug-hazard considerations. A CCI that occurs only if a particular hook is loaded seems pretty scary from a testability standpoint. > I definitely think if we were going to do it, it would need to be > unconditional. Otherwise I think we'll end up with bugs, because > nobody's going to go test all of that code with and without an object > access hook installed every time they tweak some DDL-related code. Right, same thing I'm saying. I also think we should discourage people from doing cowboy CCIs inside their OAT hooks, because that makes the testability problem even worse. Maybe that means we need to uniformly move the CREATE hooks to after a system-provided CCI, but I've not thought hard about the implications of that. regards, tom lane
On Mon, 2022-06-06 at 17:11 -0400, Tom Lane wrote: > Right, same thing I'm saying. I also think we should discourage > people from doing cowboy CCIs inside their OAT hooks, because that > makes the testability problem even worse. Maybe that means we > need to uniformly move the CREATE hooks to after a system-provided > CCI, but I've not thought hard about the implications of that. Uniformly moving the post-create hooks after CCI might not be as convenient as I thought at first. Many extensions using post-create hooks will also want to use post-alter hooks, and it would be difficult to reuse extension code between those two hooks. It's probably better to just always specify the snapshot unless you're sure you won't need a post-alter hook. It would be nice if it was easier to enforce that these hooks do the right thing, though. Regards, Jeff Davis
> Right, same thing I'm saying. I also think we should discourage > people from doing cowboy CCIs inside their OAT hooks, because that > makes the testability problem even worse. Maybe that means we > need to uniformly move the CREATE hooks to after a system-provided > CCI, but I've not thought hard about the implications of that. I like this approach, however, I am relatively new to the PG scene and am not sure how or what I should look into in terms of the implications that Tom mentioned. Are there any tips? What should be the next course of action here? I could update my patch to always call CCI before the create hooks. Thanks, Mary Xu On Fri, Jul 1, 2022 at 11:12 AM Jeff Davis <pgsql@j-davis.com> wrote: > > On Mon, 2022-06-06 at 17:11 -0400, Tom Lane wrote: > > Right, same thing I'm saying. I also think we should discourage > > people from doing cowboy CCIs inside their OAT hooks, because that > > makes the testability problem even worse. Maybe that means we > > need to uniformly move the CREATE hooks to after a system-provided > > CCI, but I've not thought hard about the implications of that. > > Uniformly moving the post-create hooks after CCI might not be as > convenient as I thought at first. Many extensions using post-create > hooks will also want to use post-alter hooks, and it would be difficult > to reuse extension code between those two hooks. It's probably better > to just always specify the snapshot unless you're sure you won't need a > post-alter hook. > > It would be nice if it was easier to enforce that these hooks do the > right thing, though. > > Regards, > Jeff Davis > >
On Tue, 2022-08-02 at 13:30 -0700, Mary Xu wrote: > > Right, same thing I'm saying. I also think we should discourage > > people from doing cowboy CCIs inside their OAT hooks, because that > > makes the testability problem even worse. Maybe that means we > > need to uniformly move the CREATE hooks to after a system-provided > > CCI, but I've not thought hard about the implications of that. > > I like this approach, however, I am relatively new to the PG scene > and > am not sure how or what I should look into in terms of the > implications that Tom mentioned. Are there any tips? What should be > the next course of action here? I could update my patch to always > call > CCI before the create hooks. I didn't see a clear consensus that we should call OAT_POST_CREATE after CCI, so I went ahead and updated the comment. We can always update the behavior later when we do have consensus, but until that time, at least the comment will be more helpful. If you are satisfied you can mark the CF issue as "committed", or you can leave it open if you think it's still unresolved. -- Jeff Davis PostgreSQL Contributor Team - AWS