Thread: oat_post_create expected behavior

oat_post_create expected behavior

From
Mary Xu
Date:
Hello,

I was using an object access hook for oat_post_create access while creating an extension and expected that I would be able to query for the newly created extension with get_extension_oid(), but it was returning InvalidOid. However, the same process works for triggers, so I was wondering what the expected behavior is? 
From the documentation in objectaccess.h, it doesn't mention anything about CommandCounterIncrement() for POST_CREATE which implied to me that it wasn't something I would need to worry about. 
One option I thought of was this patch where CCI is called before the access hook so that the new tuple is visible in the hook. Another option would be to revise the documentation to reflect the expected behavior. 

Thanks,

Mary Xu
Attachment

Re: oat_post_create expected behavior

From
Robert Haas
Date:
On Thu, Jun 2, 2022 at 6:37 PM Mary Xu <yxu2162@gmail.com> wrote:
> I was using an object access hook for oat_post_create access while creating an extension and expected that I would be
ableto query for the newly created extension with get_extension_oid(), but it was returning InvalidOid. However, the
sameprocess works for triggers, so I was wondering what the expected behavior is? 
> From the documentation in objectaccess.h, it doesn't mention anything about CommandCounterIncrement() for POST_CREATE
whichimplied to me that it wasn't something I would need to worry about. 
> One option I thought of was this patch where CCI is called before the access hook so that the new tuple is visible in
thehook. Another option would be to revise the documentation to reflect the expected behavior. 

I don't think a proposal to add CommandCounterIncrement() calls just
for the convenience of object access hooks has much chance of being
accepted. Possibly there is some work that could be done to ensure
consistent placement of the calls to post-create hooks so that either
all of them happen before, or all of them happen after, a CCI has
occurred, but I'm not sure whether or not that is feasible. Currently,
I don't think we promise anything about whether a post-create hook
call will occur before or after a CCI, which is why
sepgsql_schema_post_create(), sepgsql_schema_post_create(), and
sepgsql_attribute_post_create() perform a catalog scan using
SnapshotSelf, while sepgsql_database_post_create() uses
get_database_oid(). You might want to adopt a similar technique.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: oat_post_create expected behavior

From
Jeff Davis
Date:
On Mon, 2022-06-06 at 10:51 -0400, Robert Haas wrote:
> I don't think a proposal to add CommandCounterIncrement() calls just
> for the convenience of object access hooks has much chance of being
> accepted.

Out of curiosity, why not? The proposed patch only runs it if the
object access hook is set. Do you see a situation where it would be
confusing that an earlier DDL change is visible? And if so, would it
make more sense to call CCI unconditionally?

Also, would it ever be reasonable for such a hook to call CCI itself?
As you say, it could use SnapshotSelf, but sometimes you might want to
call routines that assume they can use an MVCC snapshot. This question
applies to the OAT_POST_ALTER hook as well as OAT_POST_CREATE.

>  Possibly there is some work that could be done to ensure
> consistent placement of the calls to post-create hooks so that either
> all of them happen before, or all of them happen after, a CCI has
> occurred, but I'm not sure whether or not that is feasible. 

I like the idea of having a test in place so that we at least know when
something changes. Otherwise it would be pretty easy to break an
extension by adjusting some code.

> Currently,
> I don't think we promise anything about whether a post-create hook
> call will occur before or after a CCI, which is why
> sepgsql_schema_post_create(), sepgsql_schema_post_create(), and
> sepgsql_attribute_post_create() perform a catalog scan using
> SnapshotSelf, while sepgsql_database_post_create() uses
> get_database_oid(). You might want to adopt a similar technique.

It would be good to document this a little better though:

 * OAT_POST_CREATE should be invoked just after the object is created.
 * Typically, this is done after inserting the primary catalog records
and
 * associated dependencies.

doesn't really give any guidance, while the comment for alter does:

 * OAT_POST_ALTER should be invoked just after the object is altered,
 * but before the command counter is incremented.  An extension using
the
 * hook can use a current MVCC snapshot to get the old version of the
tuple,
 * and can use SnapshotSelf to get the new version of the tuple.


Regards,
    Jeff Davis


PS: I added this to the July CF: 
https://commitfest.postgresql.org/38/3676/




Re: oat_post_create expected behavior

From
Robert Haas
Date:
On Mon, Jun 6, 2022 at 1:35 PM Jeff Davis <pgsql@j-davis.com> wrote:
> Out of curiosity, why not? The proposed patch only runs it if the
> object access hook is set. Do you see a situation where it would be
> confusing that an earlier DDL change is visible? And if so, would it
> make more sense to call CCI unconditionally?

Well, I think that a fair amount of work has been done previously to
cut down on unnecessary CCIs. I suspect Tom in particular is likely to
object to adding a whole bunch more of them, and I think that
objection would have some merit.

I definitely think if we were going to do it, it would need to be
unconditional. Otherwise I think we'll end up with bugs, because
nobody's going to go test all of that code with and without an object
access hook installed every time they tweak some DDL-related code.

> Also, would it ever be reasonable for such a hook to call CCI itself?
> As you say, it could use SnapshotSelf, but sometimes you might want to
> call routines that assume they can use an MVCC snapshot. This question
> applies to the OAT_POST_ALTER hook as well as OAT_POST_CREATE.

I definitely wouldn't want to warrant that it doesn't break anything.
I think the extension can do it at its own risk, but I wouldn't
recommend it.

OAT_POST_ALTER is unlike OAT_POST_CREATE in that OAT_POST_ALTER
documents that it should be called after a CCI, whereas
OAT_POST_CREATE does not make a representation either way.

> >  Possibly there is some work that could be done to ensure
> > consistent placement of the calls to post-create hooks so that either
> > all of them happen before, or all of them happen after, a CCI has
> > occurred, but I'm not sure whether or not that is feasible.
>
> I like the idea of having a test in place so that we at least know when
> something changes. Otherwise it would be pretty easy to break an
> extension by adjusting some code.

Sure. I find writing meaningful tests for this kind of stuff hard, but
there are plenty of people around here who are better at figuring out
how to test obscure scenarios than I.

> > Currently,
> > I don't think we promise anything about whether a post-create hook
> > call will occur before or after a CCI, which is why
> > sepgsql_schema_post_create(), sepgsql_schema_post_create(), and
> > sepgsql_attribute_post_create() perform a catalog scan using
> > SnapshotSelf, while sepgsql_database_post_create() uses
> > get_database_oid(). You might want to adopt a similar technique.
>
> It would be good to document this a little better though:
>
>  * OAT_POST_CREATE should be invoked just after the object is created.
>  * Typically, this is done after inserting the primary catalog records
> and
>  * associated dependencies.
>
> doesn't really give any guidance, while the comment for alter does:
>
>  * OAT_POST_ALTER should be invoked just after the object is altered,
>  * but before the command counter is incremented.  An extension using
> the
>  * hook can use a current MVCC snapshot to get the old version of the
> tuple,
>  * and can use SnapshotSelf to get the new version of the tuple.

Yeah, that comment could be made more clear.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: oat_post_create expected behavior

From
Jeff Davis
Date:
On Mon, 2022-06-06 at 13:43 -0400, Robert Haas wrote:
> Yeah, that comment could be made more clear.

I still don't understand what the rule is.

Is the rule that OAT_POST_CREATE must always use SnapshotSelf for any
catalog access? And if so, do we need to update code in contrib
extensions to follow that rule?

Regards,
    Jeff Davis





Re: oat_post_create expected behavior

From
Robert Haas
Date:
On Mon, Jun 6, 2022 at 3:46 PM Jeff Davis <pgsql@j-davis.com> wrote:
> On Mon, 2022-06-06 at 13:43 -0400, Robert Haas wrote:
> > Yeah, that comment could be made more clear.
>
> I still don't understand what the rule is.
>
> Is the rule that OAT_POST_CREATE must always use SnapshotSelf for any
> catalog access? And if so, do we need to update code in contrib
> extensions to follow that rule?

I don't think there is a rule in the sense that you want there to be
one. We sometimes call the object access hook before the CCI, and
sometimes after, and the sepgsql code knows which cases are handled
which ways and proceeds differently on that basis. If we went and
changed the sepgsql code that uses system catalog lookups to use
SnapshotSelf instead, I think it would still work, but it would be
less efficient, so that doesn't seem like a desirable change to me. If
it's possible to make the hook placement always happen after a CCI,
then we could change the sepgsql code to always use catalog lookups,
which would probably be more efficient but likely require adding some
CCI calls, which may attract objections from Tom --- or maybe it
won't. Absent either of those things, I'm inclined to just make the
comment clearly state that we're not consistent about it. That's not
great, but it may be the best we can do.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: oat_post_create expected behavior

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jun 6, 2022 at 1:35 PM Jeff Davis <pgsql@j-davis.com> wrote:
>> Out of curiosity, why not? The proposed patch only runs it if the
>> object access hook is set. Do you see a situation where it would be
>> confusing that an earlier DDL change is visible? And if so, would it
>> make more sense to call CCI unconditionally?

> Well, I think that a fair amount of work has been done previously to
> cut down on unnecessary CCIs. I suspect Tom in particular is likely to
> object to adding a whole bunch more of them, and I think that
> objection would have some merit.

We've gotten things to the point where a no-op CCI is pretty cheap,
so I'm not sure there is a performance concern here.  I do wonder
though if there are semantic or bug-hazard considerations.  A CCI
that occurs only if a particular hook is loaded seems pretty scary
from a testability standpoint.

> I definitely think if we were going to do it, it would need to be
> unconditional. Otherwise I think we'll end up with bugs, because
> nobody's going to go test all of that code with and without an object
> access hook installed every time they tweak some DDL-related code.

Right, same thing I'm saying.  I also think we should discourage
people from doing cowboy CCIs inside their OAT hooks, because that
makes the testability problem even worse.  Maybe that means we
need to uniformly move the CREATE hooks to after a system-provided
CCI, but I've not thought hard about the implications of that.

            regards, tom lane



Re: oat_post_create expected behavior

From
Jeff Davis
Date:
On Mon, 2022-06-06 at 17:11 -0400, Tom Lane wrote:
> Right, same thing I'm saying.  I also think we should discourage
> people from doing cowboy CCIs inside their OAT hooks, because that
> makes the testability problem even worse.  Maybe that means we
> need to uniformly move the CREATE hooks to after a system-provided
> CCI, but I've not thought hard about the implications of that.

Uniformly moving the post-create hooks after CCI might not be as
convenient as I thought at first. Many extensions using post-create
hooks will also want to use post-alter hooks, and it would be difficult
to reuse extension code between those two hooks. It's probably better
to just always specify the snapshot unless you're sure you won't need a
post-alter hook.

It would be nice if it was easier to enforce that these hooks do the
right thing, though.

Regards,
    Jeff Davis





Re: oat_post_create expected behavior

From
Mary Xu
Date:
> Right, same thing I'm saying.  I also think we should discourage
> people from doing cowboy CCIs inside their OAT hooks, because that
> makes the testability problem even worse.  Maybe that means we
> need to uniformly move the CREATE hooks to after a system-provided
> CCI, but I've not thought hard about the implications of that.

I like this approach, however, I am relatively new to the PG scene and
am not sure how or what I should look into in terms of the
implications that Tom mentioned. Are there any tips? What should be
the next course of action here? I could update my patch to always call
CCI before the create hooks.

Thanks,

Mary Xu

On Fri, Jul 1, 2022 at 11:12 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Mon, 2022-06-06 at 17:11 -0400, Tom Lane wrote:
> > Right, same thing I'm saying.  I also think we should discourage
> > people from doing cowboy CCIs inside their OAT hooks, because that
> > makes the testability problem even worse.  Maybe that means we
> > need to uniformly move the CREATE hooks to after a system-provided
> > CCI, but I've not thought hard about the implications of that.
>
> Uniformly moving the post-create hooks after CCI might not be as
> convenient as I thought at first. Many extensions using post-create
> hooks will also want to use post-alter hooks, and it would be difficult
> to reuse extension code between those two hooks. It's probably better
> to just always specify the snapshot unless you're sure you won't need a
> post-alter hook.
>
> It would be nice if it was easier to enforce that these hooks do the
> right thing, though.
>
> Regards,
>         Jeff Davis
>
>



Re: oat_post_create expected behavior

From
Jeff Davis
Date:
On Tue, 2022-08-02 at 13:30 -0700, Mary Xu wrote:
> > Right, same thing I'm saying.  I also think we should discourage
> > people from doing cowboy CCIs inside their OAT hooks, because that
> > makes the testability problem even worse.  Maybe that means we
> > need to uniformly move the CREATE hooks to after a system-provided
> > CCI, but I've not thought hard about the implications of that.
>
> I like this approach, however, I am relatively new to the PG scene
> and
> am not sure how or what I should look into in terms of the
> implications that Tom mentioned. Are there any tips? What should be
> the next course of action here? I could update my patch to always
> call
> CCI before the create hooks.

I didn't see a clear consensus that we should call OAT_POST_CREATE
after CCI, so I went ahead and updated the comment. We can always
update the behavior later when we do have consensus, but until that
time, at least the comment will be more helpful.

If you are satisfied you can mark the CF issue as "committed", or you
can leave it open if you think it's still unresolved.

--
Jeff Davis
PostgreSQL Contributor Team - AWS