Thread: Fix dropped object handling in pg_event_trigger_ddl_commands

Fix dropped object handling in pg_event_trigger_ddl_commands

From
Sven Klemm
Date:
Hello,

when creating an event trigger for ddl_command_end that calls
pg_event_trigger_ddl_commands certain statements will cause the
trigger to fail with a cache lookup error. The error happens on
master, 13 and 12 I didnt test any previous versions.

trg=# ALTER TABLE t ALTER COLUMN f1 SET DATA TYPE bigint, ALTER COLUMN
f1 DROP IDENTITY;
ERROR: XX000: cache lookup failed for relation 13476892
CONTEXT: PL/pgSQL function ddl_end() line 5 at FOR over SELECT rows
LOCATION: getRelationTypeDescription, objectaddress.c:4178

For the ALTER DATA TYPE we create a command to adjust the sequence
which gets recorded in the event trigger commandlist, which leads to
the described failure when the sequence is dropped as part of another
ALTER TABLE subcommand and information about the sequence can no
longer be looked up.

To reproduce:
CREATE OR REPLACE FUNCTION ddl_end()
RETURNS event_trigger AS $$
DECLARE
r RECORD;
BEGIN
FOR r IN SELECT * FROM pg_event_trigger_ddl_commands()
LOOP
RAISE NOTICE 'ddl_end: % %', r.command_tag, r.object_type;
END LOOP;
END;
$$ LANGUAGE plpgsql;

CREATE EVENT TRIGGER ddl_end ON ddl_command_end EXECUTE PROCEDURE ddl_end();

CREATE TABLE t(f1 int NOT NULL GENERATED ALWAYS AS IDENTITY);
ALTER TABLE t ALTER COLUMN f1 DROP IDENTITY, ALTER COLUMN f1 SET DATA
TYPE bigint;

I tried really hard to look for a different way to detect this error
earlier but since the subcommands are processed independently i
couldnt come up with a non-invasive version. Someone more familiar
with this code might have an idea for a better solution.

Any thoughts?

https://www.postgresql.org/message-id/CAMCrgp39V7JQA_Gc+JaEZV3ALOU1ZG=Pwyk3oDpTq7F6Z0JSmg@mail.gmail.com
--
Regards, Sven Klemm

Attachment

Re: Fix dropped object handling in pg_event_trigger_ddl_commands

From
Sven Klemm
Date:
On Sun, Apr 18, 2021 at 2:12 PM Sven Klemm <sven@timescale.com> wrote:
> when creating an event trigger for ddl_command_end that calls
> pg_event_trigger_ddl_commands certain statements will cause the
> trigger to fail with a cache lookup error. The error happens on
> master, 13 and 12 I didnt test any previous versions.
>
> trg=# ALTER TABLE t ALTER COLUMN f1 SET DATA TYPE bigint, ALTER COLUMN
> f1 DROP IDENTITY;
> ERROR: XX000: cache lookup failed for relation 13476892
> CONTEXT: PL/pgSQL function ddl_end() line 5 at FOR over SELECT rows
> LOCATION: getRelationTypeDescription, objectaddress.c:4178

Any opinions on the patch? Is this not worth the effort to fix or is
there a better way to fix this?

https://www.postgresql.org/message-id/CAMCrgp2R1cEXU53iYKtW6yVEp2_yKUz+z=3-CTrYpPP+xryRtg@mail.gmail.com

-- 
Regards, Sven Klemm



Re: Fix dropped object handling in pg_event_trigger_ddl_commands

From
Alvaro Herrera
Date:
On 2021-Apr-25, Sven Klemm wrote:

> On Sun, Apr 18, 2021 at 2:12 PM Sven Klemm <sven@timescale.com> wrote:
> > when creating an event trigger for ddl_command_end that calls
> > pg_event_trigger_ddl_commands certain statements will cause the
> > trigger to fail with a cache lookup error. The error happens on
> > master, 13 and 12 I didnt test any previous versions.
> >
> > trg=# ALTER TABLE t ALTER COLUMN f1 SET DATA TYPE bigint, ALTER COLUMN
> > f1 DROP IDENTITY;
> > ERROR: XX000: cache lookup failed for relation 13476892
> > CONTEXT: PL/pgSQL function ddl_end() line 5 at FOR over SELECT rows
> > LOCATION: getRelationTypeDescription, objectaddress.c:4178
> 
> Any opinions on the patch? Is this not worth the effort to fix or is
> there a better way to fix this?

Hello, I haven't looked at this but judging from the general shape of
function and error message, it seems clearly a bug that needs to be
fixed somehow.  I'll try to make time to look at it sometime soon, but I
have other bugs to investigate and fix, so it may be some time.

I fear your proposal of ignoring the object may be the best we can do,
but I don't like it much.

-- 
Álvaro Herrera                            39°49'30"S 73°17'W
"La verdad no siempre es bonita, pero el hambre de ella sí"



Re: Fix dropped object handling in pg_event_trigger_ddl_commands

From
Aleksander Alekseev
Date:
Hi hackers,

> Any opinions on the patch? Is this not worth the effort to fix or is
> there a better way to fix this?

I confirm that the bug still exists in master (be57f216). The patch
fixes it and looks good to me. I changed the wording a little and
added a regression test. The updated patch is in the attachment. I
added this thread to the CF and updated the status to "Ready for
Committer".

-- 
Best regards,
Aleksander Alekseev

Attachment

Re: Fix dropped object handling in pg_event_trigger_ddl_commands

From
Michael Paquier
Date:
On Mon, Jun 07, 2021 at 12:44:42PM +0300, Aleksander Alekseev wrote:
> I confirm that the bug still exists in master (be57f216). The patch
> fixes it and looks good to me. I changed the wording a little and
> added a regression test. The updated patch is in the attachment. I
> added this thread to the CF and updated the status to "Ready for
> Committer".

FWIW, that looks rather natural to me to me to just ignore the object
if it has already been dropped here.  The same kind of rules apply to
tables dropped  with DROP TABLE which would not show up as of
pg_event_trigger_ddl_commands(), but one can get a list as of
pg_event_trigger_dropped_objects().

Alvaro, were you planning to look at that?  I have not looked at the
patch in details.  missing_ok is available in getObjectIdentity() only
since v14, so this cannot be backpatched :/
--
Michael

Attachment

Re: Fix dropped object handling in pg_event_trigger_ddl_commands

From
Alvaro Herrera
Date:
On 2021-Jun-09, Michael Paquier wrote:

> On Mon, Jun 07, 2021 at 12:44:42PM +0300, Aleksander Alekseev wrote:
> > I confirm that the bug still exists in master (be57f216). The patch
> > fixes it and looks good to me. I changed the wording a little and
> > added a regression test. The updated patch is in the attachment. I
> > added this thread to the CF and updated the status to "Ready for
> > Committer".
> 
> FWIW, that looks rather natural to me to me to just ignore the object
> if it has already been dropped here.  The same kind of rules apply to
> tables dropped  with DROP TABLE which would not show up as of
> pg_event_trigger_ddl_commands(), but one can get a list as of
> pg_event_trigger_dropped_objects().

Oh, that parallel didn't occur to me.  I agree it seems a useful
precedent.

> Alvaro, were you planning to look at that?  I have not looked at the
> patch in details.  

I have it on my list of things to look at, but it isn't priority.  If
you to mess with it, please be my guest.

> missing_ok is available in getObjectIdentity() only
> since v14, so this cannot be backpatched :/

Ooh, yeah, I forgot about that.  And that one was pretty invasive ...

I'm not sure if we can reasonably implement a fix for older releases.
I mean, it's a relatively easy test: do a syscache search for the object
or a catalog indexscan (easy to do with get_object_property_data-based
API), and if the object is gone, skip getObjectTypeDescription and
getObjectIdentity.  But maybe this is too much code to add to stable
releases ...

-- 
Álvaro Herrera       Valdivia, Chile



Re: Fix dropped object handling in pg_event_trigger_ddl_commands

From
Michael Paquier
Date:
On Wed, Jun 09, 2021 at 09:55:08AM -0400, Alvaro Herrera wrote:
> I'm not sure if we can reasonably implement a fix for older releases.
> I mean, it's a relatively easy test: do a syscache search for the object
> or a catalog indexscan (easy to do with get_object_property_data-based
> API), and if the object is gone, skip getObjectTypeDescription and
> getObjectIdentity.  But maybe this is too much code to add to stable
> releases ...

Except that these syscache lookups need to be done on an object-type
basis, which is basically what getObjectDescription() & friends now do
where the logic makes sure that we have a correct objectId <-> cacheId
mapping for the syscache lookups.  So that would be roughly copying
into event_trigger.c what objectaddress.c does now, but for the back
branches.  It would be better to just backport the changes to support
missing_ok in objectaddress.c if we go down this road, but the
invasiveness makes that much more complicated.

Another thing is that ATExecDropIdentity() does performDeletion() by
using PERFORM_DELETION_INTERNAL, which does not feed the dropped
sequence to pg_event_trigger_dropped_objects().  That feels
inconsistent with CREATE TABLE GENERATED that would feed to event
triggers the CREATE SEQUENCE and ALTER SEQUENCE commands, as well as
ALTER TABLE SET DATA TYPE on a generated column that feeds an internal
ALTER SEQUENCE.

An extra idea we could consider, as the drop events are logged before
the other ALTER TABLE subcommands, is to look at the event triggers
registered when the generated sequence is dropped and to erase from
existence any previous events associated to it, but that would make
the logic weak as hell.  In all this stuff, simply ignoring the
objects still sounds like a fair and simple course of action to take
if they are gone at the time an event trigger checks after them within
pg_event_trigger_ddl_commands().  Now, this approach makes my spider
sense tingle.
--
Michael

Attachment

Re: Fix dropped object handling in pg_event_trigger_ddl_commands

From
Michael Paquier
Date:
On Thu, Jun 10, 2021 at 05:07:28PM +0900, Michael Paquier wrote:
> Except that these syscache lookups need to be done on an object-type
> basis, which is basically what getObjectDescription() & friends now do
> where the logic makes sure that we have a correct objectId <-> cacheId
> mapping for the syscache lookups.  So that would be roughly copying
> into event_trigger.c what objectaddress.c does now, but for the back
> branches.  It would be better to just backport the changes to support
> missing_ok in objectaddress.c if we go down this road, but the
> invasiveness makes that much more complicated.

I have been looking at that more this morning, and I have convinced
myself that skipping objects should work fine.  The test added at the
bottom of event_trigger.sql was making the file a bit messier though,
and there are already tests for relations when it comes to dropped
objects.  So let's do a bit of consolidation while on it with an extra
event trigger on ddl_command_end and relations on the schema evttrig.

This one already included some cases for serial columns, so that's
natural to me to extend the area for identity columns.  I have also
added a case for a serial column dropped, while on it.  The last thing
is the addition of r.object_identity from
pg_event_trigger_ddl_commands() in the data generated for the output
messages, so as the output is as complete as possible.

Regarding the back-branches, I am tempted to do nothing.  The APIs are
just not here to do the job.  On top of being an invasive change, it
took 4 years for somebody to complain on this matter, as this exists
since 10.  That's not worth the risk/cost.
--
Michael

Attachment

Re: Fix dropped object handling in pg_event_trigger_ddl_commands

From
Aleksander Alekseev
Date:
Hi Michael,

> /* The type can never be NULL */
> type = getObjectTypeDescription(&addr, true);

The last argument should be `false` then.


--
Best regards,
Aleksander Alekseev

Attachment

Re: Fix dropped object handling in pg_event_trigger_ddl_commands

From
Michael Paquier
Date:
On Fri, Jun 11, 2021 at 11:00:40AM +0300, Aleksander Alekseev wrote:
> The last argument should be `false` then.

Hm, nope.  I think that we had better pass true as argument here.

First, this is more consistent with the identity lookup (OK, it does
not matter as we would have discarded the object after the identity
lookup anyway, but any future shuffling of this code may not be that
wise).  Second, now that I look at it, getObjectTypeDescription() can
never be NULL as we have fallback names for relations, routines and
constraints for all object types so the buffer will be filled with
some data.  Let's replace the bottom of getObjectTypeDescription()
that returns now NULL by Assert(buffer.len > 0).  This code is new as
of v14, so it is better to adjust that sooner than later.
--
Michael

Attachment

Re: Fix dropped object handling in pg_event_trigger_ddl_commands

From
Michael Paquier
Date:
On Fri, Jun 11, 2021 at 09:36:57PM +0900, Michael Paquier wrote:
> Hm, nope.  I think that we had better pass true as argument here.

The main patch has been applied as of 2d689ba.

> First, this is more consistent with the identity lookup (OK, it does
> not matter as we would have discarded the object after the identity
> lookup anyway, but any future shuffling of this code may not be that
> wise).  Second, now that I look at it, getObjectTypeDescription() can
> never be NULL as we have fallback names for relations, routines and
> constraints for all object types so the buffer will be filled with
> some data.  Let's replace the bottom of getObjectTypeDescription()
> that returns now NULL by Assert(buffer.len > 0).  This code is new as
> of v14, so it is better to adjust that sooner than later.

And this has been simplified with b56b83a.
--
Michael

Attachment