Thread: BUG #15631: Generated as identity field in a temporary table with on commit drop corrupts system catalogs

The following bug has been logged on the website:

Bug reference:      15631
Logged by:          Serge Latyntsev
Email address:      dnsl48@gmail.com
PostgreSQL version: 11.1
Operating system:   Alpine
Description:

When using `generated by default as identity` in a temporary table with `on
commit drop`, but without starting a transaction, system catalogs get
corrupted and won't let create temporary tables anymore.

Here's how to reproduce (within a docker container):

$ docker exec -it $(docker run -d --rm postgres:11.1-alpine) bash
# psql postgres postgres
# create temporary table foo ( bar int generated by default as identity ) on
commit drop;
CREATE TABLE
# \q
# psql postgres postgres 
# create temporary table a (b varchar);
ERROR:  could not open relation with OID 16388
LINE 1: create temporary table a (b varchar);


PG Bug reporting form <noreply@postgresql.org> writes:
> When using `generated by default as identity` in a temporary table with `on
> commit drop`, but without starting a transaction, system catalogs get
> corrupted and won't let create temporary tables anymore.

Interesting.  I can reproduce that (don't need the docker bit...).
The temp table goes away, but the sequence is still there, and so
are its pg_depend entries --- including one saying that it depends
on the table.  That bollixes later attempts to clean out the temp
namespace, since deletion tries to recurse to the missing table.

Even more interesting, it works if you wrap the CREATE in begin/commit.

I don't have time to probe further right now, but I believe what is
wrong is that we're missing a CommandCounterIncrement call after the
sequence is created, or at least after its internal dependency on the
table column is created.  When the transaction commits immediately,
dependency.c fails to see the internal dependency entry and so it
doesn't remove the sequence.  Wrapped in a transaction, there's a
CCI somewhere and all is well.

Peter, will you take this?

            regards, tom lane


On 11/02/2019 21:33, Tom Lane wrote:
> I don't have time to probe further right now, but I believe what is
> wrong is that we're missing a CommandCounterIncrement call after the
> sequence is created, or at least after its internal dependency on the
> table column is created.  When the transaction commits immediately,
> dependency.c fails to see the internal dependency entry and so it
> doesn't remove the sequence.  Wrapped in a transaction, there's a
> CCI somewhere and all is well.

Right.  I think it would be good to put a CommandCounterIncrement() at
the top of PreCommit_on_commit_actions().  That ensures that the
dependency code always see the latest state.

> That bollixes later attempts to clean out the temp
> namespace, since deletion tries to recurse to the missing table.

Should there be some warnings when this happens?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 11/02/2019 21:33, Tom Lane wrote:
>> I don't have time to probe further right now, but I believe what is
>> wrong is that we're missing a CommandCounterIncrement call after the
>> sequence is created, or at least after its internal dependency on the
>> table column is created.  When the transaction commits immediately,
>> dependency.c fails to see the internal dependency entry and so it
>> doesn't remove the sequence.  Wrapped in a transaction, there's a
>> CCI somewhere and all is well.

> Right.  I think it would be good to put a CommandCounterIncrement() at
> the top of PreCommit_on_commit_actions().  That ensures that the
> dependency code always see the latest state.

Hm, I'd be more inclined to find where the sequence creation is happening
and add a CCI at the end, because that comports better with the general
plan for inserting CCIs.  There may be other issues of this same sort with
doing-X-just-after-identity-sequence-creation if you don't fix it that
way.

>> That bollixes later attempts to clean out the temp
>> namespace, since deletion tries to recurse to the missing table.

> Should there be some warnings when this happens?

Like what?

            regards, tom lane


On Tue, Feb 12, 2019 at 09:27:50AM -0500, Tom Lane wrote:
> Hm, I'd be more inclined to find where the sequence creation is happening
> and add a CCI at the end, because that comports better with the general
> plan for inserting CCIs.  There may be other issues of this same sort with
> doing-X-just-after-identity-sequence-creation if you don't fix it that
> way.

Agreed.  I don't think that it is the correct logic to put an
after-the-fact CCI just before executing any drop or truncate actions.
It should happen after the creation of the new object so as it becomes
correctly visible within the transaction.
--
Michael

Attachment
On Wed, Feb 13, 2019 at 11:38:17AM +0900, Michael Paquier wrote:
> Agreed.  I don't think that it is the correct logic to put an
> after-the-fact CCI just before executing any drop or truncate actions.
> It should happen after the creation of the new object so as it becomes
> correctly visible within the transaction.

The problem comes from process_owned_by() in sequence.c which has
added in v10 some handling for internal dependencies in the case of an
identity sequence, and the dependency link between the sequence and
its relation is added there.

Another thing I was wondering is if we should add the CCI directly to
recordMultipleDependencies() for internal dependencies, still it seems
to me that it would be an overkill as some dependency registrers may
do the CCI by themselves after adding the pg_depend link and doing
some other operations, so I discarded the idea.

The patch attached solves the problem, for consistency I would suggest
doing the CCI even for auto dependencies.
--
Michael

Attachment
On Wed, Feb 13, 2019 at 01:41:05PM +0900, Michael Paquier wrote:
> The patch attached solves the problem, for consistency I would suggest
> doing the CCI even for auto dependencies.

I have added this proposal of bug fix to the next CF for now:
https://commitfest.postgresql.org/22/1997/
--
Michael

Attachment
On 2019-02-13 05:41, Michael Paquier wrote:
> On Wed, Feb 13, 2019 at 11:38:17AM +0900, Michael Paquier wrote:
>> Agreed.  I don't think that it is the correct logic to put an
>> after-the-fact CCI just before executing any drop or truncate actions.
>> It should happen after the creation of the new object so as it becomes
>> correctly visible within the transaction.
> 
> The problem comes from process_owned_by() in sequence.c which has
> added in v10 some handling for internal dependencies in the case of an
> identity sequence, and the dependency link between the sequence and
> its relation is added there.
> 
> Another thing I was wondering is if we should add the CCI directly to
> recordMultipleDependencies() for internal dependencies, still it seems
> to me that it would be an overkill as some dependency registrers may
> do the CCI by themselves after adding the pg_depend link and doing
> some other operations, so I discarded the idea.
> 
> The patch attached solves the problem, for consistency I would suggest
> doing the CCI even for auto dependencies.

What is the general coding principle here?  "You need an CCI $WHEN"?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> What is the general coding principle here?  "You need an CCI $WHEN"?

The general principle is "there should be a CCI after each independent
set of data/catalog changes".  You don't typically need CCI between
the primitive actions of a single DDL statement like CREATE SEQUENCE,
because you know that those actions are independent and don't look at
each others' output.  But you need one at the end, in case whatever
happens next should be able to see the results of the statement.

In another universe the rule might have been "CCI before each independent
set of actions", rather than after.  But we haven't done it that way.
Mixing those two styles as a method of fixing bugs seems like a horrid
idea: you eventually end up with fore-AND-aft CCIs everywhere, because
nobody knows what the preceding or following statements might've done.
For better or worse, the PG rule is CCI-after, and the stuff that does
DDL on identity sequences has to fall in line.

            regards, tom lane


On 2019-02-15 15:43, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
>> What is the general coding principle here?  "You need an CCI $WHEN"?
> 
> The general principle is "there should be a CCI after each independent
> set of data/catalog changes".  You don't typically need CCI between
> the primitive actions of a single DDL statement like CREATE SEQUENCE,
> because you know that those actions are independent and don't look at
> each others' output.  But you need one at the end, in case whatever
> happens next should be able to see the results of the statement.

Yeah, I'm OK on that, but that's not really the problem here.  This is
not a case where a later step in a complex DDL command needs to see what
an earlier step did.  This is about that something later in the
transaction needs to see what happened earlier in the transaction.  This
does not seem to be the job of each individual DDL command; they don't
know what someone later might want to look at.  Otherwise many DDL
command implementations are lacking this CCI.  I think the CCI should be
more like at the end of ProcessUtility().

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On Fri, Feb 15, 2019 at 05:03:29PM +0100, Peter Eisentraut wrote:
> Yeah, I'm OK on that, but that's not really the problem here.  This is
> not a case where a later step in a complex DDL command needs to see what
> an earlier step did.  This is about that something later in the
> transaction needs to see what happened earlier in the transaction.  This
> does not seem to be the job of each individual DDL command; they don't
> know what someone later might want to look at.  Otherwise many DDL
> command implementations are lacking this CCI.  I think the CCI should be
> more like at the end of ProcessUtility().

Not all utility commands need a CCI, for example take VACUUM.
--
Michael

Attachment
Michael Paquier <michael@paquier.xyz> writes:
> On Fri, Feb 15, 2019 at 05:03:29PM +0100, Peter Eisentraut wrote:
>> Yeah, I'm OK on that, but that's not really the problem here.  This is
>> not a case where a later step in a complex DDL command needs to see what
>> an earlier step did.  This is about that something later in the
>> transaction needs to see what happened earlier in the transaction.  This
>> does not seem to be the job of each individual DDL command; they don't
>> know what someone later might want to look at.  Otherwise many DDL
>> command implementations are lacking this CCI.  I think the CCI should be
>> more like at the end of ProcessUtility().

> Not all utility commands need a CCI, for example take VACUUM.

BEGIN and COMMIT are more convincing counterexamples ...

            regards, tom lane


On 2019-02-13 05:41, Michael Paquier wrote:
> The patch attached solves the problem, for consistency I would suggest
> doing the CCI even for auto dependencies.

OK, let's go with this.  Attached is your patch with my comment wording.
 Also a test case, but I don't suggest committing that, it's a bit too
weird.  (The effect of the test is to cause catalog corruption that will
cause a later test to fail.)  We can leave it in the archive for posterity.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment
On Fri, Mar 08, 2019 at 12:00:40PM +0100, Peter Eisentraut wrote:
> OK, let's go with this.  Attached is your patch with my comment wording.

Thanks for double-checking.  Your change looks fine to me.

>  Also a test case, but I don't suggest committing that, it's a bit too
> weird.  (The effect of the test is to cause catalog corruption that will
> cause a later test to fail.)  We can leave it in the archive for posterity.

Letting the test case out is fine for me too.
--
Michael

Attachment
On 2019-03-09 02:35, Michael Paquier wrote:
> On Fri, Mar 08, 2019 at 12:00:40PM +0100, Peter Eisentraut wrote:
>> OK, let's go with this.  Attached is your patch with my comment wording.
> 
> Thanks for double-checking.  Your change looks fine to me.
> 
>>  Also a test case, but I don't suggest committing that, it's a bit too
>> weird.  (The effect of the test is to cause catalog corruption that will
>> cause a later test to fail.)  We can leave it in the archive for posterity.
> 
> Letting the test case out is fine for me too.

I looked into backpatching this, but the test case doesn't fail in PG10.
 I ran a round of bisecting but didn't arrive at a sensible result.  The
test case appears to be a bit random in its failure modes.  The issue
could in principle extend further back, since the code in question isn't
really new.  Any ideas or suggestions?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On Mon, Mar 11, 2019 at 09:41:14PM +0100, Peter Eisentraut wrote:
> I looked into backpatching this, but the test case doesn't fail in PG10.
>  I ran a round of bisecting but didn't arrive at a sensible result.  The
> test case appears to be a bit random in its failure modes.  The issue
> could in principle extend further back, since the code in question isn't
> really new.  Any ideas or suggestions?

I can reproduce the issue on a v10 server, for example:
=# create temporary table foo ( bar int generated by default as identity ) on
commit drop;
CREATE TABLE
=# \q
$ psql
=# create temporary table a (b varchar);
ERROR:  XX000: could not open relation with OID 16389

Perhaps you did not start with a fresh server or did not reset the
connection properly?  This counts.
--
Michael

Attachment
On 2019-03-12 04:46, Michael Paquier wrote:
> I can reproduce the issue on a v10 server, for example:
> =# create temporary table foo ( bar int generated by default as identity ) on
> commit drop;
> CREATE TABLE
> =# \q
> $ psql
> =# create temporary table a (b varchar);
> ERROR:  XX000: could not open relation with OID 16389

I've been trying to understand why the equivalent case with serial does
not fail even though the code is mostly the same, that is,

create temporary table foo ( bar serial ) on commit drop;

It turns out that there is some funny business going on that has only
been invisible so far.

If you run the above command with serial, the sequence is not temporary
and is not dropped.  After the table is dropped (on commit), you still
have stale dependency entries lying around (start from empty instance to
get matching OIDs):

╔═════════╤═══════╤══════════╤════════════╤══════════╤═════════════╤═════════╗
║ classid │ objid │ objsubid │ refclassid │ refobjid │ refobjsubid │
deptype ║
╠═════════╪═══════╪══════════╪════════════╪══════════╪═════════════╪═════════╣
║    1259 │ 16386 │        0 │       2615 │    16384 │           0 │ n
    ║
║    1259 │ 16386 │        0 │       1259 │    16388 │           1 │ a
    ║
╚═════════╧═══════╧══════════╧════════════╧══════════╧═════════════╧═════════╝

(These are sequence -> namespace and sequence -> column.)

You can see that the catalog is faulty at this point by running

select pg_describe_object(refclassid, refobjid, refobjsubid) from pg_depend;

This is all eventually cleaned up because the sequence is in the pg_temp
schema and so will be cleaned up with the schema.

If you run the command with the identity syntax, you get almost the same
stale dependency entries:

╔═════════╤═══════╤══════════╤════════════╤══════════╤═════════════╤═════════╗
║ classid │ objid │ objsubid │ refclassid │ refobjid │ refobjsubid │
deptype ║
╠═════════╪═══════╪══════════╪════════════╪══════════╪═════════════╪═════════╣
║    1259 │ 16386 │        0 │       2615 │    16384 │           0 │ n
    ║
║    1259 │ 16386 │        0 │       1259 │    16388 │           1 │ i
    ║
╚═════════╧═══════╧══════════╧════════════╧══════════╧═════════════╧═════════╝

It's only because of the different deptype that something chokes when it
tries to clean up the temp schema.

Adding a CommandCounterIncrement() somewhere does fix all this.  I was
thinking another option for placing this call would be in
ProcessUtilitySlow():

                  /* Need CCI between commands */
-                 if (lnext(l) != NULL)
                      CommandCounterIncrement();

I think we should also make the implicitly created sequence temporary.
Even though the permanent sequence is cleaned up properly, we should
avoid having those sequences write to the WAL.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On Wed, Mar 13, 2019 at 10:28:11AM +0100, Peter Eisentraut wrote:
>                   /* Need CCI between commands */
> -                 if (lnext(l) != NULL)
>                       CommandCounterIncrement();

Hmm.  We could actually live with this suggestion, and this impacts
only CREATE TABLE and ALTER TABLE statements.  I would still add a CCI
after the internal dependency between the identity sequence and its
root table is recorded though as there could be other callers of the
internal sequence API, so a CCI only in utility.c may not be enough.

> I think we should also make the implicitly created sequence temporary.
> Even though the permanent sequence is cleaned up properly, we should
> avoid having those sequences write to the WAL.

Indeed, sounds good to me.
--
Michael

Attachment
Hi Peter,

On Tue, Mar 26, 2019 at 11:45:27AM +0900, Michael Paquier wrote:
> On Wed, Mar 13, 2019 at 10:28:11AM +0100, Peter Eisentraut wrote:
>>                   /* Need CCI between commands */
>> -                 if (lnext(l) != NULL)
>>                       CommandCounterIncrement();
>
> Hmm.  We could actually live with this suggestion, and this impacts
> only CREATE TABLE and ALTER TABLE statements.  I would still add a CCI
> after the internal dependency between the identity sequence and its
> root table is recorded though as there could be other callers of the
> internal sequence API, so a CCI only in utility.c may not be enough.

Is there something else you are considering for this patch?  This
previous suggestion looks fine to live with, at least for me.  Tom,
perhaps you have some extra input on the matter and would prefer a
more restrictive location for the CCI?

>> I think we should also make the implicitly created sequence temporary.
>> Even though the permanent sequence is cleaned up properly, we should
>> avoid having those sequences write to the WAL.
>
> Indeed, sounds good to me.

This is at least v13 material, so I would suggest to discard this item
for now.
--
Michael

Attachment
On 2019-04-25 02:59, Michael Paquier wrote:
> Is there something else you are considering for this patch?  This
> previous suggestion looks fine to live with, at least for me.  Tom,
> perhaps you have some extra input on the matter and would prefer a
> more restrictive location for the CCI?

I didn't find that solution very principled either.  I now think the
best fix is to have a CCI at the end of standard_ProcessUtility().  It's
very possible that other commands could also create catalog entries that
some on-commit action would like to see.  It wouldn't be sensible to
chase down each command separately.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 2019-04-25 02:59, Michael Paquier wrote:
>> Is there something else you are considering for this patch?  This
>> previous suggestion looks fine to live with, at least for me.  Tom,
>> perhaps you have some extra input on the matter and would prefer a
>> more restrictive location for the CCI?

> I didn't find that solution very principled either.  I now think the
> best fix is to have a CCI at the end of standard_ProcessUtility().  It's
> very possible that other commands could also create catalog entries that
> some on-commit action would like to see.  It wouldn't be sensible to
> chase down each command separately.

Seems reasonable to me.  Do we have a concrete patch with that?
The minor-release deadline is getting closer ...

            regards, tom lane



On Fri, Apr 26, 2019 at 11:09:36AM -0400, Tom Lane wrote:
> Seems reasonable to me.  Do we have a concrete patch with that?
> The minor-release deadline is getting closer ...

Isn't that what [1] is for?  If you think that having a CCI directly
in utility.c is fine, that's fine by me.

[1]: https://postgr.es/m/1a7ee3de-f450-606f-5c89-39e45c8052f8@2ndquadrant.com
--
Michael

Attachment
On 2019-04-27 02:10, Michael Paquier wrote:
> On Fri, Apr 26, 2019 at 11:09:36AM -0400, Tom Lane wrote:
>> Seems reasonable to me.  Do we have a concrete patch with that?
>> The minor-release deadline is getting closer ...
> 
> Isn't that what [1] is for?  If you think that having a CCI directly
> in utility.c is fine, that's fine by me.

Committed and backpatched to 11 and 10.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services