Thread: Bug with temporary child of a permanent table after recovery

Bug with temporary child of a permanent table after recovery

From
Heikki Linnakangas
Date:
Spotted by accident while working on a patch:

Open psql and do:

CREATE TABLE uctest(f1 int, f2 text);
-- Create a temporary child of the permanent table
CREATE TEMP TABLE ucchild () inherits (uctest);

In another terminal:
pg_ctl stop -m immediate
pg_ctl start

psql (9.3devel)
Type "help" for help.

postgres=# SELECT * FROM uctest;
ERROR:  could not open file "base/12030/t2_16392": No such file or directory

This goes back to 9.1.

- Heikki

Re: Bug with temporary child of a permanent table after recovery

From
Tom Lane
Date:
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> Spotted by accident while working on a patch:
> Open psql and do:

> CREATE TABLE uctest(f1 int, f2 text);
> -- Create a temporary child of the permanent table
> CREATE TEMP TABLE ucchild () inherits (uctest);

> In another terminal:
> pg_ctl stop -m immediate
> pg_ctl start

> psql (9.3devel)
> Type "help" for help.

> postgres=# SELECT * FROM uctest;
> ERROR:  could not open file "base/12030/t2_16392": No such file or directory

> This goes back to 9.1.

I believe this used to work before Robert redesigned temp relations.
The underlying physical file got removed during postmaster restart, but
at this point the catalog entry for the temp table is still there, and
the planner is failing to disregard it as intended.  The problem is that
the RELATION_IS_OTHER_TEMP macro

#define RELATION_IS_OTHER_TEMP(relation) \
    ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP \
    && (relation)->rd_backend != MyBackendId)

is getting fooled by a chance collision of backend IDs --- that is,
the failure only occurs in sessions with the same BackendId that the
original table creator had.

It worked before because the "is my temp table" test was based on
whether the table belonged to myTempNamespace, which would be InvalidOid
until the backend had started using the temp namespace --- and one of
the things involved in that is to run around and clean out any leftover
tables in that schema.

I'm not sure where rd_backend gets set up, but maybe we can fix this
by not allowing rd_backend to acquire a valid value unless we've begun
using the temp namespace.

            regards, tom lane

Re: Bug with temporary child of a permanent table after recovery

From
Tom Lane
Date:
I wrote:
> I'm not sure where rd_backend gets set up, but maybe we can fix this
> by not allowing rd_backend to acquire a valid value unless we've begun
> using the temp namespace.

The key bit of code seems to be this in RelationBuildDesc():

    switch (relation->rd_rel->relpersistence)
        ...
        case RELPERSISTENCE_TEMP:
            if (isTempOrToastNamespace(relation->rd_rel->relnamespace))
                relation->rd_backend = MyBackendId;
            else
            {
                /*
                 * If it's a local temp table, but not one of ours, we have to
                 * use the slow, grotty method to figure out the owning
                 * backend.
                 */
                relation->rd_backend =
                    GetTempNamespaceBackendId(relation->rd_rel->relnamespace);
                Assert(relation->rd_backend != InvalidBackendId);
            }

where the second part of the if() will happily set rd_backend to
MyBackendId if that's what comes out of the namespace name, never
mind that we ought not to consider it "our" temp table.

The pre-9.1 coding in this area got it right:

    relation->rd_istemp = relation->rd_rel->relistemp;
    if (relation->rd_istemp)
        relation->rd_islocaltemp = isTempOrToastNamespace(relation->rd_rel->relnamespace);
    else
        relation->rd_islocaltemp = false;

There was no way to consider a temp table "ours" unless it satisfied
isTempOrToastNamespace.

Possibly we could reset rd_backend to InvalidBackendId here if the
computed value is equal to MyBackendId, but it seems a bit fragile.
That would prevent us from computing a correct pathname for the
underlying file --- not that that file should be there anymore,
but this could hinder relation creation for instance.

Perhaps a better idea is to not overload rd_backend to serve both
the "physical name of file" purpose and the "is it my temp table"
purpose.  We could add an additional relcache field with the
three possible states "not temp, my temp, somebody else's temp"
and make sure that the third state gets selected when there's
a chance collision like this.  Or resurrect the old rd_istemp and
rd_islocaltemp flags.

Thoughts?

            regards, tom lane

Re: Bug with temporary child of a permanent table after recovery

From
Jeff Davis
Date:
On Fri, 2012-12-14 at 17:56 -0500, Tom Lane wrote:
> Perhaps a better idea is to not overload rd_backend to serve both
> the "physical name of file" purpose and the "is it my temp table"
> purpose.  We could add an additional relcache field with the
> three possible states "not temp, my temp, somebody else's temp"
> and make sure that the third state gets selected when there's
> a chance collision like this.  Or resurrect the old rd_istemp and
> rd_islocaltemp flags.
>
> Thoughts?

Rather than bring back that flag, can we just use isTempOrToastNamespace
within RELATION_IS_OTHER_TEMP?

  #define RELATION_IS_OTHER_TEMP(relation) \
      ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP \
      && !isTempOrToastNamespace((relation)->rd_rel->relnamespace))

(I haven't analyzed the above code very carefully; it's just for
illustration purposes).

Regards,
    Jeff Davis

Re: Bug with temporary child of a permanent table after recovery

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> On Fri, 2012-12-14 at 17:56 -0500, Tom Lane wrote:
>> Perhaps a better idea is to not overload rd_backend to serve both
>> the "physical name of file" purpose and the "is it my temp table"
>> purpose.  We could add an additional relcache field with the
>> three possible states "not temp, my temp, somebody else's temp"
>> and make sure that the third state gets selected when there's
>> a chance collision like this.  Or resurrect the old rd_istemp and
>> rd_islocaltemp flags.

> Rather than bring back that flag, can we just use isTempOrToastNamespace
> within RELATION_IS_OTHER_TEMP?

>   #define RELATION_IS_OTHER_TEMP(relation) \
>       ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP \
>       && !isTempOrToastNamespace((relation)->rd_rel->relnamespace))

Meh.  I think this would work, but we would also have to change
RELATION_IS_LOCAL to use isTempOrToastNamespace similarly.  While
RELATION_IS_OTHER_TEMP doesn't seem to be used in any
performance-critical places (since it's generally checked no more than
once per command), I'm less willing to assume that it's okay to add
cycles to RELATION_IS_LOCAL.

On the whole I think resurrecting the rd_islocaltemp flag might be the
best thing.  We can keep the use of "relpersistence ==
RELPERSISTENCE_TEMP" to do what rd_istemp used to do, it's just the
equation of "rd_backend == MyBackendId" with "is my temp table" that's
bogus.  And this way gets rid of some unnecessary cross-branch coding
differences.

            regards, tom lane

Re: Bug with temporary child of a permanent table after recovery

From
Tom Lane
Date:
I wrote:
> On the whole I think resurrecting the rd_islocaltemp flag might be the
> best thing.  We can keep the use of "relpersistence ==
> RELPERSISTENCE_TEMP" to do what rd_istemp used to do, it's just the
> equation of "rd_backend == MyBackendId" with "is my temp table" that's
> bogus.  And this way gets rid of some unnecessary cross-branch coding
> differences.

Here is a draft patch for this.  Thoughts/objections?

One thing I noticed that maybe needs more work is that tablecmds.c in
general seems mighty willing to hack upon temp tables belonging to other
sessions.  I added tests for that in the places where there already were
checks on relpersistence, but I wonder whether we ought not simply
forbid all forms of ALTER on nonlocal temp rels, right up at the top.

            regards, tom lane


Attachment

Re: Bug with temporary child of a permanent table after recovery

From
Jeff Davis
Date:
On Sun, 2012-12-16 at 21:17 -0500, Tom Lane wrote:
> I wrote:
> > On the whole I think resurrecting the rd_islocaltemp flag might be the
> > best thing.  We can keep the use of "relpersistence ==
> > RELPERSISTENCE_TEMP" to do what rd_istemp used to do, it's just the
> > equation of "rd_backend == MyBackendId" with "is my temp table" that's
> > bogus.  And this way gets rid of some unnecessary cross-branch coding
> > differences.
>
> Here is a draft patch for this.  Thoughts/objections?

It looks good to me.

> One thing I noticed that maybe needs more work is that tablecmds.c in
> general seems mighty willing to hack upon temp tables belonging to other
> sessions.  I added tests for that in the places where there already were
> checks on relpersistence, but I wonder whether we ought not simply
> forbid all forms of ALTER on nonlocal temp rels, right up at the top.

Do you see any path where an ALTER can get a hold on a non-local temp
table? Or do you just mean as a sanity check? Either way, blocking it at
the top sounds good to me.

Regards,
    Jeff Davis

Re: Bug with temporary child of a permanent table after recovery

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> On Sun, 2012-12-16 at 21:17 -0500, Tom Lane wrote:
>> One thing I noticed that maybe needs more work is that tablecmds.c in
>> general seems mighty willing to hack upon temp tables belonging to other
>> sessions.  I added tests for that in the places where there already were
>> checks on relpersistence, but I wonder whether we ought not simply
>> forbid all forms of ALTER on nonlocal temp rels, right up at the top.

> Do you see any path where an ALTER can get a hold on a non-local temp
> table? Or do you just mean as a sanity check? Either way, blocking it at
> the top sounds good to me.

Well, if you're not a superuser then you shouldn't be able to access
non-local temp tables at all, because they live in schemas you won't
have permissions for.  So what we're discussing here is mainly just
protecting superusers from shooting themselves in the foot.  But, given
that there are any protections at all against non-local temp tables in
the code (and there are quite a number of such checks), I'm wondering
why ALTER TABLE doesn't have a blanket rejection.  Even for catalog
updates that don't involve direct touches of the file data (which is
certainly unsafe because of local vs shared buffer handling), it seems
risky because we also have various places that skip taking locks on
local temp tables.

            regards, tom lane