Thread: Ye olde "relation doesn't quite exist" problem

Ye olde "relation doesn't quite exist" problem

From
Tom Lane
Date:
I just gave an incorrect answer on pgsql-interfaces concerning the
following bug, which has been around for quite a while:

regression=> create table bug1 (f1 int28 primary key);
NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'bug1_pkey' for table 'bug1'
ERROR:  Can't find a default operator class for type 22.

-- That's fine, since we have no index support for int28.  But:

regression=> create table bug1 (f1 int28);
ERROR:  Relation 'bug1' already exists

-- Oops.

regression=> drop table bug1;
ERROR:  Relation 'bug1' does not exist

-- Double oops.


I'm pretty sure I recall a discussion to the effect that CREATE TABLE
was failing in this case because pgsql/data/base/dbname/bug1 had already
been created and wasn't deleted at transaction abort.  That may have
been the case in older versions of Postgres, but we seem to have fixed
that problem: with current sources the database file *is* removed at
transaction abort.  Unfortunately the bug still persists :-(.

Some quick tracing indicates that the reason the second CREATE TABLE
fails is that there's still an entry for bug1 in the system cache: the
search in RelnameFindRelid(),       tuple = SearchSysCacheTuple(RELNAME,
PointerGetDatum(relname),                                  0, 0, 0);
 
is finding an entry!  (If you quit the backend and start a new one,
things go back to normal, since the new backend's relcache doesn't
have the bogus entry.)

So, apparently the real problem is that the relname cache is not cleaned
of bogus entries inserted during a failed transaction.  This strikes me
as a fairly serious problem, especially if it applies to all the
SysCache tables.  That could lead to all kinds of erroneous behavior
after an aborted transaction.  I think this is a "must fix" issue...
        regards, tom lane


Re: [HACKERS] Ye olde "relation doesn't quite exist" problem

From
Bruce Momjian
Date:
> I'm pretty sure I recall a discussion to the effect that CREATE TABLE
> was failing in this case because pgsql/data/base/dbname/bug1 had already
> been created and wasn't deleted at transaction abort.  That may have
> been the case in older versions of Postgres, but we seem to have fixed
> that problem: with current sources the database file *is* removed at
> transaction abort.  Unfortunately the bug still persists :-(.
> 
> Some quick tracing indicates that the reason the second CREATE TABLE
> fails is that there's still an entry for bug1 in the system cache: the
> search in RelnameFindRelid(),
>         tuple = SearchSysCacheTuple(RELNAME,
>                                     PointerGetDatum(relname),
>                                     0, 0, 0);
> is finding an entry!  (If you quit the backend and start a new one,
> things go back to normal, since the new backend's relcache doesn't
> have the bogus entry.)
> 
> So, apparently the real problem is that the relname cache is not cleaned
> of bogus entries inserted during a failed transaction.  This strikes me
> as a fairly serious problem, especially if it applies to all the
> SysCache tables.  That could lead to all kinds of erroneous behavior
> after an aborted transaction.  I think this is a "must fix" issue...

OK, let me give two ideas here.  First, we could create a linked list of
all cache additions that happen inside a transaction.  If the
transaction aborts, we can invalidate all the cache entries in the list.
Second, we could just invalidate the entire cache on a transaction
abort.  Third, we could somehow invalidate the cache on transaction
abort "only" if there was some system table modification in the
transaction.  The third one seems a little harder.  Because the linked
list could get large, we could do a linked list, and if it gets too
large, do an full invalidation.  Also, there may be a way to spin
through the cache and remove all entries marked as part of the aborted
transaction.

Seems like this not something for 6.5.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Ye olde "relation doesn't quite exist" problem

From
Tom Lane
Date:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
> OK, let me give two ideas here.  First, we could create a linked list of
> all cache additions that happen inside a transaction.  If the
> transaction aborts, we can invalidate all the cache entries in the list.
> Second, we could just invalidate the entire cache on a transaction
> abort.  Third, we could somehow invalidate the cache on transaction
> abort "only" if there was some system table modification in the
> transaction.  The third one seems a little harder.

Yes, the second one was the quick-and-dirty answer that occurred to me.
That would favor apps that seldom incur errors (no extra overhead to
keep track of cache changes), but would be bad news for those that
often incur errors (unnecessary cache reloads).

Is there room in the SysCaches for the transaction ID of the last
transaction to modify each entry?  That would provide an easy and
inexpensive way of finding the ones to zap when the current xact is
aborted, I would think: abort would just scan all the caches looking
for entries with the current xact ID, and invalidate only those entries.
The cost in the no-error case would just be storing an additional
field whenever an entry is modified; seems cheap enough.  However,
if there are a lot of different places in the code that can create/
modify a cache entry, this could be a fair amount of work (and it'd
carry the risk of missing some places...).

> Seems like this not something for 6.5.

I think we really ought to do *something*.  I'd settle for the
brute-force blow-away-all-the-caches answer for now, though.
        regards, tom lane


Re: [HACKERS] Ye olde "relation doesn't quite exist" problem

From
Bruce Momjian
Date:
> Is there room in the SysCaches for the transaction ID of the last
> transaction to modify each entry?  That would provide an easy and
> inexpensive way of finding the ones to zap when the current xact is
> aborted, I would think: abort would just scan all the caches looking
> for entries with the current xact ID, and invalidate only those entries.
> The cost in the no-error case would just be storing an additional
> field whenever an entry is modified; seems cheap enough.  However,
> if there are a lot of different places in the code that can create/
> modify a cache entry, this could be a fair amount of work (and it'd
> carry the risk of missing some places...).

Yes, I think we could put it in, though it may have to sequential scan
to remove the entries.

> 
> > Seems like this not something for 6.5.
> 
> I think we really ought to do *something*.  I'd settle for the
> brute-force blow-away-all-the-caches answer for now, though.

OK.  I wonder if there are any problems with that.  I do that in heap.c:
           /*            * This is heavy-handed, but appears necessary bjm 1999/02/01            *
SystemCacheRelationFlushed(relid)is not enough either.            */           RelationForgetRelation(relid);
ResetSystemCache();

as part of a temp table creation to remove any non-temp table entry in
the cache.  I could not find another way, and because the temp table
creation doesn't cause problems, this could probably be used in
transaction abort too.


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Ye olde "relation doesn't quite exist" problem

From
Tom Lane
Date:
I have dealt with this bug:

> test=> create table bug1 (f1 int28 primary key);
> ERROR:  Can't find a default operator class for type 22.
> -- That's expected, since we have no index support for int28.  But now:
> test=> create table bug1 (f1 int28);
> ERROR:  Relation 'bug1' already exists

It is not real clear to me why the existing syscache invalidation
mechanism (CatalogCacheIdInvalidate() etc) fails to handle this case,
but it doesn't.  Perhaps it is because the underlying pg_class tuple
never actually makes it to "confirmed good" status, so the SI code
figures it can ignore it.

I think the correct place to handle the problem is in
SystemCacheRelationFlushed() in catcache.c.  That routine is called by
RelationFlushRelation() (which does the same task for the relcache).
Unfortunately, it was only handling one aspect of the cache-update
problem: it was cleaning out the cache associated with a system table
when the *system table's* relcache entry was flushed.  It didn't scan
the cache contents to see if any of the records are associated with a
non-system table that's being flushed.

For the moment, I have made it call ResetSystemCache() --- that is, just
flush *all* the cache entries.  Scanning the individual entries to find
the ones referencing the given relID would require knowing exactly which
column to look in for each kind of system cache, which is more knowledge
than catcache.c actually has.  Eventually we could improve it.

This means it is no longer necessary for heap.c or index.c to call
ResetSystemCache() when handling a temp table --- their calls to
RelationForgetRelation are sufficient.  I have applied those changes
as well.
        regards, tom lane


Re: [HACKERS] Ye olde "relation doesn't quite exist" problem

From
Vadim Mikheev
Date:
Tom Lane wrote:
> 
> I think the correct place to handle the problem is in
> SystemCacheRelationFlushed() in catcache.c.  That routine is called by
> RelationFlushRelation() (which does the same task for the relcache).
> Unfortunately, it was only handling one aspect of the cache-update
> problem: it was cleaning out the cache associated with a system table
> when the *system table's* relcache entry was flushed.  It didn't scan
> the cache contents to see if any of the records are associated with a
> non-system table that's being flushed.
> 
> For the moment, I have made it call ResetSystemCache() --- that is, just
> flush *all* the cache entries.  Scanning the individual entries to find ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Isn't is tooooo bad for performance ?!

> the ones referencing the given relID would require knowing exactly which
> column to look in for each kind of system cache, which is more knowledge
> than catcache.c actually has.  Eventually we could improve it.
> 
> This means it is no longer necessary for heap.c or index.c to call
> ResetSystemCache() when handling a temp table --- their calls to
> RelationForgetRelation are sufficient.  I have applied those changes
> as well.

Vadim


Re: [HACKERS] Ye olde "relation doesn't quite exist" problem

From
Bruce Momjian
Date:
> For the moment, I have made it call ResetSystemCache() --- that is, just
> flush *all* the cache entries.  Scanning the individual entries to find
> the ones referencing the given relID would require knowing exactly which
> column to look in for each kind of system cache, which is more knowledge
> than catcache.c actually has.  Eventually we could improve it.
> 
> This means it is no longer necessary for heap.c or index.c to call
> ResetSystemCache() when handling a temp table --- their calls to
> RelationForgetRelation are sufficient.  I have applied those changes
> as well.

Thanks.  I am a little confused.  I thought you just flushed only on
elog()/abort.  How does the new code work.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Ye olde "relation doesn't quite exist" problem

From
Vadim Mikheev
Date:
Bruce Momjian wrote:
> 
> > For the moment, I have made it call ResetSystemCache() --- that is, just
> > flush *all* the cache entries.  Scanning the individual entries to find
> > the ones referencing the given relID would require knowing exactly which
> > column to look in for each kind of system cache, which is more knowledge
> > than catcache.c actually has.  Eventually we could improve it.
> >
> > This means it is no longer necessary for heap.c or index.c to call
> > ResetSystemCache() when handling a temp table --- their calls to
> > RelationForgetRelation are sufficient.  I have applied those changes
> > as well.
> 
> Thanks.  I am a little confused.  I thought you just flushed only on
^^^^^^^^^^^^^^^^^^^^
> elog()/abort.  How does the new code work. ^^^^^^^^^^^^
It seems as more right thing to do.

Vadim


Re: [HACKERS] Ye olde "relation doesn't quite exist" problem

From
Tom Lane
Date:
Vadim Mikheev <vadim@krs.ru> writes:
> Bruce Momjian wrote:
>> Thanks.  I am a little confused.  I thought you just flushed only on
>                                                   ^^^^^^^^^^^^^^^^^^^^
>> elog()/abort.  How does the new code work.
>   ^^^^^^^^^^^^
> It seems as more right thing to do.

What I just committed does the cache flush whenever
RelationFlushRelation is called --- in particular, elog/abort will
cause it to happen if there are any created-in-current-transaction
relations to be disposed of.  But otherwise, no flush.

The obvious question about that is "what about modifications to
cacheable tuples that are not triggered by a relation creation?"
I think that those cases are OK because they are covered by the
shared-invalidation code.  At least, we have no bug reports to
prove the contrary...
        regards, tom lane


Re: [HACKERS] Ye olde "relation doesn't quite exist" problem

From
Tom Lane
Date:
Vadim Mikheev <vadim@krs.ru> writes:
> Tom Lane wrote:
>> For the moment, I have made it call ResetSystemCache() --- that is, just
>> flush *all* the cache entries.  Scanning the individual entries to find
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Isn't is tooooo bad for performance ?!

It's not ideal, by any means.  But I don't know how to fix it better
right now, and we've got a release to ship ...
        regards, tom lane


Re: [HACKERS] Ye olde "relation doesn't quite exist" problem

From
Bruce Momjian
Date:
Added to TODO:

* elog() flushes cache, try invalidating just entries from current xact


> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> > OK, let me give two ideas here.  First, we could create a linked list of
> > all cache additions that happen inside a transaction.  If the
> > transaction aborts, we can invalidate all the cache entries in the list.
> > Second, we could just invalidate the entire cache on a transaction
> > abort.  Third, we could somehow invalidate the cache on transaction
> > abort "only" if there was some system table modification in the
> > transaction.  The third one seems a little harder.
> 
> Yes, the second one was the quick-and-dirty answer that occurred to me.
> That would favor apps that seldom incur errors (no extra overhead to
> keep track of cache changes), but would be bad news for those that
> often incur errors (unnecessary cache reloads).
> 
> Is there room in the SysCaches for the transaction ID of the last
> transaction to modify each entry?  That would provide an easy and
> inexpensive way of finding the ones to zap when the current xact is
> aborted, I would think: abort would just scan all the caches looking
> for entries with the current xact ID, and invalidate only those entries.
> The cost in the no-error case would just be storing an additional
> field whenever an entry is modified; seems cheap enough.  However,
> if there are a lot of different places in the code that can create/
> modify a cache entry, this could be a fair amount of work (and it'd
> carry the risk of missing some places...).
> 
> > Seems like this not something for 6.5.
> 
> I think we really ought to do *something*.  I'd settle for the
> brute-force blow-away-all-the-caches answer for now, though.
> 
>             regards, tom lane
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026