Home > mailing lists

Re: [HACKERS] couldn't rollback cache ? - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: [HACKERS] couldn't rollback cache ?
Date	September 17, 1999 12:59:04
Msg-id	29176.937576653@sss.pgh.pa.us Whole thread Raw
In response to	couldn't rollback cache ? ("Hiroshi Inoue" <Inoue@tpf.co.jp>)
Responses	RE: [HACKERS] couldn't rollback cache ?
List	pgsql-hackers

Tree view

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> But as far as I see,neither relation cache nor system catalog cache 
> aren't be rollbacked correctly.
> I added time_qualification_check to SearchSysCache() on trial
> (see the patch at the end of this posting).

Hmm.  This must be a bug of very long standing; surprising it hasn't
been noticed before.  I think you are probably right, because a little
glimpsing shows that SearchSysCache() is the *only* place in the whole
system where HeapKeyTest() is called directly --- everyone else goes
through HeapTupleSatisfies() which adds a timequal check of one sort or
another.  I don't know the timequal stuff at all, but it seems likely
that we want one here.  (Vadim, is this fix right?)

> After this change,
>     abort;
>     ABORT
>     select * from t1;
>     ERROR:  Relation t1 does not have attribute dt1

> Seems relation cache is not invalidated yet.
> I also tried to add time_qualification_check to RelationId(Name)-
> CacheGetRelation(). But unfortunately,Relation doesn't have
> such a information.

I think the real bug here is in inval.c: see
InvalidationMessageCacheInvalidate, which scans pending SI messages
at abort time.  If we had committed, we'd have sent an SI message
telling other backends to refresh their relcache entries for t1;
so there is an entry for t1 in the pending-SI-message list.  We can
use that entry to tell us to invalidate our own relcache entry instead.
It looks like this is done correctly for tuple SI messages but not for
relation SI messages --- and in fact the code sez        /* XXX ignore this--is this correct ??? */
Evidently not.  (BTW, please add some comments to this code!  It's
not obvious that what it's doing is throwing away cache entries that
have been changed by a transaction now being aborted.)

It would probably be a good idea to switch the order of the operations
in AtAbort_Cache() in xact.c, so that relcache reference counts get
reset to 0 before we do the pending-SI-message scan.  That way, we could
just discard the bogus relcache entry and not waste time rebuilding an
entry we might not need again.  (This might even be essential to avoid
an error in the aborted-table-create case; not sure.  The routine
RelationCacheAbort() didn't exist till last week, so the present call
order is certainly not gospel.)

We could probably do this in other ways too, like marking all relcache
entries with the transaction ID of their last change and using that to
detect what to throw away.  But the SI message queue is already there
so might as well use it...
        regards, tom lane

pgsql-hackers by date:

From: Hannu Krosing
Date: 17 September 1999, 09:51:02
Subject: Re: [HACKERS] Hierarchical query?

From: Tom Lane
Date: 17 September 1999, 13:21:04
Subject: Re: [HACKERS] Bug

Re: [HACKERS] couldn't rollback cache ? - Mailing list pgsql-hackers

Previous

Next