Re: mvcc catalo gsnapshots and TopTransactionContext - Mailing list pgsql-hackers

From Andres Freund
Subject Re: mvcc catalo gsnapshots and TopTransactionContext
Date
Msg-id 20130806031214.GC31572@alap2.anarazel.de
Whole thread Raw
In response to Re: mvcc catalo gsnapshots and TopTransactionContext  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
On 2013-08-05 13:09:31 -0400, Noah Misch wrote:
> On Fri, Jul 12, 2013 at 11:42:23AM +0200, Andres Freund wrote:
> > On 2013-07-11 15:09:45 -0400, Tom Lane wrote:
> > > It never has been, and never will be, allowed to call the catcache code
> > > without being in a transaction.  What do you think will happen if the
> > > requested row isn't in cache?  A table access, that's what, and that
> > > absolutely requires being in a transaction.
> 
> > I'd like to add an Assert like in the attached patch making sure we're
> > in a transaction. Otherwise it's far too easy not to hit an error during
> > development because everything is cached - and syscache usage isn't
> > always obvious from the outside to the naive or the tired.
> 
> The following test case reliably hits this new assertion:
> 
> create table t (c int);
> begin;
> create index on t (c);
> savepoint q;
> insert into t values (1);
> select 1/0;
> 
> Stack trace:
> 
> #2  0x0000000000761159 in ExceptionalCondition (conditionName=<value optimized out>, errorType=<value optimized out>,
fileName=<valueoptimized out>, lineNumber=<value optimized out>) at assert.c:54
 
> #3  0x000000000074fc53 in SearchCatCache (cache=0xc53048, v1=139479, v2=0, v3=0, v4=0) at catcache.c:1072
> #4  0x000000000075c011 in SearchSysCache (cacheId=0, key1=369539, key2=6, key3=18446744073709551615, key4=0) at
syscache.c:909
> #5  0x0000000000757e57 in RelationReloadIndexInfo (relation=0x7f2dd5fffea0) at relcache.c:1770
> #6  0x000000000075817b in RelationClearRelation (relation=0x7f2dd5fffea0, rebuild=1 '\001') at relcache.c:1926
> #7  0x00000000007588c6 in RelationFlushRelation (relationId=139479) at relcache.c:2076
> #8  RelationCacheInvalidateEntry (relationId=139479) at relcache.c:2138
> #9  0x000000000075185d in LocalExecuteInvalidationMessage (msg=0xcde778) at inval.c:546
> #10 0x0000000000751185 in ProcessInvalidationMessages (hdr=0xc0d390, func=0x7517d0 <LocalExecuteInvalidationMessage>)
atinval.c:422
 
> #11 0x0000000000751a3f in AtEOSubXact_Inval (isCommit=0 '\000') at inval.c:973
> #12 0x00000000004cb986 in AbortSubTransaction () at xact.c:4250
> #13 0x00000000004cbf05 in AbortCurrentTransaction () at xact.c:2863
> #14 0x00000000006968f6 in PostgresMain (argc=1, argv=0x7fff7cf71610, dbname=0xc13b60 "test", username=0xbedc88 "nm")
atpostgres.c:3848
 
> #15 0x000000000064c2b5 in BackendRun () at postmaster.c:4044
> #16 BackendStartup () at postmaster.c:3733
> #17 ServerLoop () at postmaster.c:1571
> #18 0x000000000064fa8d in PostmasterMain (argc=1, argv=0xbe84a0) at postmaster.c:1227
> #19 0x00000000005e2dcd in main (argc=1, argv=0xbe84a0) at main.c:196
> 
> When we call AtEOSubXact_Inval() or AtEOXact_Inval() with a relation still
> open, we can potentially get a relcache rebuild and therefore a syscache
> lookup as shown above.  CommitSubTransaction() is also potentially affected,
> though I don't have an SQL-level test case for that.  It calls
> CommandCounterIncrement() after moving to TRANS_COMMIT.  That CCI had better
> find no invalidations of open relations, or we'll make syscache lookups.  (In
> simple tests, any necessary invalidations tend to happen at the CCI in
> CommitTransactionCommand(), so the later CCI does in fact find nothing to do.
> I have little confidence that should be counted upon, though.)
> 
> How might we best rearrange things to avoid these hazards?

Man. This is a nasty one. And seems to have been there for a long
time. I am rather surprised that we haven't seen this as a cause of
problems.
Deferring invalidation processing a good bit seems to be the only way to
deal with this I can see but that seems to require an uncomfortable
amount of shuffling around in xact.c.

I'll think about it more, but so far I haven't seen anything close to
nice.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: HeapTupleSatisfiesDirty fails to test HEAP_XMAX_IS_LOCKED_ONLY for TransactionIdIsInProgress(...)
Next
From: Craig Ringer
Date:
Subject: Re: HeapTupleSatisfiesDirty fails to test HEAP_XMAX_IS_LOCKED_ONLY for TransactionIdIsInProgress(...)