Home > mailing lists

Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors) - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)
Date	May 10, 2016 12:09:08
Msg-id	CA+TgmoZ0PzoGMRBU-NOEx8YLEf5LeBF8TXiht5r0zeEVNpeT7g@mail.gmail.com Whole thread Raw
In response to	Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors) (Andres Freund <andres@anarazel.de>)
Responses	Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)
List	pgsql-hackers

Tree view

On Tue, May 10, 2016 at 3:05 AM, Andres Freund <andres@anarazel.de> wrote:
> The easy way to trigger this problem would be to have an oid wraparound
> - but the WAL shows that that's not the case here.  I've not figured
> that one out entirely (and won't tonight). But I do see WAL records
> like:
> rmgr: XLOG        len (rec/tot):      4/    30, tx:          0, lsn: 2/12004018, prev 2/12003288, desc: NEXTOID
4302693
> rmgr: XLOG        len (rec/tot):      4/    30, tx:          0, lsn: 2/1327EA08, prev 2/1327DC60, desc: NEXTOID
4302693
> i.e. two NEXTOID records allocating the same range, which obviously
> doesn't seem right.  There's also every now and then close by ranges:
> rmgr: XLOG        len (rec/tot):      4/    30, tx:          0, lsn: 1/9A404DB8, prev 1/9A404270, desc: NEXTOID
3311455
> rmgr: XLOG        len (rec/tot):      4/    30, tx:    7814505, lsn: 1/9A4EC888, prev 1/9A4EB9D0, desc: NEXTOID
3311461
>
>
> As far as I can see something like the above, or an oid wraparound, are
> pretty much deadly for toast.
>
> Is anybody ready with a good defense for SatisfiesToast not doing any
> actual liveliness checks?

I assume that this was installed as a performance optimization, and I
don't really see why it shouldn't be or be able to be made safe.  I
assume that the wraparound case was deemed safe because at that time
the idea of 4 billion OIDs getting used with old transactions still
active seemed inconceivable.  It seems to me that the real question
here is how you're getting two calls to XLogPutNextOid() with the same
value of ShmemVariableCache->nextOid, and the answer, as it seems to
me, must be that LWLocks are broken.  Either two processes are
managing to hold OidGenLock in exclusive mode at the same time, or
they're acquiring it in quick succession but without the second
process seeing all of the updates performed by the first process.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Rajeev rastogi
Date: 10 May 2016, 12:05:53
Subject: Re: asynchronous and vectorized execution

From: Amit Kapila
Date: 10 May 2016, 12:09:28
Subject: Hash Indexes

Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors) - Mailing list pgsql-hackers

Previous

Next