Re: GetOldestXmin going backwards is dangerous after all - Mailing list pgsql-hackers

From Tom Lane
Subject Re: GetOldestXmin going backwards is dangerous after all
Date
Msg-id 26936.1359762981@sss.pgh.pa.us
Whole thread Raw
In response to Re: GetOldestXmin going backwards is dangerous after all  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: GetOldestXmin going backwards is dangerous after all
Re: GetOldestXmin going backwards is dangerous after all
Re: GetOldestXmin going backwards is dangerous after all
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Feb 1, 2013 at 2:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In any case, I no longer have much faith in the idea that letting
>> GetOldestXmin go backwards is really safe.

> That is admittedly kind of weird behavior, but I think one could
> equally blame this on CLUSTER.  This is hardly the first time we've
> had to patch CLUSTER's handling of TOAST tables (cf commits
> 21b446dd0927f8f2a187d9461a0d3f11db836f77,
> 7b0d0e9356963d5c3e4d329a917f5fbb82a2ef05,
> 83b7584944b3a9df064cccac06822093f1a83793) and it doesn't seem unlikely
> that we might go the full ten rounds.

Yeah, but I'm not sure whether CLUSTER is the appropriate blamee or
whether it's more like the canary in the coal mine, first to expose
problematic behaviors elsewhere.  The general problem here is really
that we're cleaning out toast tuples while the referencing main-heap
tuple still physically exists.  How safe do you think that is?  That
did not ever happen before we decoupled autovacuuming of main and toast
tables, either --- so a good case could be made that that idea is
fundamentally broken.

> Having said that, I agree that a fix in GetOldestXmin() would be nice
> if we could find one, but since the comment describes at least three
> different ways the value can move backwards, I'm not sure that there's
> really a practical solution there, especially if you want something we
> can back-patch.

Well, if we were tracking the latest value in shared memory, we could
certainly clamp to that to ensure it didn't go backwards.  The problem
is where to find storage for a per-DB value.

I thought about storing each session's latest value in its PGPROC and
taking the max over same-DB sessions during GetOldestXmin's ProcArray
scan, but that doesn't work because an autovacuum process might
disappear and thus destroy the needed info just before CLUSTER looks
for it.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: issues with range types, btree_gist and constraints
Next
From: Tom Lane
Date:
Subject: Re: GetOldestXmin going backwards is dangerous after all