Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum - Mailing list pgsql-hackers

From daveg
Subject Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
Date
Msg-id 20110308025128.GB21941@sonic.net
Whole thread Raw
In response to Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
On Tue, Mar 08, 2011 at 02:07:53AM +0000, Greg Stark wrote:
> On Mon, Mar 7, 2011 at 11:53 PM, daveg <daveg@sonic.net> wrote:
> >> Looking at the code, I don't see how that situation could arise, though.
> >> The value calculated by GetOldestXmin() should never move backwards. And
> >> GetOldestXmin() is called in lazy_vacuum_rel(), after it has acquired a
> >> lock on the table, which should protect from a race condition where two
> >> vacuums could run on the table one after another, in a way where the
> >> later vacuum runs with an OldestXmin calculated before the first vacuum.
> >>
> >> Hmm, fiddling with vacuum_defer_cleanup_age on the fly could cause that,
> >> though. You don't do that, do you?
> >
> > No.
> >
> > I've updated the patch to collect db and schema and added Merlins patch as
> > well and run it for a while. The attached log is all the debug messages
> > for pg_statistic page 333 from one database. I've also attached the two
> > most recent page images for that particular page, the last digits in the
> > filename are the hour and minute of when the page was saved.
>
>
> Well from that log you definitely have OldestXmin going backwards. And
> not by a little bit either. at 6:33 it set the all_visible flag and
> then at 7:01 it was almost 1.3 million transactions earlier. In fact
> to precisely the same value that was in use for a transaction at 1:38.
> That seems like a bit of a coincidence though it's not repeated
> earlier.
>
> It also seems odd that it happens only with this one block of this one table.

It happens on several tables on thousands of blocks in dozens of dbs daily
on four separate but similar hosts. I have over 240MB of page images saved
from one host so far but assumed that would be too much to send, so I just
filtered it down to the one block.


> What does SHOW ALL show for the current settings in effect? And what

Attached.

> was process 23896, are there any other log messages from it? When did
> it start?

The process appears to be an autovacuum worker, all it logged was the attached
messages from the debugging for this.

Thanks

-dg

--
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

Attachment

pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
Next
From: Fujii Masao
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Basic Recovery Control functions for use in Hot Standby. Pause,