Re: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
Date
Msg-id 4D75E201.20402@enterprisedb.com
Whole thread Raw
In response to Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum  (Greg Stark <gsstark@mit.edu>)
Responses Re: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
Re: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
List pgsql-hackers
On 08.03.2011 04:07, Greg Stark wrote:
> Well from that log you definitely have OldestXmin going backwards. And
> not by a little bit either. at 6:33 it set the all_visible flag and
> then at 7:01 it was almost 1.3 million transactions earlier. In fact
> to precisely the same value that was in use for a transaction at 1:38.
> That seems like a bit of a coincidence though it's not repeated
> earlier.

Yep. After staring at GetOldestXmin() again, it finally struck me how 
OldestXmin can move backwards. You need two databases for it, which 
probably explains why this has been so elusive.

Here's how to reproduce that:

CREATE DATABASE foodb;
CREATE DATABASE bardb;

session 1, in foodb:

foodb=# begin isolation level serializable;
BEGIN
foodb=# CREATE TABLE foo (a int4); -- just something to force this xact 
to have an xid
CREATE TABLE
foodb=#

(leave the transaction open)

session 2, in bardb:

bardb=# CREATE TABLE foo AS SELECT 1;
SELECT
bardb=# vacuum foo; -- to set the PD_ALL_VISIBLE flag
VACUUM

session 3, in bardb:
bardb=# begin isolation level serializable;
BEGIN
bardb=# SELECT 1; ?column?
----------        1
(1 row)

(leave transaction open)

session 2, in bardb:

bardb=# vacuum foo;
WARNING:  PD_ALL_VISIBLE flag was incorrectly set in relation "foo" page 
0 (OldestXmin 803)
VACUUM
bardb=#


What there are no other transactions active in the same database, 
GetOldestXmin() returns just latestCompletedXid. When you open a 
transaction in the same database after that, its xid will be above 
latestCompletedXid, but its xmin includes transactions from all 
databases, and there might be a transaction in some other database with 
an xid that precedes the value that GetOldestXmin() returned earlier.

I'm not sure what to do about that. One idea is track two xmin values in 
proc-array, one that includes transactions in all databases, and another 
that only includes transactions in the same database. GetOldestXmin() 
(when allDbs is false) would only pay attention to the latter. It would 
add a few instructions to GetSnapshotData(), though.

Another idea is to give up on the warning when it appears that 
oldestxmin has moved backwards, and assume that it's actually fine. We 
could still warn in other cases where the flag appears to be incorrectly 
set, like if there is a deleted tuple on the page.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Selena Deckelmann
Date:
Subject: GSoC 2011 - Mentors? Projects?
Next
From: Martijn van Oosterhout
Date:
Subject: Re: Theory of operation of collation patch