Re: GetOldestXmin going backwards is dangerous after all - Mailing list pgsql-hackers

From Tom Lane
Subject Re: GetOldestXmin going backwards is dangerous after all
Date
Msg-id 27553.1359764642@sss.pgh.pa.us
Whole thread Raw
In response to Re: GetOldestXmin going backwards is dangerous after all  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: GetOldestXmin going backwards is dangerous after all  (Andres Freund <andres@2ndquadrant.com>)
Re: GetOldestXmin going backwards is dangerous after all  (Robert Haas <robertmhaas@gmail.com>)
Re: GetOldestXmin going backwards is dangerous after all  (Simon Riggs <simon@2ndQuadrant.com>)
Re: GetOldestXmin going backwards is dangerous after all  (Simon Riggs <simon@2ndQuadrant.com>)
Re: GetOldestXmin going backwards is dangerous after all  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> Having said that, I agree that a fix in GetOldestXmin() would be nice
> if we could find one, but since the comment describes at least three
> different ways the value can move backwards, I'm not sure that there's
> really a practical solution there, especially if you want something we
> can back-patch.

Actually, wait a second.  As you say, the comment describes three known
ways to make it go backwards.  It strikes me that all three are fixable:
* if allDbs is FALSE and there are no transactions running in the current* database, GetOldestXmin() returns
latestCompletedXid.If a transaction* begins after that, its xmin will include in-progress transactions in other*
databasesthat started earlier, so another call will return a lower value.
 

The reason this is a problem is that GetOldestXmin ignores XIDs of
processes that are connected to other DBs.  It now seems to me that this
is a flat-out bug.  It can ignore their xmins, but it should include
their XIDs, because the point of considering those XIDs is that they may
contribute to the xmins of snapshots computed in the future by processes
in our own DB.  And snapshots never exclude any XIDs on the basis of
which DB they're in.  (They can't really, since we can't know when the
snap is taken whether it might be used to examine shared catalogs.)
* There are also replication-related effects: a walsender* process can set its xmin based on transactions that are no
longerrunning* in the master but are still being replayed on the standby, thus possibly* making the GetOldestXmin
readinggo backwards.  In this case there is a* possibility that we lose data that the standby would like to have, but*
thereis little we can do about that --- data is only protected if the* walsender runs continuously while queries are
executedon the standby.* (The Hot Standby code deals with such cases by failing standby queries* that needed to access
already-removeddata, so there's no integrity bug.)
 

This is just bogus.  Why don't we make it a requirement on walsenders
that they never move their advertised xmin backwards (or initially set
it to less than the prevailing global xmin)?  There's no real benefit to
allowing them to try to move the global xmin backwards, because any data
that they might hope to protect that way could be gone already.
* The return value is also adjusted with vacuum_defer_cleanup_age, so* increasing that setting on the fly is another
easyway to make* GetOldestXmin() move backwards, with no consequences for data integrity.
 

And as for that, it's been pretty clear for awhile that allowing
vacuum_defer_cleanup_age to change on the fly was a bad idea we'd
eventually have to undo.  The day of reckoning has arrived: it needs
to be PGC_POSTMASTER.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: GetOldestXmin going backwards is dangerous after all
Next
From: Jeff Janes
Date:
Subject: Re: autovacuum not prioritising for-wraparound tables