On 2013-02-01 19:24:02 -0500, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > Having said that, I agree that a fix in GetOldestXmin() would be nice
> > if we could find one, but since the comment describes at least three
> > different ways the value can move backwards, I'm not sure that there's
> > really a practical solution there, especially if you want something we
> > can back-patch.
>
> Actually, wait a second. As you say, the comment describes three known
> ways to make it go backwards. It strikes me that all three are fixable:
>
> * if allDbs is FALSE and there are no transactions running in the current
> * database, GetOldestXmin() returns latestCompletedXid. If a transaction
> * begins after that, its xmin will include in-progress transactions in other
> * databases that started earlier, so another call will return a lower value.
>
> The reason this is a problem is that GetOldestXmin ignores XIDs of
> processes that are connected to other DBs. It now seems to me that this
> is a flat-out bug. It can ignore their xmins, but it should include
> their XIDs, because the point of considering those XIDs is that they may
> contribute to the xmins of snapshots computed in the future by processes
> in our own DB. And snapshots never exclude any XIDs on the basis of
> which DB they're in. (They can't really, since we can't know when the
> snap is taken whether it might be used to examine shared catalogs.)
> * The return value is also adjusted with vacuum_defer_cleanup_age, so
> * increasing that setting on the fly is another easy way to make
> * GetOldestXmin() move backwards, with no consequences for data integrity.
>
> And as for that, it's been pretty clear for awhile that allowing
> vacuum_defer_cleanup_age to change on the fly was a bad idea we'd
> eventually have to undo. The day of reckoning has arrived: it needs
> to be PGC_POSTMASTER.
ISTM that the original problem can still occur, even after Simon's
commit.
1) start with -c vacuum_defer_cleanup_age=0
2) autovacuum vacuums "test";
3) restart with -c vacuum_defer_cleanup_age=10000
4) autovacuum vacuums "test"'s toast table;
should result in about the same ERROR, shouldn't it?
Given that there seemingly isn't yet a way to fix that people agree on
and that it "only" result in a transient error I think the fix for this
should be pushed after the next point release.
Greetings,
Andres Freund
-- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services