Re: GetOldestXmin going backwards is dangerous after all - Mailing list pgsql-hackers

From Andres Freund
Subject Re: GetOldestXmin going backwards is dangerous after all
Date
Msg-id 20130202142437.GC8956@awork2.anarazel.de
Whole thread Raw
In response to Re: GetOldestXmin going backwards is dangerous after all  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: GetOldestXmin going backwards is dangerous after all  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 2013-02-01 19:24:02 -0500, Tom Lane wrote:
>  * There are also replication-related effects: a walsender
>  * process can set its xmin based on transactions that are no longer running
>  * in the master but are still being replayed on the standby, thus possibly
>  * making the GetOldestXmin reading go backwards.  In this case there is a
>  * possibility that we lose data that the standby would like to have, but
>  * there is little we can do about that --- data is only protected if the
>  * walsender runs continuously while queries are executed on the standby.
>  * (The Hot Standby code deals with such cases by failing standby queries
>  * that needed to access already-removed data, so there's no integrity bug.)
> 
> This is just bogus.  Why don't we make it a requirement on walsenders
> that they never move their advertised xmin backwards (or initially set
> it to less than the prevailing global xmin)?  There's no real benefit to
> allowing them to try to move the global xmin backwards, because any data
> that they might hope to protect that way could be gone already.

The problem imo has multiple sides.

a) We currently don't have a way to assign the minimal valid xmin to the
current backend. I already need that for logical replication, so we
might just as well add it now:

http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=commitdiff;h=84f6ed9550b65b0584589977877e9a997daeb61d

With that its just a matter of taking ProcArrayLock exlusively before
calling GetOldestXmin and assigning it to MyPgXact->xmin before
releasing it.

b) We don't assign the xmin early enough, we only set it when the first
feedback message arrives, but we should set it when walsender starts
streaming.

c) After a disconnect the feedback message will rather likely ask for an
xmin horizon thats not valid anymore on the primary. If the disconnect
was short enough often enough that doesn't matter because nothing has
been cleared out, but it doesn't really work all that well.
Thats still better than setting it to the currently valid minimal xmin
horizon because it prevents cleanup from that moment on.
I don't see how this can be significantly improved without persistent
knowledge about standbys.

>  * The return value is also adjusted with vacuum_defer_cleanup_age, so
>  * increasing that setting on the fly is another easy way to make
>  * GetOldestXmin() move backwards, with no consequences for data integrity.
> 
> And as for that, it's been pretty clear for awhile that allowing
> vacuum_defer_cleanup_age to change on the fly was a bad idea we'd
> eventually have to undo.  The day of reckoning has arrived: it needs
> to be PGC_POSTMASTER.

Not sure if that would really solve anything, similar problems seem to
exist even if the postmaster has been restarted inbetween.


Btw, I wonder if that comment in GetOldestXmin still has good
justification, now that we allow cascading standbys:
> Also note that we intentionally don't apply vacuum_defer_cleanup_age
> on standby servers.


Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: autovacuum not prioritising for-wraparound tables
Next
From: Pavel Stehule
Date:
Subject: Re: proposal - assign result of query to psql variable