Re: Disabled features on Hot Standby - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Disabled features on Hot Standby
Date
Msg-id CA+Tgmobq22kF9LkGLsm2QqeRVGirr1sFWkXGqb=7BVjQ7Bm9ew@mail.gmail.com
Whole thread Raw
In response to Re: Disabled features on Hot Standby  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Disabled features on Hot Standby  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Disabled features on Hot Standby  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Re: Disabled features on Hot Standby  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
On Fri, Jan 13, 2012 at 11:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> I think it should be you that comes up with a fix, not for me to
> respond to your concerns about how hard it is. Many things that don't
> fully work are rejected for that reason.

Well, I disagree.  The fact that all-visible info can't be trusted in
standby mode is a problem that has existed since Hot Standby was
committed, and I don't feel obliged to fix it just because I was
involved in developing a new feature that happens to rely on
all-visible info.  I'm sorry to butt heads with you on this one, but
this limitation has been long-known and discussed many times before on
pgsql-hackers, and I'm not going to drop everything and start working
on this just because you seem to think that I should.

> Having said that, I have input that seems to solve the problem.
>
> Many WAL records have latestRemovedXid on them. We can use the same
> idea with XLOG_HEAP2_VISIBLE records, so we add a field to send the
> latest vacrelstats->latestRemovedXid. That then creates a recovery
> snapshot conflict that would cancel any query that might then see a
> page of the vis map that was written when the xmin was later than on
> the standby. If replication disconnects briefly and a vimap bit is
> updated that would cause a problem, just as the same situation would
> cause a problem because of other record types.

That could create a lot of recovery conflicts when
hot_standby_feedback=off, I think, but it might work when
hot_standby_feedback=on.  I don't fully understand the
latestRemovedXid machinery, but I guess the idea would be to kill any
standby transaction whose proc->xmin precedes the oldest committed
xmin or xmax on the page.  If hot_standby_feedback=on then there
shouldn't be any, except in the case where it's just been enabled or
the SR connection is bouncing.

Also, what happens if an all-visible bit gets set on the standby
through some other mechanism - e.g. restored from an FPI or
XLOG_HEAP_NEWPAGE?  I'm not sure whether we ever do an FPI of the
visibility map page itself, but we certainly do it for the heap pages.So it might be that this infrastructure would
(somewhatbizarrely)
 
trust the visibility map bits but not the PD_ALL_VISIBLE bits.  I'm
hoping Heikki or Tom will comment on this thread, because I think
there are a bunch of subtle issues here and that we could easily screw
it up by trying to plow through the problem too hastily.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Review of: explain / allow collecting row counts without timing info
Next
From: Jeff Janes
Date:
Subject: Re: checkpoint writeback via sync_file_range