On Thu, 2011-02-17 at 13:38 +0900, Fujii Masao wrote:
> On Thu, Feb 17, 2011 at 4:30 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > Hot Standby feedback for avoidance of cleanup conflicts on standby.
> > Standby optionally sends back information about oldestXmin of queries
> > which is then checked and applied to the WALSender's proc->xmin.
> > GetOldestXmin() is modified slightly to agree with GetSnapshotData(),
> > so that all backends on primary include WALSender within their snapshots.
> > Note this does nothing to change the snapshot xmin on either master or
> > standby. Feedback piggybacks on the standby reply message.
> > vacuum_defer_cleanup_age is no longer used on standby, though parameter
> > still exists on primary, since some use cases still exist.
>
> I have another comments about this change.
>
> Something like the following description should be in the doc.
>
> hot_standby_feedback has no effect if either hot_standby is off or
> wal_receiver_status_interval is zero.
The docs are going to need some work after 3-4 related major changes hit
them. I'm not picking up on individual sentences right now.
> + if (MyProc->xmin != newxmin)
> + {
> + LWLockAcquire(ProcArrayLock, LW_SHARED);
> + MyProc->xmin = newxmin;
> + LWLockRelease(ProcArrayLock);
>
> ProcArrayLock should be taken with LW_EXCLUSIVE since the shared
> variable is changed. No?
No, shared is sufficient for setting xmin, as we do in
GetSnapshotData().
> What about exposing the feedback xid and epoch in pg_stat_replication?
> It's useful when we investigate which standby unexpectedly prevents
> VACUUM on the primary.
This begs the questions "what is the xmin of all the normal backends?"
and "Whats is the xmin of prepared transactions?" as well. I wasn't sure
that we should expose that information for walsenders when we don't do
it for everybody else. If we do it would require major sections in the
docs explaining it all, etc..
> It seems too aggressive to calculate the oldest xmin and return it for
> each WAL write and flush on the standby. I think this because calculation
> of the oldest xmin is not light operation especially when there are many
> concurrent backends. How about feeding back the xmin only when the
> interval has passed?
You may be correct. Some rearrangement following performance tuning is
likely, though I've tried not to pre-guess the tuning.
-- Simon Riggs http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services