Simon Riggs wrote:
> On Sat, 2010-05-15 at 19:30 +0300, Heikki Linnakangas wrote:
>> Doesn't feel right to me either. If you want to expose the
>> keepalive-time to queries, it should be a separate field, something like
>> lastMasterKeepaliveTime and a pg_last_master_keepalive() function to
>> read it.
>
> That wouldn't be good because then you couldn't easily monitor the
> delay? You'd have to run two different functions depending on the state
> of replication (for which we would need yet another function). Users
> would just wrap that back up into a single function.
What exactly is the user trying to monitor? If it's "how far behind is
the standby", the difference between pg_current_xlog_insert_location()
in the master and pg_last_xlog_replay_location() in the standby seems
more robust and well-defined to me. It's a measure of XLOG location (ie.
bytes) instead of time, but time is a complicated concept.
Also note that as the patch stands, if you receive a keep-alive from the
master at point X, it doesn't mean that the standby is fully up-to-date.
It's possible that the walsender just finished sending a huge batch of
accumulated WAL, say 1 GB, and it took 1 hour for the batch to be sent.
During that time, a lot more WAL has accumulated, yet walsender sends a
keep-alive with the current timestamp.
In general, the purpose of a keep-alive is to keep the connection alive,
but you're trying to accomplish something else too, and I don't fully
understand what it is.
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com