Re: Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers. - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.
Date
Msg-id CA+TgmoboGjfNhqmUBUdWWAFq+rtraCOJ9_UPYzXs0iEvASa9xA@mail.gmail.com
Whole thread Raw
In response to Re: Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.
List pgsql-hackers
On Thu, May 31, 2012 at 11:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hmm ... first question is do we actually care whether the clocks are
> synced to the millisecond level, ie what is it you'd do differently
> if you know that the master and slave clocks are synced more closely
> than you can measure at the protocol level.
>
> But if there is a reason to care, perhaps we could have a setting that
> says "we're using NTP, so trust the clocks to be synced"?  What I object
> to is assuming that without any evidence, or being unable to operate
> correctly in an environment where it's not true.

In general, we are happy to leave to the operating system - or some
other operating service - those tasks which are best handled in that
way.  I don't understand why this should be an exception.  If we're
not going to implement a filesystem inside PostgreSQL - which is
actually relatively closely related to our core mission as a data
store - then why the heck do we want to implement time
synchronization?  If this were an easy problem I wouldn't care, but
it's not.  The solution Simon has implemented here, besides being
vulnerable to network jitter that can't be eliminated without
reimplementing some sort of complex ntp-like protocol inside the
backend - won't work with log shipping, which is why (or part of why?)
Simon proposed keepalive files to allow this information to be passed
through the archive.  To me, this is massive over-engineering.  I'll
support the keepalive feature if it's the only way to get you to agree
to adding the capabilities we need to be competitive with other
replication solutions - but that's about the only redeeming value it
has IMV.

Now, mind you, I am not saying that we should randomly and arbitrarily
make ourselves vulnerable to clock skew when there is a reasonable
alternative design.  For example, you were able to come up with a way
to make max_standby_delay work sensibly without having to compare
master and slave timestamps, and that's good.  But in cases where no
such design exists - and a time-based notion of replication delay
seems to be one of those times - I don't see any real merit in
reinventing the wheel, especially since it seems likely that the wheel
is going to be dodecagonal.  Aside from network jitter and the need
for archive keepalives, suppose the two machines really do have clocks
that are an hour off from each other.  And the master system is really
busy so the slave runs about a minute behind.  We detect the time skew
and correct for it, so the replication delay shows up correctly.  Life
is good.  But then the system administrator notices that there's a
problem and fires up ntpd to fix it.  Our keepalive system will now
notice and decide that the "replication transfer latency" is now 0 s
instead of +/- 3600 s.  However, we're replaying records from a minute
ago, before the time change, so now for the next minute our
replication delay is either 61 minutes or -59 minutes, depending on
the direction of the skew, and then it goes back to normal.  Not the
end of the world, but weird.  It's the sort of thing that we probably
won't even try to document, because it'll affect very few people, but
anyone who is affected will have to understand the system pretty
deeply to understand what's gone wrong.  IME, users hate that.

On the other hand, if we simply say "PostgreSQL computes the
replication delay by subtracting the time at which the WAL was
generated, as recorded on the master, from the time at which it is
replayed by the slave" then, hey, we still have a wart, but it's
pretty clear what the wart is and how to fix it, and we can easily
document that.  Again, if we could get rid of the failure modes and
make this really water-tight, I think I'd be in favor of that, but it
seems to me that we are in the process of expending a lot of energy
and an even larger amount of calendar time to create a system that
will misbehave in numerous subtle ways instead of one straightforward
one.  I don't see that as a good trade.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
Next
From: Merlin Moncure
Date:
Subject: Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile