On Sun, Jul 5, 2015 at 3:16 PM, Andres Freund <andres@anarazel.de> wrote:
>>On the other hand, in the common case, by the time we perform a
>>restartpoint, we're consistent: I think the main exception to that is
>>if we do a base backup that spans multiple checkpoints. I think that
>>in the new location, the chances that the legacy truncation is trying
>>to read inconsistent data is probably higher.
>
> The primary problem isn't that we truncate too early, it's that we delay truncation on the standby in comparison to
theprimary by a considerable amount. All the while continuing to replay multi creations.
>
> I don't see the difference wrt. consistency right now, but I don't have access to the code right now. I mean we
*have*to do something while inconsistent. A start/stop backup can easily span a day or four.
So, where are we with this patch?
In my opinion, we ought to do something about master and 9.5 before
beta, so that we're doing *yet another* major release with unfixed
multixact bugs. Let's make the relevant truncation changes in master
and 9.5 and bump the WAL page magic, so that a 9.5alpha standby can't
be used with a 9.5beta master. Then, we don't need any of this legacy
truncation stuff at all, and 9.5 is hopefully in a much better state
than 9.4 and 9.3.
Now, that still potentially leaves 9.4 and 9.3 users hanging out to
dry. But we don't have a tremendous number of those people clamoring
about this, and if we get 9.5+ correct, then we can go and change the
logic in 9.4 and 9.3 later when, and if, we are confident that's the
right thing to do. I am still not altogether convinced that it's a
good idea, nor am I altogether convinced that this code is right.
Perhaps it is, and if we consensus on it, fine. But regardless of
that, we should not send a third major release to beta with the
current broken system unless there is really no viable alternative.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company