Hi Christophe,
On 2/25/19 7:24 PM, Christophe Pettus wrote:
>
>
>> On Feb 25, 2019, at 08:55, Stephen Frost <sfrost@snowman.net> wrote:
>>
>> I honestly do doubt that they have had the same experiences that I have
>> had
>
> Well, I guarantee you that no two people on this list have had identical experiences. :) I certainly have been
bittenby the problems with the current system. But the resistance to major version upgrades is *huge*, and I'm
stronglybiased against anything that will make that harder. I'm not sure I'm communicating how big a problem telling
manylarge installations, "If you move to v12/13/etc., you will have to change your backup system" is going to be.
I honestly think you are underestimating how bad this can be.
The prevailing wisdom is that it's unfortunate that these backup_labels
get left around but they can be removed with scripting, so no big deal.
After that the cluster will start.
But -- if you are too aggressive about removing the backup_label and
accidentally do it before a real restore from backup, then you have a
corrupt cluster. Totally silent, but definitely corrupt. You'll
probably only see it when you start getting consistency errors from the
indexes, if ever. Page checksums won't catch it either unless you are
*lucky* enough to have a torn page.
Erroneous scripting of this kind can also affect backups that were made
with the non-exclusive method since the backups look the same.
fsync() is the major corruption issue we are facing right now but that
doesn't mean there aren't other sources of corruption we should be
thinking about. I've thought about this one a lot and it scares me.
I've worked on ways to make it better, but all of them break something
and involve compromises that are nearly as severe as removing exclusive
backups entirely.
Regards,
--
-David
david@pgmasters.net