Re: backup manifests - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: backup manifests
Date
Msg-id 20200327195512.GG13712@tamriel.snowman.net
Whole thread Raw
In response to Re: backup manifests  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: backup manifests  (David Steele <david@pgmasters.net>)
List pgsql-hackers
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Thu, Mar 26, 2020 at 4:44 PM Stephen Frost <sfrost@snowman.net> wrote:
> > Is it actually possible, today, in PG, to have a 4GB WAL record?
> > Judging this based on the WAL record size doesn't seem quite right.
>
> I'm not sure. I mean, most records are quite small, but I think if you
> set REPLICA IDENTITY FULL on a table with a bunch of very wide columns
> (and also wal_level=logical) it can get really big. I haven't tested
> to figure out just how big it can get. (If I have a table with lots of
> almost-1GB-blobs in it, does it work without logical replication and
> fail with logical replication? I don't know, but I doubt a WAL record
> >4GB is possible, because it seems unlikely that the code has a way to
> cope with that struct field overflowing.)

Interesting..  Well, topic for another thread, but I'd say if we believe
that's possible then we might want to consider if the crc32c is a good
decision to use still there.

> > Again, I'm not against having a checksum algorithm as a option.  I'm not
> > saying that it must be SHA512 as the default.
>
> I think that what we have seen so far is that all of the SHA-n
> algorithms that PostgreSQL supports are about equally slow, so it
> doesn't really matter which one you pick there from a performance
> point of view. If you're not saying it has to be SHA-512 but you do
> want it to be SHA-256, I don't think that really fixes anything. Using
> CRC-32C does fix the performance issue, but I don't think you like
> that, either. We could default to having no checksums at all, or even
> no manifest at all, but I didn't get the impression that David, at
> least, wanted to go that way, and I don't like it either. It's not the
> world's best feature, but I think it's good enough to justify enabling
> it by default. So I'm not sure we have any options here that will
> satisfy you.

I do like having a manifest by default.  At this point it's pretty clear
that we've just got a fundamental disagreement that more words aren't
going to fix.  I'd rather we play it safe and use a sha256 hash and
accept that it's going to be slower by default, and then give users an
option to make it go faster if they want (though I'd much rather that
alternative be a 64bit CRC than a 32bit one).

Andres seems to agree with you.  I'm not sure where David sits on this
specific question.

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: allow online change primary_conninfo
Next
From: Andres Freund
Date:
Subject: Re: backup manifests