Re: Online enabling of checksums - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Online enabling of checksums
Date
Msg-id CABUevEx6NXFy4o8vaWMrb35a7Bj3M_=kMX2Rx3N6ZHR9RDhF=w@mail.gmail.com
Whole thread Raw
In response to Re: Online enabling of checksums  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Online enabling of checksums
Re: Online enabling of checksums
List pgsql-hackers
On Sat, Feb 24, 2018 at 1:34 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Feb 22, 2018 at 3:28 PM, Magnus Hagander <magnus@hagander.net> wrote:
> I would prefer that yes. But having to re-read 9TB is still significantly
> better than not being able to turn on checksums at all (state today). And
> adding a catalog column for it will carry the cost of the migration
> *forever*, both for clusters that never have checksums and those that had it
> from the beginning.
>
> Accepting that the process will start over (but only read, not re-write, the
> blocks that have already been processed) in case of a crash does
> significantly simplify the process, and reduce the long-term cost of it in
> the form of entries in the catalogs. Since this is a on-time operation (or
> for many people, a zero-time operation), paying that cost that one time is
> probably better than paying a much smaller cost but constantly.

That's not totally illogical, but to be honest I'm kinda surprised
that you're approaching it that way.  I would have thought that
relchecksums and datchecksums columns would have been a sort of
automatic design choice for this feature.  The thing to keep in mind
is that nobody's going to notice the overhead of adding those columns
in practice, but someone will surely notice the pain that comes from
having to restart the whole operation.  You're talking about trading
an effectively invisible overhead for a very noticeable operational
problem.

Is it really that invisible? Given how much we argue over adding single counters to the stats system, I'm not sure it's quite that low.

We did consider doing it at a per-table basis as well. But this is also an overhead that has to be paid forever, whereas the risk of having to read the database files more than once (because it'd only have to read them on the second pass, not write anything) is a one-off operation. And for all those that have initialized with checksums in the first place don't have to pay any overhead at all in the current design.

I very strongly doubg it's a "very noticeable operational problem". People don't restart their databases very often... Let's say it takes 2-3 weeks to complete a run in a fairly large database. How many such large databases actually restart that frequently? I'm not sure I know of any. And the only effect of it is you have to start the process over (but read-only for the part you have already done). It's certainly not ideal, but I don't agree it's in any form a "very noticeable problem".

The other point to it is that this keeps the code a lot simpler. That is both good for having a chance at all to finish it and get it into 11 (and it can then be improved upon to add for example incremental support in 12, or something like that). And of course, simple code means less overhead in the form of maintenance and effects on other parts of the system down the road. 

--

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] Re: Improve OR conditions on joined columns (commonstar schema problem)
Next
From: Magnus Hagander
Date:
Subject: Re: Online enabling of checksums