On Sat, Feb 24, 2018 at 10:49 PM, Andres Freund <andres@anarazel.de> wrote:
On 2018-02-24 22:45:09 +0100, Magnus Hagander wrote: > Is it really that invisible? Given how much we argue over adding single > counters to the stats system, I'm not sure it's quite that low.
That's appears to be entirely unrelated. The stats stuff is expensive because we currently have to essentialy write out the stats for *all* tables in a database, once a counter is updated. And those counters are obviously constantly updated. Thus the overhead of adding one column is essentially multiplied by the number of tables in the system. Whereas here it's a single column that can be updated on a per-row basis, which is barely ever going to be written to.
Am I missing something?
It's probably at least partially unrelated, you are right. I may have misread our reluctance to add more values there as a general reluctancy to add more values to central columns.
> We did consider doing it at a per-table basis as well. But this is also an > overhead that has to be paid forever, whereas the risk of having to read > the database files more than once (because it'd only have to read them on > the second pass, not write anything) is a one-off operation. And for all > those that have initialized with checksums in the first place don't have to > pay any overhead at all in the current design.
Why does it have to be paid forever?
The size of the pg_class row would be there forever. Granted, it's not that big an overhead given that there are already plenty of columns there. But the point being you can never remove that column, and it will be there for users who never even considered running without checksums. It's certainly not a large overhead, but it's also not zero.
> I very strongly doubg it's a "very noticeable operational problem". People > don't restart their databases very often... Let's say it takes 2-3 weeks to > complete a run in a fairly large database. How many such large databases > actually restart that frequently? I'm not sure I know of any. And the only > effect of it is you have to start the process over (but read-only for the > part you have already done). It's certainly not ideal, but I don't agree > it's in any form a "very noticeable problem".
I definitely know large databases that fail over more frequently than that.
I would argue that they have bigger issues than enabling checksums... By far.