On Thu, Feb 22, 2018 at 3:28 PM, Magnus Hagander <magnus@hagander.net> wrote:
> I would prefer that yes. But having to re-read 9TB is still significantly
> better than not being able to turn on checksums at all (state today). And
> adding a catalog column for it will carry the cost of the migration
> *forever*, both for clusters that never have checksums and those that had it
> from the beginning.
>
> Accepting that the process will start over (but only read, not re-write, the
> blocks that have already been processed) in case of a crash does
> significantly simplify the process, and reduce the long-term cost of it in
> the form of entries in the catalogs. Since this is a on-time operation (or
> for many people, a zero-time operation), paying that cost that one time is
> probably better than paying a much smaller cost but constantly.
That's not totally illogical, but to be honest I'm kinda surprised
that you're approaching it that way. I would have thought that
relchecksums and datchecksums columns would have been a sort of
automatic design choice for this feature. The thing to keep in mind
is that nobody's going to notice the overhead of adding those columns
in practice, but someone will surely notice the pain that comes from
having to restart the whole operation. You're talking about trading
an effectively invisible overhead for a very noticeable operational
problem.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company