On Wed, Mar 24, 2021 at 11:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> On reflection, though, I wonder if we've made pg_dump do the right
> thing anyway. There is a strong case to be made for the idea that
> when dumping from a pre-14 server, it should emit
> SET default_toast_compression = 'pglz';
> rather than omitting any mention of the variable, which is what
> I made it do in aa25d1089. If we changed that, I think all these
> diffs would go away. Am I right in thinking that what's being
> compared here is new pg_dump's dump from old server versus new
> pg_dump's dump from new server?
>
> The "strong case" goes like this: initdb a v14 cluster, change
> default_toast_compression to lz4 in its postgresql.conf, then
> try to pg_upgrade from an old server. If the dump script doesn't
> set default_toast_compression = 'pglz' then the upgrade will
> do the wrong thing because all the tables will be recreated with
> a different behavior than they had before. IIUC, this wouldn't
> result in broken data, but it still seems to me to be undesirable.
> dump/restore ought to do its best to preserve the old DB state,
> unless you explicitly tell it --no-toast-compression or the like.
This feels a bit like letting the tail wag the dog, because one might
reasonably guess that the user's intention in such a case was to
switch to using LZ4, and we've subverted that intention by deciding
that we know better. I wouldn't blame someone for thinking that using
--no-toast-compression with a pre-v14 server ought to have no effect,
but with your proposal here, it would. Furthermore, IIUC, the user has
no way of passing --no-toast-compression through to pg_upgrade, so
they're just going to have to do the upgrade and then fix everything
manually afterward to the state that they intended to have all along.
Now, on the other hand, if they wanted to make practically any other
kind of change while upgrading, they'd have to do something like that
anyway, so I guess this is no worse.
But also ... aren't we just doing this to work around a test case that
isn't especially good in the first place? Counting the number of lines
in the diff between A and B is an extremely crude proxy for "they're
similar enough that we probably haven't broken anything."
--
Robert Haas
EDB: http://www.enterprisedb.com