On Thu, 2025-02-27 at 22:42 -0500, Greg Sabino Mullane wrote:
> I know I'm coming late to this, but I would like us to rethink having
> statistics dumped by default. I was caught by this today, as I was
> doing two dumps in a row, but the output changed between runs solely
> because the stats got updated. It got me thinking about all the use
> cases of pg_dump I've seen over the years. I think this has the
> potential to cause a lot of problems for things like automated
> scripts.
Can you expand on some of those cases?
There are some good reasons to make dumping stats the default:
* The argument here[1] seemed compelling: pg_dump has always dumped
everything by default, so not doing so for stats could be surprising.
* When dumping into the custom format, we'd almost certainly want to
include the stats so you can decide later whether to restore them or
not.
* For most of the cases I'm aware of, if you encounter a diff related
to stats, it would be obvious what the problem is and the fix would be
easy. I can imagine cases where it might not be easy, but I can't
recall any, so if you can then it would be helpful to list them.
so we will need to weigh the costs and benefits.
Unless there's a consensus to change it, I'm inclined to keep it the
default at least into beta, so that we can get feedback from users and
make a more informed decision.
(Aside: I assume everyone here agrees that pg_upgrade should transfer
the stats by default.)
Regards,
Jeff Davis
[1]
https://www.postgresql.org/message-id/3228677.1713844341%40sss.pgh.pa.us