Hi,
On 2025-03-06 14:51:26 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2025-03-06 13:47:34 -0500, Tom Lane wrote:
> >> ... I wonder if we could just rip out pg_upgrade's support
> >> for DB-level parallelism, which is not terribly pretty anyway, and
> >> simply pass the -j switch straight to pg_dump and pg_restore.
>
> > I don't think that'd work well, right now pg_dump only handles a single
> > database (pg_dumpall doesn't yet support -Fc) *and* pg_dump is still serial
> > for the bulk of the work that pg_upgrade cares about.
> > I think the only parallelism that'd actually happen for pg_upgrade would be
> > dumping of large objects?
>
> Uh ... the entire point here is that we'd be trying to parallelize its
> dumping of stats, no? Most DBs will have enough of those to be
> interesting, I should think.
Well, we added concurrent-pg-dump runs to pg_upgrade for a reason,
presumably. Before stats got dumped, there wasn't any benefit of pg_dump level
parallelism, unless large objects are used. Presumably we validated that there
*is* gain from running pg_dump on multiple databases concurrently.
There are many systems with hundreds of databases, removing all parallelism
for those from pg_upgrade would likely hurt way more than what we can gain
here.
Greetings,
Andres Freund