Home > mailing lists

Re: Statistics Import and Export - Mailing list pgsql-hackers

From	Nathan Bossart
Subject	Re: Statistics Import and Export
Date	March 6 23:50:46
Msg-id	Z8oKpqB6OSrHYHQK@nathan Whole thread Raw
In response to	Re: Statistics Import and Export (Andres Freund <andres@anarazel.de>)
Responses	Re: Statistics Import and Export
List	pgsql-hackers

Tree view

On Thu, Mar 06, 2025 at 03:20:16PM -0500, Andres Freund wrote:
> There are many systems with hundreds of databases, removing all parallelism
> for those from pg_upgrade would likely hurt way more than what we can gain
> here.

I just did a quick test on a freshly analyzed database with 1,000 sequences
and 10,000 tables with 1,000 rows and 2 unique constraints apiece.

    ~/pgdata$ time pg_dump postgres --no-data --binary-upgrade > /dev/null
    0.29s user 0.09s system 21% cpu 1.777 total

    ~/pgdata$ time pg_dump postgres --no-data --no-statistics --binary-upgrade > /dev/null
    0.14s user 0.02s system 25% cpu 0.603 total

So about 1.174 seconds goes to statistics.  Even if we do all sorts of work
to make dumping statistics really fast, dumping 8 in succession would still
take upwards of 4.8 seconds or more.  Even with the current code, dumping 8
in parallel would probably take closer to 2 seconds, and I bet reducing the
number of statistics queries could drive it below 1.  Granted, I'm waving
my hands vigorously with those last two estimates.

That being said, I do think in-database parallelism would be useful in some
cases.  I frequently hear about problems with huge numbers of large objects
on a cluster with one big database.  But that's probably less likely than
the many database case.

-- 
nathan

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 06 March, 23:49:20
Subject: Re: Refactoring postmaster's code to cleanup after child exit

From: Tom Lane
Date: 06 March, 23:56:46
Subject: Re: Add column name to error description

Re: Statistics Import and Export - Mailing list pgsql-hackers

Previous

Next