Re: Statistics Import and Export - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Statistics Import and Export
Date
Msg-id Z8nn1Kbo4j7fo-ik@nathan
Whole thread Raw
In response to Re: Statistics Import and Export  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Thu, Mar 06, 2025 at 01:04:55PM -0500, Andres Freund wrote:
> To be clear, I think this is a very important improvement that most people
> should use.

+1

> I just don't think it's quite there yet.

I agree that we should continue working on the performance/memory stuff.

> 1) It's a difference of seconds in the regression database, which has a few
>    hundred tables, few columns, very little data and thus small stats. In a
>    database with a lot of tables and columns with complicated datatypes the
>    difference will be far larger.
> 
>    And in contrast to analyzing the database in parallel, the pg_dump/restore
>    work to restore stats afaict happens single-threaded for each database.

Yeah, I did a lot of work in v18 to rein in pg_dump --binary-upgrade
runtime, and I'm a bit worried that this will undo much of that.  Obviously
it's going to increase runtime by some amount, which is acceptable, but it
needs to be within reason.  I'm optimistic this is within reach for v18 by
reducing the number of queries.

> I care about the memory usage effects because I've seen plenty systems where
> pg_statistics is many gigabytes (after toast compression!), and I am really
> worried that pg_dump having all the serialized strings in memory will cause a
> lot of previously working pg_dump invocations and pg_upgrades to fail. That'd
> also be a really bad experience.

I think it is entirely warranted to consider these cases.  IME cases of "a
million tables" or "a million sequences" are far more common than you might
think.

-- 
nathan



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
Next
From: Corey Huinker
Date:
Subject: Re: Statistics Import and Export