Re: Statistics Import and Export - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: Statistics Import and Export
Date
Msg-id CADkLM=eSK39M41v=OGH=9MAE0WiQZgYB54Oj3VwMCPSoPWyctQ@mail.gmail.com
Whole thread Raw
In response to Re: Statistics Import and Export  (Andres Freund <andres@anarazel.de>)
Responses Re: Statistics Import and Export
List pgsql-hackers


> Pardon my inexperience, but aren't the ArchiveEntry records needed right up
> until the program's run?

s/the/the end of the/?

yes
 
> If there's value in freeing them, why isn't it being done already? What
> other thing would consume this freed memory?

I'm not saying that they can be freed, they can't right now. My point is just
that we *already* keep all the stats in memory, so the fact that fetching all
stats in a single query would also require keeping them in memory is not an
issue.

That's true in cases where we're not filtering schemas or tables. We fetch the pg_class stats as a part of getTables, but those are small, and not a part of the query in question.

 Fetching all the pg_stats for a db when we only want one table could be a nasty performance regression, and we can't just filter on the oids of the tables we want, because those tables can have expression indexes, so the oid filter would get complicated quickly.
 
But TBH, I do wonder how much the current memory usage of the statistics
dump/restore support is going to bite us. In some cases this will dramatically
increase pg_dump/pg_upgrade's memory usage, my tests were with tiny amounts of
data and very simple scalar datatypes and you already could see a substantial
increase.  With something like postgis or even just a lot of jsonb columns
this is going to be way worse.

Yes, it will cost us in pg_dump, but it will save customers from some long ANALYZE operations.

pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: bogus error message for ALTER TABLE ALTER CONSTRAINT
Next
From: Tom Lane
Date:
Subject: Re: Next commitfest app release is planned for March 18th