Re: analyze-in-stages post upgrade questions - Mailing list pgsql-general

From Adrian Klaver
Subject Re: analyze-in-stages post upgrade questions
Date
Msg-id 6add2a9a-7cf2-4d1b-8f3e-2e26a7ebe883@aklaver.com
Whole thread Raw
In response to analyze-in-stages post upgrade questions  ("Zechman, Derek S" <Derek.S.Zechman@snapon.com>)
List pgsql-general
On 6/27/25 06:35, Zechman, Derek S wrote:
> We recently performed an upgrade from pg14 (14.18) to pg16 (16.9) and 
> performed the analyze-in-stages post upgrade.  It has been noticed that 
> some plans changed to use hash joins instead of nested loops.  Further 
> investigation found it was because the parent table of partitioned 
> tables did not have stats.  After running an ANALYZE on the parent 
> tables we got similar plan an execution times as before.
> 
> I have two questions
> 
> 1 - Why does analyze-in-stages not analyze the parent tables?
> 
> 2 – What happens if we do not run analyze-in-stages post upgrade and 
> just run an analyze?

It is spelled out in the docs:

https://www.postgresql.org/docs/current/pgupgrade.html

Emphasis added

"Using vacuumdb --all --analyze-only can efficiently generate such 
statistics, and the use of --jobs can speed it up. Option 
--analyze-in-stages can be used to generate **minimal statistics** 
quickly. If vacuum_cost_delay is set to a non-zero value, this can be 
overridden to speed up statistics generation using PGOPTIONS, e.g., 
PGOPTIONS='-c vacuum_cost_delay=0' vacuumdb ...."

and from here:

https://www.postgresql.org/docs/current/app-vacuumdb.html

"--analyze-in-stages

     Only calculate statistics for use by the optimizer (no vacuum), 
like --analyze-only. Run three stages of analyze; the first stage uses 
the lowest possible statistics target (see default_statistics_target) to 
produce usable statistics faster, and subsequent stages build the full 
statistics.

     This option is only useful to analyze a database that currently has 
no statistics or has wholly incorrect ones, such as if it is newly 
populated from a restored dump or by pg_upgrade. Be aware that running 
with this option in a database with existing statistics may cause the 
query optimizer choices to become transiently worse due to the low 
statistics targets of the early stages.
"

> 
> Thanks,
> 
> Sean
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




pgsql-general by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: Re: Figure out nullability of query parameters
Next
From: raphi
Date:
Subject: Re: password rules