Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 02/01/2019 20:47, Jesper Pedersen wrote:
>
>> Well, that really depends. The user passed -j to pg_upgrade in order for
>> the upgrade to happen faster, so maybe they would expect, as I would,
>> that the ANALYZE phase would happen in parallel too.
>
> pg_upgrade -j reduces the *downtime* caused by pg_upgrade during the
> upgrade process. Also, during said downtime, nothing else is happening,
> so you can use all the resources of the machine.
>
> Once the system is back up, you don't necessarily want to use all the
> resources. The analyze script is specifically written to run while
> production traffic is active. If you just want to run the analyze as
> fast as possible, you can just run vacuumdb -j directly, without using
> the script.
Peter, I'm skeptical here.
I might permit a connection to a just pg_upgraded DB prior to any
analyze being known finished only for the most trivial case. At my site
however, *trivial* systems are a small minority.
In fact, our automated upgrade workflow uses our home-built parallel
analyzer which predates vacuumdb -j. Apps are not allowed into the DB
until a fast 1st pass has been done.
We run it in 2 phases...
$all preceeding upgrade steps w/system locked out
analyze-lite (reduced stats target)
open DB for application traffic
analyze-full
Of course we are increasing downtime by disallowing app traffic till
finish of analyze-lite however the assumption is that many queries would
be too slow to attempt without full analyzer coverage, albiet at a
reduced stats target.
IMO this is certainly a case of no 1-size-fits-all solution so perhaps a
--analyze-jobs option :-)
FWIW
Thanks
> Moreover, it's not clear that pg_upgrade and vacuumdb are bound the same
> way, so it's not a given that the same -j number should be used.
>
> Perhaps more documentation would be useful here.
--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net