Re: optimizing pg_upgrade's once-in-each-database steps - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: optimizing pg_upgrade's once-in-each-database steps
Date
Msg-id ZoMHsBasBy8LqkEU@nathan
Whole thread Raw
In response to Re: optimizing pg_upgrade's once-in-each-database steps  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: optimizing pg_upgrade's once-in-each-database steps
List pgsql-hackers
I figured I'd post what I have so far since this thread hasn't been updated
in a while.  The attached patches are still "proof-of-concept grade," but
they are at least moving in the right direction (IMHO).  The variable
naming is still not great, and they are woefully undercommented, among
other things.

0001 introduces a new API for registering callbacks and running them in
parallel on all databases in the cluster.  This new system manages a set of
"slots" that follow a simple state machine to asynchronously establish a
connection and run the queries.  It uses system() to wait for these
asynchronous tasks to complete.  Users of this API only need to provide two
callbacks: one to return the query that should be run on each database and
another to process the results of that query.  If multiple queries are
required for each database, users can provide multiple sets of callbacks.

The other patches change several of the existing tasks to use this new API.
With these patches applied, I see the following differences in the output
of 'pg_upgrade | ts -i' for a cluster with 1k empty databases:

    WITHOUT PATCH

    00:00:19 Checking database user is the install user                    ok
    00:00:02 Checking for subscription state                               ok
    00:00:06 Adding ".old" suffix to old global/pg_control                 ok
    00:00:04 Checking for extension updates                                ok

    WITH PATCHES (--jobs 1)

    00:00:10 Checking database user is the install user                    ok
    00:00:02 Checking for subscription state                               ok
    00:00:07 Adding ".old" suffix to old global/pg_control                 ok
    00:00:05 Checking for extension updates                                ok

    WITH PATCHES (--jobs 4)

    00:00:06 Checking database user is the install user                    ok
    00:00:00 Checking for subscription state                               ok
    00:00:02 Adding ".old" suffix to old global/pg_control                 ok
    00:00:01 Checking for extension updates                                ok

Note that the "Checking database user is the install user" time also
includes the call to get_db_rel_and_slot_infos() on the old cluster as well
as the call to get_loadable_libraries() on the old cluster.  I believe the
improvement with the patches with just one job is due to the consolidation
of the queries into one database connection (presently,
get_db_rel_and_slot_infos() creates 3 connections per database for some
upgrades).  Similarly, the "Adding \".old\" suffix to old
global/pg_control" time includes the call to get_db_rel_and_slot_infos() on
the new cluster.

There are several remaining places where we could use this new API to speed
up upgrades.  For example, I haven't attempted to use it for the data type
checks yet, and that tends to eat up a sizable chunk of time when there are
many databases.

On Thu, May 16, 2024 at 08:24:08PM -0500, Nathan Bossart wrote:
> On Thu, May 16, 2024 at 05:09:55PM -0700, Jeff Davis wrote:
>> Also, did you consider connecting once to each database and running
>> many queries? Most of those seem like just checks.
> 
> This was the idea behind 347758b.  It may be possible to do more along
> these lines.  IMO parallelizing will still be useful even if we do combine
> more of the steps.

My current thinking is that any possible further consolidation should
happen as part of a follow-up effort to parallelization.  I'm cautiously
optimistic that the parallelization work will make the consolidation easier
since it moves things to rigidly-defined callback functions.

A separate piece of off-list feedback from Michael Paquier is that this new
parallel system might be something we can teach the ParallelSlot code used
by bin/scripts/ to do.  I've yet to look too deeply into this, but I
suspect that it will be difficult to combine the two.  For example, the
ParallelSlot system doesn't seem well-suited for the kind of
run-once-in-each-database tasks required by pg_upgrade, and the error
handling is probably little different, too.  However, it's still worth a
closer look, and I'm interested in folks' opinions on the subject.

-- 
nathan

Attachment

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Built-in CTYPE provider
Next
From: "Joel Jacobson"
Date:
Subject: Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.