Re: optimizing pg_upgrade's once-in-each-database steps - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: optimizing pg_upgrade's once-in-each-database steps
Date
Msg-id ZqwA_qngQM25FrjK@nathan
Whole thread Raw
In response to Re: optimizing pg_upgrade's once-in-each-database steps  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: optimizing pg_upgrade's once-in-each-database steps
List pgsql-hackers
On Thu, Aug 01, 2024 at 12:44:35PM -0500, Nathan Bossart wrote:
> On Wed, Jul 31, 2024 at 10:55:33PM +0100, Ilya Gladyshev wrote:
>> I like your idea of parallelizing these checks with async libpq API, thanks
>> for working on it. The patch doesn't apply cleanly on master anymore, but
>> I've rebased locally and taken it for a quick spin with a pg16 instance of
>> 1000 empty databases. Didn't see any regressions with -j 1, there's some
>> speedup with -j 8 (33 sec vs 8 sec for these checks).
> 
> Thanks for taking a look.  I'm hoping to do a round of polishing before
> posting a rebased patch set soon.
> 
>> One thing that I noticed that could be improved is we could start a new
>> connection right away after having run all query callbacks for the current
>> connection in process_slot, instead of just returning and establishing the
>> new connection only on the next iteration of the loop in async_task_run
>> after potentially sleeping on select.
> 
> Yeah, we could just recursively call process_slot() right after freeing the
> slot.  That'd at least allow us to avoid the spinning behavior as we run
> out of databases to process, if nothing else.

Here is a new patch set.  Besides rebasing, I've added the recursive call
to process_slot() mentioned in the quoted text, and I've added quite a bit
of commentary to async.c.

-- 
nathan

Attachment

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Official devcontainer config
Next
From: Pavel Stehule
Date:
Subject: Re: proposal: schema variables