Re: optimizing pg_upgrade's once-in-each-database steps - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: optimizing pg_upgrade's once-in-each-database steps
Date
Msg-id ZprY4e23vR95Nc51@nathan
Whole thread Raw
In response to Re: optimizing pg_upgrade's once-in-each-database steps  (Daniel Gustafsson <daniel@yesql.se>)
Responses Re: optimizing pg_upgrade's once-in-each-database steps
List pgsql-hackers
On Thu, Jul 18, 2024 at 09:57:23AM +0200, Daniel Gustafsson wrote:
>> On 17 Jul 2024, at 23:32, Nathan Bossart <nathandbossart@gmail.com> wrote:
>> On Wed, Jul 17, 2024 at 11:16:59PM +0200, Daniel Gustafsson wrote:
> 
>>> +static void
>>> +dispatch_query(const ClusterInfo *cluster, AsyncSlot *slot,
>>>    ....
>>> +   pg_free(query);
>>> +}
>>> 
>>> A minor point, perhaps fueled by me not having played around much with this
>>> patchset.  It seems a bit odd that dispatch_query is responsible for freeing
>>> the query from the get_query callback.  I would have expected the output from
>>> AsyncTaskGetQueryCB to be stored in AsyncTask and released by async_task_free.
>> 
>> I don't see any problem with doing it the way you suggest.

Actually, I do see a problem.  If we do it this way, we'll have to store a
string per database somewhere, which seems unnecessary.

However, while looking into this, I noticed that only one get_query
callback (get_db_subscription_count()) actually customizes the generated
query using information in the provided DbInfo.  AFAICT we can do this
particular step without running a query in each database, as I mentioned
elsewhere [0].  That should speed things up a bit and allow us to simplify
the AsyncTask code.

With that, if we are willing to assume that a given get_query callback will
generate the same string for all databases (and I think we should), we can
run the callback once and save the string in the step for dispatch_query()
to use.  This would look more like what you suggested in the quoted text.

[0] https://postgr.es/m/ZprQJv_TxccN3tkr%40nathan

-- 
nathan



pgsql-hackers by date:

Previous
From: Paul George
Date:
Subject: Re: behavior of GROUP BY with VOLATILE expressions
Next
From: Jeff Davis
Date:
Subject: Re: Statistics Import and Export