Re: Inefficiency in parallel pg_restore with many tables - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Inefficiency in parallel pg_restore with many tables
Date
Msg-id 20230722231941.GA2020225@nathanxps13
Whole thread Raw
In response to Re: Inefficiency in parallel pg_restore with many tables  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: Inefficiency in parallel pg_restore with many tables
Re: Inefficiency in parallel pg_restore with many tables
List pgsql-hackers
On Thu, Jul 20, 2023 at 12:06:44PM -0700, Nathan Bossart wrote:
> Here is a work-in-progress patch set for converting ready_list to a
> priority queue.  On my machine, Tom's 100k-table example [0] takes 11.5
> minutes without these patches and 1.5 minutes with them.
> 
> One item that requires more thought is binaryheap's use of Datum.  AFAICT
> the Datum definitions live in postgres.h and aren't available to frontend
> code.  I think we'll either need to move the Datum definitions to c.h or to
> adjust binaryheap to use "void *".

In v3, I moved the Datum definitions to c.h.  I first tried modifying
binaryheap to use "int" or "void *" instead, but that ended up requiring
some rather invasive changes in backend code, not to mention any extensions
that happen to be using it.  I also looked into moving the definitions to a
separate datumdefs.h header that postgres.h would include, but that felt
awkward because 1) postgres.h clearly states that it is intended for things
"that never escape the backend" and 2) the definitions seem relatively
inexpensive.  However, I think the latter option is still viable, so I'm
fine with switching to it if folks think that is a better approach.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Fix search_path for all maintenance commands
Next
From: Nathan Bossart
Date:
Subject: Re: Inefficiency in parallel pg_restore with many tables