Home > mailing lists

Re: pg_dump: optimize dumpFunc() - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: pg_dump: optimize dumpFunc()
Date	August 2, 2024 05:33:45
Msg-id	3903370.1722576825@sss.pgh.pa.us Whole thread Raw
In response to	pg_dump: optimize dumpFunc() (Nathan Bossart <nathandbossart@gmail.com>)
Responses	Re: pg_dump: optimize dumpFunc()
List	pgsql-hackers

Tree view

Nathan Bossart <nathandbossart@gmail.com> writes:
> I've recently committed some optimizations for dumping sequences and
> pg_class information (commits 68e9629, bd15b7d, and 2329cad), and I noticed
> that we are also executing a query per function in pg_dump.  Commit be85727
> optimized this by preparing the query ahead of time, but I found that we
> can improve performance further by gathering all the relevant data in a
> single query.  Here are the results I see for a database with 10k simple
> functions with and without the attached patch:

I'm a bit concerned about this on two grounds:

1. Is it a win for DBs with not so many functions?

2. On the other end of the scale, if you've got a *boatload* of
functions, what does it do to pg_dump's memory requirements?
I'm recalling my days at Salesforce, where they had quite a few
thousand pl/pgsql functions totalling very many megabytes of source
text.  (Don't recall precise numbers offhand, and they'd be obsolete
by now even if I did.)

I'm not sure that the results you're showing justify taking any
risk here.

            regards, tom lane

pgsql-hackers by date:

From: Junwang Zhao
Date: 02 August 2024, 05:22:38
Subject: Re: [Patch] remove duplicated smgrclose

From: "Hayato Kuroda (Fujitsu)"
Date: 02 August 2024, 05:56:01
Subject: RE: [Proposal] Add foreign-server health checks infrastructure

Re: pg_dump: optimize dumpFunc() - Mailing list pgsql-hackers

Previous

Next