Thread: pgsql: Improve performance of dumpSequenceData().

pgsql: Improve performance of dumpSequenceData().

From
Nathan Bossart
Date:
31 July 2024, 15:13:58
Improve performance of dumpSequenceData().

As one might guess, this function dumps the sequence data.  It is
called once per sequence, and each such call executes a query to
retrieve the relevant data for a single sequence.  This can cause
pg_dump to take significantly longer, especially when there are
many sequences.

This commit improves the performance of this function by gathering
all the sequence data with a single query at the beginning of
pg_dump.  This information is stored in a sorted array that
dumpSequenceData() can bsearch() for what it needs.  This follows a
similar approach as previous commits that introduced sorted arrays
for role information, pg_class information, and sequence metadata.
As with those commits, this patch will cause pg_dump to use more
memory, but that isn't expected to be too egregious.

Note that we use the brand new function pg_sequence_read_tuple() in
the query that gathers all sequence data, so we must continue to
use the preexisting query-per-sequence approach for versions older
than 18.

Reviewed-by: Euler Taveira, Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20240503025140.GA1227404%40nathanxps13

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/bd15b7db489deadb2d9af7f21d16a6ed4a09465b

Modified Files
--------------
src/bin/pg_dump/pg_dump.c | 81 ++++++++++++++++++++++++++++++++++++-----------
1 file changed, 63 insertions(+), 18 deletions(-)