Re: improve performance of pg_dump with many sequences - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: improve performance of pg_dump with many sequences
Date
Msg-id Zpbnz4P13W7WICj-@nathan
Whole thread Raw
In response to Re: improve performance of pg_dump with many sequences  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: improve performance of pg_dump with many sequences
List pgsql-hackers
On Thu, Jul 11, 2024 at 09:09:17PM -0500, Nathan Bossart wrote:
> On second thought, maybe we should just limit this improvement to the minor
> releases with the fix so that we _can_ get rid of the workaround.  Or we
> could use the hacky workaround only for versions with the bug.

Here is a new version of the patch set.  The main differences are 1) we no
longer gather the sequence data for schema-only dumps and 2) 0003 uses a
simplified query for dumps on v18 and newer.  I considered also using a
slightly simplified query for dumps on versions with the
unlogged-sequences-on-standbys fix, but I felt that wasn't worth the extra
code.

Unfortunately, I've also discovered a problem with 0003.
pg_sequence_last_value() returns NULL when is_called is false, in which
case we assume last_value == seqstart, which is, sadly, bogus due to
commands like ALTER SEQUENCE [RE]START WITH.  AFAICT there isn't an easy
way around this.  We could either create a giant query that gathers the
information from all sequences in the database, or we could introduce a new
function in v18 that returns everything we need (which would only help for
upgrades _from_ v18).  Assuming I'm not missing a better option, I think
the latter is the better choice, and I still think it's worth doing even
though it probably won't help anyone for ~2.5 years.

-- 
nathan

Attachment

pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: PG_TEST_EXTRA and meson
Next
From: Nathan Bossart
Date:
Subject: Re: [PATCH] Refactor pqformat.{c,h} and protocol.h