Thread: improve performance of pg_dump with many sequences

improve performance of pg_dump with many sequences

From
Nathan Bossart
Date:
Similar to 'pg_dump --binary-upgrade' [0], we can speed up pg_dump with
many sequences by gathering the required information in a single query
instead of two queries per sequence.  The attached patches are
works-in-progress, but here are the results I see on my machine for
'pg_dump --schema-only --binary-upgrade' with a million sequences:

          HEAD : 6m22.809s
           [0] : 1m54.701s
[0] + attached : 0m38.233s

I'm not sure I have all the details correct in 0003, and we might want to
separate the table into two tables which are only populated when the
relevant section is dumped.  Furthermore, the query in 0003 is a bit goofy
because it needs to dance around a bug reported elsewhere [1].

[0] https://postgr.es/m/20240418041712.GA3441570%40nathanxps13
[1] https://postgr.es/m/20240501005730.GA594666%40nathanxps13

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: improve performance of pg_dump with many sequences

From
"Euler Taveira"
Date:
On Tue, Jul 9, 2024, at 4:11 PM, Nathan Bossart wrote:
rebased

Nice improvement. The numbers for a realistic scenario (10k sequences) are

for i in `seq 1 10000`; do echo "CREATE SEQUENCE s$i;"; done > /tmp/s.sql

master:
real 0m1,141s
user 0m0,056s
sys 0m0,147s

patched:
real 0m0,410s
user 0m0,045s
sys 0m0,103s

You are changing internal representation from char to int64. Is the main goal to
validate catalog data? What if there is a new sequence data type whose
representation is not an integer?

This code path is adding zero byte to the last position of the fixed string. I
suggest that the zero byte is added to the position after the string length.

Assert(strlen(PQgetvalue(res, 0, 0)) < sizeof(seqtype));
strncpy(seqtype, PQgetvalue(res, 0, 0), sizeof(seqtype));
seqtype[sizeof(seqtype) - 1] = '\0';

Something like

l = strlen(PQgetvalue(res, 0, 0));
Assert(l < sizeof(seqtype));
strncpy(seqtype, PQgetvalue(res, 0, 0), l);
seqtype[l] = '\0';

Another suggestion is to use a constant for seqtype

char seqtype[MAX_SEQNAME_LEN];

and simplify the expression:

size_t      seqtype_sz = sizeof(((SequenceItem *) 0)->seqtype);

If you are not planning to apply 0003, make sure you fix collectSequences() to
avoid versions less than 10. Move this part to 0002.

@@ -17233,11 +17235,24 @@ collectSequences(Archive *fout)                       
    PGresult   *res;                                                            
    const char *query;                                                          
                                                                                
+   if (fout->remoteVersion < 100000)                                           
+       return;                                                                 


Since you apply a fix for pg_sequence_last_value function, you can simplify the
query in 0003. CASE is not required.

I repeated the same test but not applying 0003.

patched (0001 and 0002):
real 0m0,290s
user 0m0,038s
sys 0m0,104s

I'm not sure if 0003 is worth. Maybe if you have another table like you
suggested.


--
Euler Taveira

Re: improve performance of pg_dump with many sequences

From
Nathan Bossart
Date:
On Wed, Jul 10, 2024 at 05:08:56PM -0300, Euler Taveira wrote:
> Nice improvement. The numbers for a realistic scenario (10k sequences) are

Thanks for taking a look!

> You are changing internal representation from char to int64. Is the main goal to
> validate catalog data? What if there is a new sequence data type whose
> representation is not an integer?

IIRC 0001 was primarily intended to reduce the amount of memory needed for
the sorted table.  Regarding a new sequence data type, I'm assuming we'll
have much bigger fish to fry if we do that (e.g., pg_sequence uses int8 for
the values), and I'd hope that adjusting this code wouldn't be too
difficult, anyway.

> This code path is adding zero byte to the last position of the fixed string. I
> suggest that the zero byte is added to the position after the string length.

I'm not following why that would be a better approach.  strncpy() will add
a NUL to the end of the string unless it doesn't fit in the buffer, in
which case we'll add our own via "seqtype[sizeof(seqtype) - 1] = '\0'".
Furthermore, the compiler can determine the position where the NUL should
be placed, whereas placing it at the end of the copied string requires a
runtime strlen().

> l = strlen(PQgetvalue(res, 0, 0));
> Assert(l < sizeof(seqtype));
> strncpy(seqtype, PQgetvalue(res, 0, 0), l);
> seqtype[l] = '\0';

I think the strncpy() should really be limited to the size of the seqtype
buffer.  IMHO an Assert is not sufficient.

> If you are not planning to apply 0003, make sure you fix collectSequences() to
> avoid versions less than 10. Move this part to 0002.

Yeah, no need to create the table if we aren't going to use it.

> Since you apply a fix for pg_sequence_last_value function, you can simplify the
> query in 0003. CASE is not required.

Unfortunately, I think we have to keep this workaround since older minor
releases of PostgreSQL don't have the fix.

> patched (0001 and 0002):
> real 0m0,290s
> user 0m0,038s
> sys 0m0,104s
> 
> I'm not sure if 0003 is worth. Maybe if you have another table like you
> suggested.

What pg_dump command did you test here?  Did you dump the sequence data, or
was this --schema-only?

-- 
nathan



Re: improve performance of pg_dump with many sequences

From
"Euler Taveira"
Date:
On Wed, Jul 10, 2024, at 7:05 PM, Nathan Bossart wrote:
I'm not following why that would be a better approach.  strncpy() will add
a NUL to the end of the string unless it doesn't fit in the buffer, in
which case we'll add our own via "seqtype[sizeof(seqtype) - 1] = '\0'".
Furthermore, the compiler can determine the position where the NUL should
be placed, whereas placing it at the end of the copied string requires a
runtime strlen().

Nevermind, you are copying the whole buffer (n = sizeof(seqtype)).

Unfortunately, I think we have to keep this workaround since older minor
releases of PostgreSQL don't have the fix.

Hmm. Right.

What pg_dump command did you test here?  Did you dump the sequence data, or
was this --schema-only?

time pg_dump -f - -s -d postgres


--
Euler Taveira

Re: improve performance of pg_dump with many sequences

From
Nathan Bossart
Date:
On Wed, Jul 10, 2024 at 11:52:33PM -0300, Euler Taveira wrote:
> On Wed, Jul 10, 2024, at 7:05 PM, Nathan Bossart wrote:
>> Unfortunately, I think we have to keep this workaround since older minor
>> releases of PostgreSQL don't have the fix.
> 
> Hmm. Right.

On second thought, maybe we should just limit this improvement to the minor
releases with the fix so that we _can_ get rid of the workaround.  Or we
could use the hacky workaround only for versions with the bug.

-- 
nathan



Re: improve performance of pg_dump with many sequences

From
Nathan Bossart
Date:
On Thu, Jul 11, 2024 at 09:09:17PM -0500, Nathan Bossart wrote:
> On second thought, maybe we should just limit this improvement to the minor
> releases with the fix so that we _can_ get rid of the workaround.  Or we
> could use the hacky workaround only for versions with the bug.

Here is a new version of the patch set.  The main differences are 1) we no
longer gather the sequence data for schema-only dumps and 2) 0003 uses a
simplified query for dumps on v18 and newer.  I considered also using a
slightly simplified query for dumps on versions with the
unlogged-sequences-on-standbys fix, but I felt that wasn't worth the extra
code.

Unfortunately, I've also discovered a problem with 0003.
pg_sequence_last_value() returns NULL when is_called is false, in which
case we assume last_value == seqstart, which is, sadly, bogus due to
commands like ALTER SEQUENCE [RE]START WITH.  AFAICT there isn't an easy
way around this.  We could either create a giant query that gathers the
information from all sequences in the database, or we could introduce a new
function in v18 that returns everything we need (which would only help for
upgrades _from_ v18).  Assuming I'm not missing a better option, I think
the latter is the better choice, and I still think it's worth doing even
though it probably won't help anyone for ~2.5 years.

-- 
nathan

Attachment

Re: improve performance of pg_dump with many sequences

From
Michael Paquier
Date:
On Tue, Jul 16, 2024 at 04:36:15PM -0500, Nathan Bossart wrote:
> Unfortunately, I've also discovered a problem with 0003.
> pg_sequence_last_value() returns NULL when is_called is false, in which
> case we assume last_value == seqstart, which is, sadly, bogus due to
> commands like ALTER SEQUENCE [RE]START WITH.  AFAICT there isn't an easy
> way around this.  We could either create a giant query that gathers the
> information from all sequences in the database, or we could introduce a new
> function in v18 that returns everything we need (which would only help for
> upgrades _from_ v18).  Assuming I'm not missing a better option, I think
> the latter is the better choice, and I still think it's worth doing even
> though it probably won't help anyone for ~2.5 years.

Yeah, I have bumped on the same issue.  In the long term, I also think
that we'd better have pg_sequence_last_value() return a row with
is_called and the value scanned.  As you say, it won't help except
when upgrading from versions of Postgres that are at least to v18,
assuming that this change gets in the tree, but that would be much
better in the long term and time flies fast.

See 0001 as of this area:
https://www.postgresql.org/message-id/ZnPIUPMmp5TzBPC2%40paquier.xyz
--
Michael

Attachment

Re: improve performance of pg_dump with many sequences

From
Nathan Bossart
Date:
On Wed, Jul 17, 2024 at 11:30:04AM +0900, Michael Paquier wrote:
> Yeah, I have bumped on the same issue.  In the long term, I also think
> that we'd better have pg_sequence_last_value() return a row with
> is_called and the value scanned.  As you say, it won't help except
> when upgrading from versions of Postgres that are at least to v18,
> assuming that this change gets in the tree, but that would be much
> better in the long term and time flies fast.

AFAICT pg_sequence_last_value() is basically an undocumented internal
function only really intended for use by the pg_sequences system view, so
changing the function like this for v18 might not be out of the question.
Otherwise, I think we'd have to create a strikingly similar function with
slightly different behavior, which would be a bizarre place to end up.

-- 
nathan



Re: improve performance of pg_dump with many sequences

From
Nathan Bossart
Date:
On Tue, Jul 16, 2024 at 10:23:08PM -0500, Nathan Bossart wrote:
> On Wed, Jul 17, 2024 at 11:30:04AM +0900, Michael Paquier wrote:
>> Yeah, I have bumped on the same issue.  In the long term, I also think
>> that we'd better have pg_sequence_last_value() return a row with
>> is_called and the value scanned.  As you say, it won't help except
>> when upgrading from versions of Postgres that are at least to v18,
>> assuming that this change gets in the tree, but that would be much
>> better in the long term and time flies fast.
> 
> AFAICT pg_sequence_last_value() is basically an undocumented internal
> function only really intended for use by the pg_sequences system view, so
> changing the function like this for v18 might not be out of the question.
> Otherwise, I think we'd have to create a strikingly similar function with
> slightly different behavior, which would be a bizarre place to end up.

On second thought, I worry that this change might needlessly complicate the
pg_sequences system view.  Maybe we should just add a
pg_sequence_get_tuple() function that returns everything in
FormData_pg_sequence_data for a given sequence OID...

-- 
nathan



Re: improve performance of pg_dump with many sequences

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> On second thought, I worry that this change might needlessly complicate the
> pg_sequences system view.  Maybe we should just add a
> pg_sequence_get_tuple() function that returns everything in
> FormData_pg_sequence_data for a given sequence OID...

Uh ... why do we need a function, rather than just

select * from pg_sequence

?

            regards, tom lane



Re: improve performance of pg_dump with many sequences

From
Nathan Bossart
Date:
On Wed, Jul 17, 2024 at 02:59:26PM -0400, Tom Lane wrote:
> Nathan Bossart <nathandbossart@gmail.com> writes:
>> On second thought, I worry that this change might needlessly complicate the
>> pg_sequences system view.  Maybe we should just add a
>> pg_sequence_get_tuple() function that returns everything in
>> FormData_pg_sequence_data for a given sequence OID...
> 
> Uh ... why do we need a function, rather than just
> 
> select * from pg_sequence

We can use that for dumpSequence(), but dumpSequenceData() requires
information from the sequence tuple itself.  Right now, we query each
sequence relation individually for that data, and I'm trying to find a way
to cut down on those round trips.

-- 
nathan



Re: improve performance of pg_dump with many sequences

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> On Wed, Jul 17, 2024 at 02:59:26PM -0400, Tom Lane wrote:
>> Uh ... why do we need a function, rather than just
>> select * from pg_sequence

> We can use that for dumpSequence(), but dumpSequenceData() requires
> information from the sequence tuple itself.  Right now, we query each
> sequence relation individually for that data, and I'm trying to find a way
> to cut down on those round trips.

Ah, I confused FormData_pg_sequence_data with FormData_pg_sequence.
Sorry for the noise.

            regards, tom lane



Re: improve performance of pg_dump with many sequences

From
Nathan Bossart
Date:
Here is an attempt at adding a new function that returns the sequence tuple
and using that to avoid querying each sequence relation individually in
dumpSequenceData().

If we instead wanted to change pg_sequence_last_value() to return both
is_called and last_value, I think we could modify the pg_sequences system
view to use a LATERAL subquery, i.e.,

    SELECT
        ...
        CASE
            WHEN L.is_called THEN L.last_value
            ELSE NULL
        END AS last_value
    FROM pg_sequence S
        ...
        JOIN LATERAL pg_sequence_last_value(S.seqrelid) L ON true
    ...

That doesn't seem so bad, and it'd avoid an extra pg_proc entry, but it
would probably break anything that calls pg_sequence_last_value() directly.
Thoughts?

-- 
nathan

Attachment

Re: improve performance of pg_dump with many sequences

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> Here is an attempt at adding a new function that returns the sequence tuple
> and using that to avoid querying each sequence relation individually in
> dumpSequenceData().

Didn't read the patch yet, but ...

> If we instead wanted to change pg_sequence_last_value() to return both
> is_called and last_value, I think we could modify the pg_sequences system
> view to use a LATERAL subquery, i.e.,
> ...
> That doesn't seem so bad, and it'd avoid an extra pg_proc entry, but it
> would probably break anything that calls pg_sequence_last_value() directly.
> Thoughts?

... one more pg_proc entry is pretty cheap.  I think we should leave
pg_sequence_last_value alone.  We don't know if anyone is depending
on it, and de-optimizing the pg_sequences view doesn't seem like a
win either.

... okay, I lied, I looked at the patch.  Why are you testing

+    if (pg_class_aclcheck(relid, GetUserId(), ACL_SELECT | ACL_USAGE) == ACLCHECK_OK &&

?  This is a substitute for a SELECT from the sequence and it seems
like it ought to demand exactly the same privilege as SELECT.
(If you want to get more technical, USAGE allows nextval() which
gives strictly less information than what this exposes; that's why
we're here after all.)  So there is a difference in the privilege
levels, which is another reason for not combining this with
pg_sequence_last_value.

            regards, tom lane



Re: improve performance of pg_dump with many sequences

From
Nathan Bossart
Date:
On Wed, Jul 17, 2024 at 11:58:21PM -0400, Tom Lane wrote:
> ... okay, I lied, I looked at the patch.  Why are you testing
> 
> +    if (pg_class_aclcheck(relid, GetUserId(), ACL_SELECT | ACL_USAGE) == ACLCHECK_OK &&
> 
> ?  This is a substitute for a SELECT from the sequence and it seems
> like it ought to demand exactly the same privilege as SELECT.
> (If you want to get more technical, USAGE allows nextval() which
> gives strictly less information than what this exposes; that's why
> we're here after all.)  So there is a difference in the privilege
> levels, which is another reason for not combining this with
> pg_sequence_last_value.

Oh, that's a good point.  I wrongly assumed the privilege checks would be
the same as pg_sequence_last_value().  I fixed this in v5.

I also polished the rest of the patches a bit.  Among other things, I
created an enum for the sequence data types to avoid the hacky strncpy()
stuff, which was causing weird CI failures [0].

[0] https://cirrus-ci.com/task/4614801962303488

-- 
nathan

Attachment

Re: improve performance of pg_dump with many sequences

From
Nathan Bossart
Date:
I fixed a compiler warning on Windows in v6 of the patch set.  Sorry for
the noise.

-- 
nathan

Attachment

Re: improve performance of pg_dump with many sequences

From
Nathan Bossart
Date:
I ran Euler's tests again on the v6 patch set.

    for i in `seq 1 10000`; do psql postgres -c "CREATE SEQUENCE s$i;"; done
    time pg_dump -f - -s -d postgres > /dev/null

    HEAD:        0.607s
    0001 + 0002: 0.094s
    all patches: 0.094s

Barring additional feedback, I am planning to commit these patches early
next week.

-- 
nathan



Re: improve performance of pg_dump with many sequences

From
Nathan Bossart
Date:
Committed.

-- 
nathan