Thread: some more pg_dump refactoring

some more pg_dump refactoring

From
Peter Eisentraut
Date:
Here is a patch to reorganize dumpFunc() and dumpAgg() in pg_dump, 
similar to daa9fe8a5264a3f192efa5ddee8fb011ad9da365.  Instead of 
repeating the almost same large query in each version branch, use one
query and add a few columns to the SELECT list depending on the
version.  This saves a lot of duplication.

I have tested this with various old versions of PostgreSQL I had 
available, but a bit more random testing with old versions would be welcome.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: some more pg_dump refactoring

From
Daniel Gustafsson
Date:
> On 23 Jun 2020, at 14:57, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
>
> Here is a patch to reorganize dumpFunc() and dumpAgg() in pg_dump, similar to
daa9fe8a5264a3f192efa5ddee8fb011ad9da365. Instead of repeating the almost same large query in each version branch, use
one
> query and add a few columns to the SELECT list depending on the
> version.  This saves a lot of duplication.
>
> I have tested this with various old versions of PostgreSQL I had available, but a bit more random testing with old
versionswould be welcome. 

+1 from reading the patch and lightly testing it with the older versions I had
handy, and another big +1 on the refactoring to remove the duplication.

cheers ./daniel


Re: some more pg_dump refactoring

From
Fabien COELHO
Date:
Hallo Peter,

My 0.02 €:

Patch applies cleanly, compiles, make check and pg_dump tap tests ok. The 
refactoring is a definite improvements.

You changed the query strings to use "\n" instead of " ". I would not have 
changed that, because it departs from the style around, and I do not think 
it improves readability at the C code level.

I tried to check manually and randomly that the same query is built for 
the same version, although my check may have been partial, especially on 
the aggregate query which does not comment about what is changed between 
versions, and my eyes are not very good at diffing.

I've notice that some attributes are given systematic replacements (eg 
proparallel), removing the need to test for presence afterwards. This 
looks fine to me.

However, on version < 8.4, ISTM that funcargs and funciargs are always 
added: is this voluntary?.

Would it make sense to accumulate in the other direction, older to newer, 
so that new attributes are added at the end of the select?

Should array_to_string be pg_catalog.array_to_string? All other calls seem 
to have an explicit schema.

I'm fine with inlining most PQfnumber calls.

I do not have old versions at hand for testing.

> Here is a patch to reorganize dumpFunc() and dumpAgg() in pg_dump, similar to 
> daa9fe8a5264a3f192efa5ddee8fb011ad9da365.  Instead of repeating the almost 
> same large query in each version branch, use one
> query and add a few columns to the SELECT list depending on the
> version.  This saves a lot of duplication.
>
> I have tested this with various old versions of PostgreSQL I had available, 
> but a bit more random testing with old versions would be welcome.

-- 
Fabien.

Re: some more pg_dump refactoring

From
Peter Eisentraut
Date:
On 2020-06-25 08:58, Fabien COELHO wrote:
> You changed the query strings to use "\n" instead of " ". I would not have
> changed that, because it departs from the style around, and I do not think
> it improves readability at the C code level.

This was the style that was introduced by 
daa9fe8a5264a3f192efa5ddee8fb011ad9da365.

> However, on version < 8.4, ISTM that funcargs and funciargs are always
> added: is this voluntary?.

That was a mistake.

> Would it make sense to accumulate in the other direction, older to newer,
> so that new attributes are added at the end of the select?

I think that could make sense, but the current style was introduced by 
daa9fe8a5264a3f192efa5ddee8fb011ad9da365.  Should we revise that?

> Should array_to_string be pg_catalog.array_to_string? All other calls seem
> to have an explicit schema.

It's not handled fully consistently in pg_dump.  But my understanding is 
that this is no longer necessary because pg_dump explicitly sets the 
search path before running.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: some more pg_dump refactoring

From
Fabien COELHO
Date:
Hello,

>> You changed the query strings to use "\n" instead of " ". I would not have
>> changed that, because it departs from the style around, and I do not think
>> it improves readability at the C code level.
>
> This was the style that was introduced by 
> daa9fe8a5264a3f192efa5ddee8fb011ad9da365.

Yep. This does not imply that it is better, or worst. Currently it is not 
consistent within the file, and there are only few instances, so maybe it 
could be discussed anyway.

After giving it some thought, I'd say that at least I'd like the query to 
be easy to read when dumped. This is not incompatible with some embedded 
eol, on the contrary, but ISTM that it could keep some indentation as 
well, which would be kind-of a middle ground. For readability, I'd also 
consider turning keywords to upper case. Maybe it could look like:

   "SELECT\n"
   "  somefield,\n"
   "  someotherfiled,\n"
   ...
   "FROM some_table\n"
   "JOIN ... ON ...\n" ...

All this is highly debatable, so ignore if you feel like it.

>> Would it make sense to accumulate in the other direction, older to newer,
>> so that new attributes are added at the end of the select?
>
> I think that could make sense, but the current style was introduced by 
> daa9fe8a5264a3f192efa5ddee8fb011ad9da365.  Should we revise that?

It seems to me more logical to do it while you're at it, but you are the 
one writing the patches:-)

>> Should array_to_string be pg_catalog.array_to_string? All other calls seem
>> to have an explicit schema.
>
> It's not handled fully consistently in pg_dump.  But my understanding is that 
> this is no longer necessary because pg_dump explicitly sets the search path 
> before running.

Dunno. It does not look consistent with a mix because the wary reviewer 
think that there may be a potential bug:-) ISTM that explicit is better 
than implicit in this context: not relying on search path would allow to 
test the query independently, and anyway what you see is what you get.

Otherwise: v2 patch applies cleanly, compiles, global make check ok, 
pg_dump tap ok.

"progargnames" is added in both branches of an if, which looks awkward. 
I'd suggest maybe to add it once unconditionnaly.

I cannot test easily on older versions.

-- 
Fabien.



Re: some more pg_dump refactoring

From
Peter Eisentraut
Date:
On 2020-06-30 16:56, Fabien COELHO wrote:
>>> Would it make sense to accumulate in the other direction, older to newer,
>>> so that new attributes are added at the end of the select?
>> I think that could make sense, but the current style was introduced by
>> daa9fe8a5264a3f192efa5ddee8fb011ad9da365.  Should we revise that?
> It seems to me more logical to do it while you're at it, but you are the
> one writing the patches:-)

What do you think about this patch to reorganize the existing code from 
that old commit?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: some more pg_dump refactoring

From
Fabien COELHO
Date:
Hallo Peter,

>>>> Would it make sense to accumulate in the other direction, older to newer,
>>>> so that new attributes are added at the end of the select?
>>> I think that could make sense, but the current style was introduced by
>>> daa9fe8a5264a3f192efa5ddee8fb011ad9da365.  Should we revise that?
>> It seems to me more logical to do it while you're at it, but you are the
>> one writing the patches:-)
>
> What do you think about this patch to reorganize the existing code from that 
> old commit?

I think it is a definite further improvement.

Patch applies cleanly, compiles, global & pg_dump tap test ok, looks ok to 
me.

-- 
Fabien.



Re: some more pg_dump refactoring

From
Peter Eisentraut
Date:
On 2020-07-08 06:42, Fabien COELHO wrote:
>> What do you think about this patch to reorganize the existing code from that
>> old commit?
> 
> I think it is a definite further improvement.
> 
> Patch applies cleanly, compiles, global & pg_dump tap test ok, looks ok to
> me.

Thanks.  I have committed that, and attached is my original patch 
adjusted to this newer style.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: some more pg_dump refactoring

From
Alvaro Herrera
Date:
On 2020-Jul-09, Peter Eisentraut wrote:

> On 2020-07-08 06:42, Fabien COELHO wrote:
> > > What do you think about this patch to reorganize the existing code from that
> > > old commit?
> > 
> > I think it is a definite further improvement.
> > 
> > Patch applies cleanly, compiles, global & pg_dump tap test ok, looks ok to
> > me.
> 
> Thanks.  I have committed that, and attached is my original patch adjusted
> to this newer style.

Thanks, I too prefer the style where queries are split in lines instead
of a single very long one, at least when looking at log_statement=all
lines generated by pg_dump runs.  It's not a *huge* usability
improvement, but I see no reason to make things worse.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: some more pg_dump refactoring

From
Peter Eisentraut
Date:
On 2020-07-09 16:14, Peter Eisentraut wrote:
> On 2020-07-08 06:42, Fabien COELHO wrote:
>>> What do you think about this patch to reorganize the existing code from that
>>> old commit?
>>
>> I think it is a definite further improvement.
>>
>> Patch applies cleanly, compiles, global & pg_dump tap test ok, looks ok to
>> me.
> 
> Thanks.  I have committed that, and attached is my original patch
> adjusted to this newer style.

committed

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services