Thread: some more pg_dump refactoring
Here is a patch to reorganize dumpFunc() and dumpAgg() in pg_dump, similar to daa9fe8a5264a3f192efa5ddee8fb011ad9da365. Instead of repeating the almost same large query in each version branch, use one query and add a few columns to the SELECT list depending on the version. This saves a lot of duplication. I have tested this with various old versions of PostgreSQL I had available, but a bit more random testing with old versions would be welcome. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
> On 23 Jun 2020, at 14:57, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > > Here is a patch to reorganize dumpFunc() and dumpAgg() in pg_dump, similar to daa9fe8a5264a3f192efa5ddee8fb011ad9da365. Instead of repeating the almost same large query in each version branch, use one > query and add a few columns to the SELECT list depending on the > version. This saves a lot of duplication. > > I have tested this with various old versions of PostgreSQL I had available, but a bit more random testing with old versionswould be welcome. +1 from reading the patch and lightly testing it with the older versions I had handy, and another big +1 on the refactoring to remove the duplication. cheers ./daniel
Hallo Peter, My 0.02 €: Patch applies cleanly, compiles, make check and pg_dump tap tests ok. The refactoring is a definite improvements. You changed the query strings to use "\n" instead of " ". I would not have changed that, because it departs from the style around, and I do not think it improves readability at the C code level. I tried to check manually and randomly that the same query is built for the same version, although my check may have been partial, especially on the aggregate query which does not comment about what is changed between versions, and my eyes are not very good at diffing. I've notice that some attributes are given systematic replacements (eg proparallel), removing the need to test for presence afterwards. This looks fine to me. However, on version < 8.4, ISTM that funcargs and funciargs are always added: is this voluntary?. Would it make sense to accumulate in the other direction, older to newer, so that new attributes are added at the end of the select? Should array_to_string be pg_catalog.array_to_string? All other calls seem to have an explicit schema. I'm fine with inlining most PQfnumber calls. I do not have old versions at hand for testing. > Here is a patch to reorganize dumpFunc() and dumpAgg() in pg_dump, similar to > daa9fe8a5264a3f192efa5ddee8fb011ad9da365. Instead of repeating the almost > same large query in each version branch, use one > query and add a few columns to the SELECT list depending on the > version. This saves a lot of duplication. > > I have tested this with various old versions of PostgreSQL I had available, > but a bit more random testing with old versions would be welcome. -- Fabien.
On 2020-06-25 08:58, Fabien COELHO wrote: > You changed the query strings to use "\n" instead of " ". I would not have > changed that, because it departs from the style around, and I do not think > it improves readability at the C code level. This was the style that was introduced by daa9fe8a5264a3f192efa5ddee8fb011ad9da365. > However, on version < 8.4, ISTM that funcargs and funciargs are always > added: is this voluntary?. That was a mistake. > Would it make sense to accumulate in the other direction, older to newer, > so that new attributes are added at the end of the select? I think that could make sense, but the current style was introduced by daa9fe8a5264a3f192efa5ddee8fb011ad9da365. Should we revise that? > Should array_to_string be pg_catalog.array_to_string? All other calls seem > to have an explicit schema. It's not handled fully consistently in pg_dump. But my understanding is that this is no longer necessary because pg_dump explicitly sets the search path before running. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hello, >> You changed the query strings to use "\n" instead of " ". I would not have >> changed that, because it departs from the style around, and I do not think >> it improves readability at the C code level. > > This was the style that was introduced by > daa9fe8a5264a3f192efa5ddee8fb011ad9da365. Yep. This does not imply that it is better, or worst. Currently it is not consistent within the file, and there are only few instances, so maybe it could be discussed anyway. After giving it some thought, I'd say that at least I'd like the query to be easy to read when dumped. This is not incompatible with some embedded eol, on the contrary, but ISTM that it could keep some indentation as well, which would be kind-of a middle ground. For readability, I'd also consider turning keywords to upper case. Maybe it could look like: "SELECT\n" " somefield,\n" " someotherfiled,\n" ... "FROM some_table\n" "JOIN ... ON ...\n" ... All this is highly debatable, so ignore if you feel like it. >> Would it make sense to accumulate in the other direction, older to newer, >> so that new attributes are added at the end of the select? > > I think that could make sense, but the current style was introduced by > daa9fe8a5264a3f192efa5ddee8fb011ad9da365. Should we revise that? It seems to me more logical to do it while you're at it, but you are the one writing the patches:-) >> Should array_to_string be pg_catalog.array_to_string? All other calls seem >> to have an explicit schema. > > It's not handled fully consistently in pg_dump. But my understanding is that > this is no longer necessary because pg_dump explicitly sets the search path > before running. Dunno. It does not look consistent with a mix because the wary reviewer think that there may be a potential bug:-) ISTM that explicit is better than implicit in this context: not relying on search path would allow to test the query independently, and anyway what you see is what you get. Otherwise: v2 patch applies cleanly, compiles, global make check ok, pg_dump tap ok. "progargnames" is added in both branches of an if, which looks awkward. I'd suggest maybe to add it once unconditionnaly. I cannot test easily on older versions. -- Fabien.
On 2020-06-30 16:56, Fabien COELHO wrote: >>> Would it make sense to accumulate in the other direction, older to newer, >>> so that new attributes are added at the end of the select? >> I think that could make sense, but the current style was introduced by >> daa9fe8a5264a3f192efa5ddee8fb011ad9da365. Should we revise that? > It seems to me more logical to do it while you're at it, but you are the > one writing the patches:-) What do you think about this patch to reorganize the existing code from that old commit? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hallo Peter, >>>> Would it make sense to accumulate in the other direction, older to newer, >>>> so that new attributes are added at the end of the select? >>> I think that could make sense, but the current style was introduced by >>> daa9fe8a5264a3f192efa5ddee8fb011ad9da365. Should we revise that? >> It seems to me more logical to do it while you're at it, but you are the >> one writing the patches:-) > > What do you think about this patch to reorganize the existing code from that > old commit? I think it is a definite further improvement. Patch applies cleanly, compiles, global & pg_dump tap test ok, looks ok to me. -- Fabien.
On 2020-07-08 06:42, Fabien COELHO wrote: >> What do you think about this patch to reorganize the existing code from that >> old commit? > > I think it is a definite further improvement. > > Patch applies cleanly, compiles, global & pg_dump tap test ok, looks ok to > me. Thanks. I have committed that, and attached is my original patch adjusted to this newer style. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 2020-Jul-09, Peter Eisentraut wrote: > On 2020-07-08 06:42, Fabien COELHO wrote: > > > What do you think about this patch to reorganize the existing code from that > > > old commit? > > > > I think it is a definite further improvement. > > > > Patch applies cleanly, compiles, global & pg_dump tap test ok, looks ok to > > me. > > Thanks. I have committed that, and attached is my original patch adjusted > to this newer style. Thanks, I too prefer the style where queries are split in lines instead of a single very long one, at least when looking at log_statement=all lines generated by pg_dump runs. It's not a *huge* usability improvement, but I see no reason to make things worse. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020-07-09 16:14, Peter Eisentraut wrote: > On 2020-07-08 06:42, Fabien COELHO wrote: >>> What do you think about this patch to reorganize the existing code from that >>> old commit? >> >> I think it is a definite further improvement. >> >> Patch applies cleanly, compiles, global & pg_dump tap test ok, looks ok to >> me. > > Thanks. I have committed that, and attached is my original patch > adjusted to this newer style. committed -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services