Re: [HACKERS] Custom compression methods (buildfarm xupgrade) - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: [HACKERS] Custom compression methods (buildfarm xupgrade)
Date
Msg-id 20210328231433.GI15100@telsasoft.com
Whole thread Raw
In response to Re: [HACKERS] Custom compression methods  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
On Sun, Mar 28, 2021 at 04:48:29PM -0400, Andrew Dunstan wrote:
> Nothing is hidden here - the diffs are reported, see for example
>
<https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=crake&dt=2021-03-28%2015%3A37%3A07&stg=xversion-upgrade-REL9_4_STABLE-HEAD>
> What we're comparing here is target pg_dumpall against the original
> source vs target pg_dumpall against the upgraded source.

The command being run is:

https://github.com/PGBuildFarm/client-code/blob/master/PGBuild/Modules/TestUpgradeXversion.pm#L610
        system( "diff -I '^-- ' -u $upgrade_loc/origin-$oversion.sql "
                  . "$upgrade_loc/converted-$oversion-to-$this_branch.sql "
                  . "> $upgrade_loc/dumpdiff-$oversion 2>&1");
...
    my $difflines = `wc -l < $upgrade_loc/dumpdiff-$oversion`;

where -I means: --ignore-matching-lines=RE

I think wc -l should actually be grep -c '^[-+]'
otherwise context lines count for as much as diff lines.
You could write that with diff -U0 |wc -l, except the context is useful to
humans.

With some more effort, the number of lines of diff can be very small, allowing
a smaller fudge factor.  

For upgrade from v10:
time make -C src/bin/pg_upgrade check oldsrc=`pwd`/10 oldbindir=`pwd`/10/tmp_install/usr/local/pgsql/bin

$ diff -u src/bin/pg_upgrade/tmp_check/dump1.sql src/bin/pg_upgrade/tmp_check/dump2.sql |wc -l
622

Without context:
$ diff -u src/bin/pg_upgrade/tmp_check/dump1.sql src/bin/pg_upgrade/tmp_check/dump2.sql |grep -c '^[-+]'
142

Without comments:
$ diff -I '^-- ' -u src/bin/pg_upgrade/tmp_check/dump1.sql src/bin/pg_upgrade/tmp_check/dump2.sql |grep -c '^[-+]'
130

Without SET default stuff:
diff -I '^$' -I "SET default_table_access_method = heap;" -I "^SET default_toast_compression = 'pglz';$" -I '^-- ' -u
/home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/dump1.sql
/home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/dump2.sql|less |grep -c '^[-+]'
 
117

Without trigger function call noise:
diff -I "^CREATE TRIGGER [_[:alnum:]]\+ .* FOR EACH \(ROW\|STATEMENT\) EXECUTE \(PROCEDURE\|FUNCTION\)" -I '^$' -I "SET
default_table_access_method= heap;" -I "^SET default_toast_compression = 'pglz';$" -I '^-- ' -u
/home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/dump1.sql
/home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/dump2.sql|grep -c '^[-+]'
 
11

Maybe it's important not to totally ignore that, and instead perhaps clean up
the known/accepted changes like s/FUNCTION/PROCEDURE/:

</home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/dump2.sql sed '/^CREATE TRIGGER/s/FUNCTION/PROCEDURE/' |diff
-I'^$' -I "SET default_table_access_method = heap;" -I "^SET default_toast_compression = 'pglz';$" -I '^-- ' -u
/home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/dump1.sql- |grep -c '^[-+]'
 
11

It seems weird that we don't quote "heap" but we quote tablespaces and not
toast compression methods.

-- 
Justin



pgsql-hackers by date:

Previous
From: Zi Yi Xu
Date:
Subject: Pgsql Google Summer of Code
Next
From: Kohei KaiGai
Date:
Subject: Re: TRUNCATE on foreign table