Thread: [BUG] pg_upgrade test fails from older versions.
Hello! Found that pg_upgrade test has broken for upgrades from older versions. This happened for two reasons. 1) In 7b378237a the format of "aclitem" changed so upgrade from <=15 fails with error: "Your installation contains the "aclitem" data type in user tables. The internal format of "aclitem" changed in PostgreSQL version 16 so this cluster cannot currently be upgraded... " Tried to fix it by changing the column type in the upgrade_adapt.sql. Please see the patch attached. 2) In 60684dd83 and b5d63824 there are two changes in the set of specific privileges. The thing is that in the privileges.sql test there is REVOKE DELETE command which becomes pair of REVOKE ALL and GRANT all specific privileges except DELETE in the result dump. Therefore, any change in the set of specific privileges will lead to a non-zero dumps diff. To avoid this, i propose to replace any specific GRANT and REVOKE in the result dumps with ALL. This also made in the patch attached. Would be glad to any remarks. With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
"Anton A. Melnikov" <aamelnikov@inbox.ru> writes: > 2) In 60684dd83 and b5d63824 there are two changes in the set of specific privileges. > The thing is that in the privileges.sql test there is REVOKE DELETE command > which becomes pair of REVOKE ALL and GRANT all specific privileges except DELETE > in the result dump. Therefore, any change in the set of specific privileges will lead to > a non-zero dumps diff. > To avoid this, i propose to replace any specific GRANT and REVOKE in the result dumps with ALL. > This also made in the patch attached. Isn't that likely to mask actual bugs? regards, tom lane
On Mon, Dec 19, 2022 at 03:50:19AM +0300, Anton A. Melnikov wrote: > +-- The internal format of "aclitem" changed in PostgreSQL version 16 > +-- so replace it with text type > +\if :oldpgversion_le15 > +DO $$ > +DECLARE > + change_aclitem_type TEXT; > +BEGIN > + FOR change_aclitem_type IN > + SELECT 'ALTER TABLE ' || table_schema || '.' || > + table_name || ' ALTER COLUMN ' || > + column_name || ' SET DATA TYPE text;' > + AS change_aclitem_type > + FROM information_schema.columns > + WHERE data_type = 'aclitem' and table_schema != 'pg_catalog' > + LOOP > + EXECUTE change_aclitem_type; > + END LOOP; > +END; > +$$; > +\endif This is forgetting about materialized views, which is something that pg_upgrade would also complain about when checking for relations with aclitems. As far as I can see, the only place in the main regression test suite where we have an aclitem attribute is tab_core_types for HEAD and the stable branches, so it would be enough to do this change. Anyway, wouldn't it be better to use the same conditions as the WITH OIDS queries a few lines above, at least for consistency? Note that check_for_data_types_usage() checks for tables, matviews and indexes. -- Michael
Attachment
On Sun, Dec 18, 2022 at 08:56:48PM -0500, Tom Lane wrote: > "Anton A. Melnikov" <aamelnikov@inbox.ru> writes: >> 2) In 60684dd83 and b5d63824 there are two changes in the set of specific privileges. >> The thing is that in the privileges.sql test there is REVOKE DELETE command >> which becomes pair of REVOKE ALL and GRANT all specific privileges except DELETE >> in the result dump. Therefore, any change in the set of specific privileges will lead to >> a non-zero dumps diff. >> To avoid this, i propose to replace any specific GRANT and REVOKE in the result dumps with ALL. >> This also made in the patch attached. > > Isn't that likely to mask actual bugs? + # Replace specific privilegies with ALL + $dump_contents =~ s/^(GRANT\s|REVOKE\s)(\S*)\s/$1ALL /mgx; Yes, this would silence some diffs in the dumps taken from the old and the new clusters. It seems to me that it is one of the things where the original dumps have better be tweaked, as this does not cause a hard failure when running pg_upgrade. While thinking about that, an extra idea popped in my mind as it may be interesting to be able to filter out some of the diffs in some contexts. So what about adding in 002_pg_upgrade.pl a small-ish hook in the shape of a new environment variable pointing to a file adds some custom filtering rules? -- Michael
Attachment
Hello! Divided patch into two parts: first part refers to the modification of the old dump while the second one relates to dump filtering. 1) v2-0001-Remove-aclitem-from-old-dump.patch On 19.12.2022 06:10, Michael Paquier wrote: > This is forgetting about materialized views, which is something that > pg_upgrade would also complain about when checking for relations with > aclitems. As far as I can see, the only place in the main regression > test suite where we have an aclitem attribute is tab_core_types for > HEAD and the stable branches, so it would be enough to do this > change. Anyway, wouldn't it be better to use the same conditions as > the WITH OIDS queries a few lines above, at least for consistency? > > Note that check_for_data_types_usage() checks for tables, matviews and > indexes. Found that 'ALTER ... ALTER COLUMN SET DATA TYPE text' is not applicable to materialized views and indexes as well as DROP COLUMN. So couldn't make anything better than drop its in the old dump if they contain at least one column of 'aclitem' type. i've tested this script with: CREATE TABLE acltable AS SELECT 1 AS int, 'postgres=a/postgres'::aclitem AS aclitem; CREATE MATERIALIZED VIEW aclmview AS SELECT 'postgres=a/postgres'::aclitem AS aclitem; CREATE INDEX aclindex on acltable (int) INCLUDE (aclitem); performed in the regression database before creating the old dump. The only thing i haven't been able to find a case when an an 'acltype' column would be preserved in the index when this type was replaced in the parent table. So passing relkind = 'i' is probably redundant. If it is possible to find such a case, it would be very interesting. Also made the replacement logic for 'acltype' in the tables more closer to above the script that removes OIDs columns. In this script found likewise that ALTER TABLE ... SET WITHOUT OIDS is not applicable to materialized views and ALTER MATERIALIZED VIEW doesn't support WITHOUT OIDS clause. Besides i couldn't find any legal way to create materialized view with oids in versions 11 or lower. Command 'CREATE MATERIALIZED VIEW' doesn't support WITH OIDS or (OIDS) clause, as well as ALTER MATERIALIZED VIEW as mentioned above. Even with GUC default_with_oids = true": CREATE TABLE oidtable AS SELECT 1 AS int; CREATE MATERIALIZED VIEW oidmv AS SELECT * FROM oidtable; give: postgres=# SELECT oid::regclass::text FROM pg_class WHERE relname !~ '^pg_' AND relhasoids; oid ---------- oidtable (1 row) So suggest to exclude the check of materialized views from this DO block. Would be grateful for remarks if i didn't consider some cases. 2) v2-0002-Additional-dumps-filtering.patch On 19.12.2022 06:16, Michael Paquier wrote: > > While thinking about that, an extra idea popped in my mind as it may > be interesting to be able to filter out some of the diffs in some > contexts. So what about adding in 002_pg_upgrade.pl a small-ish hook > in the shape of a new environment variable pointing to a file adds > some custom filtering rules? Yes. Made a hook that allows to proceed an external text file with additional filtering rules and example of such file. Please take a look on it. With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Thu, Dec 22, 2022 at 09:59:18AM +0300, Anton A. Melnikov wrote: > 2) v2-0002-Additional-dumps-filtering.patch + # Replace specific privilegies with ALL + $dump_contents =~ s/^(GRANT\s|REVOKE\s)(\S*)\s/$1ALL /mgx; This should not be in 0002, I guess.. > Yes. Made a hook that allows to proceed an external text file with additional > filtering rules and example of such file. Please take a look on it. > > With the best wishes, Hmm. 0001 does a direct check on aclitem as data type used in an attribute, but misses anything related to arrays, domains or even composite types, not to mention that we'd miss uses of aclitems in index expressions. That's basically the kind of thing check_for_data_types_usage() does. I am not sure that it is a good idea to provide a limited coverage if we do that for matviews and indexes, and the complexity induced in upgrade_adapt.sql is not really appealing either. For now, I have fixed the most pressing part for tables to match with the buildfarm code that just drops the aclitem column rather than doing that for all the relations that could have one. The part on WITH OIDS has been addressed in its own commit down to v12, removing the handling for matviews but adding one for foreign tables where the operation is supported. -- Michael
Attachment
On Fri, Dec 23, 2022 at 11:42:39AM +0900, Michael Paquier wrote: > Hmm. 0001 does a direct check on aclitem as data type used in an > attribute, > For now, I have fixed the most pressing part for tables to match with > the buildfarm +DO $$ + DECLARE + rec text; + col text; + BEGIN + FOR rec in + SELECT oid::regclass::text + FROM pg_class + WHERE relname !~ '^pg_' + AND relkind IN ('r') + ORDER BY 1 + LOOP + FOR col in SELECT attname FROM pg_attribute + WHERE attrelid::regclass::text = rec + AND atttypid = 'aclitem'::regtype + LOOP + EXECUTE 'ALTER TABLE ' || quote_ident(rec) || ' ALTER COLUMN ' || + quote_ident(col) || ' SET DATA TYPE text'; + END LOOP; + END LOOP; + END; $$; This will do a seq scan around pg_attribute for each relation (currently ~600)... Here, that takes a few seconds in a debug build, and I guess it'll be more painful when running under valgrind/discard_caches/antiquated hardware/etc. This would do a single seqscan: SELECT format('ALTER TABLE %I ALTER COLUMN %I TYPE TEXT', attrelid::regclass, attname) FROM pg_attribute WHERE atttypid='aclitem'::regtype;-- AND ... \gexec -- Justin
On Thu, Dec 22, 2022 at 09:27:24PM -0600, Justin Pryzby wrote: > This would do a single seqscan: > SELECT format('ALTER TABLE %I ALTER COLUMN %I TYPE TEXT', > attrelid::regclass, attname) FROM pg_attribute WHERE > atttypid='aclitem'::regtype; -- AND ... > \gexec FWIW, I find the use of a FOR loop with a DO block much cleaner to follow in this context, so something like the attached would be able to group the two queries and address your point on O(N^2). Do you like that? -- Michael
Attachment
Hello! On 23.12.2022 06:27, Justin Pryzby wrote: > > This would do a single seqscan: > SELECT format('ALTER TABLE %I ALTER COLUMN %I TYPE TEXT', attrelid::regclass, attname) FROM pg_attribute WHERE atttypid='aclitem'::regtype;-- AND ... > \gexec > Touched a bit on how long it takes to execute different types of queries on my PC. At each measurement, the server restarted with a freshly copied regression database. 1) DO $$ DECLARE change_aclitem_type TEXT; BEGIN FOR change_aclitem_type IN SELECT 'ALTER TABLE ' || table_schema || '.' || table_name || ' ALTER COLUMN ' || column_name || ' SET DATA TYPE text;' AS change_aclitem_type FROM information_schema.columns WHERE data_type = 'aclitem' and table_schema != 'pg_catalog' LOOP EXECUTE change_aclitem_type; END LOOP; END; $$; 2) DO $$ DECLARE rec text; col text; BEGIN FOR rec in SELECT oid::regclass::text FROM pg_class WHERE relname !~ '^pg_' AND relkind IN ('r') ORDER BY 1 LOOP FOR col in SELECT attname FROM pg_attribute WHERE attrelid::regclass::text = rec AND atttypid = 'aclitem'::regtype LOOP EXECUTE 'ALTER TABLE ' || quote_ident(rec) || ' ALTER COLUMN ' || quote_ident(col) || ' SET DATA TYPE text'; END LOOP; END LOOP; END; $$; 3) SELECT format('ALTER TABLE %I ALTER COLUMN %I TYPE TEXT', attrelid::regclass, attname) FROM pg_attribute WHERE atttypid='aclitem'::regtype; \gexec 4) The same as 3) but in the DO block DO $$ DECLARE change_aclitem_type TEXT; BEGIN FOR change_aclitem_type IN SELECT 'ALTER TABLE ' || attrelid::regclass || ' ALTER COLUMN ' || attname || ' TYPE TEXT;' AS change_aclitem_type FROM pg_attribute WHERE atttypid = 'aclitem'::regtype LOOP EXECUTE change_aclitem_type; END LOOP; END; $$; Average execution time for three times: _____________________________________ |N of query: | 1 | 2 | 3 | 4 | |____________________________________ |Avg time, ms: | 58 | 1076 | 51 | 33 | |____________________________________ Raw results in timing.txt Best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Sorry, didn't get to see the last letter! On 23.12.2022 11:51, Michael Paquier wrote: > > FWIW, I find the use of a FOR loop with a DO block much cleaner to > follow in this context, so something like the attached would be able > to group the two queries and address your point on O(N^2). Do you > like that? > -- > Michael The query: DO $$ DECLARE rec record; BEGIN FOR rec in SELECT oid::regclass::text as rel, attname as col FROM pg_class c, pg_attribute a WHERE c.relname !~ '^pg_' AND c.relkind IN ('r') AND a.attrelid = c.oid AND a.atttypid = 'aclitem'::regtype ORDER BY 1 LOOP EXECUTE 'ALTER TABLE ' || quote_ident(rec.rel) || ' ALTER COLUMN ' || quote_ident(rec.col) || ' SET DATA TYPE text'; END LOOP; END; $$; gives the average time of 36 ms at the same conditions. With the best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Fri, Dec 23, 2022 at 05:51:28PM +0900, Michael Paquier wrote: > On Thu, Dec 22, 2022 at 09:27:24PM -0600, Justin Pryzby wrote: > > This would do a single seqscan: > > SELECT format('ALTER TABLE %I ALTER COLUMN %I TYPE TEXT', > > attrelid::regclass, attname) FROM pg_attribute WHERE > > atttypid='aclitem'::regtype; -- AND ... > > \gexec > > FWIW, I find the use of a FOR loop with a DO block much cleaner to > follow in this context, so something like the attached would be able > to group the two queries and address your point on O(N^2). Do you > like that? LGTM. Thanks. -- Justin
On Fri, Dec 23, 2022 at 10:39:25AM -0600, Justin Pryzby wrote: > On Fri, Dec 23, 2022 at 05:51:28PM +0900, Michael Paquier wrote: >> FWIW, I find the use of a FOR loop with a DO block much cleaner to >> follow in this context, so something like the attached would be able >> to group the two queries and address your point on O(N^2). Do you >> like that? > > LGTM. Thanks. I am a bit busy for the next few days, but I may be able to get that done next Monday. -- Michael
Attachment
On Fri, Dec 23, 2022 at 10:39:25AM -0600, Justin Pryzby wrote: > LGTM. Thanks. Done as of d3c0cc4. -- Michael
Attachment
On Fri, Dec 23, 2022 at 12:43:00PM +0300, Anton A. Melnikov wrote: > Sorry, didn't get to see the last letter! No worries, the result is the same :) I was looking at 0002 to add a callback to provide custom filtering rules. + my @ext_filter = split('\/', $_); Are you sure that enforcing a separation with a slash is a good idea? What if the filters include directory paths or characters that are escaped, for example? Rather than introducing a filter.regex, I would tend to just document that in the README with a small example. I have been considering a few alternatives while making this useful in most cases, still my mind alrways comes back to the simplest thing we to just read each line of the file, chomp it and apply the pattern to the log file.. -- Michael
Attachment
Hello! On 23.12.2022 05:42, Michael Paquier wrote: > On Thu, Dec 22, 2022 at 09:59:18AM +0300, Anton A. Melnikov wrote: >> 2) v2-0002-Additional-dumps-filtering.patch > > + # Replace specific privilegies with ALL > + $dump_contents =~ s/^(GRANT\s|REVOKE\s)(\S*)\s/$1ALL /mgx; > This should not be in 0002, I guess.. Made a separate patch for it: v3-0001-Fix-dumps-filtering.patch On 26.12.2022 05:52, Michael Paquier wrote: > On Fri, Dec 23, 2022 at 12:43:00PM +0300, Anton A. Melnikov wrote: > I was looking at 0002 to add a callback to provide custom filtering > rules. > > + my @ext_filter = split('\/', $_); > Are you sure that enforcing a separation with a slash is a good idea? > What if the filters include directory paths or characters that are > escaped, for example? > > Rather than introducing a filter.regex, I would tend to just document > that in the README with a small example. I have been considering a > few alternatives while making this useful in most cases, still my mind > alrways comes back to the simplest thing we to just read each line of > the file, chomp it and apply the pattern to the log file.. Thanks for your attention! Yes, indeed. It will be really simpler. Made it in the v3-0002-Add-external-dumps-filtering.patch With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Mon, Dec 26, 2022 at 09:22:08AM +0300, Anton A. Melnikov wrote: > Made a separate patch for it: v3-0001-Fix-dumps-filtering.patch Well, the thing about this part is that is it is not needed: the same can be achieved with 0002 in place. > Yes, indeed. It will be really simpler. > Made it in the v3-0002-Add-external-dumps-filtering.patch I have fixed a few things in the patch, like switching the step skipping comments with a regexp, adding one step to ignore empty lines, applying a proper indentation and fixing comments here and there (TESTING was incorrect, btw). It is worth noting that perlcritic was complaining here, as eval is getting used with a string. I have spent a few days looking at that, and I really want a maximum of flexibility in the rules that can be applied so I have put a "no critic" rule, which is fine by me as this extra file is something owned by the user and it would apply only to cross-version upgrades. So it looks like we are now done here.. With all these pieces in place in the tests, I don't see why it would not be possible to automate the cross-version tests of pg_upgrade. -- Michael
Attachment
Hello! On 27.12.2022 08:44, Michael Paquier wrote: > > It is worth noting that perlcritic was complaining here, as eval is > getting used with a string. I have spent a few days looking at that, > and I really want a maximum of flexibility in the rules that can be > applied so I have put a "no critic" rule, which is fine by me as this > extra file is something owned by the user and it would apply only to > cross-version upgrades. I think it's a very smart decision. Thank you very match! > So it looks like we are now done here.. With all these pieces in > place in the tests, I don't see why it would not be possible to > automate the cross-version tests of pg_upgrade. I've checked the cross-upgrade test form 9.5+ to current master and have found no problem with accuracy up to dumps filtering. For cross-version tests automation one have to write additional filtering rules in the external files. I would like to try realize this, better in a separate thread. If there are no other considerations could you close the corresponding record on the January CF, please? With the best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Tue, Dec 27, 2022 at 03:26:10PM +0300, Anton A. Melnikov wrote: > I would like to try realize this, better in a separate thread. I don't think that this should be added into the tree, but if you have per-version filtering rules, of course feel free to publish that to the lists. I am sure that this could be helpful for others. > If there are no other considerations could you close the corresponding > record on the January CF, please? Indeed, now marked as committed. -- Michael
Attachment
On 27.12.2022 16:50, Michael Paquier wrote: >> If there are no other considerations could you close the corresponding >> record on the January CF, please? > > Indeed, now marked as committed. - Thanks a lot! Merry Christmas! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company