Thread: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
Hi, hackers!
- Apply patch 0002
- Set autovaccum = off in postgresql.conf
create table tbl2 (a varchar(50), b varchar(150), c bytea, d varchar(50));
create unique index idx on tbl2(a,b);
insert into tbl2 select i::text::varchar, i::text::varchar, i::text::bytea, i::text::varchar from generate_series(0,3) as i, generate_series(0,10000) as x;
--
It seems that if btree index with a unique constraint is corrupted by duplicates, amcheck now can not catch this. Reindex becomes impossible as it throws an error but otherwise the index will let the user know that it is corrupted, and amcheck will tell that the index is clean. So I'd like to propose a short patch to improve amcheck for checking the unique constraint. It will output tid's of tuples that are duplicated in the index (i.e. more than one tid for the same index key is visible) and the user can easily investigate and delete corresponding table entries.
0001 - is the actual patch, and
0002 - is a temporary hack for testing. It will allow inserting duplicates in a table even if an index with the exact name "idx" has a unique constraint (generally it is prohibited to insert). Then a new amcheck will tell us about these duplicates. It's pity but testing can not be done automatically, as it needs a core recompile. For testing I'd recommend a protocol similar to the following:
- Set autovaccum = off in postgresql.conf
create table tbl2 (a varchar(50), b varchar(150), c bytea, d varchar(50));
create unique index idx on tbl2(a,b);
insert into tbl2 select i::text::varchar, i::text::varchar, i::text::bytea, i::text::varchar from generate_series(0,3) as i, generate_series(0,10000) as x;
So we'll have a generous amount of duplicates in the table and index. Then:
create extension amcheck;
select bt_index_check('idx', true);
There will be output about each pair of duplicates: index tid's, position in a posting list (if the index item is deduplicated) and table tid's.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 218 and posting 219 (point to heap tid=(126,93) and tid=(126,94)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 219 and posting 220 (point to heap tid=(126,94) and tid=(126,95)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 220 and posting 221 (point to heap tid=(126,95) and tid=(126,96)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 221 and tid=(26,7) posting 0 (point to heap tid=(126,96) and tid=(126,97)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,7) posting 0 and posting 1 (point to heap tid=(126,97) and tid=(126,98)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,7) posting 1 and posting 2 (point to heap tid=(126,98) and tid=(126,99)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 219 and posting 220 (point to heap tid=(126,94) and tid=(126,95)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 220 and posting 221 (point to heap tid=(126,95) and tid=(126,96)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 221 and tid=(26,7) posting 0 (point to heap tid=(126,96) and tid=(126,97)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,7) posting 0 and posting 1 (point to heap tid=(126,97) and tid=(126,98)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,7) posting 1 and posting 2 (point to heap tid=(126,98) and tid=(126,99)) page lsn=0/1B3D420.
Things to notice in the test output:
- cross-page duplicates (when some of them on the one index page and the other are on the next). (Under patch 0002 they are marked by an additional message "INFO: cross page equal keys" to catch them among the other)
- If we delete table entries corresponding to some duplicates in between the other duplicates (vacuum should be off), the message for this entry should disappear but the other duplicates should be detected as previous.
delete from tbl2 where ctid::text='(126,94)';
select bt_index_check('idx', true);
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 218 and posting 220 (point to heap tid=(126,93) and tid=(126,95)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 220 and posting 221 (point to heap tid=(126,95) and tid=(126,96)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 221 and tid=(26,7) posting 0 (point to heap tid=(126,96) and tid=(126,97)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,7) posting 0 and posting 1 (point to heap tid=(126,97) and tid=(126,98)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,7) posting 1 and posting 2 (point to heap tid=(126,98) and tid=(126,99)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 220 and posting 221 (point to heap tid=(126,95) and tid=(126,96)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 221 and tid=(26,7) posting 0 (point to heap tid=(126,96) and tid=(126,97)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,7) posting 0 and posting 1 (point to heap tid=(126,97) and tid=(126,98)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,7) posting 1 and posting 2 (point to heap tid=(126,98) and tid=(126,99)) page lsn=0/1B3D420.
Caveat: if the first entry on the next index page has a key equal to the key on a previous page AND all heap tid's corresponding to this entry are invisible, currently cross-page check can not detect unique constraint violation between previous index page entry and 2nd, 3d and next current index page entries. In this case, there would be a message that recommends doing VACUUM to remove the invisible entries from the index and repeat the check. (Generally, it is recommended to do vacuum before the check, but for the testing purpose I'd recommend turning it off to check the detection of visible-invisible-visible duplicates scenarios)
Your feedback is very much welcome, as usual.
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
On Mon, 8 Feb 2021, 14:46 Pavel Borisov <pashkin.elfe@gmail.com wrote:
Hi, hackers!It seems that if btree index with a unique constraint is corrupted by duplicates, amcheck now can not catch this. Reindex becomes impossible as it throws an error but otherwise the index will let the user know that it is corrupted, and amcheck will tell that the index is clean. So I'd like to propose a short patch to improve amcheck for checking the unique constraint. It will output tid's of tuples that are duplicated in the index (i.e. more than one tid for the same index key is visible) and the user can easily investigate and delete corresponding table entries.0001 - is the actual patch, and0002 - is a temporary hack for testing. It will allow inserting duplicates in a table even if an index with the exact name "idx" has a unique constraint (generally it is prohibited to insert). Then a new amcheck will tell us about these duplicates. It's pity but testing can not be done automatically, as it needs a core recompile. For testing I'd recommend a protocol similar to the following:- Apply patch 0002
- Set autovaccum = off in postgresql.conf
create table tbl2 (a varchar(50), b varchar(150), c bytea, d varchar(50));
create unique index idx on tbl2(a,b);
insert into tbl2 select i::text::varchar, i::text::varchar, i::text::bytea, i::text::varchar from generate_series(0,3) as i, generate_series(0,10000) as x;
So we'll have a generous amount of duplicates in the table and index. Then:create extension amcheck;select bt_index_check('idx', true);There will be output about each pair of duplicates: index tid's, position in a posting list (if the index item is deduplicated) and table tid's.WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 218 and posting 219 (point to heap tid=(126,93) and tid=(126,94)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 219 and posting 220 (point to heap tid=(126,94) and tid=(126,95)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 220 and posting 221 (point to heap tid=(126,95) and tid=(126,96)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 221 and tid=(26,7) posting 0 (point to heap tid=(126,96) and tid=(126,97)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,7) posting 0 and posting 1 (point to heap tid=(126,97) and tid=(126,98)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,7) posting 1 and posting 2 (point to heap tid=(126,98) and tid=(126,99)) page lsn=0/1B3D420.Things to notice in the test output:- cross-page duplicates (when some of them on the one index page and the other are on the next). (Under patch 0002 they are marked by an additional message "INFO: cross page equal keys" to catch them among the other)- If we delete table entries corresponding to some duplicates in between the other duplicates (vacuum should be off), the message for this entry should disappear but the other duplicates should be detected as previous.delete from tbl2 where ctid::text='(126,94)';select bt_index_check('idx', true);WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 218 and posting 220 (point to heap tid=(126,93) and tid=(126,95)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 220 and posting 221 (point to heap tid=(126,95) and tid=(126,96)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,6) posting 221 and tid=(26,7) posting 0 (point to heap tid=(126,96) and tid=(126,97)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,7) posting 0 and posting 1 (point to heap tid=(126,97) and tid=(126,98)) page lsn=0/1B3D420.
WARNING: index uniqueness is violated for index "idx": Index tid=(26,7) posting 1 and posting 2 (point to heap tid=(126,98) and tid=(126,99)) page lsn=0/1B3D420.Caveat: if the first entry on the next index page has a key equal to the key on a previous page AND all heap tid's corresponding to this entry are invisible, currently cross-page check can not detect unique constraint violation between previous index page entry and 2nd, 3d and next current index page entries. In this case, there would be a message that recommends doing VACUUM to remove the invisible entries from the index and repeat the check. (Generally, it is recommended to do vacuum before the check, but for the testing purpose I'd recommend turning it off to check the detection of visible-invisible-visible duplicates scenarios)Your feedback is very much welcome, as usual.--
There was typo, I mean, initially"...index will NOT let the user know that it is corrupted..."
> On Feb 8, 2021, at 2:46 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > 0002 - is a temporary hack for testing. It will allow inserting duplicates in a table even if an index with the exact name"idx" has a unique constraint (generally it is prohibited to insert). Then a new amcheck will tell us about these duplicates.It's pity but testing can not be done automatically, as it needs a core recompile. For testing I'd recommend aprotocol similar to the following: > > - Apply patch 0002 > - Set autovaccum = off in postgresql.conf Thanks Pavel and Anastasia for working on this! Updating pg_catalog directly is ugly, but the following seems a simpler way to set up a regression test than having to recompile. What do you think? CREATE TABLE junk (t text); CREATE UNIQUE INDEX junk_idx ON junk USING btree (t); INSERT INTO junk (t) VALUES ('fee'), ('fi'), ('fo'), ('fum'); UPDATE pg_catalog.pg_index SET indisunique = false WHERE indrelid = (SELECT oid FROM pg_catalog.pg_class WHERE relname = 'junk'); INSERT INTO junk (t) VALUES ('fee'), ('fi'), ('fo'), ('fum'); UPDATE pg_catalog.pg_index SET indisunique = true WHERE indrelid = (SELECT oid FROM pg_catalog.pg_class WHERE relname = 'junk'); SELECT * FROM junk; t ----- fee fi fo fum fee fi fo fum (8 rows) \d junk Table "public.junk" Column | Type | Collation | Nullable | Default --------+------+-----------+----------+--------- t | text | | | Indexes: "junk_idx" UNIQUE, btree (t) \d junk_idx Index "public.junk_idx" Column | Type | Key? | Definition --------+------+------+------------ t | text | yes | t unique, btree, for table "public.junk" — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
вт, 9 февр. 2021 г. в 01:46, Mark Dilger <mark.dilger@enterprisedb.com>:
> On Feb 8, 2021, at 2:46 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> 0002 - is a temporary hack for testing. It will allow inserting duplicates in a table even if an index with the exact name "idx" has a unique constraint (generally it is prohibited to insert). Then a new amcheck will tell us about these duplicates. It's pity but testing can not be done automatically, as it needs a core recompile. For testing I'd recommend a protocol similar to the following:
>
> - Apply patch 0002
> - Set autovaccum = off in postgresql.conf
Thanks Pavel and Anastasia for working on this!
Updating pg_catalog directly is ugly, but the following seems a simpler way to set up a regression test than having to recompile. What do you think?
Very nice idea, thanks!
I've made a regression test based on it. PFA v.2 of a patch. Now it doesn't need anything external for testing.
--
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
To make tests stable I also removed lsn output under warning level. PFA v3 of a patch
Attachment
Hi,
Minor comment:
+ if (errflag == true)
+ ereport(ERROR,
+ ereport(ERROR,
I think 'if (errflag)' should suffice.
Cheers
On Tue, Feb 9, 2021 at 10:44 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
вт, 9 февр. 2021 г. в 01:46, Mark Dilger <mark.dilger@enterprisedb.com>:
> On Feb 8, 2021, at 2:46 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> 0002 - is a temporary hack for testing. It will allow inserting duplicates in a table even if an index with the exact name "idx" has a unique constraint (generally it is prohibited to insert). Then a new amcheck will tell us about these duplicates. It's pity but testing can not be done automatically, as it needs a core recompile. For testing I'd recommend a protocol similar to the following:
>
> - Apply patch 0002
> - Set autovaccum = off in postgresql.conf
Thanks Pavel and Anastasia for working on this!
Updating pg_catalog directly is ugly, but the following seems a simpler way to set up a regression test than having to recompile. What do you think?Very nice idea, thanks!I've made a regression test based on it. PFA v.2 of a patch. Now it doesn't need anything external for testing.--
On 3/15/21 11:11 AM, Mark Dilger wrote: > >> On Mar 2, 2021, at 6:08 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote: >> >> I completely agree that checking uniqueness requires looking at the heap, but I don't agree that every caller of bt_index_checkon an index wants that particular check to be performed. There are multiple ways in which an index might becorrupt, and Peter wrote the code to only check some of them by default, with options to expand the checks to other things. This is why heapallindexed is optional. If you don't want to pay the price of checking all entries in the heap againstthe btree, you don't have to. >> >> I've got the idea and revised the patch accordingly. Thanks! >> Pfa v4 of a patch. I've added an optional argument to allow uniqueness checks for the unique indexes. >> Also, I added a test variant to make them work on 32-bit systems. Unfortunately, converting the regression test to TAPwould be a pain for me. Hope it can be used now as a 2-variant regression test for 32 and 64 bit systems. >> >> Thank you for your consideration! >> >> -- >> Best regards, >> Pavel Borisov >> >> Postgres Professional: http://postgrespro.com >> <v4-0001-Make-amcheck-checking-UNIQUE-constraint-for-btree.patch> > > Looking over v4, here are my review comments... This patch appears to need some work and has not been updated in several weeks, so marking Returned with Feedback. Please submit to the next CF when you have a new patch. Regards, -- -David david@pgmasters.net
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
>> I completely agree that checking uniqueness requires looking at the heap, but I don't agree that every caller of bt_index_check on an index wants that particular check to be performed. There are multiple ways in which an index might be corrupt, and Peter wrote the code to only check some of them by default, with options to expand the checks to other things. This is why heapallindexed is optional. If you don't want to pay the price of checking all entries in the heap against the btree, you don't have to.
>>
>> I've got the idea and revised the patch accordingly. Thanks!
>> Pfa v4 of a patch. I've added an optional argument to allow uniqueness checks for the unique indexes.
>> Also, I added a test variant to make them work on 32-bit systems. Unfortunately, converting the regression test to TAP would be a pain for me. Hope it can be used now as a 2-variant regression test for 32 and 64 bit systems.
>>
>> Thank you for your consideration!
>>
>> --
>> Best regards,
>> Pavel Borisov
>>
>> Postgres Professional: http://postgrespro.com
>> <v4-0001-Make-amcheck-checking-UNIQUE-constraint-for-btree.patch>
>
> Looking over v4, here are my review comments...
Mark and Peter, big thanks for your ideas!
I had little time to work on this feature until recently, but finally, I've elaborated v5 patch (PFA)
It contains the following improvements, most of which are based on your consideration:
- Amcheck tests are reworked into TAP-tests with "break the warranty" by comparison function changes in the opclass instead of pg_index update. Mark, again thanks for the sample!
- Added new --checkunique option into pg_amcheck.
- Added documentation and tests for amcheck and pg_amcheck new functions.
- Results are output at ERROR log level like it is done in the other amcheck tests.
- Rare case of inability to check due to the first entry on a leaf page being both: (1) equal to the last one on the previous page and (2) deleted in the heap, is demoted to DEBUG1 log level. In this, I folowed Peter's consideration that amcheck should do its best to check, but can not always verify everything. The case is expected to be extremely rare.
- Made pages connectivity based on btpo_next (corrected a bug in the code, I found meanwhile)
- If snapshot is already taken in heapallindexed case, reuse it for unique constraint check.
The patch is pgindented and rebased on the current PG master code.
I'd like to re-attach the patch v5 to the upcoming CF if you don't mind.
I also want to add that some customers face index uniqueness violations more often than I've expected, and I hope this check could also help some other PostgreSQL customers.
Your further considerations are welcome as always!
Attachment
> On Dec 20, 2021, at 7:37 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > The patch is pgindented and rebased on the current PG master code. Thank you, Pavel. The tests in check_btree.sql no longer create a bttest_unique table, so the DROP TABLE is surplusage: +DROP TABLE bttest_unique; +ERROR: table "bttest_unique" does not exist The changes in pg_amcheck.c to pass the new checkunique parameter will likely need to be based on a amcheck version check. The implementation of prepare_btree_command() in pg_amcheck.c should be kept compatible with older versions of amcheck,because it connects to remote servers and you can't know in advance that the remote servers are as up-to-date asthe machine where pg_amcheck is installed. I'm thinking specifically about this change: @@ -871,7 +877,8 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn) if (opts.parent_check) appendPQExpBuffer(sql, "SELECT %s.bt_index_parent_check(" - "index := c.oid, heapallindexed := %s, rootdescend := %s)" + "index := c.oid, heapallindexed := %s, rootdescend := %s, " + "checkunique := %s)" "\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i " "WHERE c.oid = %u " "AND c.oid = i.indexrelid " If the user calls pg_amcheck with --checkunique, and one or more remote servers have an amcheck version < 1.4, at a minimumyou'll need to avoid calling bt_index_parent_check with that parameter, and probably also you'll either need to raisea warning or perhaps an error telling the user that such a check cannot be performed. You've forgotten to include contrib/amcheck/amcheck--1.3--1.4.sql in the v5 patch, resulting in a failed install: /usr/bin/install -c -m 644 ./amcheck--1.3--1.4.sql ./amcheck--1.2--1.3.sql ./amcheck--1.1--1.2.sql ./amcheck--1.0--1.1.sql./amcheck--1.0.sql '/Users/mark.dilger/hydra/unique_review.5/tmp_install/Users/mark.dilger/pgtest/test_install/share/postgresql/extension/' install: ./amcheck--1.3--1.4.sql: No such file or directory make[2]: *** [install] Error 71 make[1]: *** [checkprep] Error 2 Using the one from the v4 patch fixes the problem. Please include this file in v6, to simplify the review process. The changes to t/005_opclass_damage.pl look ok. The creation of a new table for the new test seems unnecessary, but onlyproblematic in that it makes the test slightly longer to read. I recommend changing the test to use the same table thatthe prior test uses, but that is just a recommendation, not a requirement. You should add coverage for --checkunique to t/003_check.pl. You should add coverage for multiple PostgreSQL::Test::Cluster instances running differing versions of amcheck, perhaps someon version 1.3 and some on version 1.4. Then test that the --checkunique option works adequately. I have not reviewed the changes to verify_nbtree.c. I'll leave that to Peter. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On Dec 22, 2021, at 12:01 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > Thank you, Mark! > > In v6 (PFA) I've made the changes on your advice i.e. > > - pg_amcheck with --checkunique option will ignore uniqueness check (with a warning) if amcheck version in a db is <1.4and doesn't support the feature. Ok. + int vmaj = 0, + vmin = 0, + vrev = 0; + const char *amcheck_version = pstrdup(PQgetvalue(result, 0, 1)); + + sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev); The pstrdup is unnecessary but harmless. > - fixed unnecessary drop table in regression Ok. > - use the existing table for uniqueness check in 005_opclass_damage.pl It appears you still create a new table, bttest_unique, rather than using the existing table int4tbl. That's fine. > - added tests into 003_check.pl . It is only smoke test that just verifies new functions. + +$node->command_checks_all( + [ + @cmd, '-s', 's1', '-i', 't1_btree', '--parent-check', + '--checkunique', 'db1' + ], + 2, + [$index_missing_relation_fork_re], + [$no_output_re], + 'pg_amcheck smoke test --parent-check'); + +$node->command_checks_all( + [ + @cmd, '-s', 's1', '-i', 't1_btree', '--heapallindexed', + '--rootdescend', '--checkunique', 'db1' + ], + 2, + [$index_missing_relation_fork_re], + [$no_output_re], + 'pg_amcheck smoke test --heapallindexed --rootdescend'); + +$node->command_checks_all( + [ @cmd, '--checkunique', '-d', 'db1', '-d', 'db2', '-d', 'db3', '-S', 's*' ], + 0, [$no_output_re], [$no_output_re], + 'pg_amcheck excluding all corrupt schemas'); + You have borrowed the existing tests but forgot to change their names. (The last line of each check is the test name, suchas 'pg_amcheck smoke test --parent-check'.) Please make each test name unique. > - added test contrib/amcheck/t/004_verify_nbtree_unique.pl it is more extensive test based on opclass damage which wasintended to be main test for amcheck, but which I've forgotten to add to commit in v5. > 005_opclass_damage.pl test, which you've seen in v5 is largely based on first part of 004_verify_nbtree_unique.pl (withthe later calling pg_amcheck, and the former calling bt_index_check(), bt_index_parent_check() ) Ok. > - added forgotten upgrade script amcheck--1.3--1.4.sql (from v4) Ok. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
The pstrdup is unnecessary but harmless.
> - use the existing table for uniqueness check in 005_opclass_damage.pl
It appears you still create a new table, bttest_unique, rather than using the existing table int4tbl. That's fine.
> - added tests into 003_check.pl . It is only smoke test that just verifies new functions.
You have borrowed the existing tests but forgot to change their names. (The last line of each check is the test name, such as 'pg_amcheck smoke test --parent-check'.) Please make each test name unique.
Thanks for your review! Fixed all these remaining things from patch v6.
PFA v7 patch.
---
Best regards,
Maxim Orlov.
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Julien Rouhaud
Date:
Hi, On Fri, Dec 24, 2021 at 2:06 AM Максим Орлов <orlovmg@gmail.com> wrote: > > Thanks for your review! Fixed all these remaining things from patch v6. > PFA v7 patch. The cfbot reports that you have mixed declarations and code (https://cirrus-ci.com/task/6407449413419008): [17:21:26.926] pg_amcheck.c: In function ‘main’: [17:21:26.926] pg_amcheck.c:634:4: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] [17:21:26.926] 634 | int vmaj = 0, [17:21:26.926] | ^~~
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
The cfbot reports that you have mixed declarations and code
(https://cirrus-ci.com/task/6407449413419008):
[17:21:26.926] pg_amcheck.c: In function ‘main’:
[17:21:26.926] pg_amcheck.c:634:4: error: ISO C90 forbids mixed
declarations and code [-Werror=declaration-after-statement]
[17:21:26.926] 634 | int vmaj = 0,
[17:21:26.926] | ^~~
Corrected this, thanks!
Also added more comments on this part of the code.
PFA v8 of a patch
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
By the way I've forgotten to add one part of my code into the CF patch related to the treatment of NULL values in checking btree unique constraints.
--
PFA v9 of a patch with this minor code and tests additions.
Attachment
I've updated the patch due to recent changes by Daniel Gustafsson (549ec201d6132b7).
Best regards,
Maxim Orlov.
Attachment
This patch was broken by d16773cdc86210493a2874cb0cf93f3883fcda73 "Add macros in hash and btree AMs to get the special area of their pages" If it's really just a few macros it should be easy enough to merge but it would be good to do a rebase given the number of other commits since February anyways. On Mon, 21 Feb 2022 at 09:14, Maxim Orlov <orlovmg@gmail.com> wrote: > > I've updated the patch due to recent changes by Daniel Gustafsson (549ec201d6132b7). > > -- > Best regards, > Maxim Orlov. -- greg
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
This patch was broken by d16773cdc86210493a2874cb0cf93f3883fcda73 "Add
macros in hash and btree AMs to get the special area of their pages"
If it's really just a few macros it should be easy enough to merge but
it would be good to do a rebase given the number of other commits
since February anyways.
Rebased, thanks!
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
v11 patch do not apply due to recent code changes.
--
Rebased. PFA v12.
Please feel free to check and discuss it.
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
CFbot says v12 patch does not apply.
Rebased. PFA v13.
Your reviews are very much welcome!
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Aleksander Alekseev
Date:
Hi Pavel, > Rebased. PFA v13. > Your reviews are very much welcome! I noticed that this patch is in "Needs Review" state and it has been stuck for some time now, so I decided to take a look. ``` +SELECT bt_index_parent_check('bttest_a_idx', true, true, true); +SELECT bt_index_parent_check('bttest_b_idx', true, false, true); `` 1. This "true, false, true" sequence is difficult to read. I suggest we use named arguments here. 2. I believe there are some minor issues with the comments. E.g. instead of: - First key on next page is same - Make values 768 and 769 looks equal I would write: - The first key on the next page is the same - Make values 768 and 769 look equal There are many little errors like these. ``` +# Copyright (c) 2021, PostgreSQL Global Development Group ``` 3. Oh no. The copyright has expired! ``` + <literal>true</literal>. When <parameter>checkunique</parameter> + is <literal>true</literal> <function>bt_index_check</function> will ``` 4. This piece of documentation was copy-pasted between two functions without change of the function name. Other than that to me the patch looks in pretty good shape. Here is v14 where I fixed my own nitpicks, with the permission of Pavel given offlist. If no one sees any other defects I'm going to change the status of the patch to "Ready to Committer" in a short time. -- Best regards, Aleksander Alekseev
Attachment
Hi! I would make two cosmetic changes. 1. I suggest replace description of function bt_report_duplicate() from ``` /* * Prepare and print an error message for unique constrain violation in * a btree index under WARNING level. Also set a flag to report ERROR * at the end of the check. */ ``` to ``` /* * Prepare an error message for unique constrain violation in * a btree index and report ERROR. */ ``` 2. I think will be better to change test 004_verify_nbtree_unique.pl - replace ``` use Test::More tests => 6; ``` to ``` use Test::More; ... done_testing(); ``` (same as in the other three tests). -- With best regards, Dmitry Koval Postgres Professional: http://postgrespro.com
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Karina Litskevich
Date:
Hi,
I also would like to suggest a cosmetic change.
In v15 a new field checkunique is added after heapallindexed and before no_btree_expansion fields in struct definition, but in initialisation it is added after no_btree_expansion:
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -102,6 +102,7 @@ typedef struct AmcheckOptions
bool parent_check;
bool rootdescend;
bool heapallindexed;
+ bool checkunique;
/* heap and btree hybrid option */
bool no_btree_expansion;
@@ -132,7 +133,8 @@ static AmcheckOptions opts = {
.parent_check = false,
.rootdescend = false,
.heapallindexed = false,
- .no_btree_expansion = false
+ .no_btree_expansion = false,
+ .checkunique = false
};
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -102,6 +102,7 @@ typedef struct AmcheckOptions
bool parent_check;
bool rootdescend;
bool heapallindexed;
+ bool checkunique;
/* heap and btree hybrid option */
bool no_btree_expansion;
@@ -132,7 +133,8 @@ static AmcheckOptions opts = {
.parent_check = false,
.rootdescend = false,
.heapallindexed = false,
- .no_btree_expansion = false
+ .no_btree_expansion = false,
+ .checkunique = false
};
I suggest to add checkunique field between heapallindexed and no_btree_expansion fields in initialisation as well as in definition:
@@ -132,6 +133,7 @@ static AmcheckOptions opts = {
.parent_check = false,
.rootdescend = false,
.heapallindexed = false,
+ .checkunique = false,
.no_btree_expansion = false
};
.parent_check = false,
.rootdescend = false,
.heapallindexed = false,
+ .checkunique = false,
.no_btree_expansion = false
};
--
Best regards,
Litskevich Karina
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Andres Freund
Date:
Hi, On 2022-05-20 17:46:38 +0400, Pavel Borisov wrote: > CFbot says v12 patch does not apply. > Rebased. PFA v13. > Your reviews are very much welcome! Due to the merge of the meson based build this patch needs to be adjusted: https://cirrus-ci.com/build/6350479973154816 Looks like you need to add amcheck--1.3--1.4.sql to the list of files to be installed and t/004_verify_nbtree_unique.pl to the tests. Greetings, Andres Freund
On Thu, 22 Sept 2022 at 18:13, Andres Freund <andres@anarazel.de> wrote:
Due to the merge of the meson based build this patch needs to be
adjusted: https://cirrus-ci.com/build/6350479973154816
Looks like you need to add amcheck--1.3--1.4.sql to the list of files to be
installed and t/004_verify_nbtree_unique.pl to the tests.
Greetings,
Andres Freund
Thanks! Fixed.
--
Best regards,
Maxim Orlov.
Attachment
Hi!
I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done without any significant code changes and CF bot how happy again.
I'm going to move it to RfC, could I? If not, please tell why.
I'm going to move it to RfC, could I? If not, please tell why.
Best regards,
Maxim Orlov.
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Aleksander Alekseev
Date:
Hi hackers, > I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done without anysignificant code changes and CF bot how happy again. > > I'm going to move it to RfC, could I? If not, please tell why. I restored the "Ready for Committer" state. I don't think it's a good practice to change the state every time the patch has a slight conflict or something. This is not helpful at all. Such things happen quite regularly and typically are fixed in a couple of days. -- Best regards, Aleksander Alekseev
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Alexander Korotkov
Date:
On Wed, Sep 28, 2022 at 11:44 AM Aleksander Alekseev <aleksander@timescale.com> wrote: > > I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done without anysignificant code changes and CF bot how happy again. > > > > I'm going to move it to RfC, could I? If not, please tell why. > > I restored the "Ready for Committer" state. I don't think it's a good > practice to change the state every time the patch has a slight > conflict or something. This is not helpful at all. Such things happen > quite regularly and typically are fixed in a couple of days. This patch seems useful to me. I went through the thread, it seems that all the critics are addressed. I've rebased this patch. Also, I've run perltidy for tests, split long errmsg() into errmsg(), errdetail() and errhint(), and do other minor enchantments. I think this patch is ready to go. I'm going to push it if there are no objections. ------ Regards, Alexander Korotkov
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
Hi, Alexander! On Wed, 25 Oct 2023 at 00:13, Alexander Korotkov <aekorotkov@gmail.com> wrote: > > On Wed, Sep 28, 2022 at 11:44 AM Aleksander Alekseev > <aleksander@timescale.com> wrote: > > > I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done withoutany significant code changes and CF bot how happy again. > > > > > > I'm going to move it to RfC, could I? If not, please tell why. > > > > I restored the "Ready for Committer" state. I don't think it's a good > > practice to change the state every time the patch has a slight > > conflict or something. This is not helpful at all. Such things happen > > quite regularly and typically are fixed in a couple of days. > > This patch seems useful to me. I went through the thread, it seems > that all the critics are addressed. > > I've rebased this patch. Also, I've run perltidy for tests, split > long errmsg() into errmsg(), errdetail() and errhint(), and do other > minor enchantments. > > I think this patch is ready to go. I'm going to push it if there are > no objections. > > ------ > Regards, > Alexander Korotkov It's very good that this long-standing patch is finally committed. Thanks a lot! Regards, Pavel Borisov
On Mon, Oct 30, 2023 at 11:29:04AM +0400, Pavel Borisov wrote: > On Wed, 25 Oct 2023 at 00:13, Alexander Korotkov <aekorotkov@gmail.com> wrote: > > I think this patch is ready to go. I'm going to push it if there are > > no objections. > It's very good that this long-standing patch is finally committed. Thanks a lot! Agreed. I gave this feature (commit 5ae2087) a try. Thanks for implementing it. Could I get your input on two topics? ==== 1. Cross-page comparison at "first key on the next page" only Cross-page comparisons got this discussion upthread: On Tue, Mar 02, 2021 at 07:10:32PM -0800, Peter Geoghegan wrote: > On Mon, Feb 8, 2021 at 2:46 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > Caveat: if the first entry on the next index page has a key equal to the key on a previous page AND all heap tid's correspondingto this entry are invisible, currently cross-page check can not detect unique constraint violation between previousindex page entry and 2nd, 3d and next current index page entries. In this case, there would be a message that recommendsdoing VACUUM to remove the invisible entries from the index and repeat the check. (Generally, it is recommendedto do vacuum before the check, but for the testing purpose I'd recommend turning it off to check the detectionof visible-invisible-visible duplicates scenarios) > You're going to have to "couple" buffer locks in the style of > _bt_check_unique() (as well as keeping a buffer lock on "the first > leaf page a duplicate might be on" throughout) if you need the test to > work reliably. The amcheck feature has no lock coupling at its "first key on the next page" check. I think that's fine, because amcheck takes one snapshot at the beginning and looks for pairs of visible-to-that-snapshot heap tuples with the same scan key. _bt_check_unique(), unlike amcheck, must catch concurrent inserts. If amcheck "checkunique" wanted to detect duplicates that would appear when all transactions commit, it would need lock coupling. (I'm not suggesting it do that.) Do you see a problem with the lack of lock coupling at "first key on the next page"? > But why bother with that? The tool doesn't have to be > 100% perfect at detecting corruption (nothing can be), and it's rather > unlikely that it will matter for this test. A simple test that doesn't > handle cross-page duplicates is still going to be very effective. I agree, but perhaps the "first key on the next page" code is more complex than general-case code would be. If the lack of lock coupling is fine, then I think memory context lifecycle is the only obstacle making index page boundaries special. Are there factors beyond that? We already have state->lowkey kept across pages via MemoryContextAlloc(). Similar lines of code could preserve the scan key for checkunique, making the "first key on the next page" code unnecessary. ==== 2. Raises runtime by 476% despite no dead tuples I used the following to create a table larger than RAM, 17GB table and 10GB index on a system with 12GB RAM: \set count 500000000 begin; set maintenance_work_mem = '1GB'; set client_min_messages = debug1; -- debug2 is per-block spam create temp table t as select n from generate_series(1,:count) t(n); create unique index t_idx on t(n); \dt+ t \di+ t_idx create extension amcheck; select bt_index_check('t_idx', heapallindexed => false, checkunique => false); select bt_index_check('t_idx', heapallindexed => false, checkunique => true); Adding checkunique raised runtime from 58s to 276s, because it checks visibility for every heap tuple. It could do the heap fetch and visibility check lazily, when the index yields two heap TIDs for one scan key. That should give zero visibility checks for this particular test case, and it doesn't add visibility checks to bloated-table cases. Pseudo-code: /*--- * scan_key is the last uniqueness-relevant scan key observed as * bt_check_level_from_leftmost() moves right to traverse the leaf level. * Will be NULL if the next tuple can't be the second tuple of a * uniqueness violation, because any of the following apply: * - we're evaluating the first leaf tuple of the entire index * - last scan key had anynullkeys (never forms a uniqueness violation w/ * any other scan key) */ scan_key = NULL; /* * scan_key_known_visible==true indicates that scan_key_heap_tid is the * last _visible_ heap TID observed for scan_key. Otherwise, * scan_key_heap_tid is the last heap TID observed for scan_key, and we've * not yet checked its visibility. */ bool scan_key_known_visible; scan_key_heap_tid; foreach itup (leftmost_leaf_level_tup .. rightmost_leaf_level_tup) { if (itup.anynullkeys) scan_key = NULL; else if (scan_key != NULL && _bt_compare(scan_key, itup.key) == 0 && (scan_key_known_visible || (scan_key_known_visible = visible(scan_key_heap_tid)))) { if (visible(itup.tid)) elog(ERROR, "duplicate in unique index"); } else { /* * No prior uniqueness-relevant key, or key changed, or we just * learned scan_key_heap_tid was invisible. Make itup the * standard by which we judge future index tuples as we move * right. */ scan_key = itup.key; scan_key_known_visible = false; scan_key_heap_tid = itup.tid; } }
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Peter Geoghegan
Date:
On Sun, Mar 24, 2024 at 10:03 PM Noah Misch <noah@leadboat.com> wrote: > > You're going to have to "couple" buffer locks in the style of > > _bt_check_unique() (as well as keeping a buffer lock on "the first > > leaf page a duplicate might be on" throughout) if you need the test to > > work reliably. > > The amcheck feature has no lock coupling at its "first key on the next page" > check. I think that's fine, because amcheck takes one snapshot at the > beginning and looks for pairs of visible-to-that-snapshot heap tuples with the > same scan key. _bt_check_unique(), unlike amcheck, must catch concurrent > inserts. If amcheck "checkunique" wanted to detect duplicates that would > appear when all transactions commit, it would need lock coupling. (I'm not > suggesting it do that.) Do you see a problem with the lack of lock coupling > at "first key on the next page"? Practically speaking, no, I see no problems. > I agree, but perhaps the "first key on the next page" code is more complex > than general-case code would be. If the lack of lock coupling is fine, then I > think memory context lifecycle is the only obstacle making index page > boundaries special. Are there factors beyond that? I believe that my concern back in 2021 was that the general complexity of cross-page checking was unlikely to be worth it. Note that nbtsplitloc.c is *maximally* aggressive about avoiding split points that fall within some group of duplicates, so with a unique index it should be very rare. Admittedly, I was probably thinking about the complexity of adding a bunch of code just to be able to check uniqueness across page boundaries. I did mention lock coupling by name, but that was more of a catch-all term for the problems in this area. > We already have > state->lowkey kept across pages via MemoryContextAlloc(). Similar lines of > code could preserve the scan key for checkunique, making the "first key on the > next page" code unnecessary. I suspect that I was overly focussed on the index structure itself back when I made these remarks. I might not have considered that just using an MVCC snapshot for the TIDs makes the whole process safe, though that now seems quite obvious. Separately, I now see that the committed patch just reuses the code that has long been used to check that things are in the correct order across page boundaries: this is the bt_right_page_check_scankey check, which existed in the very earliest versions of amcheck. So while I agree that we could just keep the original scan key (from the last item on every leaf page), and then make the check at the start of the next page instead (as opposed to making it at the end of the previous leaf page, which is how it works now), it's not obvious that that would be a good trade-off, all things considered. It might still be a little better that way around, overall, but you're not just talking about changing the recently committed checkunique patch (I think). You're also talking about restructuring the long established bt_right_page_check_scankey check (otherwise, what's the point?). I'm not categorically opposed to that, but it's not as if it'll allow you to throw out a bunch of code -- AFAICT that proposal doesn't have that clear advantage going for it. The race condition that is described at great length in bt_right_page_check_scankey isn't ever going to be a problem for the recently committed checkunique patch (as you more or less pointed out yourself), but obviously it is still a concern for the cross-page order check. In summary, the old bt_right_page_check_scankey check is strictly concerned with the consistency of a physical data structure (the index itself), whereas the new checkunique check makes sure that the logical content of the database is consistent (the index, the heap, and all associated transaction status metadata have to be consistent). That means that the concerns that are described at length in bt_right_page_check_scankey (nor anything like those concerns) don't apply to the new checkunique check. We agree on all that, I think. But it's less clear that that presents us with an opportunity to simplify this patch. > Adding checkunique raised runtime from 58s to 276s, because it checks > visibility for every heap tuple. It could do the heap fetch and visibility > check lazily, when the index yields two heap TIDs for one scan key. That > should give zero visibility checks for this particular test case, and it > doesn't add visibility checks to bloated-table cases. The added runtime that you report seems quite excessive to me. I'm really surprised that the code doesn't manage to avoid visibility checks in the absence of duplicates that might both have TIDs considered visible. Lazy visibility checking seems almost essential, and not just a nice-to-have optimization. It seems like the implication of everything that you said about refactoring/moving the check was that doing so would enable this optimization (at least an implementation along the lines of your pseudo code). If that was what you intended, then it's not obvious to me why it is relevant. What, if anything, does it have to do with making the new checkunique visibility checks happen lazily? -- Peter Geoghegan
On Mon, Mar 25, 2024 at 12:03:10PM -0400, Peter Geoghegan wrote: > On Sun, Mar 24, 2024 at 10:03 PM Noah Misch <noah@leadboat.com> wrote: > Separately, I now see that the committed patch just reuses the code > that has long been used to check that things are in the correct order > across page boundaries: this is the bt_right_page_check_scankey check, > which existed in the very earliest versions of amcheck. So while I > agree that we could just keep the original scan key (from the last > item on every leaf page), and then make the check at the start of the > next page instead (as opposed to making it at the end of the previous > leaf page, which is how it works now), it's not obvious that that > would be a good trade-off, all things considered. > > It might still be a little better that way around, overall, but you're > not just talking about changing the recently committed checkunique > patch (I think). You're also talking about restructuring the long > established bt_right_page_check_scankey check (otherwise, what's the > point?). I'm not categorically opposed to that, but it's not as if I wasn't thinking about changing the pre-v17 bt_right_page_check_scankey() code. I got interested in this area when I saw the interaction of the new "first key on the next page" logic with bt_right_page_check_scankey(). The patch made bt_right_page_check_scankey() pass back rightfirstoffset. The new code then does palloc_btree_page() and PageGetItem() with that offset, which bt_right_page_check_scankey() had already done. That smelled like a misplaced distribution of responsibility. For a time, I suspected the new code should move down into bt_right_page_check_scankey(). Then I transitioned to thinking checkunique didn't need new code for the page boundary. > it'll allow you to throw out a bunch of code -- AFAICT that proposal > doesn't have that clear advantage going for it. The race condition > that is described at great length in bt_right_page_check_scankey isn't > ever going to be a problem for the recently committed checkunique > patch (as you more or less pointed out yourself), but obviously it is > still a concern for the cross-page order check. > > In summary, the old bt_right_page_check_scankey check is strictly > concerned with the consistency of a physical data structure (the index > itself), whereas the new checkunique check makes sure that the logical > content of the database is consistent (the index, the heap, and all > associated transaction status metadata have to be consistent). That > means that the concerns that are described at length in > bt_right_page_check_scankey (nor anything like those concerns) don't > apply to the new checkunique check. We agree on all that, I think. But > it's less clear that that presents us with an opportunity to simplify > this patch. See above for why I anticipated a simplification opportunity with respect to new-in-v17 code. Still, it may not pan out. > > Adding checkunique raised runtime from 58s to 276s, because it checks Side note: my last email incorrectly described that as "raises runtime by 476%". It should have said "by 376%" or "by a factor of 4.76". > > visibility for every heap tuple. It could do the heap fetch and visibility > > check lazily, when the index yields two heap TIDs for one scan key. That > > should give zero visibility checks for this particular test case, and it > > doesn't add visibility checks to bloated-table cases. > It seems like the implication of everything that you said about > refactoring/moving the check was that doing so would enable this > optimization (at least an implementation along the lines of your > pseudo code). If that was what you intended, then it's not obvious to > me why it is relevant. What, if anything, does it have to do with > making the new checkunique visibility checks happen lazily? Their connection is just being the two big-picture topics I found in post-commit review. Decisions about the cross-page check are indeed separable from decisions about lazy vs. eager visibility checks. Thanks, nm
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Peter Geoghegan
Date:
On Mon, Mar 25, 2024 at 2:24 PM Noah Misch <noah@leadboat.com> wrote: > I wasn't thinking about changing the pre-v17 bt_right_page_check_scankey() > code. I got interested in this area when I saw the interaction of the new > "first key on the next page" logic with bt_right_page_check_scankey(). The > patch made bt_right_page_check_scankey() pass back rightfirstoffset. The new > code then does palloc_btree_page() and PageGetItem() with that offset, which > bt_right_page_check_scankey() had already done. That smelled like a misplaced > distribution of responsibility. For a time, I suspected the new code should > move down into bt_right_page_check_scankey(). Then I transitioned to thinking > checkunique didn't need new code for the page boundary. Ah, I see. Somehow I missed this point when I recently took a fresh look at the committed patch. I did notice (I meant to point out) that I have concerns about this part of the new uniqueness check code: " if (P_IGNORE(topaque) || !P_ISLEAF(topaque)) break; " My concern here is with the !P_ISLEAF(topaque) test -- it shouldn't be required. If the page in question isn't a leaf page, then the index must be corrupt (or the page deletion recycle safety/drain technique thing is buggy). The " !P_ISLEAF(topaque)" part of the check is either superfluous or something that ought to be reported as corruption -- it's not a legal/expected state. Separately, I dislike the way the target block changes within bt_target_page_check(). The general idea behind verify_nbtree.c's target block is that every block becomes the target exactly once, in a clearly defined place. All corruption (in the index structure itself) is formally considered to be a problem with that particular target block. I want to be able to clearly distinguish between the target and target's right sibling here, to explain my concerns, but they're kinda both the target, so that's a lot harder than it should be. (Admittedly directly blaming the target block has always been a little bit arbitrary, at least in certain cases, but even there it provides structure that makes things much easier to describe unambiguously.) -- Peter Geoghegan
On Fri, Mar 29, 2024 at 02:17:08PM -0400, Peter Geoghegan wrote: > On Mon, Mar 25, 2024 at 2:24 PM Noah Misch <noah@leadboat.com> wrote: > > I wasn't thinking about changing the pre-v17 bt_right_page_check_scankey() > > code. I got interested in this area when I saw the interaction of the new > > "first key on the next page" logic with bt_right_page_check_scankey(). The > > patch made bt_right_page_check_scankey() pass back rightfirstoffset. The new > > code then does palloc_btree_page() and PageGetItem() with that offset, which > > bt_right_page_check_scankey() had already done. That smelled like a misplaced > > distribution of responsibility. For a time, I suspected the new code should > > move down into bt_right_page_check_scankey(). Then I transitioned to thinking > > checkunique didn't need new code for the page boundary. > I did notice (I meant to point out) that I have concerns about this > part of the new uniqueness check code: > > " > if (P_IGNORE(topaque) || !P_ISLEAF(topaque)) > break; > " > > My concern here is with the !P_ISLEAF(topaque) test -- it shouldn't be > required. If the page in question isn't a leaf page, then the index > must be corrupt (or the page deletion recycle safety/drain technique > thing is buggy). The " !P_ISLEAF(topaque)" part of the check is either > superfluous or something that ought to be reported as corruption -- > it's not a legal/expected state. Good point. > Separately, I dislike the way the target block changes within > bt_target_page_check(). The general idea behind verify_nbtree.c's > target block is that every block becomes the target exactly once, in a > clearly defined place. Agreed.
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Peter Eisentraut
Date:
On 24.10.23 22:13, Alexander Korotkov wrote: > On Wed, Sep 28, 2022 at 11:44 AM Aleksander Alekseev > <aleksander@timescale.com> wrote: >>> I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done without anysignificant code changes and CF bot how happy again. >>> >>> I'm going to move it to RfC, could I? If not, please tell why. >> >> I restored the "Ready for Committer" state. I don't think it's a good >> practice to change the state every time the patch has a slight >> conflict or something. This is not helpful at all. Such things happen >> quite regularly and typically are fixed in a couple of days. > > This patch seems useful to me. I went through the thread, it seems > that all the critics are addressed. > > I've rebased this patch. Also, I've run perltidy for tests, split > long errmsg() into errmsg(), errdetail() and errhint(), and do other > minor enchantments. > > I think this patch is ready to go. I'm going to push it if there are > no objections. I just found the new pg_amcheck option --checkunique in PG17-to-be. Could we rename this to --check-unique? Seems friendlier. Maybe also rename the bt_index_check function argument to check_unique.
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
I did notice (I meant to point out) that I have concerns about this
part of the new uniqueness check code:
"
if (P_IGNORE(topaque) || !P_ISLEAF(topaque))
break;
"
My concern here is with the !P_ISLEAF(topaque) test -- it shouldn't be
required
I agree. But I didn't see the need to check uniqueness constraints violations in internal pages. Furthermore, it doesn't mean only a violation of constraint, but a major index corruption. I agree that checking and reporting this type of corruption separately is a possible thing.
Separately, I dislike the way the target block changes within
bt_target_page_check(). The general idea behind verify_nbtree.c's
target block is that every block becomes the target exactly once, in a
clearly defined place. All corruption (in the index structure itself)
is formally considered to be a problem with that particular target
block. I want to be able to clearly distinguish between the target and
target's right sibling here, to explain my concerns, but they're kinda
both the target, so that's a lot harder than it should be. (Admittedly
directly blaming the target block has always been a little bit
arbitrary, at least in certain cases, but even there it provides
structure that makes things much easier to describe unambiguously.)
The possible way to load the target block only once is to get rid of the cross-page uniqueness violation check. I introduced it to catch more possible cases of uniqueness violations. Though they are expected to be extremely rare, and anyway the algorithm doesn't get any warranty, just does its best to catch what is possible. I don't object to this change.
Regards,
Pavel.
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Alexander Korotkov
Date:
On Wed, Apr 17, 2024 at 6:41 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote: >> I did notice (I meant to point out) that I have concerns about this >> part of the new uniqueness check code: >> " >> if (P_IGNORE(topaque) || !P_ISLEAF(topaque)) >> break; >> " >> >> My concern here is with the !P_ISLEAF(topaque) test -- it shouldn't be >> required > > I agree. But I didn't see the need to check uniqueness constraints violations in internal pages. Furthermore, it doesn'tmean only a violation of constraint, but a major index corruption. I agree that checking and reporting this type ofcorruption separately is a possible thing. I think we could just throw an error in case of an unexpected internal page. It doesn't seem reasonable to continue the check with this type of corruption detected. If the tree linkage is corrupted we may enter an endless loop or something. >> Separately, I dislike the way the target block changes within >> bt_target_page_check(). The general idea behind verify_nbtree.c's >> target block is that every block becomes the target exactly once, in a >> clearly defined place. All corruption (in the index structure itself) >> is formally considered to be a problem with that particular target >> block. I want to be able to clearly distinguish between the target and >> target's right sibling here, to explain my concerns, but they're kinda >> both the target, so that's a lot harder than it should be. (Admittedly >> directly blaming the target block has always been a little bit >> arbitrary, at least in certain cases, but even there it provides >> structure that makes things much easier to describe unambiguously.) > > The possible way to load the target block only once is to get rid of the cross-page uniqueness violation check. I introducedit to catch more possible cases of uniqueness violations. Though they are expected to be extremely rare, and anywaythe algorithm doesn't get any warranty, just does its best to catch what is possible. I don't object to this change. I think we could probably just avoid setting state->target during cross-page check. Just save that into a local variable and pass as an argument where needed. Skipping the visibility checks for "only one tuple for scan key" case looks like very valuable optimization [1]. I also think we should wrap lVis_* variables into struct. That would make the way we pass them to functions more elegant. Links. 1. https://www.postgresql.org/message-id/20240325020323.fd.nmisch%40google.com ------ Regards, Alexander Korotkov
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Alexander Korotkov
Date:
On Wed, Apr 17, 2024 at 9:38 AM Peter Eisentraut <peter@eisentraut.org> wrote: > On 24.10.23 22:13, Alexander Korotkov wrote: > > On Wed, Sep 28, 2022 at 11:44 AM Aleksander Alekseev > > <aleksander@timescale.com> wrote: > >>> I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done withoutany significant code changes and CF bot how happy again. > >>> > >>> I'm going to move it to RfC, could I? If not, please tell why. > >> > >> I restored the "Ready for Committer" state. I don't think it's a good > >> practice to change the state every time the patch has a slight > >> conflict or something. This is not helpful at all. Such things happen > >> quite regularly and typically are fixed in a couple of days. > > > > This patch seems useful to me. I went through the thread, it seems > > that all the critics are addressed. > > > > I've rebased this patch. Also, I've run perltidy for tests, split > > long errmsg() into errmsg(), errdetail() and errhint(), and do other > > minor enchantments. > > > > I think this patch is ready to go. I'm going to push it if there are > > no objections. > > I just found the new pg_amcheck option --checkunique in PG17-to-be. > Could we rename this to --check-unique? Seems friendlier. Maybe also > rename the bt_index_check function argument to check_unique. +1 from me Let's do so if nobody objects. ------ Regards, Alexander Korotkov
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
Hi, hackers!
On Wed, 24 Apr 2024 at 13:58, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Wed, Apr 17, 2024 at 9:38 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> On 24.10.23 22:13, Alexander Korotkov wrote:
> > On Wed, Sep 28, 2022 at 11:44 AM Aleksander Alekseev
> > <aleksander@timescale.com> wrote:
> >>> I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done without any significant code changes and CF bot how happy again.
> >>>
> >>> I'm going to move it to RfC, could I? If not, please tell why.
> >>
> >> I restored the "Ready for Committer" state. I don't think it's a good
> >> practice to change the state every time the patch has a slight
> >> conflict or something. This is not helpful at all. Such things happen
> >> quite regularly and typically are fixed in a couple of days.
> >
> > This patch seems useful to me. I went through the thread, it seems
> > that all the critics are addressed.
> >
> > I've rebased this patch. Also, I've run perltidy for tests, split
> > long errmsg() into errmsg(), errdetail() and errhint(), and do other
> > minor enchantments.
> >
> > I think this patch is ready to go. I'm going to push it if there are
> > no objections.
>
> I just found the new pg_amcheck option --checkunique in PG17-to-be.
> Could we rename this to --check-unique? Seems friendlier. Maybe also
> rename the bt_index_check function argument to check_unique.
+1 from me
Let's do so if nobody objects.
Thank you very much for your input in this thread!
See the patches based on the proposals in the attachment:
0001: Optimize speed by avoiding heap visibility checking for different non-deduplicated index tuples as proposed by Noah Misch
Speed measurements on my laptop using the exact method recommended by Noah upthread:
Current master branch: checkunique off: 144s, checkunique on: 419s
With patch 0001: checkunique off: 141s, checkunique on: 171s
0002: Use structure to store and transfer info about last visible heap entry (code refactoring) as proposed by Alexander Korotkov
0003: Don't load rightpage into BtreeCheckState (code refactoring) as proposed by Peter Geoghegan
Loading of right page for cross-page unique constraint check in the same way as in bt_right_page_check_scankey()
0004: Report error when next page to a leaf is not a leaf as proposed by Peter Geoghegan
I think it's a very improbable condition and this check might be not necessary, but it's right and safe to break check and report error.
0005: Rename checkunique parameter to more user friendly as proposed by Peter Eisentraut and Alexander Korotkov
Again many thanks for the useful proposals!
Regards,
Pavel Borisov,
Supabase
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Karina Litskevich
Date:
Hi, hackers!
On Thu, Apr 25, 2024 at 4:00 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
0005: Rename checkunique parameter to more user friendly as proposed by Peter Eisentraut and Alexander Korotkov
I'm not sure renaming checkunique is a good idea. Other arguments of
bt_index_check and bt_index_parent_check functions (heapallindexed and
rootdescend) don't have underscore character in them. Corresponding
pg_amcheck options (--heapallindexed and --rootdescend) are also written
in one piece. check_unique and --check-unique stand out. Making arguments
and options in different styles doesn't seem user friendly to me.
Best regards,
Karina Litskevich
Postgres Professional: http://postgrespro.com/
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
Hi, Karina!
On Thu, 25 Apr 2024 at 17:44, Karina Litskevich <litskevichkarina@gmail.com> wrote:
Hi, hackers!On Thu, Apr 25, 2024 at 4:00 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:0005: Rename checkunique parameter to more user friendly as proposed by Peter Eisentraut and Alexander Korotkov
I'm not sure renaming checkunique is a good idea. Other arguments of
bt_index_check and bt_index_parent_check functions (heapallindexed and
rootdescend) don't have underscore character in them. Corresponding
pg_amcheck options (--heapallindexed and --rootdescend) are also written
in one piece. check_unique and --check-unique stand out. Making arguments
and options in different styles doesn't seem user friendly to me.
I did it under the consensus of Peter Eisentraut and Alexander Korotkov.
The pro for renaming is more user-friendly naming, I also agree.
The cons is that we already have both styles: "non-user friendly" heapallindexed and rootdescend and "user-friendly" parent-check.
I'm ready to go with consensus in this matter. It's also not yet too late to make it unique-check (instead of check-unique) to be better in style with parent-check.
Kind regards,
Pavel
On Thu, Apr 25, 2024 at 04:59:54PM +0400, Pavel Borisov wrote: > 0001: Optimize speed by avoiding heap visibility checking for different > non-deduplicated index tuples as proposed by Noah Misch > > Speed measurements on my laptop using the exact method recommended by Noah > upthread: > Current master branch: checkunique off: 144s, checkunique on: 419s > With patch 0001: checkunique off: 141s, checkunique on: 171s Where is the CPU time going to make it still be 21% slower w/ checkunique on? It's a great improvement vs. current master, but I don't have an obvious explanation for the remaining +21%.
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Alexander Korotkov
Date:
Hi Noah, On Wed, May 1, 2024 at 5:24 AM Noah Misch <noah@leadboat.com> wrote: > On Thu, Apr 25, 2024 at 04:59:54PM +0400, Pavel Borisov wrote: > > 0001: Optimize speed by avoiding heap visibility checking for different > > non-deduplicated index tuples as proposed by Noah Misch > > > > Speed measurements on my laptop using the exact method recommended by Noah > > upthread: > > Current master branch: checkunique off: 144s, checkunique on: 419s > > With patch 0001: checkunique off: 141s, checkunique on: 171s > > Where is the CPU time going to make it still be 21% slower w/ checkunique on? > It's a great improvement vs. current master, but I don't have an obvious > explanation for the remaining +21%. I think there is at least extra index tuples comparison. ------ Regards, Alexander Korotkov
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Alexander Korotkov
Date:
On Wed, May 1, 2024 at 5:26 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > On Wed, May 1, 2024 at 5:24 AM Noah Misch <noah@leadboat.com> wrote: > > On Thu, Apr 25, 2024 at 04:59:54PM +0400, Pavel Borisov wrote: > > > 0001: Optimize speed by avoiding heap visibility checking for different > > > non-deduplicated index tuples as proposed by Noah Misch > > > > > > Speed measurements on my laptop using the exact method recommended by Noah > > > upthread: > > > Current master branch: checkunique off: 144s, checkunique on: 419s > > > With patch 0001: checkunique off: 141s, checkunique on: 171s > > > > Where is the CPU time going to make it still be 21% slower w/ checkunique on? > > It's a great improvement vs. current master, but I don't have an obvious > > explanation for the remaining +21%. > > I think there is at least extra index tuples comparison. The revised patchset is attached. I applied cosmetical changes. I'm going to push it if no objections. I don't post the patch with rename of new option. It doesn't seem there is a consensus. I must admit that keeping all the options in the same naming convention makes sense. ------ Regards, Alexander Korotkov Supabase
Attachment
Alexander Korotkov <aekorotkov@gmail.com> writes: > The revised patchset is attached. I applied cosmetical changes. I'm > going to push it if no objections. Is this really suitable material to be pushing post-feature-freeze? It doesn't look like it's fixing any new-in-v17 issues. regards, tom lane
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
Hi, Tom!
On Fri, 10 May 2024, 04:43 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
Alexander Korotkov <aekorotkov@gmail.com> writes:
> The revised patchset is attached. I applied cosmetical changes. I'm
> going to push it if no objections.
Is this really suitable material to be pushing post-feature-freeze?
It doesn't look like it's fixing any new-in-v17 issues.
regards, tom lane
I think these patches are nice-to-have optimizations and refactorings to make code look better. They are not necessary for the main feature. They don't fix any bugs. But they were requested in the thread, and make sense in my opinion.
I really don't know what's the policy of applying code improvements other than bugfixes post feature-freeze. IMO they are safe to be appiled to v17, but they also could be added later.
Regards,
Pavel Borisov
Supabase
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Alexander Korotkov
Date:
On Fri, May 10, 2024 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alexander Korotkov <aekorotkov@gmail.com> writes: > > The revised patchset is attached. I applied cosmetical changes. I'm > > going to push it if no objections. > > Is this really suitable material to be pushing post-feature-freeze? > It doesn't look like it's fixing any new-in-v17 issues. These are code improvements to the 5ae2087202, which answer critics in the thread. 0001 comprises an optimization, but it's rather small and simple. 0002 and 0003 contain refactoring. 0004 contains better error reporting. For me this looks like pretty similar to what others commit post-FF, isn't it? ------ Regards, Alexander Korotkov Supabase
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
Hi, Alexander!
On Fri, 10 May 2024 at 12:39, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Fri, May 10, 2024 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > The revised patchset is attached. I applied cosmetical changes. I'm
> > going to push it if no objections.
>
> Is this really suitable material to be pushing post-feature-freeze?
> It doesn't look like it's fixing any new-in-v17 issues.
These are code improvements to the 5ae2087202, which answer critics in
the thread. 0001 comprises an optimization, but it's rather small and
simple. 0002 and 0003 contain refactoring. 0004 contains better
error reporting. For me this looks like pretty similar to what others
commit post-FF, isn't it?
I've re-checked patches v2. Differences from v1 are in improving naming/pgindent's/commit messages.
In 0002 order of variables in struct BtreeLastVisibleEntry changed. This doesn't change code behavior.
Patch v2-0003 doesn't contain credits and a discussion link. All other patches do.
Overall, patches contain small performance optimization (0001), code refactoring and error reporting changes. IMO they could be pushed post-FF.
Regards,
Pavel.
> On May 10, 2024, at 5:10 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > Hi, Alexander! > > On Fri, 10 May 2024 at 12:39, Alexander Korotkov <aekorotkov@gmail.com> wrote: > On Fri, May 10, 2024 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Alexander Korotkov <aekorotkov@gmail.com> writes: > > > The revised patchset is attached. I applied cosmetical changes. I'm > > > going to push it if no objections. > > > > Is this really suitable material to be pushing post-feature-freeze? > > It doesn't look like it's fixing any new-in-v17 issues. > > These are code improvements to the 5ae2087202, which answer critics in > the thread. 0001 comprises an optimization, but it's rather small and > simple. 0002 and 0003 contain refactoring. 0004 contains better > error reporting. For me this looks like pretty similar to what others > commit post-FF, isn't it? > I've re-checked patches v2. Differences from v1 are in improving naming/pgindent's/commit messages. > In 0002 order of variables in struct BtreeLastVisibleEntry changed. This doesn't change code behavior. > > Patch v2-0003 doesn't contain credits and a discussion link. All other patches do. > > Overall, patches contain small performance optimization (0001), code refactoring and error reporting changes. IMO theycould be pushed post-FF. v2-0001's commit message itself says, "This commit implements skipping keys". I take no position on the correctness or valueof the improvement, but it seems out of scope post feature freeze. The patch seems to postpone uniqueness checkinguntil later in the scan than what the prior version did, and that kind of change could require more analysis thanwe have time for at this point in the release cycle. v2-0002 does appear to just be refactoring. I don't care for a small portion of that patch, but I doubt it violates thepost feature freeze rules. In particular: + BtreeLastVisibleEntry lVis = {InvalidBlockNumber, InvalidOffsetNumber, -1, NULL}; v2-0003 may be an improvement in some way, but it compounds some preexisting confusion also. There is already a member ofthe BtreeCheckState called "target" and a memory context in that struct called "targetcontext". That context is used toallocate pages "state->target", "rightpage", "child" and "page", but not "metapage". Perhaps "targetcontext" is a poorchoice of name? "notmetacontext" is a terrible name, but closer to describing the purpose of the memory context. Careto propose something sensible? Prior to applying v2-0003, the rightpage was stored in state->target, and continued to be in state->target later when entering /* * * Downlink check * * * Additional check of child items iff this is an internal page and * caller holds a ShareLock. This happens for every downlink (item) * in target excluding the negative-infinity downlink (again, this is * because it has no useful value to compare). */ if (!P_ISLEAF(topaque) && state->readonly) bt_child_check(state, skey, offset); and thereafter. Now, the rightpage of state->target is created, checked, and free'd, and then the old state->target getsprocessed in the downlink check and thereafter. This is either introducing a bug, or fixing one, but the commit messageis totally ambiguous about whether this is a bugfix or a code cleanup or something else? I think this kind of patchshould have a super clear commit message about what it thinks it is doing. v2-0004 guards against a real threat, and is reasonable post feature freeze — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
Hi, Mark!
On Fri, 10 May 2024, 21:35 Mark Dilger, <mark.dilger@enterprisedb.com> wrote:
> On May 10, 2024, at 5:10 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> Hi, Alexander!
>
> On Fri, 10 May 2024 at 12:39, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Fri, May 10, 2024 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Alexander Korotkov <aekorotkov@gmail.com> writes:
> > > The revised patchset is attached. I applied cosmetical changes. I'm
> > > going to push it if no objections.
> >
> > Is this really suitable material to be pushing post-feature-freeze?
> > It doesn't look like it's fixing any new-in-v17 issues.
>
> These are code improvements to the 5ae2087202, which answer critics in
> the thread. 0001 comprises an optimization, but it's rather small and
> simple. 0002 and 0003 contain refactoring. 0004 contains better
> error reporting. For me this looks like pretty similar to what others
> commit post-FF, isn't it?
> I've re-checked patches v2. Differences from v1 are in improving naming/pgindent's/commit messages.
> In 0002 order of variables in struct BtreeLastVisibleEntry changed. This doesn't change code behavior.
>
> Patch v2-0003 doesn't contain credits and a discussion link. All other patches do.
>
> Overall, patches contain small performance optimization (0001), code refactoring and error reporting changes. IMO they could be pushed post-FF.
v2-0001's commit message itself says, "This commit implements skipping keys". I take no position on the correctness or value of the improvement, but it seems out of scope post feature freeze. The patch seems to postpone uniqueness checking until later in the scan than what the prior version did, and that kind of change could require more analysis than we have time for at this point in the release cycle.
v2-0002 does appear to just be refactoring. I don't care for a small portion of that patch, but I doubt it violates the post feature freeze rules. In particular:
+ BtreeLastVisibleEntry lVis = {InvalidBlockNumber, InvalidOffsetNumber, -1, NULL};
v2-0003 may be an improvement in some way, but it compounds some preexisting confusion also. There is already a member of the BtreeCheckState called "target" and a memory context in that struct called "targetcontext". That context is used to allocate pages "state->target", "rightpage", "child" and "page", but not "metapage". Perhaps "targetcontext" is a poor choice of name? "notmetacontext" is a terrible name, but closer to describing the purpose of the memory context. Care to propose something sensible?
Prior to applying v2-0003, the rightpage was stored in state->target, and continued to be in state->target later when entering
/*
* * Downlink check *
*
* Additional check of child items iff this is an internal page and
* caller holds a ShareLock. This happens for every downlink (item)
* in target excluding the negative-infinity downlink (again, this is
* because it has no useful value to compare).
*/
if (!P_ISLEAF(topaque) && state->readonly)
bt_child_check(state, skey, offset);
and thereafter. Now, the rightpage of state->target is created, checked, and free'd, and then the old state->target gets processed in the downlink check and thereafter. This is either introducing a bug, or fixing one, but the commit message is totally ambiguous about whether this is a bugfix or a code cleanup or something else? I think this kind of patch should have a super clear commit message about what it thinks it is doing.
v2-0004 guards against a real threat, and is reasonable post feature freeze
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
IMO 0003 doesn't introduce nor fixes a bug. It loads rightpage into a local variable, rather that to a BtreeCheckState that can have another users of state->target afterb uniqueness check in the future, but don't have now. So the original patch is correct, and the goal of this refactoring is to untie rightpage fron state structure as it's used only transiently for cross-page unuque check. It's the same style as already used bt_right_page_check_scankey() that loads rightpage into a local variable.
For 0002 I doubt I understand your actual positiob. Could you explain what it violates or doesn't violate?
Best regards,
Pavel.
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
On Fri, 10 May 2024, 22:42 Pavel Borisov, <pashkin.elfe@gmail.com> wrote:
Hi, Mark!On Fri, 10 May 2024, 21:35 Mark Dilger, <mark.dilger@enterprisedb.com> wrote:
> On May 10, 2024, at 5:10 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> Hi, Alexander!
>
> On Fri, 10 May 2024 at 12:39, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Fri, May 10, 2024 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Alexander Korotkov <aekorotkov@gmail.com> writes:
> > > The revised patchset is attached. I applied cosmetical changes. I'm
> > > going to push it if no objections.
> >
> > Is this really suitable material to be pushing post-feature-freeze?
> > It doesn't look like it's fixing any new-in-v17 issues.
>
> These are code improvements to the 5ae2087202, which answer critics in
> the thread. 0001 comprises an optimization, but it's rather small and
> simple. 0002 and 0003 contain refactoring. 0004 contains better
> error reporting. For me this looks like pretty similar to what others
> commit post-FF, isn't it?
> I've re-checked patches v2. Differences from v1 are in improving naming/pgindent's/commit messages.
> In 0002 order of variables in struct BtreeLastVisibleEntry changed. This doesn't change code behavior.
>
> Patch v2-0003 doesn't contain credits and a discussion link. All other patches do.
>
> Overall, patches contain small performance optimization (0001), code refactoring and error reporting changes. IMO they could be pushed post-FF.
v2-0001's commit message itself says, "This commit implements skipping keys". I take no position on the correctness or value of the improvement, but it seems out of scope post feature freeze. The patch seems to postpone uniqueness checking until later in the scan than what the prior version did, and that kind of change could require more analysis than we have time for at this point in the release cycle.
v2-0002 does appear to just be refactoring. I don't care for a small portion of that patch, but I doubt it violates the post feature freeze rules. In particular:
+ BtreeLastVisibleEntry lVis = {InvalidBlockNumber, InvalidOffsetNumber, -1, NULL};
v2-0003 may be an improvement in some way, but it compounds some preexisting confusion also. There is already a member of the BtreeCheckState called "target" and a memory context in that struct called "targetcontext". That context is used to allocate pages "state->target", "rightpage", "child" and "page", but not "metapage". Perhaps "targetcontext" is a poor choice of name? "notmetacontext" is a terrible name, but closer to describing the purpose of the memory context. Care to propose something sensible?
Prior to applying v2-0003, the rightpage was stored in state->target, and continued to be in state->target later when entering
/*
* * Downlink check *
*
* Additional check of child items iff this is an internal page and
* caller holds a ShareLock. This happens for every downlink (item)
* in target excluding the negative-infinity downlink (again, this is
* because it has no useful value to compare).
*/
if (!P_ISLEAF(topaque) && state->readonly)
bt_child_check(state, skey, offset);
and thereafter. Now, the rightpage of state->target is created, checked, and free'd, and then the old state->target gets processed in the downlink check and thereafter. This is either introducing a bug, or fixing one, but the commit message is totally ambiguous about whether this is a bugfix or a code cleanup or something else? I think this kind of patch should have a super clear commit message about what it thinks it is doing.
v2-0004 guards against a real threat, and is reasonable post feature freeze
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL CompanyIMO 0003 doesn't introduce nor fixes a bug. It loads rightpage into a local variable, rather that to a BtreeCheckState that can have another users of state->target afterb uniqueness check in the future, but don't have now. So the original patch is correct, and the goal of this refactoring is to untie rightpage fron state structure as it's used only transiently for cross-page unuque check. It's the same style as already used bt_right_page_check_scankey() that loads rightpage into a local variable.For 0002 I doubt I understand your actual positiob. Could you explain what it violates or doesn't violate?
Please forgive many typos in the previous message, I wrote from phone.
Pavel.
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Alexander Korotkov
Date:
On Fri, May 10, 2024 at 8:35 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > On May 10, 2024, at 5:10 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > On Fri, 10 May 2024 at 12:39, Alexander Korotkov <aekorotkov@gmail.com> wrote: > > On Fri, May 10, 2024 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > Alexander Korotkov <aekorotkov@gmail.com> writes: > > > > The revised patchset is attached. I applied cosmetical changes. I'm > > > > going to push it if no objections. > > > > > > Is this really suitable material to be pushing post-feature-freeze? > > > It doesn't look like it's fixing any new-in-v17 issues. > > > > These are code improvements to the 5ae2087202, which answer critics in > > the thread. 0001 comprises an optimization, but it's rather small and > > simple. 0002 and 0003 contain refactoring. 0004 contains better > > error reporting. For me this looks like pretty similar to what others > > commit post-FF, isn't it? > > I've re-checked patches v2. Differences from v1 are in improving naming/pgindent's/commit messages. > > In 0002 order of variables in struct BtreeLastVisibleEntry changed. This doesn't change code behavior. > > > > Patch v2-0003 doesn't contain credits and a discussion link. All other patches do. > > > > Overall, patches contain small performance optimization (0001), code refactoring and error reporting changes. IMO theycould be pushed post-FF. > > v2-0001's commit message itself says, "This commit implements skipping keys". I take no position on the correctness orvalue of the improvement, but it seems out of scope post feature freeze. The patch seems to postpone uniqueness checkinguntil later in the scan than what the prior version did, and that kind of change could require more analysis thanwe have time for at this point in the release cycle. Formally this could be classified as algorithmic change and probably should be postponed to the next release. But that's quite local optimization, which just postpones a function call within the same iteration of loop. And the effect is huge. Probably we could allow this post-FF in the sake of quality release, given it's very local change with a huge effect. > v2-0002 does appear to just be refactoring. I don't care for a small portion of that patch, but I doubt it violates thepost feature freeze rules. In particular: > > + BtreeLastVisibleEntry lVis = {InvalidBlockNumber, InvalidOffsetNumber, -1, NULL}; I don't understand what is the problem with this line and post feature freeze rules. Please, explain it more. > v2-0003 may be an improvement in some way, but it compounds some preexisting confusion also. There is already a memberof the BtreeCheckState called "target" and a memory context in that struct called "targetcontext". That context isused to allocate pages "state->target", "rightpage", "child" and "page", but not "metapage". Perhaps "targetcontext" isa poor choice of name? "notmetacontext" is a terrible name, but closer to describing the purpose of the memory context. Care to propose something sensible? > > Prior to applying v2-0003, the rightpage was stored in state->target, and continued to be in state->target later when entering > > /* > * * Downlink check * > * > * Additional check of child items iff this is an internal page and > * caller holds a ShareLock. This happens for every downlink (item) > * in target excluding the negative-infinity downlink (again, this is > * because it has no useful value to compare). > */ > if (!P_ISLEAF(topaque) && state->readonly) > bt_child_check(state, skey, offset); > > and thereafter. Now, the rightpage of state->target is created, checked, and free'd, and then the old state->target getsprocessed in the downlink check and thereafter. This is either introducing a bug, or fixing one, but the commit messageis totally ambiguous about whether this is a bugfix or a code cleanup or something else? I think this kind of patchshould have a super clear commit message about what it thinks it is doing. The only bt_target_page_check() caller is bt_check_level_from_leftmost(), which overrides state->target in the next iteration anyway. I think the patch is just refactoring to eliminate the confusion pointer by Peter Geoghegan upthread. 0002 and 0003 don't address any bugs, but It would be very nice to accept them, because it would simplify future backpatching in this area. > v2-0004 guards against a real threat, and is reasonable post feature freeze Ok. ------ Regards, Alexander Korotkov Supabase
> On May 10, 2024, at 11:42 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > IMO 0003 doesn't introduce nor fixes a bug. It loads rightpage into a local variable, rather that to a BtreeCheckStatethat can have another users of state->target afterb uniqueness check in the future, but don't have now. Sothe original patch is correct, and the goal of this refactoring is to untie rightpage fron state structure as it's usedonly transiently for cross-page unuque check. It's the same style as already used bt_right_page_check_scankey() thatloads rightpage into a local variable. Well, you can put an Assert(false) dead in the middle of the code we're discussing and all the regression tests still pass,so I'd argue the change is getting zero test coverage. This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target",which the old implementation most certainly did. That means that after returning from bt_target_page_check()into the calling function bt_check_level_from_leftmost() the value in state->target is not what itwould have been prior to this patch. Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 linesfurther down in bt_check_level_from_leftmost() state->target is clearly used. So the behavior at that point is changingbetween the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch,wrong after the patch, or something else? Is this a bug being introduced, being fixed, or ... ? Having a regression test that actually touches this code would go a fair way towards helping the analysis. > For 0002 I doubt I understand your actual positiob. Could you explain what it violates or doesn't violate? v2-0002 is does not violate the post feature freeze restriction on new features so far as I can tell, but I just don't carefor the variable initialization because it doesn't name the fields. If anybody refactored the struct they might notnotice that the need to reorder this initialization, and depending on various factors including compiler flags they mightnot get an error. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote: > > The only bt_target_page_check() caller is > bt_check_level_from_leftmost(), which overrides state->target in the > next iteration anyway. I think the patch is just refactoring to > eliminate the confusion pointer by Peter Geoghegan upthread. I find your argument unconvincing. After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target in thenext iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by state->target. See line 963. I'm left with four possibilities: 1) bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iterationwhere bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're makingdoesn't matter. 2) The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thing tohappen at what is now line 963. The patch fixes the bug, because state->target no longer gets overwritten where you arenow using "rightpage" for the value. 3) The code used to work, having set up state->target correctly in the place where you are now using "rightpage", but v2-0003has broken that. 4) It's been broken all along and your patch just changes from wrong to wrong. If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that you arenot documenting it. Even with documentation of this interrelationship, I'd be unhappy with how brittle the code is. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way or theother. I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so adding Assertsdoesn't help to investigate the question. If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix. Bug fixesshould be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylisticconsequences. If (3) is true, then I'm complaining that the patch is flat busted. If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are needed. Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage. For reference, I said something similar earlier today in another email to this thread: This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target",which the old implementation most certainly did. That means that after returning from bt_target_page_check()into the calling function bt_check_level_from_leftmost() the value in state->target is not what itwould have been prior to this patch. Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 linesfurther down in bt_check_level_from_leftmost() state->target is clearly used. So the behavior at that point is changingbetween the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch,wrong after the patch, or something else? Is this a bug being introduced, being fixed, or ... ? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Mark Dilger <mark.dilger@enterprisedb.com> writes: > Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage. Indeed. If we have no regression tests that reach this code, it's folly to touch it at all, but most especially so post-feature-freeze. I think the *first* order of business ought to be to create some test cases that reach this area. Perhaps they'll be too expensive to incorporate in our regular regression tests, but we could still use them to investigate Mark's concerns. regards, tom lane
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Alexander Korotkov
Date:
On Sat, May 11, 2024 at 4:13 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote: > > The only bt_target_page_check() caller is > > bt_check_level_from_leftmost(), which overrides state->target in the > > next iteration anyway. I think the patch is just refactoring to > > eliminate the confusion pointer by Peter Geoghegan upthread. > > I find your argument unconvincing. > > After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target inthe next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by state->target. See line 963. > > I'm left with four possibilities: > > > 1) bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iterationwhere bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're makingdoesn't matter. > > 2) The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thingto happen at what is now line 963. The patch fixes the bug, because state->target no longer gets overwritten whereyou are now using "rightpage" for the value. > > 3) The code used to work, having set up state->target correctly in the place where you are now using "rightpage", butv2-0003 has broken that. > > 4) It's been broken all along and your patch just changes from wrong to wrong. > > > If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that youare not documenting it. Even with documentation of this interrelationship, I'd be unhappy with how brittle the code is. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way orthe other. I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so addingAsserts doesn't help to investigate the question. > > If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix. Bug fixesshould be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylisticconsequences. > > If (3) is true, then I'm complaining that the patch is flat busted. > > If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are needed. > > Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage. > > > For reference, I said something similar earlier today in another email to this thread: > > This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target",which the old implementation most certainly did. That means that after returning from bt_target_page_check()into the calling function bt_check_level_from_leftmost() the value in state->target is not what itwould have been prior to this patch. Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 linesfurther down in bt_check_level_from_leftmost() state->target is clearly used. So the behavior at that point is changingbetween the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch,wrong after the patch, or something else? Is this a bug being introduced, being fixed, or ... ? Thank you for your analysis. I'm inclined to believe in 2, but not yet completely sure. It's really pity that our tests don't cover this. I'm investigating this area. ------ Regards, Alexander Korotkov
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Alexander Korotkov
Date:
On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > On Sat, May 11, 2024 at 4:13 AM Mark Dilger > <mark.dilger@enterprisedb.com> wrote: > > > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote: > > > The only bt_target_page_check() caller is > > > bt_check_level_from_leftmost(), which overrides state->target in the > > > next iteration anyway. I think the patch is just refactoring to > > > eliminate the confusion pointer by Peter Geoghegan upthread. > > > > I find your argument unconvincing. > > > > After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target inthe next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by state->target. See line 963. > > > > I'm left with four possibilities: > > > > > > 1) bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iterationwhere bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're makingdoesn't matter. > > > > 2) The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thingto happen at what is now line 963. The patch fixes the bug, because state->target no longer gets overwritten whereyou are now using "rightpage" for the value. > > > > 3) The code used to work, having set up state->target correctly in the place where you are now using "rightpage", butv2-0003 has broken that. > > > > 4) It's been broken all along and your patch just changes from wrong to wrong. > > > > > > If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that youare not documenting it. Even with documentation of this interrelationship, I'd be unhappy with how brittle the code is. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way orthe other. I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so addingAsserts doesn't help to investigate the question. > > > > If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix. Bug fixesshould be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylisticconsequences. > > > > If (3) is true, then I'm complaining that the patch is flat busted. > > > > If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are needed. > > > > Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage. > > > > > > For reference, I said something similar earlier today in another email to this thread: > > > > This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target",which the old implementation most certainly did. That means that after returning from bt_target_page_check()into the calling function bt_check_level_from_leftmost() the value in state->target is not what itwould have been prior to this patch. Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 linesfurther down in bt_check_level_from_leftmost() state->target is clearly used. So the behavior at that point is changingbetween the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch,wrong after the patch, or something else? Is this a bug being introduced, being fixed, or ... ? > > Thank you for your analysis. I'm inclined to believe in 2, but not > yet completely sure. It's really pity that our tests don't cover > this. I'm investigating this area. It seems that I got to the bottom of this. Changing BtreeCheckState.target for a cross-page unique constraint check is wrong, but that happens only for leaf pages. After that BtreeCheckState.target is only used for setting the low key. The low key is only used for non-leaf pages. So, that didn't lead to any visible bug. I've revised the commit message to reflect this. So, the picture for the patches is the following now. 0001 – optimization, but rather simple and giving huge effect 0002 – refactoring 0003 – fix for the bug 0004 – better error reporting ------ Regards, Alexander Korotkov
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
Hi, Alexander!
On Mon, 13 May 2024 at 05:42, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
> On Sat, May 11, 2024 at 4:13 AM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
> > > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > The only bt_target_page_check() caller is
> > > bt_check_level_from_leftmost(), which overrides state->target in the
> > > next iteration anyway. I think the patch is just refactoring to
> > > eliminate the confusion pointer by Peter Geoghegan upthread.
> >
> > I find your argument unconvincing.
> >
> > After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target in the next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by state->target. See line 963.
> >
> > I'm left with four possibilities:
> >
> >
> > 1) bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iteration where bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're making doesn't matter.
> >
> > 2) The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thing to happen at what is now line 963. The patch fixes the bug, because state->target no longer gets overwritten where you are now using "rightpage" for the value.
> >
> > 3) The code used to work, having set up state->target correctly in the place where you are now using "rightpage", but v2-0003 has broken that.
> >
> > 4) It's been broken all along and your patch just changes from wrong to wrong.
> >
> >
> > If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that you are not documenting it. Even with documentation of this interrelationship, I'd be unhappy with how brittle the code is. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way or the other. I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so adding Asserts doesn't help to investigate the question.
> >
> > If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix. Bug fixes should be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylistic consequences.
> >
> > If (3) is true, then I'm complaining that the patch is flat busted.
> >
> > If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are needed.
> >
> > Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage.
> >
> >
> > For reference, I said something similar earlier today in another email to this thread:
> >
> > This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target", which the old implementation most certainly did. That means that after returning from bt_target_page_check() into the calling function bt_check_level_from_leftmost() the value in state->target is not what it would have been prior to this patch. Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 lines further down in bt_check_level_from_leftmost() state->target is clearly used. So the behavior at that point is changing between the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch, wrong after the patch, or something else? Is this a bug being introduced, being fixed, or ... ?
>
> Thank you for your analysis. I'm inclined to believe in 2, but not
> yet completely sure. It's really pity that our tests don't cover
> this. I'm investigating this area.
It seems that I got to the bottom of this. Changing
BtreeCheckState.target for a cross-page unique constraint check is
wrong, but that happens only for leaf pages. After that
BtreeCheckState.target is only used for setting the low key. The low
key is only used for non-leaf pages. So, that didn't lead to any
visible bug. I've revised the commit message to reflect this.
- when the unique check is on, state->target was reassigned only for the leaf pages (under P_ISLEAF(topaque) in bt_target_page_check).
- in this level (leaf) in bt_check_level_from_leftmost() this value of state->target was used to get state->lowkey. Then it was reset (in the next iteration of do loop in in bt_check_level_from_leftmost()
- state->lowkey lives until the end of pages level (leaf) iteration cycle. Then, low-key is reset (state->lowkey = NULL in the end of bt_check_level_from_leftmost())
- state->lowkey is used only in bt_child_check/bt_child_highkey_check. Both are called only from non-leaf pages iteration cycles (under P_ISLEAF(topaque))
- Also there is a check (rightblock_number != P_NONE) in before getting rightpage into state->target in bt_target_page_check() that ensures us that rightpage indeed exists and getting this (unused) lowkey in bt_check_level_from_leftmost will not invoke any page reading errors.
I'm pretty sure that there was no bug in this, not just the bug was hidden.
Indeed re-assigning state->target in leaf page iteration for cross-page unique check was not beautiful, and Peter pointed out this. In my opinion the patch 0003 is a pure code refactoring.
As for the cross-page check regression/TAP testing, this test had problems since the btree page layout is not fixed (especially it's different on 32-bit arch). I had a variant for testing cross-page check when the test was yet regression one upthread for both 32/64 bit architectures. I remember it was decided not to include it due to complications and low impact for testing the corner case of very rare cross-page duplicates. (There were also suggestions to drop cross-page duplicates check at all, which I didn't agree 2 years ago, but still it can make sense)
Separately, I propose to avoid getting state->lowkey for leaf pages at all as it's unused. PFA is a simple patch for this. (I don't add it to the current patch set as I believe it has nothing to do with UNIQUE constraint check, rather it improves the previous btree amcheck code)
Best regards,
Pavel Borisov,
Supabase
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
On Mon, 13 May 2024 at 15:55, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
Hi, Alexander!On Mon, 13 May 2024 at 05:42, Alexander Korotkov <aekorotkov@gmail.com> wrote:On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
> On Sat, May 11, 2024 at 4:13 AM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
> > > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > The only bt_target_page_check() caller is
> > > bt_check_level_from_leftmost(), which overrides state->target in the
> > > next iteration anyway. I think the patch is just refactoring to
> > > eliminate the confusion pointer by Peter Geoghegan upthread.
> >
> > I find your argument unconvincing.
> >
> > After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target in the next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by state->target. See line 963.
> >
> > I'm left with four possibilities:
> >
> >
> > 1) bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iteration where bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're making doesn't matter.
> >
> > 2) The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thing to happen at what is now line 963. The patch fixes the bug, because state->target no longer gets overwritten where you are now using "rightpage" for the value.
> >
> > 3) The code used to work, having set up state->target correctly in the place where you are now using "rightpage", but v2-0003 has broken that.
> >
> > 4) It's been broken all along and your patch just changes from wrong to wrong.
> >
> >
> > If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that you are not documenting it. Even with documentation of this interrelationship, I'd be unhappy with how brittle the code is. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way or the other. I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so adding Asserts doesn't help to investigate the question.
> >
> > If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix. Bug fixes should be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylistic consequences.
> >
> > If (3) is true, then I'm complaining that the patch is flat busted.
> >
> > If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are needed.
> >
> > Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage.
> >
> >
> > For reference, I said something similar earlier today in another email to this thread:
> >
> > This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target", which the old implementation most certainly did. That means that after returning from bt_target_page_check() into the calling function bt_check_level_from_leftmost() the value in state->target is not what it would have been prior to this patch. Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 lines further down in bt_check_level_from_leftmost() state->target is clearly used. So the behavior at that point is changing between the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch, wrong after the patch, or something else? Is this a bug being introduced, being fixed, or ... ?
>
> Thank you for your analysis. I'm inclined to believe in 2, but not
> yet completely sure. It's really pity that our tests don't cover
> this. I'm investigating this area.
It seems that I got to the bottom of this. Changing
BtreeCheckState.target for a cross-page unique constraint check is
wrong, but that happens only for leaf pages. After that
BtreeCheckState.target is only used for setting the low key. The low
key is only used for non-leaf pages. So, that didn't lead to any
visible bug. I've revised the commit message to reflect this.I agree with your analysis regarding state->target:
- when the unique check is on, state->target was reassigned only for the leaf pages (under P_ISLEAF(topaque) in bt_target_page_check).
- in this level (leaf) in bt_check_level_from_leftmost() this value of state->target was used to get state->lowkey. Then it was reset (in the next iteration of do loop in in bt_check_level_from_leftmost()
- state->lowkey lives until the end of pages level (leaf) iteration cycle. Then, low-key is reset (state->lowkey = NULL in the end of bt_check_level_from_leftmost())
- state->lowkey is used only in bt_child_check/bt_child_highkey_check. Both are called only from non-leaf pages iteration cycles (under P_ISLEAF(topaque))
- Also there is a check (rightblock_number != P_NONE) in before getting rightpage into state->target in bt_target_page_check() that ensures us that rightpage indeed exists and getting this (unused) lowkey in bt_check_level_from_leftmost will not invoke any page reading errors.I'm pretty sure that there was no bug in this, not just the bug was hidden.Indeed re-assigning state->target in leaf page iteration for cross-page unique check was not beautiful, and Peter pointed out this. In my opinion the patch 0003 is a pure code refactoring.As for the cross-page check regression/TAP testing, this test had problems since the btree page layout is not fixed (especially it's different on 32-bit arch). I had a variant for testing cross-page check when the test was yet regression one upthread for both 32/64 bit architectures. I remember it was decided not to include it due to complications and low impact for testing the corner case of very rare cross-page duplicates. (There were also suggestions to drop cross-page duplicates check at all, which I didn't agree 2 years ago, but still it can make sense)Separately, I propose to avoid getting state->lowkey for leaf pages at all as it's unused. PFA is a simple patch for this. (I don't add it to the current patch set as I believe it has nothing to do with UNIQUE constraint check, rather it improves the previous btree amcheck code)
A correction of a typo in previous message:
non-leaf pages iteration cycles (under !P_ISLEAF(topaque)) -> non-leaf pages iteration cycles (under !P_ISLEAF(topaque))
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
A correction of a typo in previous message:
non-leaf pages iteration cycles (under P_ISLEAF(topaque)) -> non-leaf pages iteration cycles (under !P_ISLEAF(topaque))
On Mon, 13 May 2024 at 16:19, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
On Mon, 13 May 2024 at 15:55, Pavel Borisov <pashkin.elfe@gmail.com> wrote:Hi, Alexander!On Mon, 13 May 2024 at 05:42, Alexander Korotkov <aekorotkov@gmail.com> wrote:On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
> On Sat, May 11, 2024 at 4:13 AM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
> > > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > The only bt_target_page_check() caller is
> > > bt_check_level_from_leftmost(), which overrides state->target in the
> > > next iteration anyway. I think the patch is just refactoring to
> > > eliminate the confusion pointer by Peter Geoghegan upthread.
> >
> > I find your argument unconvincing.
> >
> > After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target in the next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by state->target. See line 963.
> >
> > I'm left with four possibilities:
> >
> >
> > 1) bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iteration where bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're making doesn't matter.
> >
> > 2) The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thing to happen at what is now line 963. The patch fixes the bug, because state->target no longer gets overwritten where you are now using "rightpage" for the value.
> >
> > 3) The code used to work, having set up state->target correctly in the place where you are now using "rightpage", but v2-0003 has broken that.
> >
> > 4) It's been broken all along and your patch just changes from wrong to wrong.
> >
> >
> > If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that you are not documenting it. Even with documentation of this interrelationship, I'd be unhappy with how brittle the code is. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way or the other. I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so adding Asserts doesn't help to investigate the question.
> >
> > If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix. Bug fixes should be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylistic consequences.
> >
> > If (3) is true, then I'm complaining that the patch is flat busted.
> >
> > If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are needed.
> >
> > Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage.
> >
> >
> > For reference, I said something similar earlier today in another email to this thread:
> >
> > This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target", which the old implementation most certainly did. That means that after returning from bt_target_page_check() into the calling function bt_check_level_from_leftmost() the value in state->target is not what it would have been prior to this patch. Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 lines further down in bt_check_level_from_leftmost() state->target is clearly used. So the behavior at that point is changing between the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch, wrong after the patch, or something else? Is this a bug being introduced, being fixed, or ... ?
>
> Thank you for your analysis. I'm inclined to believe in 2, but not
> yet completely sure. It's really pity that our tests don't cover
> this. I'm investigating this area.
It seems that I got to the bottom of this. Changing
BtreeCheckState.target for a cross-page unique constraint check is
wrong, but that happens only for leaf pages. After that
BtreeCheckState.target is only used for setting the low key. The low
key is only used for non-leaf pages. So, that didn't lead to any
visible bug. I've revised the commit message to reflect this.I agree with your analysis regarding state->target:
- when the unique check is on, state->target was reassigned only for the leaf pages (under P_ISLEAF(topaque) in bt_target_page_check).
- in this level (leaf) in bt_check_level_from_leftmost() this value of state->target was used to get state->lowkey. Then it was reset (in the next iteration of do loop in in bt_check_level_from_leftmost()
- state->lowkey lives until the end of pages level (leaf) iteration cycle. Then, low-key is reset (state->lowkey = NULL in the end of bt_check_level_from_leftmost())
- state->lowkey is used only in bt_child_check/bt_child_highkey_check. Both are called only from non-leaf pages iteration cycles (under P_ISLEAF(topaque))
- Also there is a check (rightblock_number != P_NONE) in before getting rightpage into state->target in bt_target_page_check() that ensures us that rightpage indeed exists and getting this (unused) lowkey in bt_check_level_from_leftmost will not invoke any page reading errors.I'm pretty sure that there was no bug in this, not just the bug was hidden.Indeed re-assigning state->target in leaf page iteration for cross-page unique check was not beautiful, and Peter pointed out this. In my opinion the patch 0003 is a pure code refactoring.As for the cross-page check regression/TAP testing, this test had problems since the btree page layout is not fixed (especially it's different on 32-bit arch). I had a variant for testing cross-page check when the test was yet regression one upthread for both 32/64 bit architectures. I remember it was decided not to include it due to complications and low impact for testing the corner case of very rare cross-page duplicates. (There were also suggestions to drop cross-page duplicates check at all, which I didn't agree 2 years ago, but still it can make sense)Separately, I propose to avoid getting state->lowkey for leaf pages at all as it's unused. PFA is a simple patch for this. (I don't add it to the current patch set as I believe it has nothing to do with UNIQUE constraint check, rather it improves the previous btree amcheck code)A correction of a typo in previous message:non-leaf pages iteration cycles (under !P_ISLEAF(topaque)) -> non-leaf pages iteration cycles (under !P_ISLEAF(topaque))
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Alexander Korotkov
Date:
On Mon, May 13, 2024 at 4:42 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov > <aekorotkov@gmail.com> wrote: > > On Sat, May 11, 2024 at 4:13 AM Mark Dilger > > <mark.dilger@enterprisedb.com> wrote: > > > > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote: > > > > The only bt_target_page_check() caller is > > > > bt_check_level_from_leftmost(), which overrides state->target in the > > > > next iteration anyway. I think the patch is just refactoring to > > > > eliminate the confusion pointer by Peter Geoghegan upthread. > > > > > > I find your argument unconvincing. > > > > > > After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->targetin the next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referencedby state->target. See line 963. > > > > > > I'm left with four possibilities: > > > > > > > > > 1) bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iterationwhere bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're makingdoesn't matter. > > > > > > 2) The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thingto happen at what is now line 963. The patch fixes the bug, because state->target no longer gets overwritten whereyou are now using "rightpage" for the value. > > > > > > 3) The code used to work, having set up state->target correctly in the place where you are now using "rightpage",but v2-0003 has broken that. > > > > > > 4) It's been broken all along and your patch just changes from wrong to wrong. > > > > > > > > > If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and thatyou are not documenting it. Even with documentation of this interrelationship, I'd be unhappy with how brittle the codeis. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one wayor the other. I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so addingAsserts doesn't help to investigate the question. > > > > > > If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix. Bugfixes should be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylisticconsequences. > > > > > > If (3) is true, then I'm complaining that the patch is flat busted. > > > > > > If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that areneeded. > > > > > > Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage. > > > > > > > > > For reference, I said something similar earlier today in another email to this thread: > > > > > > This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target",which the old implementation most certainly did. That means that after returning from bt_target_page_check()into the calling function bt_check_level_from_leftmost() the value in state->target is not what itwould have been prior to this patch. Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 linesfurther down in bt_check_level_from_leftmost() state->target is clearly used. So the behavior at that point is changingbetween the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch,wrong after the patch, or something else? Is this a bug being introduced, being fixed, or ... ? > > > > Thank you for your analysis. I'm inclined to believe in 2, but not > > yet completely sure. It's really pity that our tests don't cover > > this. I'm investigating this area. > > It seems that I got to the bottom of this. Changing > BtreeCheckState.target for a cross-page unique constraint check is > wrong, but that happens only for leaf pages. After that > BtreeCheckState.target is only used for setting the low key. The low > key is only used for non-leaf pages. So, that didn't lead to any > visible bug. I've revised the commit message to reflect this. > > So, the picture for the patches is the following now. > 0001 – optimization, but rather simple and giving huge effect > 0002 – refactoring > 0003 – fix for the bug > 0004 – better error reporting I think the thread contains enough motivation on why 0002, 0003 and 0004 are material for post-FF. They are fixes and refactoring for new-in-v17 feature. I'm going to push them if no objections. Regarding 0001, I'd like to ask Tom and Mark if they find convincing that given that optimization is small, simple and giving huge effect, it could be pushed post-FF? Otherwise, this could wait for v18. ------ Regards, Alexander Korotkov Supabase
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
Hi, Alexander!
On Fri, 17 May 2024 at 14:11, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Mon, May 13, 2024 at 4:42 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov
> <aekorotkov@gmail.com> wrote:
> > On Sat, May 11, 2024 at 4:13 AM Mark Dilger
> > <mark.dilger@enterprisedb.com> wrote:
> > > > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > > The only bt_target_page_check() caller is
> > > > bt_check_level_from_leftmost(), which overrides state->target in the
> > > > next iteration anyway. I think the patch is just refactoring to
> > > > eliminate the confusion pointer by Peter Geoghegan upthread.
> > >
> > > I find your argument unconvincing.
> > >
> > > After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target in the next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by state->target. See line 963.
> > >
> > > I'm left with four possibilities:
> > >
> > >
> > > 1) bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iteration where bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're making doesn't matter.
> > >
> > > 2) The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thing to happen at what is now line 963. The patch fixes the bug, because state->target no longer gets overwritten where you are now using "rightpage" for the value.
> > >
> > > 3) The code used to work, having set up state->target correctly in the place where you are now using "rightpage", but v2-0003 has broken that.
> > >
> > > 4) It's been broken all along and your patch just changes from wrong to wrong.
> > >
> > >
> > > If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that you are not documenting it. Even with documentation of this interrelationship, I'd be unhappy with how brittle the code is. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way or the other. I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so adding Asserts doesn't help to investigate the question.
> > >
> > > If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix. Bug fixes should be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylistic consequences.
> > >
> > > If (3) is true, then I'm complaining that the patch is flat busted.
> > >
> > > If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are needed.
> > >
> > > Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage.
> > >
> > >
> > > For reference, I said something similar earlier today in another email to this thread:
> > >
> > > This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target", which the old implementation most certainly did. That means that after returning from bt_target_page_check() into the calling function bt_check_level_from_leftmost() the value in state->target is not what it would have been prior to this patch. Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 lines further down in bt_check_level_from_leftmost() state->target is clearly used. So the behavior at that point is changing between the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch, wrong after the patch, or something else? Is this a bug being introduced, being fixed, or ... ?
> >
> > Thank you for your analysis. I'm inclined to believe in 2, but not
> > yet completely sure. It's really pity that our tests don't cover
> > this. I'm investigating this area.
>
> It seems that I got to the bottom of this. Changing
> BtreeCheckState.target for a cross-page unique constraint check is
> wrong, but that happens only for leaf pages. After that
> BtreeCheckState.target is only used for setting the low key. The low
> key is only used for non-leaf pages. So, that didn't lead to any
> visible bug. I've revised the commit message to reflect this.
>
> So, the picture for the patches is the following now.
> 0001 – optimization, but rather simple and giving huge effect
> 0002 – refactoring
> 0003 – fix for the bug
> 0004 – better error reporting
I think the thread contains enough motivation on why 0002, 0003 and
0004 are material for post-FF. They are fixes and refactoring for
new-in-v17 feature. I'm going to push them if no objections.
Regarding 0001, I'd like to ask Tom and Mark if they find convincing
that given that optimization is small, simple and giving huge effect,
it could be pushed post-FF? Otherwise, this could wait for v18.
In my view, patches 0002-0004 are worth pushing.
0001 is ready in my view. But I see no problem pushing it into v18 regarding that this optimization could be not eligible for post-FF. I don't know the criteria for this just let's be safe about it.
Regards,
Pavel Borisov
> On May 17, 2024, at 3:11 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote: > > On Mon, May 13, 2024 at 4:42 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: >> On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov >> <aekorotkov@gmail.com> wrote: >>> On Sat, May 11, 2024 at 4:13 AM Mark Dilger >>> <mark.dilger@enterprisedb.com> wrote: >>>>> On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote: >>>>> The only bt_target_page_check() caller is >>>>> bt_check_level_from_leftmost(), which overrides state->target in the >>>>> next iteration anyway. I think the patch is just refactoring to >>>>> eliminate the confusion pointer by Peter Geoghegan upthread. >>>> >>>> I find your argument unconvincing. >>>> >>>> After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->targetin the next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referencedby state->target. See line 963. >>>> >>>> I'm left with four possibilities: >>>> >>>> >>>> 1) bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iterationwhere bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're makingdoesn't matter. >>>> >>>> 2) The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thingto happen at what is now line 963. The patch fixes the bug, because state->target no longer gets overwritten whereyou are now using "rightpage" for the value. >>>> >>>> 3) The code used to work, having set up state->target correctly in the place where you are now using "rightpage", butv2-0003 has broken that. >>>> >>>> 4) It's been broken all along and your patch just changes from wrong to wrong. >>>> >>>> >>>> If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and thatyou are not documenting it. Even with documentation of this interrelationship, I'd be unhappy with how brittle the codeis. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one wayor the other. I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so addingAsserts doesn't help to investigate the question. >>>> >>>> If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix. Bug fixesshould be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylisticconsequences. >>>> >>>> If (3) is true, then I'm complaining that the patch is flat busted. >>>> >>>> If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that areneeded. >>>> >>>> Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage. >>>> >>>> >>>> For reference, I said something similar earlier today in another email to this thread: >>>> >>>> This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target",which the old implementation most certainly did. That means that after returning from bt_target_page_check()into the calling function bt_check_level_from_leftmost() the value in state->target is not what itwould have been prior to this patch. Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 linesfurther down in bt_check_level_from_leftmost() state->target is clearly used. So the behavior at that point is changingbetween the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch,wrong after the patch, or something else? Is this a bug being introduced, being fixed, or ... ? >>> >>> Thank you for your analysis. I'm inclined to believe in 2, but not >>> yet completely sure. It's really pity that our tests don't cover >>> this. I'm investigating this area. >> >> It seems that I got to the bottom of this. Changing >> BtreeCheckState.target for a cross-page unique constraint check is >> wrong, but that happens only for leaf pages. After that >> BtreeCheckState.target is only used for setting the low key. The low >> key is only used for non-leaf pages. So, that didn't lead to any >> visible bug. I've revised the commit message to reflect this. >> >> So, the picture for the patches is the following now. >> 0001 – optimization, but rather simple and giving huge effect >> 0002 – refactoring >> 0003 – fix for the bug >> 0004 – better error reporting > > I think the thread contains enough motivation on why 0002, 0003 and > 0004 are material for post-FF. They are fixes and refactoring for > new-in-v17 feature. I'm going to push them if no objections. > > Regarding 0001, I'd like to ask Tom and Mark if they find convincing > that given that optimization is small, simple and giving huge effect, > it could be pushed post-FF? Otherwise, this could wait for v18. I won't pretend to be part of the Release Management Team. Perhaps Tom wishes to respond. I wrote a TAP test to check the uniqueness checker. bt_index_check() sometimes fails to detect a corruption. This is trueboth before and after applying v3-0001. The bt_index_parent_check() seems to always detect the corruption created bythe TAP test. Likewise, this is true both before and after applying v3-0001. The documentation in https://www.postgresql.org/docs/devel/amcheck.html#AMCHECK-FUNCTIONS is ambiguous: "bt_index_check does not verify invariants that span child/parent relationships, but will verify the presence of all heaptuples as index tuples within the index when heapallindexed is true. When checkunique is true bt_index_check will checkthat no more than one among duplicate entries in unique index is visible. When a routine, lightweight test for corruptionis required in a live production environment, using bt_index_check often provides the best trade-off between thoroughnessof verification and limiting the impact on application performance and availability." The second sentence, "When checkunique is true bt_index_check will check that no more than one among duplicate entries inunique index is visible." is not strictly true, as it won't check if the violation spans a page boundary. That's impliedby the surrounding sentences, but I'm not sure a reader can be trusted to know which way to interpret how "checkunique"works. Clarification is needed. The attached TAP test is not intended for commit. I am only including it here because you might want to use the TAP testas a starting point for creating and testing for new kinds of corruption. Beware the test intentionally includes aninfinite loop, which is helpful for a developer examining the code, but not at all appropriate otherwise. It loads allblocks of the index into memory each loop, which could be made more efficient if we wanted this to be part of the corecodebase. I just threw it together this morning. It's not polished, documented, checked for portability, or otherwiseproduction quality. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
Hi, Mark!
The documentation in https://www.postgresql.org/docs/devel/amcheck.html#AMCHECK-FUNCTIONS is ambiguous:
"bt_index_check does not verify invariants that span child/parent relationships, but will verify the presence of all heap tuples as index tuples within the index when heapallindexed is true. When checkunique is true bt_index_check will check that no more than one among duplicate entries in unique index is visible. When a routine, lightweight test for corruption is required in a live production environment, using bt_index_check often provides the best trade-off between thoroughness of verification and limiting the impact on application performance and availability."
The second sentence, "When checkunique is true bt_index_check will check that no more than one among duplicate entries in unique index is visible." is not strictly true, as it won't check if the violation spans a page boundary.
Amcheck with checkunique option does check uniqueness violation between pages. But it doesn't warranty detection of cross page uniqueness violations in extremely rare cases when the first equal index entry on the next page corresponds to tuple that is not visible (e.g. dead). In this, I followed the Peter's notion [1] that checking across a number of dead equal entries that could theoretically span even across many pages is an unneeded code complication and amcheck is not a tool that provides any warranty when checking an index.
I'm not against docs modification in any way that clarifies its exact usage and limitations.
Kind regards,
Pavel Borisov
> On May 17, 2024, at 11:51 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > Amcheck with checkunique option does check uniqueness violation between pages. But it doesn't warranty detection of crosspage uniqueness violations in extremely rare cases when the first equal index entry on the next page corresponds totuple that is not visible (e.g. dead). In this, I followed the Peter's notion [1] that checking across a number of deadequal entries that could theoretically span even across many pages is an unneeded code complication and amcheck is nota tool that provides any warranty when checking an index. This confuses me a bit. The regression test creates a table and index but never performs any DELETE nor any UPDATE operations,so none of the index entries should be dead. If I am understanding you correct, I'd be forced to conclude thatthe uniqueness checking code is broken. Can you take a look? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On May 17, 2024, at 12:10 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote: > >> Amcheck with checkunique option does check uniqueness violation between pages. But it doesn't warranty detection of crosspage uniqueness violations in extremely rare cases when the first equal index entry on the next page corresponds totuple that is not visible (e.g. dead). In this, I followed the Peter's notion [1] that checking across a number of deadequal entries that could theoretically span even across many pages is an unneeded code complication and amcheck is nota tool that provides any warranty when checking an index. > > This confuses me a bit. The regression test creates a table and index but never performs any DELETE nor any UPDATE operations,so none of the index entries should be dead. If I am understanding you correct, I'd be forced to conclude thatthe uniqueness checking code is broken. Can you take a look? On further review, the test was not anticipating the error message "high key invariant violated for index". That wasn'tseen in calls to bt_index_parent_check(), but appears as one of the errors from bt_index_check(). I am rerunning thetest now.... — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
Hi, Mark!
On Fri, 17 May 2024 at 23:10, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
> On May 17, 2024, at 11:51 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> Amcheck with checkunique option does check uniqueness violation between pages. But it doesn't warranty detection of cross page uniqueness violations in extremely rare cases when the first equal index entry on the next page corresponds to tuple that is not visible (e.g. dead). In this, I followed the Peter's notion [1] that checking across a number of dead equal entries that could theoretically span even across many pages is an unneeded code complication and amcheck is not a tool that provides any warranty when checking an index.
This confuses me a bit. The regression test creates a table and index but never performs any DELETE nor any UPDATE operations, so none of the index entries should be dead. If I am understanding you correct, I'd be forced to conclude that the uniqueness checking code is broken. Can you take a look?
At the first glance it's not clear to me:
- why your test creates cross-page unique constraint violations?
- how do you know they are not detected?
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Peter Geoghegan
Date:
On Fri, May 17, 2024 at 3:42 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > On further review, the test was not anticipating the error message "high key invariant violated for index". That wasn'tseen in calls to bt_index_parent_check(), but appears as one of the errors from bt_index_check(). I am rerunning thetest now.... Many different parts of the B-Tree code will fight against allowing duplicates of the same value to span multiple leaf pages -- this is especially true for unique indexes. For example, nbtsplitloc.c has a variety of strategies that will prevent choosing a split point that necessitates including a distinguishing heap TID in the new high key. In other words, nbtsplitloc.c is very aggressive about picking a split point between (rather than within) groups of duplicates. Of course it's still *possible* for a unique index to have multiple leaf pages containing the same individual value. The regression tests do have coverage for certain relevant code paths (e.g., there is coverage for code paths only hit when _bt_check_unique has to go to the page to the right). This is only the case because I went out of my way to make sure of it, by adding tests that allow a huge number of version duplicates to accumulate within a unique index. (The "move right" _bt_check_unique branches had zero test coverage for a year or two.) Just how important it is that amcheck covers cases where the version duplicates span multiple leaf pages is of course debatable -- it's always better to be more thorough, when practical. But it's certainly something that needs to be assessed based on the merits. -- Peter Geoghegan
> On May 17, 2024, at 12:42 PM, Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > At the first glance it's not clear to me: > - why your test creates cross-page unique constraint violations? To see if they are detected. > - how do you know they are not detected? It appears that they are detected. At least, rerunning the test after adjusting the expected output, I no longer see problems. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On May 17, 2024, at 1:00 PM, Peter Geoghegan <pg@bowt.ie> wrote: > > Many different parts of the B-Tree code will fight against allowing > duplicates of the same value to span multiple leaf pages -- this is > especially true for unique indexes. The quick-and-dirty TAP test I wrote this morning is intended to introduce duplicates across page boundaries, not to testfor ones that got there by normal database activity. In other words, the TAP test forcibly corrupts the index by changinga value on one side of a boundary to be equal to the value on the other side of the boundary. Prior to the corruptingaction the values were all unique. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Peter Geoghegan
Date:
On Fri, May 17, 2024 at 4:10 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > The quick-and-dirty TAP test I wrote this morning is intended to introduce duplicates across page boundaries, not to testfor ones that got there by normal database activity. In other words, the TAP test forcibly corrupts the index by changinga value on one side of a boundary to be equal to the value on the other side of the boundary. Prior to the corruptingaction the values were all unique. I understood that. I was just pointing out that an index that looks even somewhat like that is already quite unnatural. -- Peter Geoghegan
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Pavel Borisov
Date:
Hi, Mark!
> At the first glance it's not clear to me:
> - why your test creates cross-page unique constraint violations?
To see if they are detected.
> - how do you know they are not detected?
It appears that they are detected. At least, rerunning the test after adjusting the expected output, I no longer see problems.
I understand your point. It was unclear how it modified the index so that only unique constraint check between pages should have failed with other checks passed.
Anyway, thanks for your testing and efforts! I'm happy that the test now passes and confirms that amcheck feature works as intended.
Kind regards,
Pavel Borisov
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Alexander Korotkov
Date:
On Fri, May 17, 2024 at 1:11 PM Alexander Korotkov <aekorotkov@gmail.com> wrote: > I think the thread contains enough motivation on why 0002, 0003 and > 0004 are material for post-FF. They are fixes and refactoring for > new-in-v17 feature. I'm going to push them if no objections. > > Regarding 0001, I'd like to ask Tom and Mark if they find convincing > that given that optimization is small, simple and giving huge effect, > it could be pushed post-FF? Otherwise, this could wait for v18. The revised version of 0001 unique checking optimization is attached. I'm going to push this to v18 if no objections. ------ Regards, Alexander Korotkov Supabase
Attachment
On Fri, Jul 26, 2024 at 8:10 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > The revised version of 0001 unique checking optimization is attached. > I'm going to push this to v18 if no objections. I have no reason to specifically object to pushing this into 18, but I would like to point out that you're posting here about this but failed to reply to the "64-bit pg_notify page numbers truncated to 32-bit", an open item that was assigned to you but which, since you didn't respond, was eventually fixed by commits from Michael Paquier. I know it's easy to lose track of the open items list and I sometimes forget to check it myself, but it's rather important to stay on top of any open items that get assigned to you. Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
From
Alexander Korotkov
Date:
On Fri, Jul 26, 2024 at 5:38 PM Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Jul 26, 2024 at 8:10 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > The revised version of 0001 unique checking optimization is attached. > > I'm going to push this to v18 if no objections. > > I have no reason to specifically object to pushing this into 18, but I > would like to point out that you're posting here about this but failed > to reply to the "64-bit pg_notify page numbers truncated to 32-bit", > an open item that was assigned to you but which, since you didn't > respond, was eventually fixed by commits from Michael Paquier. > > I know it's easy to lose track of the open items list and I sometimes > forget to check it myself, but it's rather important to stay on top of > any open items that get assigned to you. Yes, it's a pity I miss this open item on me. Besides putting ashes on my head, I think I could pay more attention on other open items. ------ Regards, Alexander Korotkov Supabase