Thread: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

[PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Pavel Borisov
Date:
Hi, hackers!

It seems that if btree index with a unique constraint is corrupted by duplicates, amcheck now can not catch this. Reindex becomes impossible as it throws an error but otherwise the index will let the user know that it is corrupted, and amcheck will tell that the index is clean. So I'd like to propose a short patch to improve amcheck for checking the unique constraint. It will output tid's of tuples that are duplicated in the index (i.e. more than one tid for the same index key is visible) and the user can easily investigate and delete corresponding table entries.

0001 - is the actual patch, and 
0002 - is a temporary hack for testing. It will allow inserting duplicates in a table even if an index with the exact name "idx" has a unique constraint (generally it is prohibited to insert). Then a new amcheck will tell us about these duplicates. It's pity but testing can not be done automatically, as it needs a core recompile. For testing I'd recommend a protocol similar to the following:

- Apply patch 0002
- Set autovaccum = off in postgresql.conf

create table tbl2 (a varchar(50), b varchar(150), c bytea, d varchar(50));
create unique index idx on tbl2(a,b);
insert into tbl2 select i::text::varchar, i::text::varchar, i::text::bytea, i::text::varchar from generate_series(0,3) as i, generate_series(0,10000) as x;

 
So we'll have a generous amount of duplicates in the table and index. Then:

create extension amcheck;
select bt_index_check('idx', true);

There will be output about each pair of duplicates: index tid's, position in a posting list (if the index item is deduplicated) and table tid's.

WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 218 and posting 219 (point to heap tid=(126,93) and tid=(126,94)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 219 and posting 220 (point to heap tid=(126,94) and tid=(126,95)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 220 and posting 221 (point to heap tid=(126,95) and tid=(126,96)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 221 and tid=(26,7) posting 0 (point to heap tid=(126,96) and tid=(126,97)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,7) posting 0 and posting 1 (point to heap tid=(126,97) and tid=(126,98)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,7) posting 1 and posting 2 (point to heap tid=(126,98) and tid=(126,99)) page lsn=0/1B3D420.

Things to notice in the test output:
- cross-page duplicates (when some of them on the one index page and the other are on the next). (Under patch 0002 they are marked by an additional message "INFO:  cross page equal keys" to catch them among the other)

- If we delete table entries corresponding to some duplicates in between the other duplicates (vacuum should be off), the message for this entry should disappear but the other duplicates should be detected as previous.
delete from tbl2 where ctid::text='(126,94)';
select bt_index_check('idx', true);

WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 218 and posting 220 (point to heap tid=(126,93) and tid=(126,95)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 220 and posting 221 (point to heap tid=(126,95) and tid=(126,96)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 221 and tid=(26,7) posting 0 (point to heap tid=(126,96) and tid=(126,97)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,7) posting 0 and posting 1 (point to heap tid=(126,97) and tid=(126,98)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,7) posting 1 and posting 2 (point to heap tid=(126,98) and tid=(126,99)) page lsn=0/1B3D420.

Caveat: if the first entry on the next index page has a key equal to the key on a previous page AND all heap tid's corresponding to this entry are invisible, currently cross-page check can not detect unique constraint violation between previous index page entry and 2nd, 3d and next current index page entries. In this case, there would be a message that recommends doing VACUUM to remove the invisible entries from the index and repeat the check. (Generally, it is recommended to do vacuum before the check, but for the testing purpose I'd recommend turning it off to check the detection of visible-invisible-visible duplicates scenarios)

Your feedback is very much welcome, as usual.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Pavel Borisov
Date:


On Mon, 8 Feb 2021, 14:46 Pavel Borisov <pashkin.elfe@gmail.com wrote:
Hi, hackers!

It seems that if btree index with a unique constraint is corrupted by duplicates, amcheck now can not catch this. Reindex becomes impossible as it throws an error but otherwise the index will let the user know that it is corrupted, and amcheck will tell that the index is clean. So I'd like to propose a short patch to improve amcheck for checking the unique constraint. It will output tid's of tuples that are duplicated in the index (i.e. more than one tid for the same index key is visible) and the user can easily investigate and delete corresponding table entries.

0001 - is the actual patch, and 
0002 - is a temporary hack for testing. It will allow inserting duplicates in a table even if an index with the exact name "idx" has a unique constraint (generally it is prohibited to insert). Then a new amcheck will tell us about these duplicates. It's pity but testing can not be done automatically, as it needs a core recompile. For testing I'd recommend a protocol similar to the following:

- Apply patch 0002
- Set autovaccum = off in postgresql.conf

create table tbl2 (a varchar(50), b varchar(150), c bytea, d varchar(50));
create unique index idx on tbl2(a,b);
insert into tbl2 select i::text::varchar, i::text::varchar, i::text::bytea, i::text::varchar from generate_series(0,3) as i, generate_series(0,10000) as x;

 
So we'll have a generous amount of duplicates in the table and index. Then:

create extension amcheck;
select bt_index_check('idx', true);

There will be output about each pair of duplicates: index tid's, position in a posting list (if the index item is deduplicated) and table tid's.

WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 218 and posting 219 (point to heap tid=(126,93) and tid=(126,94)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 219 and posting 220 (point to heap tid=(126,94) and tid=(126,95)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 220 and posting 221 (point to heap tid=(126,95) and tid=(126,96)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 221 and tid=(26,7) posting 0 (point to heap tid=(126,96) and tid=(126,97)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,7) posting 0 and posting 1 (point to heap tid=(126,97) and tid=(126,98)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,7) posting 1 and posting 2 (point to heap tid=(126,98) and tid=(126,99)) page lsn=0/1B3D420.

Things to notice in the test output:
- cross-page duplicates (when some of them on the one index page and the other are on the next). (Under patch 0002 they are marked by an additional message "INFO:  cross page equal keys" to catch them among the other)

- If we delete table entries corresponding to some duplicates in between the other duplicates (vacuum should be off), the message for this entry should disappear but the other duplicates should be detected as previous.
delete from tbl2 where ctid::text='(126,94)';
select bt_index_check('idx', true);

WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 218 and posting 220 (point to heap tid=(126,93) and tid=(126,95)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 220 and posting 221 (point to heap tid=(126,95) and tid=(126,96)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,6) posting 221 and tid=(26,7) posting 0 (point to heap tid=(126,96) and tid=(126,97)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,7) posting 0 and posting 1 (point to heap tid=(126,97) and tid=(126,98)) page lsn=0/1B3D420.
WARNING:  index uniqueness is violated for index "idx": Index tid=(26,7) posting 1 and posting 2 (point to heap tid=(126,98) and tid=(126,99)) page lsn=0/1B3D420.

Caveat: if the first entry on the next index page has a key equal to the key on a previous page AND all heap tid's corresponding to this entry are invisible, currently cross-page check can not detect unique constraint violation between previous index page entry and 2nd, 3d and next current index page entries. In this case, there would be a message that recommends doing VACUUM to remove the invisible entries from the index and repeat the check. (Generally, it is recommended to do vacuum before the check, but for the testing purpose I'd recommend turning it off to check the detection of visible-invisible-visible duplicates scenarios)

Your feedback is very much welcome, as usual.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

There was typo, I mean, initially"...index will NOT let the user know that it is corrupted..."

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Mark Dilger
Date:

> On Feb 8, 2021, at 2:46 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> 0002 - is a temporary hack for testing. It will allow inserting duplicates in a table even if an index with the exact
name"idx" has a unique constraint (generally it is prohibited to insert). Then a new amcheck will tell us about these
duplicates.It's pity but testing can not be done automatically, as it needs a core recompile. For testing I'd recommend
aprotocol similar to the following: 
>
> - Apply patch 0002
> - Set autovaccum = off in postgresql.conf

Thanks Pavel and Anastasia for working on this!

Updating pg_catalog directly is ugly, but the following seems a simpler way to set up a regression test than having to
recompile. What do you think? 

CREATE TABLE junk (t text);
CREATE UNIQUE INDEX junk_idx ON junk USING btree (t);
INSERT INTO junk (t) VALUES ('fee'), ('fi'), ('fo'), ('fum');
UPDATE pg_catalog.pg_index
    SET indisunique = false
    WHERE indrelid = (SELECT oid FROM pg_catalog.pg_class WHERE relname = 'junk');
INSERT INTO junk (t) VALUES ('fee'), ('fi'), ('fo'), ('fum');
UPDATE pg_catalog.pg_index
    SET indisunique = true
    WHERE indrelid = (SELECT oid FROM pg_catalog.pg_class WHERE relname = 'junk');
SELECT * FROM junk;
  t
-----
 fee
 fi
 fo
 fum
 fee
 fi
 fo
 fum
(8 rows)

\d junk
              Table "public.junk"
 Column | Type | Collation | Nullable | Default
--------+------+-----------+----------+---------
 t      | text |           |          |
Indexes:
    "junk_idx" UNIQUE, btree (t)

\d junk_idx
      Index "public.junk_idx"
 Column | Type | Key? | Definition
--------+------+------+------------
 t      | text | yes  | t
unique, btree, for table "public.junk"

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Pavel Borisov
Date:
вт, 9 февр. 2021 г. в 01:46, Mark Dilger <mark.dilger@enterprisedb.com>:


> On Feb 8, 2021, at 2:46 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> 0002 - is a temporary hack for testing. It will allow inserting duplicates in a table even if an index with the exact name "idx" has a unique constraint (generally it is prohibited to insert). Then a new amcheck will tell us about these duplicates. It's pity but testing can not be done automatically, as it needs a core recompile. For testing I'd recommend a protocol similar to the following:
>
> - Apply patch 0002
> - Set autovaccum = off in postgresql.conf

Thanks Pavel and Anastasia for working on this!

Updating pg_catalog directly is ugly, but the following seems a simpler way to set up a regression test than having to recompile.  What do you think?

Very nice idea, thanks!
I've made a regression test based on it. PFA v.2 of a patch. Now it doesn't need anything external for testing.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Pavel Borisov
Date:
To make tests stable I also removed lsn output under warning level. PFA v3 of a patch

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Zhihong Yu
Date:
Hi,
Minor comment:

+   if (errflag == true)
+       ereport(ERROR,

I think 'if (errflag)' should suffice.

Cheers

On Tue, Feb 9, 2021 at 10:44 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
вт, 9 февр. 2021 г. в 01:46, Mark Dilger <mark.dilger@enterprisedb.com>:


> On Feb 8, 2021, at 2:46 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> 0002 - is a temporary hack for testing. It will allow inserting duplicates in a table even if an index with the exact name "idx" has a unique constraint (generally it is prohibited to insert). Then a new amcheck will tell us about these duplicates. It's pity but testing can not be done automatically, as it needs a core recompile. For testing I'd recommend a protocol similar to the following:
>
> - Apply patch 0002
> - Set autovaccum = off in postgresql.conf

Thanks Pavel and Anastasia for working on this!

Updating pg_catalog directly is ugly, but the following seems a simpler way to set up a regression test than having to recompile.  What do you think?

Very nice idea, thanks!
I've made a regression test based on it. PFA v.2 of a patch. Now it doesn't need anything external for testing.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
David Steele
Date:
On 3/15/21 11:11 AM, Mark Dilger wrote:
> 
>> On Mar 2, 2021, at 6:08 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>>
>> I completely agree that checking uniqueness requires looking at the heap, but I don't agree that every caller of
bt_index_checkon an index wants that particular check to be performed.  There are multiple ways in which an index might
becorrupt, and Peter wrote the code to only check some of them by default, with options to expand the checks to other
things. This is why heapallindexed is optional.  If you don't want to pay the price of checking all entries in the heap
againstthe btree, you don't have to.
 
>>
>> I've got the idea and revised the patch accordingly. Thanks!
>> Pfa v4 of a patch. I've added an optional argument to allow uniqueness checks for the unique indexes.
>> Also, I added a test variant to make them work on 32-bit systems. Unfortunately, converting the regression test to
TAPwould be a pain for me. Hope it can be used now as a 2-variant regression test for 32 and 64 bit systems.
 
>>
>> Thank you for your consideration!
>>
>> -- 
>> Best regards,
>> Pavel Borisov
>>
>> Postgres Professional: http://postgrespro.com
>> <v4-0001-Make-amcheck-checking-UNIQUE-constraint-for-btree.patch>
> 
> Looking over v4, here are my review comments...

This patch appears to need some work and has not been updated in several 
weeks, so marking Returned with Feedback.

Please submit to the next CF when you have a new patch.

Regards,
-- 
-David
david@pgmasters.net



Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Pavel Borisov
Date:
>> I completely agree that checking uniqueness requires looking at the heap, but I don't agree that every caller of bt_index_check on an index wants that particular check to be performed.  There are multiple ways in which an index might be corrupt, and Peter wrote the code to only check some of them by default, with options to expand the checks to other things.  This is why heapallindexed is optional.  If you don't want to pay the price of checking all entries in the heap against the btree, you don't have to.
>>
>> I've got the idea and revised the patch accordingly. Thanks!
>> Pfa v4 of a patch. I've added an optional argument to allow uniqueness checks for the unique indexes.
>> Also, I added a test variant to make them work on 32-bit systems. Unfortunately, converting the regression test to TAP would be a pain for me. Hope it can be used now as a 2-variant regression test for 32 and 64 bit systems.
>>
>> Thank you for your consideration!
>>
>> --
>> Best regards,
>> Pavel Borisov
>>
>> Postgres Professional: http://postgrespro.com
>> <v4-0001-Make-amcheck-checking-UNIQUE-constraint-for-btree.patch>
>
> Looking over v4, here are my review comments...

Mark and Peter, big thanks for your ideas!

I had little time to work on this feature until recently, but finally, I've elaborated v5 patch (PFA) 
It contains the following improvements, most of which are based on your consideration:

- Amcheck tests are reworked into TAP-tests with "break the warranty" by comparison function changes in the opclass instead of pg_index update. Mark, again thanks for the sample!
- Added new --checkunique option into pg_amcheck.
- Added documentation and tests for amcheck and pg_amcheck new functions.
- Results are output at ERROR log level like it is done in the other amcheck tests.
- Rare case of inability to check due to the first entry on a leaf page being both: (1) equal to the last one on the previous page and (2) deleted in the heap, is demoted to DEBUG1 log level. In this, I folowed Peter's consideration that amcheck should do its best to check, but can not always verify everything. The case is expected to be extremely rare.
- Made pages connectivity based on btpo_next (corrected a bug in the code, I found meanwhile)
- If snapshot is already taken in heapallindexed case, reuse it for unique constraint check.

The patch is pgindented and rebased on the current PG master code.
I'd like  to re-attach the patch v5 to the upcoming CF if you don't mind.

I also want to add that some customers face index uniqueness violations more often than I've expected, and I hope this check could also help some other PostgreSQL customers.

Your further considerations are welcome as always! 
--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Mark Dilger
Date:

> On Dec 20, 2021, at 7:37 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> The patch is pgindented and rebased on the current PG master code.

Thank you, Pavel.


The tests in check_btree.sql no longer create a bttest_unique table, so the DROP TABLE is surplusage:

+DROP TABLE bttest_unique;
+ERROR:  table "bttest_unique" does not exist


The changes in pg_amcheck.c to pass the new checkunique parameter will likely need to be based on a amcheck version
check. The implementation of prepare_btree_command() in pg_amcheck.c should be kept compatible with older versions of
amcheck,because it connects to remote servers and you can't know in advance that the remote servers are as up-to-date
asthe machine where pg_amcheck is installed.  I'm thinking specifically about this change: 

@@ -871,7 +877,8 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
    if (opts.parent_check)
        appendPQExpBuffer(sql,
                          "SELECT %s.bt_index_parent_check("
-                         "index := c.oid, heapallindexed := %s, rootdescend := %s)"
+                         "index := c.oid, heapallindexed := %s, rootdescend := %s, "
+                         "checkunique := %s)"
                          "\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
                          "WHERE c.oid = %u "
                          "AND c.oid = i.indexrelid "

If the user calls pg_amcheck with --checkunique, and one or more remote servers have an amcheck version < 1.4, at a
minimumyou'll need to avoid calling bt_index_parent_check with that parameter, and probably also you'll either need to
raisea warning or perhaps an error telling the user that such a check cannot be performed. 


You've forgotten to include contrib/amcheck/amcheck--1.3--1.4.sql in the v5 patch, resulting in a failed install:

/usr/bin/install -c -m 644 ./amcheck--1.3--1.4.sql ./amcheck--1.2--1.3.sql ./amcheck--1.1--1.2.sql
./amcheck--1.0--1.1.sql./amcheck--1.0.sql
'/Users/mark.dilger/hydra/unique_review.5/tmp_install/Users/mark.dilger/pgtest/test_install/share/postgresql/extension/'
install: ./amcheck--1.3--1.4.sql: No such file or directory
make[2]: *** [install] Error 71
make[1]: *** [checkprep] Error 2

Using the one from the v4 patch fixes the problem.  Please include this file in v6, to simplify the review process.


The changes to t/005_opclass_damage.pl look ok.  The creation of a new table for the new test seems unnecessary, but
onlyproblematic in that it makes the test slightly longer to read.  I recommend changing the test to use the same table
thatthe prior test uses, but that is just a recommendation, not a requirement. 

You should add coverage for --checkunique to t/003_check.pl.

You should add coverage for multiple PostgreSQL::Test::Cluster instances running differing versions of amcheck, perhaps
someon version 1.3 and some on version 1.4.  Then test that the --checkunique option works adequately. 


I have not reviewed the changes to verify_nbtree.c.  I'll leave that to Peter.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Mark Dilger
Date:

> On Dec 22, 2021, at 12:01 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> Thank you, Mark!
>
> In v6 (PFA) I've made the changes on your advice i.e.
>
> - pg_amcheck with --checkunique option will ignore uniqueness check (with a warning) if amcheck version in a db is
<1.4and doesn't support the feature. 

Ok.

+           int         vmaj = 0,
+                       vmin = 0,
+                       vrev = 0;
+           const char *amcheck_version = pstrdup(PQgetvalue(result, 0, 1));
+
+           sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);

The pstrdup is unnecessary but harmless.

> - fixed unnecessary drop table in regression

Ok.

> - use the existing table for uniqueness check in 005_opclass_damage.pl

It appears you still create a new table, bttest_unique, rather than using the existing table int4tbl.  That's fine.

> - added tests into 003_check.pl . It is only smoke test that just verifies new functions.

+
+$node->command_checks_all(
+   [
+       @cmd, '-s', 's1', '-i', 't1_btree', '--parent-check',
+       '--checkunique', 'db1'
+   ],
+   2,
+   [$index_missing_relation_fork_re],
+   [$no_output_re],
+   'pg_amcheck smoke test --parent-check');
+
+$node->command_checks_all(
+   [
+       @cmd, '-s', 's1', '-i', 't1_btree', '--heapallindexed',
+       '--rootdescend', '--checkunique',  'db1'
+   ],
+   2,
+   [$index_missing_relation_fork_re],
+   [$no_output_re],
+   'pg_amcheck smoke test --heapallindexed --rootdescend');
+
+$node->command_checks_all(
+   [ @cmd, '--checkunique', '-d', 'db1', '-d', 'db2', '-d', 'db3', '-S', 's*' ],
+   0, [$no_output_re], [$no_output_re],
+   'pg_amcheck excluding all corrupt schemas');
+

You have borrowed the existing tests but forgot to change their names.  (The last line of each check is the test name,
suchas 'pg_amcheck smoke test --parent-check'.)  Please make each test name unique. 

> - added test contrib/amcheck/t/004_verify_nbtree_unique.pl it is more extensive test based on opclass damage which
wasintended to be main test for amcheck, but which I've forgotten to add to commit in v5. 
> 005_opclass_damage.pl test, which you've seen in v5 is largely based on first part of 004_verify_nbtree_unique.pl
(withthe later calling pg_amcheck, and the former calling bt_index_check(), bt_index_parent_check() ) 

Ok.

> - added forgotten upgrade script amcheck--1.3--1.4.sql (from v4)

Ok.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Максим Орлов
Date:
The pstrdup is unnecessary but harmless.

> - use the existing table for uniqueness check in 005_opclass_damage.pl

It appears you still create a new table, bttest_unique, rather than using the existing table int4tbl.  That's fine.

> - added tests into 003_check.pl . It is only smoke test that just verifies new functions.

You have borrowed the existing tests but forgot to change their names.  (The last line of each check is the test name, such as 'pg_amcheck smoke test --parent-check'.)  Please make each test name unique.

Thanks for your review! Fixed all these remaining things from patch v6.
PFA v7 patch.

---
Best regards,
Maxim Orlov.
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Julien Rouhaud
Date:
Hi,

On Fri, Dec 24, 2021 at 2:06 AM Максим Орлов <orlovmg@gmail.com> wrote:
>
> Thanks for your review! Fixed all these remaining things from patch v6.
> PFA v7 patch.

The cfbot reports that you have mixed declarations and code
(https://cirrus-ci.com/task/6407449413419008):

[17:21:26.926] pg_amcheck.c: In function ‘main’:
[17:21:26.926] pg_amcheck.c:634:4: error: ISO C90 forbids mixed
declarations and code [-Werror=declaration-after-statement]
[17:21:26.926] 634 | int vmaj = 0,
[17:21:26.926] | ^~~



Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Pavel Borisov
Date:
The cfbot reports that you have mixed declarations and code
(https://cirrus-ci.com/task/6407449413419008):

[17:21:26.926] pg_amcheck.c: In function ‘main’:
[17:21:26.926] pg_amcheck.c:634:4: error: ISO C90 forbids mixed
declarations and code [-Werror=declaration-after-statement]
[17:21:26.926] 634 | int vmaj = 0,
[17:21:26.926] | ^~~

Corrected this, thanks!
Also added more comments on this part of the code.
PFA v8 of a patch

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Pavel Borisov
Date:
By the way I've forgotten to add one part of my code into the CF patch related to the treatment of NULL values in checking btree unique constraints.
PFA v9 of a patch with this minor code and tests additions.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Maxim Orlov
Date:
I've updated the patch due to recent changes by Daniel Gustafsson (549ec201d6132b7).

--
Best regards,
Maxim Orlov.
Attachment
This patch was broken by d16773cdc86210493a2874cb0cf93f3883fcda73 "Add
macros in hash and btree AMs to get the special area of their pages"

If it's really just a few macros it should be easy enough to merge but
it would be good to do a rebase given the number of other commits
since February anyways.

On Mon, 21 Feb 2022 at 09:14, Maxim Orlov <orlovmg@gmail.com> wrote:
>
> I've updated the patch due to recent changes by Daniel Gustafsson (549ec201d6132b7).
>
> --
> Best regards,
> Maxim Orlov.



-- 
greg



Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Pavel Borisov
Date:
This patch was broken by d16773cdc86210493a2874cb0cf93f3883fcda73 "Add
macros in hash and btree AMs to get the special area of their pages"

If it's really just a few macros it should be easy enough to merge but
it would be good to do a rebase given the number of other commits
since February anyways.

Rebased, thanks!

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Pavel Borisov
Date:
v11 patch do not apply due to recent code changes.
Rebased. PFA v12.

Please feel free to check and discuss it.


--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Pavel Borisov
Date:
CFbot says v12 patch does not apply.
Rebased. PFA v13.
Your reviews are very much welcome!

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Aleksander Alekseev
Date:
Hi Pavel,

> Rebased. PFA v13.
> Your reviews are very much welcome!

I noticed that this patch is in "Needs Review" state and it has been
stuck for some time now, so I decided to take a look.

```
+SELECT bt_index_parent_check('bttest_a_idx', true, true, true);
+SELECT bt_index_parent_check('bttest_b_idx', true, false, true);
``

1. This "true, false, true" sequence is difficult to read. I suggest
we use named arguments here.

2. I believe there are some minor issues with the comments. E.g. instead of:

- First key on next page is same
- Make values 768 and 769 looks equal

I would write:

- The first key on the next page is the same
- Make values 768 and 769 look equal

There are many little errors like these.

```
+# Copyright (c) 2021, PostgreSQL Global Development Group
```

3. Oh no. The copyright has expired!

```
+      <literal>true</literal>.  When <parameter>checkunique</parameter>
+      is <literal>true</literal> <function>bt_index_check</function> will
```

4. This piece of documentation was copy-pasted between two functions
without change of the function name.

Other than that to me the patch looks in pretty good shape. Here is
v14 where I fixed my own nitpicks, with the permission of Pavel given
offlist.

If no one sees any other defects I'm going to change the status of the
patch to "Ready to Committer" in a short time.

--
Best regards,
Aleksander Alekseev

Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Dmitry Koval
Date:
Hi!

I would make two cosmetic changes.

1. I suggest replace description of function bt_report_duplicate() from
```
/*
  * Prepare and print an error message for unique constrain violation in
  * a btree index under WARNING level. Also set a flag to report ERROR
  * at the end of the check.
  */
```
to
```
/*
  * Prepare an error message for unique constrain violation in
  * a btree index and report ERROR.
  */
```

2. I think will be better to change test 004_verify_nbtree_unique.pl - 
replace
```
use Test::More tests => 6;
```
to
```
use Test::More;
...
done_testing();
```
(same as in the other three tests).

-- 
With best regards,
Dmitry Koval

Postgres Professional: http://postgrespro.com
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Karina Litskevich
Date:
Hi,

I also would like to suggest a cosmetic change.
In v15 a new field checkunique is added after heapallindexed and before no_btree_expansion fields in struct definition, but in initialisation it is added after no_btree_expansion:

--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -102,6 +102,7 @@ typedef struct AmcheckOptions
  bool parent_check;
  bool rootdescend;
  bool heapallindexed;
+ bool checkunique;
 
  /* heap and btree hybrid option */
  bool no_btree_expansion;
@@ -132,7 +133,8 @@ static AmcheckOptions opts = {
  .parent_check = false,
  .rootdescend = false,
  .heapallindexed = false,
- .no_btree_expansion = false
+ .no_btree_expansion = false,
+ .checkunique = false
 };

I suggest to add checkunique field between heapallindexed and no_btree_expansion fields in initialisation as well as in definition:

@@ -132,6 +133,7 @@ static AmcheckOptions opts = {
  .parent_check = false,
  .rootdescend = false,
  .heapallindexed = false,
+ .checkunique = false,
  .no_btree_expansion = false
 };

--
Best regards,
Litskevich Karina
Postgres Professional: http://postgrespro.com/
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Andres Freund
Date:
Hi,

On 2022-05-20 17:46:38 +0400, Pavel Borisov wrote:
> CFbot says v12 patch does not apply.
> Rebased. PFA v13.
> Your reviews are very much welcome!

Due to the merge of the meson based build this patch needs to be
adjusted: https://cirrus-ci.com/build/6350479973154816

Looks like you need to add amcheck--1.3--1.4.sql to the list of files to be
installed and t/004_verify_nbtree_unique.pl to the tests.

Greetings,

Andres Freund



Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Maxim Orlov
Date:


On Thu, 22 Sept 2022 at 18:13, Andres Freund <andres@anarazel.de> wrote:
Due to the merge of the meson based build this patch needs to be
adjusted: https://cirrus-ci.com/build/6350479973154816

Looks like you need to add amcheck--1.3--1.4.sql to the list of files to be
installed and t/004_verify_nbtree_unique.pl to the tests.

Greetings,

Andres Freund

Thanks! Fixed.

--
Best regards,
Maxim Orlov.
Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Maxim Orlov
Date:
Hi!

I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done without any significant code changes and CF bot how happy again.

I'm going to move it to RfC, could I? If not, please tell why.

--
Best regards,
Maxim Orlov.

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Aleksander Alekseev
Date:
Hi hackers,

> I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done without
anysignificant code changes and CF bot how happy again.
 
>
> I'm going to move it to RfC, could I? If not, please tell why.

I restored the "Ready for Committer" state. I don't think it's a good
practice to change the state every time the patch has a slight
conflict or something. This is not helpful at all. Such things happen
quite regularly and typically are fixed in a couple of days.

-- 
Best regards,
Aleksander Alekseev



Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Alexander Korotkov
Date:
On Wed, Sep 28, 2022 at 11:44 AM Aleksander Alekseev
<aleksander@timescale.com> wrote:
> > I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done without
anysignificant code changes and CF bot how happy again. 
> >
> > I'm going to move it to RfC, could I? If not, please tell why.
>
> I restored the "Ready for Committer" state. I don't think it's a good
> practice to change the state every time the patch has a slight
> conflict or something. This is not helpful at all. Such things happen
> quite regularly and typically are fixed in a couple of days.

This patch seems useful to me.  I went through the thread, it seems
that all the critics are addressed.

I've rebased this patch.   Also, I've run perltidy for tests, split
long errmsg() into errmsg(), errdetail() and errhint(), and do other
minor enchantments.

I think this patch is ready to go.  I'm going to push it if there are
no objections.

------
Regards,
Alexander Korotkov

Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Pavel Borisov
Date:
Hi, Alexander!


On Wed, 25 Oct 2023 at 00:13, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> On Wed, Sep 28, 2022 at 11:44 AM Aleksander Alekseev
> <aleksander@timescale.com> wrote:
> > > I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done
withoutany significant code changes and CF bot how happy again. 
> > >
> > > I'm going to move it to RfC, could I? If not, please tell why.
> >
> > I restored the "Ready for Committer" state. I don't think it's a good
> > practice to change the state every time the patch has a slight
> > conflict or something. This is not helpful at all. Such things happen
> > quite regularly and typically are fixed in a couple of days.
>
> This patch seems useful to me.  I went through the thread, it seems
> that all the critics are addressed.
>
> I've rebased this patch.   Also, I've run perltidy for tests, split
> long errmsg() into errmsg(), errdetail() and errhint(), and do other
> minor enchantments.
>
> I think this patch is ready to go.  I'm going to push it if there are
> no objections.
>
> ------
> Regards,
> Alexander Korotkov

It's very good that this long-standing patch is finally committed. Thanks a lot!

Regards,
Pavel Borisov



On Mon, Oct 30, 2023 at 11:29:04AM +0400, Pavel Borisov wrote:
> On Wed, 25 Oct 2023 at 00:13, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > I think this patch is ready to go.  I'm going to push it if there are
> > no objections.

> It's very good that this long-standing patch is finally committed. Thanks a lot!

Agreed.  I gave this feature (commit 5ae2087) a try.  Thanks for implementing
it.  Could I get your input on two topics?


==== 1. Cross-page comparison at "first key on the next page" only

Cross-page comparisons got this discussion upthread:

On Tue, Mar 02, 2021 at 07:10:32PM -0800, Peter Geoghegan wrote:
> On Mon, Feb 8, 2021 at 2:46 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> > Caveat: if the first entry on the next index page has a key equal to the key on a previous page AND all heap tid's
correspondingto this entry are invisible, currently cross-page check can not detect unique constraint violation between
previousindex page entry and 2nd, 3d and next current index page entries. In this case, there would be a message that
recommendsdoing VACUUM to remove the invisible entries from the index and repeat the check. (Generally, it is
recommendedto do vacuum before the check, but for the testing purpose I'd recommend turning it off to check the
detectionof visible-invisible-visible duplicates scenarios)
 

> You're going to have to "couple" buffer locks in the style of
> _bt_check_unique() (as well as keeping a buffer lock on "the first
> leaf page a duplicate might be on" throughout) if you need the test to
> work reliably.

The amcheck feature has no lock coupling at its "first key on the next page"
check.  I think that's fine, because amcheck takes one snapshot at the
beginning and looks for pairs of visible-to-that-snapshot heap tuples with the
same scan key.  _bt_check_unique(), unlike amcheck, must catch concurrent
inserts.  If amcheck "checkunique" wanted to detect duplicates that would
appear when all transactions commit, it would need lock coupling.  (I'm not
suggesting it do that.)  Do you see a problem with the lack of lock coupling
at "first key on the next page"?

> But why bother with that? The tool doesn't have to be
> 100% perfect at detecting corruption (nothing can be), and it's rather
> unlikely that it will matter for this test. A simple test that doesn't
> handle cross-page duplicates is still going to be very effective.

I agree, but perhaps the "first key on the next page" code is more complex
than general-case code would be.  If the lack of lock coupling is fine, then I
think memory context lifecycle is the only obstacle making index page
boundaries special.  Are there factors beyond that?  We already have
state->lowkey kept across pages via MemoryContextAlloc().  Similar lines of
code could preserve the scan key for checkunique, making the "first key on the
next page" code unnecessary.


==== 2. Raises runtime by 476% despite no dead tuples

I used the following to create a table larger than RAM, 17GB table and 10GB
index on a system with 12GB RAM:

\set count 500000000
begin;
set maintenance_work_mem = '1GB';
set client_min_messages = debug1;  -- debug2 is per-block spam
create temp table t as select n from generate_series(1,:count) t(n);
create unique index t_idx on t(n);
\dt+ t
\di+ t_idx
create extension amcheck;
select bt_index_check('t_idx', heapallindexed => false, checkunique => false);
select bt_index_check('t_idx', heapallindexed => false, checkunique => true);

Adding checkunique raised runtime from 58s to 276s, because it checks
visibility for every heap tuple.  It could do the heap fetch and visibility
check lazily, when the index yields two heap TIDs for one scan key.  That
should give zero visibility checks for this particular test case, and it
doesn't add visibility checks to bloated-table cases.  Pseudo-code:

    /*---
     * scan_key is the last uniqueness-relevant scan key observed as
     * bt_check_level_from_leftmost() moves right to traverse the leaf level.
     * Will be NULL if the next tuple can't be the second tuple of a
     * uniqueness violation, because any of the following apply:
     * - we're evaluating the first leaf tuple of the entire index
     * - last scan key had anynullkeys (never forms a uniqueness violation w/
     *   any other scan key)
     */
    scan_key = NULL;
    /*
     * scan_key_known_visible==true indicates that scan_key_heap_tid is the
     * last _visible_ heap TID observed for scan_key.  Otherwise,
     * scan_key_heap_tid is the last heap TID observed for scan_key, and we've
     * not yet checked its visibility.
     */
    bool scan_key_known_visible;
    scan_key_heap_tid;
    foreach itup (leftmost_leaf_level_tup .. rightmost_leaf_level_tup) {
        if (itup.anynullkeys)
            scan_key = NULL;
        else if (scan_key != NULL &&
                 _bt_compare(scan_key, itup.key) == 0 &&
                 (scan_key_known_visible ||
                  (scan_key_known_visible = visible(scan_key_heap_tid))))
        {
            if (visible(itup.tid))
                elog(ERROR, "duplicate in unique index");
        }
        else
        {
            /*
             * No prior uniqueness-relevant key, or key changed, or we just
             * learned scan_key_heap_tid was invisible.  Make itup the
             * standard by which we judge future index tuples as we move
             * right.
             */
            scan_key = itup.key;
            scan_key_known_visible = false;
            scan_key_heap_tid = itup.tid;
        }
    }



On Sun, Mar 24, 2024 at 10:03 PM Noah Misch <noah@leadboat.com> wrote:
> > You're going to have to "couple" buffer locks in the style of
> > _bt_check_unique() (as well as keeping a buffer lock on "the first
> > leaf page a duplicate might be on" throughout) if you need the test to
> > work reliably.
>
> The amcheck feature has no lock coupling at its "first key on the next page"
> check.  I think that's fine, because amcheck takes one snapshot at the
> beginning and looks for pairs of visible-to-that-snapshot heap tuples with the
> same scan key.  _bt_check_unique(), unlike amcheck, must catch concurrent
> inserts.  If amcheck "checkunique" wanted to detect duplicates that would
> appear when all transactions commit, it would need lock coupling.  (I'm not
> suggesting it do that.)  Do you see a problem with the lack of lock coupling
> at "first key on the next page"?

Practically speaking, no, I see no problems.

> I agree, but perhaps the "first key on the next page" code is more complex
> than general-case code would be.  If the lack of lock coupling is fine, then I
> think memory context lifecycle is the only obstacle making index page
> boundaries special.  Are there factors beyond that?

I believe that my concern back in 2021 was that the general complexity
of cross-page checking was unlikely to be worth it. Note that
nbtsplitloc.c is *maximally* aggressive about avoiding split points
that fall within some group of duplicates, so with a unique index it
should be very rare.

Admittedly, I was probably thinking about the complexity of adding a
bunch of code just to be able to check uniqueness across page
boundaries. I did mention lock coupling by name, but that was more of
a catch-all term for the problems in this area.

> We already have
> state->lowkey kept across pages via MemoryContextAlloc().  Similar lines of
> code could preserve the scan key for checkunique, making the "first key on the
> next page" code unnecessary.

I suspect that I was overly focussed on the index structure itself
back when I made these remarks. I might not have considered that just
using an MVCC snapshot for the TIDs makes the whole process safe,
though that now seems quite obvious.

Separately, I now see that the committed patch just reuses the code
that has long been used to check that things are in the correct order
across page boundaries: this is the bt_right_page_check_scankey check,
which existed in the very earliest versions of amcheck. So while I
agree that we could just keep the original scan key (from the last
item on every leaf page), and then make the check at the start of the
next page instead (as opposed to making it at the end of the previous
leaf page, which is how it works now), it's not obvious that that
would be a good trade-off, all things considered.

It might still be a little better that way around, overall, but you're
not just talking about changing the recently committed checkunique
patch (I think). You're also talking about restructuring the long
established bt_right_page_check_scankey check (otherwise, what's the
point?). I'm not categorically opposed to that, but it's not as if
it'll allow you to throw out a bunch of code -- AFAICT that proposal
doesn't have that clear advantage going for it. The race condition
that is described at great length in bt_right_page_check_scankey isn't
ever going to be a problem for the recently committed checkunique
patch (as you more or less pointed out yourself), but obviously it is
still a concern for the cross-page order check.

In summary, the old bt_right_page_check_scankey check is strictly
concerned with the consistency of a physical data structure (the index
itself), whereas the new checkunique check makes sure that the logical
content of the database is consistent (the index, the heap, and all
associated transaction status metadata have to be consistent). That
means that the concerns that are described at length in
bt_right_page_check_scankey (nor anything like those concerns) don't
apply to the new checkunique check. We agree on all that, I think. But
it's less clear that that presents us with an opportunity to simplify
this patch.

> Adding checkunique raised runtime from 58s to 276s, because it checks
> visibility for every heap tuple.  It could do the heap fetch and visibility
> check lazily, when the index yields two heap TIDs for one scan key.  That
> should give zero visibility checks for this particular test case, and it
> doesn't add visibility checks to bloated-table cases.

The added runtime that you report seems quite excessive to me. I'm
really surprised that the code doesn't manage to avoid visibility
checks in the absence of duplicates that might both have TIDs
considered visible. Lazy visibility checking seems almost essential,
and not just a nice-to-have optimization.

It seems like the implication of everything that you said about
refactoring/moving the check was that doing so would enable this
optimization (at least an implementation along the lines of your
pseudo code). If that was what you intended, then it's not obvious to
me why it is relevant. What, if anything, does it have to do with
making the new checkunique visibility checks happen lazily?

--
Peter Geoghegan



On Mon, Mar 25, 2024 at 12:03:10PM -0400, Peter Geoghegan wrote:
> On Sun, Mar 24, 2024 at 10:03 PM Noah Misch <noah@leadboat.com> wrote:

> Separately, I now see that the committed patch just reuses the code
> that has long been used to check that things are in the correct order
> across page boundaries: this is the bt_right_page_check_scankey check,
> which existed in the very earliest versions of amcheck. So while I
> agree that we could just keep the original scan key (from the last
> item on every leaf page), and then make the check at the start of the
> next page instead (as opposed to making it at the end of the previous
> leaf page, which is how it works now), it's not obvious that that
> would be a good trade-off, all things considered.
> 
> It might still be a little better that way around, overall, but you're
> not just talking about changing the recently committed checkunique
> patch (I think). You're also talking about restructuring the long
> established bt_right_page_check_scankey check (otherwise, what's the
> point?). I'm not categorically opposed to that, but it's not as if

I wasn't thinking about changing the pre-v17 bt_right_page_check_scankey()
code.  I got interested in this area when I saw the interaction of the new
"first key on the next page" logic with bt_right_page_check_scankey().  The
patch made bt_right_page_check_scankey() pass back rightfirstoffset.  The new
code then does palloc_btree_page() and PageGetItem() with that offset, which
bt_right_page_check_scankey() had already done.  That smelled like a misplaced
distribution of responsibility.  For a time, I suspected the new code should
move down into bt_right_page_check_scankey().  Then I transitioned to thinking
checkunique didn't need new code for the page boundary.

> it'll allow you to throw out a bunch of code -- AFAICT that proposal
> doesn't have that clear advantage going for it. The race condition
> that is described at great length in bt_right_page_check_scankey isn't
> ever going to be a problem for the recently committed checkunique
> patch (as you more or less pointed out yourself), but obviously it is
> still a concern for the cross-page order check.
> 
> In summary, the old bt_right_page_check_scankey check is strictly
> concerned with the consistency of a physical data structure (the index
> itself), whereas the new checkunique check makes sure that the logical
> content of the database is consistent (the index, the heap, and all
> associated transaction status metadata have to be consistent). That
> means that the concerns that are described at length in
> bt_right_page_check_scankey (nor anything like those concerns) don't
> apply to the new checkunique check. We agree on all that, I think. But
> it's less clear that that presents us with an opportunity to simplify
> this patch.

See above for why I anticipated a simplification opportunity with respect to
new-in-v17 code.  Still, it may not pan out.

> > Adding checkunique raised runtime from 58s to 276s, because it checks

Side note: my last email incorrectly described that as "raises runtime by
476%".  It should have said "by 376%" or "by a factor of 4.76".

> > visibility for every heap tuple.  It could do the heap fetch and visibility
> > check lazily, when the index yields two heap TIDs for one scan key.  That
> > should give zero visibility checks for this particular test case, and it
> > doesn't add visibility checks to bloated-table cases.

> It seems like the implication of everything that you said about
> refactoring/moving the check was that doing so would enable this
> optimization (at least an implementation along the lines of your
> pseudo code). If that was what you intended, then it's not obvious to
> me why it is relevant. What, if anything, does it have to do with
> making the new checkunique visibility checks happen lazily?

Their connection is just being the two big-picture topics I found in
post-commit review.  Decisions about the cross-page check are indeed separable
from decisions about lazy vs. eager visibility checks.

Thanks,
nm



On Mon, Mar 25, 2024 at 2:24 PM Noah Misch <noah@leadboat.com> wrote:
> I wasn't thinking about changing the pre-v17 bt_right_page_check_scankey()
> code.  I got interested in this area when I saw the interaction of the new
> "first key on the next page" logic with bt_right_page_check_scankey().  The
> patch made bt_right_page_check_scankey() pass back rightfirstoffset.  The new
> code then does palloc_btree_page() and PageGetItem() with that offset, which
> bt_right_page_check_scankey() had already done.  That smelled like a misplaced
> distribution of responsibility.  For a time, I suspected the new code should
> move down into bt_right_page_check_scankey().  Then I transitioned to thinking
> checkunique didn't need new code for the page boundary.

Ah, I see. Somehow I missed this point when I recently took a fresh
look at the committed patch.

 I did notice (I meant to point out) that I have concerns about this
part of the new uniqueness check code:

"
if (P_IGNORE(topaque) || !P_ISLEAF(topaque))
    break;
"

My concern here is with the !P_ISLEAF(topaque) test -- it shouldn't be
required. If the page in question isn't a leaf page, then the index
must be corrupt (or the page deletion recycle safety/drain technique
thing is buggy). The " !P_ISLEAF(topaque)" part of the check is either
superfluous or something that ought to be reported as corruption --
it's not a legal/expected state.

Separately, I dislike the way the target block changes within
bt_target_page_check(). The general idea behind verify_nbtree.c's
target block is that every block becomes the target exactly once, in a
clearly defined place. All corruption (in the index structure itself)
is formally considered to be a problem with that particular target
block. I want to be able to clearly distinguish between the target and
target's right sibling here, to explain my concerns, but they're kinda
both the target, so that's a lot harder than it should be. (Admittedly
directly blaming the target block has always been a little bit
arbitrary, at least in certain cases, but even there it provides
structure that makes things much easier to describe unambiguously.)

--
Peter Geoghegan



On Fri, Mar 29, 2024 at 02:17:08PM -0400, Peter Geoghegan wrote:
> On Mon, Mar 25, 2024 at 2:24 PM Noah Misch <noah@leadboat.com> wrote:
> > I wasn't thinking about changing the pre-v17 bt_right_page_check_scankey()
> > code.  I got interested in this area when I saw the interaction of the new
> > "first key on the next page" logic with bt_right_page_check_scankey().  The
> > patch made bt_right_page_check_scankey() pass back rightfirstoffset.  The new
> > code then does palloc_btree_page() and PageGetItem() with that offset, which
> > bt_right_page_check_scankey() had already done.  That smelled like a misplaced
> > distribution of responsibility.  For a time, I suspected the new code should
> > move down into bt_right_page_check_scankey().  Then I transitioned to thinking
> > checkunique didn't need new code for the page boundary.

>  I did notice (I meant to point out) that I have concerns about this
> part of the new uniqueness check code:
> 
> "
> if (P_IGNORE(topaque) || !P_ISLEAF(topaque))
>     break;
> "
> 
> My concern here is with the !P_ISLEAF(topaque) test -- it shouldn't be
> required. If the page in question isn't a leaf page, then the index
> must be corrupt (or the page deletion recycle safety/drain technique
> thing is buggy). The " !P_ISLEAF(topaque)" part of the check is either
> superfluous or something that ought to be reported as corruption --
> it's not a legal/expected state.

Good point.

> Separately, I dislike the way the target block changes within
> bt_target_page_check(). The general idea behind verify_nbtree.c's
> target block is that every block becomes the target exactly once, in a
> clearly defined place.

Agreed.



On 24.10.23 22:13, Alexander Korotkov wrote:
> On Wed, Sep 28, 2022 at 11:44 AM Aleksander Alekseev
> <aleksander@timescale.com> wrote:
>>> I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done without
anysignificant code changes and CF bot how happy again.
 
>>>
>>> I'm going to move it to RfC, could I? If not, please tell why.
>>
>> I restored the "Ready for Committer" state. I don't think it's a good
>> practice to change the state every time the patch has a slight
>> conflict or something. This is not helpful at all. Such things happen
>> quite regularly and typically are fixed in a couple of days.
> 
> This patch seems useful to me.  I went through the thread, it seems
> that all the critics are addressed.
> 
> I've rebased this patch.   Also, I've run perltidy for tests, split
> long errmsg() into errmsg(), errdetail() and errhint(), and do other
> minor enchantments.
> 
> I think this patch is ready to go.  I'm going to push it if there are
> no objections.

I just found the new pg_amcheck option --checkunique in PG17-to-be. 
Could we rename this to --check-unique?  Seems friendlier.  Maybe also 
rename the bt_index_check function argument to check_unique.




 I did notice (I meant to point out) that I have concerns about this
part of the new uniqueness check code:
"
if (P_IGNORE(topaque) || !P_ISLEAF(topaque))
    break;
"
My concern here is with the !P_ISLEAF(topaque) test -- it shouldn't be
required
I agree. But I didn't see the need to check uniqueness constraints violations in internal pages. Furthermore, it doesn't mean only a violation of constraint, but a major index corruption. I agree that checking and reporting this type of corruption separately is a possible thing.

 
Separately, I dislike the way the target block changes within
bt_target_page_check(). The general idea behind verify_nbtree.c's
target block is that every block becomes the target exactly once, in a
clearly defined place. All corruption (in the index structure itself)
is formally considered to be a problem with that particular target
block. I want to be able to clearly distinguish between the target and
target's right sibling here, to explain my concerns, but they're kinda
both the target, so that's a lot harder than it should be. (Admittedly
directly blaming the target block has always been a little bit
arbitrary, at least in certain cases, but even there it provides
structure that makes things much easier to describe unambiguously.)
 
The possible way to load the target block only once is to get rid of the cross-page uniqueness violation check. I introduced it to catch more possible cases of uniqueness violations. Though they are expected to be extremely rare, and anyway the algorithm doesn't get any warranty, just does its best to catch what is possible. I don't object to this change.

Regards,
Pavel.

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Alexander Korotkov
Date:
On Wed, Apr 17, 2024 at 6:41 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>>  I did notice (I meant to point out) that I have concerns about this
>> part of the new uniqueness check code:
>> "
>> if (P_IGNORE(topaque) || !P_ISLEAF(topaque))
>>     break;
>> "
>>
>> My concern here is with the !P_ISLEAF(topaque) test -- it shouldn't be
>> required
>
> I agree. But I didn't see the need to check uniqueness constraints violations in internal pages. Furthermore, it
doesn'tmean only a violation of constraint, but a major index corruption. I agree that checking and reporting this type
ofcorruption separately is a possible thing. 

I think we could just throw an error in case of an unexpected internal
page.  It doesn't seem reasonable to continue the check with this type
of corruption detected.  If the tree linkage is corrupted we may enter
an endless loop or something.

>> Separately, I dislike the way the target block changes within
>> bt_target_page_check(). The general idea behind verify_nbtree.c's
>> target block is that every block becomes the target exactly once, in a
>> clearly defined place. All corruption (in the index structure itself)
>> is formally considered to be a problem with that particular target
>> block. I want to be able to clearly distinguish between the target and
>> target's right sibling here, to explain my concerns, but they're kinda
>> both the target, so that's a lot harder than it should be. (Admittedly
>> directly blaming the target block has always been a little bit
>> arbitrary, at least in certain cases, but even there it provides
>> structure that makes things much easier to describe unambiguously.)
>
> The possible way to load the target block only once is to get rid of the cross-page uniqueness violation check. I
introducedit to catch more possible cases of uniqueness violations. Though they are expected to be extremely rare, and
anywaythe algorithm doesn't get any warranty, just does its best to catch what is possible. I don't object to this
change.

I think we could probably just avoid setting state->target during
cross-page check.  Just save that into a local variable and pass as an
argument where needed.

Skipping the visibility checks for "only one tuple for scan key" case
looks like very valuable optimization [1].

I also think we should wrap lVis_* variables into struct.  That would
make the way we pass them to functions more elegant.

Links.
1. https://www.postgresql.org/message-id/20240325020323.fd.nmisch%40google.com

------
Regards,
Alexander Korotkov



Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Alexander Korotkov
Date:
On Wed, Apr 17, 2024 at 9:38 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> On 24.10.23 22:13, Alexander Korotkov wrote:
> > On Wed, Sep 28, 2022 at 11:44 AM Aleksander Alekseev
> > <aleksander@timescale.com> wrote:
> >>> I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done
withoutany significant code changes and CF bot how happy again. 
> >>>
> >>> I'm going to move it to RfC, could I? If not, please tell why.
> >>
> >> I restored the "Ready for Committer" state. I don't think it's a good
> >> practice to change the state every time the patch has a slight
> >> conflict or something. This is not helpful at all. Such things happen
> >> quite regularly and typically are fixed in a couple of days.
> >
> > This patch seems useful to me.  I went through the thread, it seems
> > that all the critics are addressed.
> >
> > I've rebased this patch.   Also, I've run perltidy for tests, split
> > long errmsg() into errmsg(), errdetail() and errhint(), and do other
> > minor enchantments.
> >
> > I think this patch is ready to go.  I'm going to push it if there are
> > no objections.
>
> I just found the new pg_amcheck option --checkunique in PG17-to-be.
> Could we rename this to --check-unique?  Seems friendlier.  Maybe also
> rename the bt_index_check function argument to check_unique.

+1 from me
Let's do so if nobody objects.

------
Regards,
Alexander Korotkov



Hi, hackers!

On Wed, 24 Apr 2024 at 13:58, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Wed, Apr 17, 2024 at 9:38 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> On 24.10.23 22:13, Alexander Korotkov wrote:
> > On Wed, Sep 28, 2022 at 11:44 AM Aleksander Alekseev
> > <aleksander@timescale.com> wrote:
> >>> I think, this patch was marked as "Waiting on Author", probably, by mistake. Since recent changes were done without any significant code changes and CF bot how happy again.
> >>>
> >>> I'm going to move it to RfC, could I? If not, please tell why.
> >>
> >> I restored the "Ready for Committer" state. I don't think it's a good
> >> practice to change the state every time the patch has a slight
> >> conflict or something. This is not helpful at all. Such things happen
> >> quite regularly and typically are fixed in a couple of days.
> >
> > This patch seems useful to me.  I went through the thread, it seems
> > that all the critics are addressed.
> >
> > I've rebased this patch.   Also, I've run perltidy for tests, split
> > long errmsg() into errmsg(), errdetail() and errhint(), and do other
> > minor enchantments.
> >
> > I think this patch is ready to go.  I'm going to push it if there are
> > no objections.
>
> I just found the new pg_amcheck option --checkunique in PG17-to-be.
> Could we rename this to --check-unique?  Seems friendlier.  Maybe also
> rename the bt_index_check function argument to check_unique.

+1 from me
Let's do so if nobody objects.

Thank you very much for your input in this thread!

See the patches based on the proposals in the attachment:

0001: Optimize speed by avoiding heap visibility checking for different non-deduplicated index tuples as proposed by Noah Misch

Speed measurements on my laptop using the exact method recommended by Noah upthread:
Current master branch: checkunique off: 144s, checkunique on: 419s
With patch 0001: checkunique off: 141s, checkunique on: 171s

0002: Use structure to store and transfer info about last visible heap entry (code refactoring) as proposed by Alexander Korotkov

0003: Don't load rightpage into BtreeCheckState (code refactoring) as proposed by Peter Geoghegan

Loading of right page for cross-page unique constraint check in the same way as in bt_right_page_check_scankey() 

0004: Report error when next page to a leaf is not a leaf as proposed by Peter Geoghegan

I think it's a very improbable condition and this check might be not necessary, but it's right and safe to break check and report error.

0005: Rename checkunique parameter to more user friendly as proposed by Peter Eisentraut and Alexander Korotkov

Again many thanks for the useful proposals!

Regards,
Pavel Borisov,
Supabase

Attachment

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From
Karina Litskevich
Date:
Hi, hackers!

On Thu, Apr 25, 2024 at 4:00 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
0005: Rename checkunique parameter to more user friendly as proposed by Peter Eisentraut and Alexander Korotkov

I'm not sure renaming checkunique is a good idea. Other arguments of
bt_index_check and bt_index_parent_check functions (heapallindexed and
rootdescend) don't have underscore character in them. Corresponding
pg_amcheck options (--heapallindexed and --rootdescend) are also written
in one piece. check_unique and --check-unique stand out. Making arguments
and options in different styles doesn't seem user friendly to me.

Best regards,
Karina Litskevich
Postgres Professional: http://postgrespro.com/ 
Hi, Karina!

On Thu, 25 Apr 2024 at 17:44, Karina Litskevich <litskevichkarina@gmail.com> wrote:
Hi, hackers!

On Thu, Apr 25, 2024 at 4:00 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
0005: Rename checkunique parameter to more user friendly as proposed by Peter Eisentraut and Alexander Korotkov

I'm not sure renaming checkunique is a good idea. Other arguments of
bt_index_check and bt_index_parent_check functions (heapallindexed and
rootdescend) don't have underscore character in them. Corresponding
pg_amcheck options (--heapallindexed and --rootdescend) are also written
in one piece. check_unique and --check-unique stand out. Making arguments
and options in different styles doesn't seem user friendly to me.

I did it under the consensus of Peter Eisentraut and Alexander Korotkov. 
The pro for renaming is more user-friendly naming, I also agree. 
The cons is that we already have both styles: "non-user friendly" heapallindexed and rootdescend and "user-friendly" parent-check.

I'm ready to go with consensus in this matter. It's also not yet too late to make it unique-check (instead of check-unique) to be better in style with parent-check.

Kind regards,
Pavel
On Thu, Apr 25, 2024 at 04:59:54PM +0400, Pavel Borisov wrote:
> 0001: Optimize speed by avoiding heap visibility checking for different
> non-deduplicated index tuples as proposed by Noah Misch
> 
> Speed measurements on my laptop using the exact method recommended by Noah
> upthread:
> Current master branch: checkunique off: 144s, checkunique on: 419s
> With patch 0001: checkunique off: 141s, checkunique on: 171s

Where is the CPU time going to make it still be 21% slower w/ checkunique on?
It's a great improvement vs. current master, but I don't have an obvious
explanation for the remaining +21%.



Hi Noah,

On Wed, May 1, 2024 at 5:24 AM Noah Misch <noah@leadboat.com> wrote:
> On Thu, Apr 25, 2024 at 04:59:54PM +0400, Pavel Borisov wrote:
> > 0001: Optimize speed by avoiding heap visibility checking for different
> > non-deduplicated index tuples as proposed by Noah Misch
> >
> > Speed measurements on my laptop using the exact method recommended by Noah
> > upthread:
> > Current master branch: checkunique off: 144s, checkunique on: 419s
> > With patch 0001: checkunique off: 141s, checkunique on: 171s
>
> Where is the CPU time going to make it still be 21% slower w/ checkunique on?
> It's a great improvement vs. current master, but I don't have an obvious
> explanation for the remaining +21%.

I think there is at least extra index tuples comparison.

------
Regards,
Alexander Korotkov



On Wed, May 1, 2024 at 5:26 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Wed, May 1, 2024 at 5:24 AM Noah Misch <noah@leadboat.com> wrote:
> > On Thu, Apr 25, 2024 at 04:59:54PM +0400, Pavel Borisov wrote:
> > > 0001: Optimize speed by avoiding heap visibility checking for different
> > > non-deduplicated index tuples as proposed by Noah Misch
> > >
> > > Speed measurements on my laptop using the exact method recommended by Noah
> > > upthread:
> > > Current master branch: checkunique off: 144s, checkunique on: 419s
> > > With patch 0001: checkunique off: 141s, checkunique on: 171s
> >
> > Where is the CPU time going to make it still be 21% slower w/ checkunique on?
> > It's a great improvement vs. current master, but I don't have an obvious
> > explanation for the remaining +21%.
>
> I think there is at least extra index tuples comparison.

The revised patchset is attached.  I applied cosmetical changes.  I'm
going to push it if no objections.

I don't post the patch with rename of new option.  It doesn't seem
there is a consensus.  I must admit that keeping all the options in
the same naming convention makes sense.

------
Regards,
Alexander Korotkov
Supabase

Attachment
Alexander Korotkov <aekorotkov@gmail.com> writes:
> The revised patchset is attached.  I applied cosmetical changes.  I'm
> going to push it if no objections.

Is this really suitable material to be pushing post-feature-freeze?
It doesn't look like it's fixing any new-in-v17 issues.

            regards, tom lane



Hi, Tom!

On Fri, 10 May 2024, 04:43 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
Alexander Korotkov <aekorotkov@gmail.com> writes:
> The revised patchset is attached.  I applied cosmetical changes.  I'm
> going to push it if no objections.

Is this really suitable material to be pushing post-feature-freeze?
It doesn't look like it's fixing any new-in-v17 issues.

                        regards, tom lane

I think these patches are nice-to-have optimizations and refactorings to make code look better. They are not necessary for the main feature. They don't fix any bugs. But they were requested in the thread, and make sense in my opinion. 

I really don't know what's the policy of applying code improvements other than bugfixes post feature-freeze. IMO they are safe to be appiled to v17, but they also could be added later.

Regards, 
Pavel Borisov
Supabase
On Fri, May 10, 2024 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > The revised patchset is attached.  I applied cosmetical changes.  I'm
> > going to push it if no objections.
>
> Is this really suitable material to be pushing post-feature-freeze?
> It doesn't look like it's fixing any new-in-v17 issues.

These are code improvements to the 5ae2087202, which answer critics in
the thread.  0001 comprises an optimization, but it's rather small and
simple.  0002 and 0003 contain refactoring.  0004 contains better
error reporting.  For me this looks like pretty similar to what others
commit post-FF, isn't it?

------
Regards,
Alexander Korotkov
Supabase



Hi, Alexander!

On Fri, 10 May 2024 at 12:39, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Fri, May 10, 2024 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > The revised patchset is attached.  I applied cosmetical changes.  I'm
> > going to push it if no objections.
>
> Is this really suitable material to be pushing post-feature-freeze?
> It doesn't look like it's fixing any new-in-v17 issues.

These are code improvements to the 5ae2087202, which answer critics in
the thread.  0001 comprises an optimization, but it's rather small and
simple.  0002 and 0003 contain refactoring.  0004 contains better
error reporting.  For me this looks like pretty similar to what others
commit post-FF, isn't it?
I've re-checked patches v2. Differences from v1 are in improving naming/pgindent's/commit messages.
In 0002 order of variables in struct BtreeLastVisibleEntry changed. This doesn't change code behavior.

Patch v2-0003 doesn't contain credits and a discussion link. All other patches do. 

Overall, patches contain small performance optimization (0001), code refactoring and error reporting changes. IMO they could be pushed post-FF.

Regards,
Pavel.


> On May 10, 2024, at 5:10 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> Hi, Alexander!
>
> On Fri, 10 May 2024 at 12:39, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Fri, May 10, 2024 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Alexander Korotkov <aekorotkov@gmail.com> writes:
> > > The revised patchset is attached.  I applied cosmetical changes.  I'm
> > > going to push it if no objections.
> >
> > Is this really suitable material to be pushing post-feature-freeze?
> > It doesn't look like it's fixing any new-in-v17 issues.
>
> These are code improvements to the 5ae2087202, which answer critics in
> the thread.  0001 comprises an optimization, but it's rather small and
> simple.  0002 and 0003 contain refactoring.  0004 contains better
> error reporting.  For me this looks like pretty similar to what others
> commit post-FF, isn't it?
> I've re-checked patches v2. Differences from v1 are in improving naming/pgindent's/commit messages.
> In 0002 order of variables in struct BtreeLastVisibleEntry changed. This doesn't change code behavior.
>
> Patch v2-0003 doesn't contain credits and a discussion link. All other patches do.
>
> Overall, patches contain small performance optimization (0001), code refactoring and error reporting changes. IMO
theycould be pushed post-FF. 

v2-0001's commit message itself says, "This commit implements skipping keys". I take no position on the correctness or
valueof the improvement, but it seems out of scope post feature freeze.  The patch seems to postpone uniqueness
checkinguntil later in the scan than what the prior version did, and that kind of change could require more analysis
thanwe have time for at this point in the release cycle. 


v2-0002 does appear to just be refactoring.  I don't care for a small portion of that patch, but I doubt it violates
thepost feature freeze rules.  In particular: 

  +   BtreeLastVisibleEntry lVis = {InvalidBlockNumber, InvalidOffsetNumber, -1, NULL};


v2-0003 may be an improvement in some way, but it compounds some preexisting confusion also.  There is already a member
ofthe BtreeCheckState called "target" and a memory context in that struct called "targetcontext".  That context is used
toallocate pages "state->target", "rightpage", "child" and "page", but not "metapage".  Perhaps "targetcontext" is a
poorchoice of name?  "notmetacontext" is a terrible name, but closer to describing the purpose of the memory context.
Careto propose something sensible? 

Prior to applying v2-0003, the rightpage was stored in state->target, and continued to be in state->target later when
entering

        /*
         * * Downlink check *
         *
         * Additional check of child items iff this is an internal page and
         * caller holds a ShareLock.  This happens for every downlink (item)
         * in target excluding the negative-infinity downlink (again, this is
         * because it has no useful value to compare).
         */
        if (!P_ISLEAF(topaque) && state->readonly)
            bt_child_check(state, skey, offset);

and thereafter.  Now, the rightpage of state->target is created, checked, and free'd, and then the old state->target
getsprocessed in the downlink check and thereafter.  This is either introducing a bug, or fixing one, but the commit
messageis totally ambiguous about whether this is a bugfix or a code cleanup or something else?  I think this kind of
patchshould have a super clear commit message about what it thinks it is doing. 


v2-0004 guards against a real threat, and is reasonable post feature freeze


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Hi, Mark!


On Fri, 10 May 2024, 21:35 Mark Dilger, <mark.dilger@enterprisedb.com> wrote:


> On May 10, 2024, at 5:10 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> Hi, Alexander!
>
> On Fri, 10 May 2024 at 12:39, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Fri, May 10, 2024 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Alexander Korotkov <aekorotkov@gmail.com> writes:
> > > The revised patchset is attached.  I applied cosmetical changes.  I'm
> > > going to push it if no objections.
> >
> > Is this really suitable material to be pushing post-feature-freeze?
> > It doesn't look like it's fixing any new-in-v17 issues.
>
> These are code improvements to the 5ae2087202, which answer critics in
> the thread.  0001 comprises an optimization, but it's rather small and
> simple.  0002 and 0003 contain refactoring.  0004 contains better
> error reporting.  For me this looks like pretty similar to what others
> commit post-FF, isn't it?
> I've re-checked patches v2. Differences from v1 are in improving naming/pgindent's/commit messages.
> In 0002 order of variables in struct BtreeLastVisibleEntry changed. This doesn't change code behavior.
>
> Patch v2-0003 doesn't contain credits and a discussion link. All other patches do.
>
> Overall, patches contain small performance optimization (0001), code refactoring and error reporting changes. IMO they could be pushed post-FF.

v2-0001's commit message itself says, "This commit implements skipping keys". I take no position on the correctness or value of the improvement, but it seems out of scope post feature freeze.  The patch seems to postpone uniqueness checking until later in the scan than what the prior version did, and that kind of change could require more analysis than we have time for at this point in the release cycle.


v2-0002 does appear to just be refactoring.  I don't care for a small portion of that patch, but I doubt it violates the post feature freeze rules.  In particular:

  +   BtreeLastVisibleEntry lVis = {InvalidBlockNumber, InvalidOffsetNumber, -1, NULL};


v2-0003 may be an improvement in some way, but it compounds some preexisting confusion also.  There is already a member of the BtreeCheckState called "target" and a memory context in that struct called "targetcontext".  That context is used to allocate pages "state->target", "rightpage", "child" and "page", but not "metapage".  Perhaps "targetcontext" is a poor choice of name?  "notmetacontext" is a terrible name, but closer to describing the purpose of the memory context.  Care to propose something sensible?

Prior to applying v2-0003, the rightpage was stored in state->target, and continued to be in state->target later when entering

        /*
         * * Downlink check *
         * 
         * Additional check of child items iff this is an internal page and
         * caller holds a ShareLock.  This happens for every downlink (item)
         * in target excluding the negative-infinity downlink (again, this is
         * because it has no useful value to compare).
         */
        if (!P_ISLEAF(topaque) && state->readonly)
            bt_child_check(state, skey, offset);

and thereafter.  Now, the rightpage of state->target is created, checked, and free'd, and then the old state->target gets processed in the downlink check and thereafter.  This is either introducing a bug, or fixing one, but the commit message is totally ambiguous about whether this is a bugfix or a code cleanup or something else?  I think this kind of patch should have a super clear commit message about what it thinks it is doing.


v2-0004 guards against a real threat, and is reasonable post feature freeze



Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

IMO 0003 doesn't introduce nor fixes a bug. It loads rightpage into a local variable, rather that to a BtreeCheckState that can have another users of state->target afterb uniqueness check in the future, but don't have now. So the original patch is correct, and the goal of this refactoring is to untie rightpage fron state structure as it's used only transiently for cross-page unuque check. It's the same style as already used bt_right_page_check_scankey() that loads rightpage into a local variable.

For 0002 I doubt I understand your actual positiob. Could you explain what it violates or doesn't violate?

Best regards,
Pavel.


On Fri, 10 May 2024, 22:42 Pavel Borisov, <pashkin.elfe@gmail.com> wrote:
Hi, Mark!


On Fri, 10 May 2024, 21:35 Mark Dilger, <mark.dilger@enterprisedb.com> wrote:


> On May 10, 2024, at 5:10 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> Hi, Alexander!
>
> On Fri, 10 May 2024 at 12:39, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Fri, May 10, 2024 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Alexander Korotkov <aekorotkov@gmail.com> writes:
> > > The revised patchset is attached.  I applied cosmetical changes.  I'm
> > > going to push it if no objections.
> >
> > Is this really suitable material to be pushing post-feature-freeze?
> > It doesn't look like it's fixing any new-in-v17 issues.
>
> These are code improvements to the 5ae2087202, which answer critics in
> the thread.  0001 comprises an optimization, but it's rather small and
> simple.  0002 and 0003 contain refactoring.  0004 contains better
> error reporting.  For me this looks like pretty similar to what others
> commit post-FF, isn't it?
> I've re-checked patches v2. Differences from v1 are in improving naming/pgindent's/commit messages.
> In 0002 order of variables in struct BtreeLastVisibleEntry changed. This doesn't change code behavior.
>
> Patch v2-0003 doesn't contain credits and a discussion link. All other patches do.
>
> Overall, patches contain small performance optimization (0001), code refactoring and error reporting changes. IMO they could be pushed post-FF.

v2-0001's commit message itself says, "This commit implements skipping keys". I take no position on the correctness or value of the improvement, but it seems out of scope post feature freeze.  The patch seems to postpone uniqueness checking until later in the scan than what the prior version did, and that kind of change could require more analysis than we have time for at this point in the release cycle.


v2-0002 does appear to just be refactoring.  I don't care for a small portion of that patch, but I doubt it violates the post feature freeze rules.  In particular:

  +   BtreeLastVisibleEntry lVis = {InvalidBlockNumber, InvalidOffsetNumber, -1, NULL};


v2-0003 may be an improvement in some way, but it compounds some preexisting confusion also.  There is already a member of the BtreeCheckState called "target" and a memory context in that struct called "targetcontext".  That context is used to allocate pages "state->target", "rightpage", "child" and "page", but not "metapage".  Perhaps "targetcontext" is a poor choice of name?  "notmetacontext" is a terrible name, but closer to describing the purpose of the memory context.  Care to propose something sensible?

Prior to applying v2-0003, the rightpage was stored in state->target, and continued to be in state->target later when entering

        /*
         * * Downlink check *
         * 
         * Additional check of child items iff this is an internal page and
         * caller holds a ShareLock.  This happens for every downlink (item)
         * in target excluding the negative-infinity downlink (again, this is
         * because it has no useful value to compare).
         */
        if (!P_ISLEAF(topaque) && state->readonly)
            bt_child_check(state, skey, offset);

and thereafter.  Now, the rightpage of state->target is created, checked, and free'd, and then the old state->target gets processed in the downlink check and thereafter.  This is either introducing a bug, or fixing one, but the commit message is totally ambiguous about whether this is a bugfix or a code cleanup or something else?  I think this kind of patch should have a super clear commit message about what it thinks it is doing.


v2-0004 guards against a real threat, and is reasonable post feature freeze



Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

IMO 0003 doesn't introduce nor fixes a bug. It loads rightpage into a local variable, rather that to a BtreeCheckState that can have another users of state->target afterb uniqueness check in the future, but don't have now. So the original patch is correct, and the goal of this refactoring is to untie rightpage fron state structure as it's used only transiently for cross-page unuque check. It's the same style as already used bt_right_page_check_scankey() that loads rightpage into a local variable.

For 0002 I doubt I understand your actual positiob. Could you explain what it violates or doesn't violate?
Please forgive many typos in the previous message, I wrote from phone.

Pavel.
On Fri, May 10, 2024 at 8:35 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> > On May 10, 2024, at 5:10 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> > On Fri, 10 May 2024 at 12:39, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > On Fri, May 10, 2024 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > Alexander Korotkov <aekorotkov@gmail.com> writes:
> > > > The revised patchset is attached.  I applied cosmetical changes.  I'm
> > > > going to push it if no objections.
> > >
> > > Is this really suitable material to be pushing post-feature-freeze?
> > > It doesn't look like it's fixing any new-in-v17 issues.
> >
> > These are code improvements to the 5ae2087202, which answer critics in
> > the thread.  0001 comprises an optimization, but it's rather small and
> > simple.  0002 and 0003 contain refactoring.  0004 contains better
> > error reporting.  For me this looks like pretty similar to what others
> > commit post-FF, isn't it?
> > I've re-checked patches v2. Differences from v1 are in improving naming/pgindent's/commit messages.
> > In 0002 order of variables in struct BtreeLastVisibleEntry changed. This doesn't change code behavior.
> >
> > Patch v2-0003 doesn't contain credits and a discussion link. All other patches do.
> >
> > Overall, patches contain small performance optimization (0001), code refactoring and error reporting changes. IMO
theycould be pushed post-FF. 
>
> v2-0001's commit message itself says, "This commit implements skipping keys". I take no position on the correctness
orvalue of the improvement, but it seems out of scope post feature freeze.  The patch seems to postpone uniqueness
checkinguntil later in the scan than what the prior version did, and that kind of change could require more analysis
thanwe have time for at this point in the release cycle. 

Formally this could be classified as algorithmic change and probably
should be postponed to the next release.  But that's quite local
optimization, which just postpones a function call within the same
iteration of loop.  And the effect is huge.  Probably we could allow
this post-FF in the sake of quality release, given it's very local
change with a huge effect.

> v2-0002 does appear to just be refactoring.  I don't care for a small portion of that patch, but I doubt it violates
thepost feature freeze rules.  In particular: 
>
>   +   BtreeLastVisibleEntry lVis = {InvalidBlockNumber, InvalidOffsetNumber, -1, NULL};

I don't understand what is the problem with this line and post feature
freeze rules.  Please, explain it more.

> v2-0003 may be an improvement in some way, but it compounds some preexisting confusion also.  There is already a
memberof the BtreeCheckState called "target" and a memory context in that struct called "targetcontext".  That context
isused to allocate pages "state->target", "rightpage", "child" and "page", but not "metapage".  Perhaps "targetcontext"
isa poor choice of name?  "notmetacontext" is a terrible name, but closer to describing the purpose of the memory
context. Care to propose something sensible? 
>
> Prior to applying v2-0003, the rightpage was stored in state->target, and continued to be in state->target later when
entering
>
>         /*
>          * * Downlink check *
>          *
>          * Additional check of child items iff this is an internal page and
>          * caller holds a ShareLock.  This happens for every downlink (item)
>          * in target excluding the negative-infinity downlink (again, this is
>          * because it has no useful value to compare).
>          */
>         if (!P_ISLEAF(topaque) && state->readonly)
>             bt_child_check(state, skey, offset);
>
> and thereafter.  Now, the rightpage of state->target is created, checked, and free'd, and then the old state->target
getsprocessed in the downlink check and thereafter.  This is either introducing a bug, or fixing one, but the commit
messageis totally ambiguous about whether this is a bugfix or a code cleanup or something else?  I think this kind of
patchshould have a super clear commit message about what it thinks it is doing. 

The only bt_target_page_check() caller is
bt_check_level_from_leftmost(), which overrides state->target in the
next iteration anyway.  I think the patch is just refactoring to
eliminate the confusion pointer by Peter Geoghegan upthread.

0002 and 0003 don't address any bugs, but It would be very nice to
accept them, because it would simplify future backpatching in this
area.

> v2-0004 guards against a real threat, and is reasonable post feature freeze

Ok.

------
Regards,
Alexander Korotkov
Supabase




> On May 10, 2024, at 11:42 AM, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> IMO 0003 doesn't introduce nor fixes a bug. It loads rightpage into a local variable, rather that to a
BtreeCheckStatethat can have another users of state->target afterb uniqueness check in the future, but don't have now.
Sothe original patch is correct, and the goal of this refactoring is to untie rightpage fron state structure as it's
usedonly transiently for cross-page unuque check. It's the same style as already used bt_right_page_check_scankey()
thatloads rightpage into a local variable. 

Well, you can put an Assert(false) dead in the middle of the code we're discussing and all the regression tests still
pass,so I'd argue the change is getting zero test coverage. 

This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting
"state->target",which the old implementation most certainly did.  That means that after returning from
bt_target_page_check()into the calling function bt_check_level_from_leftmost() the value in state->target is not what
itwould have been prior to this patch.  Now, that'd be irrelevant if nobody goes on to consult that value, but just 44
linesfurther down in bt_check_level_from_leftmost() state->target is clearly used.  So the behavior at that point is
changingbetween the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the
patch,wrong after the patch, or something else?  Is this a bug being introduced, being fixed, or ... ? 

Having a regression test that actually touches this code would go a fair way towards helping the analysis.

> For 0002 I doubt I understand your actual positiob. Could you explain what it violates or doesn't violate?

v2-0002 is does not violate the post feature freeze restriction on new features so far as I can tell, but I just don't
carefor the variable initialization because it doesn't name the fields.  If anybody refactored the struct they might
notnotice that the need to reorder this initialization, and depending on various factors including compiler flags they
mightnot get an error. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







> On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> The only bt_target_page_check() caller is
> bt_check_level_from_leftmost(), which overrides state->target in the
> next iteration anyway.  I think the patch is just refactoring to
> eliminate the confusion pointer by Peter Geoghegan upthread.

I find your argument unconvincing.

After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target in
thenext iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by
state->target. See line 963. 

I'm left with four possibilities:


1)  bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same
iterationwhere bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're
makingdoesn't matter. 

2)  The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thing
tohappen at what is now line 963.  The patch fixes the bug, because state->target no longer gets overwritten where you
arenow using "rightpage" for the value. 

3)  The code used to work, having set up state->target correctly in the place where you are now using "rightpage", but
v2-0003has broken that.  

4)  It's been broken all along and your patch just changes from wrong to wrong.


If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that you
arenot documenting it.  Even with documentation of this interrelationship, I'd be unhappy with how brittle the code is.
I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way or
theother.  I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so adding
Assertsdoesn't help to investigate the question. 

If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix.  Bug
fixesshould be clearly documented as such, otherwise future work might assume the commit can be reverted with only
stylisticconsequences. 

If (3) is true, then I'm complaining that the patch is flat busted.

If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are
needed.

Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage.


For reference, I said something similar earlier today in another email to this thread:

This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting
"state->target",which the old implementation most certainly did.  That means that after returning from
bt_target_page_check()into the calling function bt_check_level_from_leftmost() the value in state->target is not what
itwould have been prior to this patch.  Now, that'd be irrelevant if nobody goes on to consult that value, but just 44
linesfurther down in bt_check_level_from_leftmost() state->target is clearly used.  So the behavior at that point is
changingbetween the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the
patch,wrong after the patch, or something else?  Is this a bug being introduced, being fixed, or ... ? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Mark Dilger <mark.dilger@enterprisedb.com> writes:
> Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage.

Indeed.  If we have no regression tests that reach this code, it's
folly to touch it at all, but most especially so post-feature-freeze.

I think the *first* order of business ought to be to create some
test cases that reach this area.  Perhaps they'll be too expensive
to incorporate in our regular regression tests, but we could still
use them to investigate Mark's concerns.

            regards, tom lane



On Sat, May 11, 2024 at 4:13 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > The only bt_target_page_check() caller is
> > bt_check_level_from_leftmost(), which overrides state->target in the
> > next iteration anyway.  I think the patch is just refactoring to
> > eliminate the confusion pointer by Peter Geoghegan upthread.
>
> I find your argument unconvincing.
>
> After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target
inthe next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by
state->target. See line 963. 
>
> I'm left with four possibilities:
>
>
> 1)  bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same
iterationwhere bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're
makingdoesn't matter. 
>
> 2)  The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong
thingto happen at what is now line 963.  The patch fixes the bug, because state->target no longer gets overwritten
whereyou are now using "rightpage" for the value. 
>
> 3)  The code used to work, having set up state->target correctly in the place where you are now using "rightpage",
butv2-0003 has broken that. 
>
> 4)  It's been broken all along and your patch just changes from wrong to wrong.
>
>
> If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that
youare not documenting it.  Even with documentation of this interrelationship, I'd be unhappy with how brittle the code
is. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way
orthe other.  I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so
addingAsserts doesn't help to investigate the question. 
>
> If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix.  Bug
fixesshould be clearly documented as such, otherwise future work might assume the commit can be reverted with only
stylisticconsequences. 
>
> If (3) is true, then I'm complaining that the patch is flat busted.
>
> If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are
needed.
>
> Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage.
>
>
> For reference, I said something similar earlier today in another email to this thread:
>
> This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting
"state->target",which the old implementation most certainly did.  That means that after returning from
bt_target_page_check()into the calling function bt_check_level_from_leftmost() the value in state->target is not what
itwould have been prior to this patch.  Now, that'd be irrelevant if nobody goes on to consult that value, but just 44
linesfurther down in bt_check_level_from_leftmost() state->target is clearly used.  So the behavior at that point is
changingbetween the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the
patch,wrong after the patch, or something else?  Is this a bug being introduced, being fixed, or ... ? 

Thank you for your analysis.  I'm inclined to believe in 2, but not
yet completely sure.  It's really pity that our tests don't cover
this.  I'm investigating this area.

------
Regards,
Alexander Korotkov



On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
> On Sat, May 11, 2024 at 4:13 AM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
> > > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > The only bt_target_page_check() caller is
> > > bt_check_level_from_leftmost(), which overrides state->target in the
> > > next iteration anyway.  I think the patch is just refactoring to
> > > eliminate the confusion pointer by Peter Geoghegan upthread.
> >
> > I find your argument unconvincing.
> >
> > After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target
inthe next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by
state->target. See line 963. 
> >
> > I'm left with four possibilities:
> >
> >
> > 1)  bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same
iterationwhere bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're
makingdoesn't matter. 
> >
> > 2)  The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong
thingto happen at what is now line 963.  The patch fixes the bug, because state->target no longer gets overwritten
whereyou are now using "rightpage" for the value. 
> >
> > 3)  The code used to work, having set up state->target correctly in the place where you are now using "rightpage",
butv2-0003 has broken that. 
> >
> > 4)  It's been broken all along and your patch just changes from wrong to wrong.
> >
> >
> > If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that
youare not documenting it.  Even with documentation of this interrelationship, I'd be unhappy with how brittle the code
is. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way
orthe other.  I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so
addingAsserts doesn't help to investigate the question. 
> >
> > If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix.  Bug
fixesshould be clearly documented as such, otherwise future work might assume the commit can be reverted with only
stylisticconsequences. 
> >
> > If (3) is true, then I'm complaining that the patch is flat busted.
> >
> > If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are
needed.
> >
> > Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage.
> >
> >
> > For reference, I said something similar earlier today in another email to this thread:
> >
> > This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting
"state->target",which the old implementation most certainly did.  That means that after returning from
bt_target_page_check()into the calling function bt_check_level_from_leftmost() the value in state->target is not what
itwould have been prior to this patch.  Now, that'd be irrelevant if nobody goes on to consult that value, but just 44
linesfurther down in bt_check_level_from_leftmost() state->target is clearly used.  So the behavior at that point is
changingbetween the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the
patch,wrong after the patch, or something else?  Is this a bug being introduced, being fixed, or ... ? 
>
> Thank you for your analysis.  I'm inclined to believe in 2, but not
> yet completely sure.  It's really pity that our tests don't cover
> this.  I'm investigating this area.

It seems that I got to the bottom of this.  Changing
BtreeCheckState.target for a cross-page unique constraint check is
wrong, but that happens only for leaf pages.  After that
BtreeCheckState.target is only used for setting the low key.  The low
key is only used for non-leaf pages.  So, that didn't lead to any
visible bug.  I've revised the commit message to reflect this.

So, the picture for the patches is the following now.
0001 – optimization, but rather simple and giving huge effect
0002 – refactoring
0003 – fix for the bug
0004 – better error reporting

------
Regards,
Alexander Korotkov

Attachment
Hi, Alexander!

On Mon, 13 May 2024 at 05:42, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
> On Sat, May 11, 2024 at 4:13 AM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
> > > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > The only bt_target_page_check() caller is
> > > bt_check_level_from_leftmost(), which overrides state->target in the
> > > next iteration anyway.  I think the patch is just refactoring to
> > > eliminate the confusion pointer by Peter Geoghegan upthread.
> >
> > I find your argument unconvincing.
> >
> > After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target in the next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by state->target.  See line 963.
> >
> > I'm left with four possibilities:
> >
> >
> > 1)  bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iteration where bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're making doesn't matter.
> >
> > 2)  The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thing to happen at what is now line 963.  The patch fixes the bug, because state->target no longer gets overwritten where you are now using "rightpage" for the value.
> >
> > 3)  The code used to work, having set up state->target correctly in the place where you are now using "rightpage", but v2-0003 has broken that.
> >
> > 4)  It's been broken all along and your patch just changes from wrong to wrong.
> >
> >
> > If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that you are not documenting it.  Even with documentation of this interrelationship, I'd be unhappy with how brittle the code is.  I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way or the other.  I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so adding Asserts doesn't help to investigate the question.
> >
> > If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix.  Bug fixes should be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylistic consequences.
> >
> > If (3) is true, then I'm complaining that the patch is flat busted.
> >
> > If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are needed.
> >
> > Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage.
> >
> >
> > For reference, I said something similar earlier today in another email to this thread:
> >
> > This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target", which the old implementation most certainly did.  That means that after returning from bt_target_page_check() into the calling function bt_check_level_from_leftmost() the value in state->target is not what it would have been prior to this patch.  Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 lines further down in bt_check_level_from_leftmost() state->target is clearly used.  So the behavior at that point is changing between the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch, wrong after the patch, or something else?  Is this a bug being introduced, being fixed, or ... ?
>
> Thank you for your analysis.  I'm inclined to believe in 2, but not
> yet completely sure.  It's really pity that our tests don't cover
> this.  I'm investigating this area.

It seems that I got to the bottom of this.  Changing
BtreeCheckState.target for a cross-page unique constraint check is
wrong, but that happens only for leaf pages.  After that
BtreeCheckState.target is only used for setting the low key.  The low
key is only used for non-leaf pages.  So, that didn't lead to any
visible bug.  I've revised the commit message to reflect this.
 
I agree with your analysis regarding state->target:
- when the unique check is on, state->target was reassigned only for the leaf pages (under P_ISLEAF(topaque) in bt_target_page_check).
- in this level (leaf) in bt_check_level_from_leftmost() this value of state->target was used to get state->lowkey. Then it was reset (in the next iteration of do loop in in bt_check_level_from_leftmost()
- state->lowkey lives until the end of pages level (leaf) iteration cycle. Then, low-key is reset (state->lowkey = NULL in the end of  bt_check_level_from_leftmost())
- state->lowkey is used only in bt_child_check/bt_child_highkey_check. Both are called only from non-leaf pages iteration cycles (under P_ISLEAF(topaque))
- Also there is a check (rightblock_number != P_NONE) in before getting rightpage into state->target in bt_target_page_check() that ensures us that rightpage indeed exists and getting this (unused) lowkey in bt_check_level_from_leftmost will not invoke any page reading errors.

I'm pretty sure that there was no bug in this, not just the bug was hidden.

Indeed re-assigning state->target in leaf page iteration for cross-page unique check was not beautiful, and Peter pointed out this. In my opinion the patch 0003 is a pure code refactoring. 

As for the cross-page check regression/TAP testing, this test had problems since the btree page layout is not fixed (especially it's different on 32-bit arch). I had a variant for testing cross-page check when the test was yet regression one upthread for both 32/64 bit architectures. I remember it was decided not to include it due to complications and low impact for testing the corner case of very rare cross-page duplicates. (There were also suggestions to drop cross-page duplicates check at all, which I didn't agree 2 years ago, but still it can make sense)

Separately, I propose to avoid getting state->lowkey for leaf pages at all as it's unused. PFA is a simple patch for this. (I don't add it to the current patch set as I believe it has nothing to do with UNIQUE constraint check, rather it improves the previous btree amcheck code)

Best regards,
Pavel Borisov,
Supabase

Attachment


On Mon, 13 May 2024 at 15:55, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
Hi, Alexander!

On Mon, 13 May 2024 at 05:42, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
> On Sat, May 11, 2024 at 4:13 AM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
> > > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > The only bt_target_page_check() caller is
> > > bt_check_level_from_leftmost(), which overrides state->target in the
> > > next iteration anyway.  I think the patch is just refactoring to
> > > eliminate the confusion pointer by Peter Geoghegan upthread.
> >
> > I find your argument unconvincing.
> >
> > After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target in the next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by state->target.  See line 963.
> >
> > I'm left with four possibilities:
> >
> >
> > 1)  bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iteration where bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're making doesn't matter.
> >
> > 2)  The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thing to happen at what is now line 963.  The patch fixes the bug, because state->target no longer gets overwritten where you are now using "rightpage" for the value.
> >
> > 3)  The code used to work, having set up state->target correctly in the place where you are now using "rightpage", but v2-0003 has broken that.
> >
> > 4)  It's been broken all along and your patch just changes from wrong to wrong.
> >
> >
> > If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that you are not documenting it.  Even with documentation of this interrelationship, I'd be unhappy with how brittle the code is.  I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way or the other.  I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so adding Asserts doesn't help to investigate the question.
> >
> > If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix.  Bug fixes should be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylistic consequences.
> >
> > If (3) is true, then I'm complaining that the patch is flat busted.
> >
> > If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are needed.
> >
> > Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage.
> >
> >
> > For reference, I said something similar earlier today in another email to this thread:
> >
> > This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target", which the old implementation most certainly did.  That means that after returning from bt_target_page_check() into the calling function bt_check_level_from_leftmost() the value in state->target is not what it would have been prior to this patch.  Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 lines further down in bt_check_level_from_leftmost() state->target is clearly used.  So the behavior at that point is changing between the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch, wrong after the patch, or something else?  Is this a bug being introduced, being fixed, or ... ?
>
> Thank you for your analysis.  I'm inclined to believe in 2, but not
> yet completely sure.  It's really pity that our tests don't cover
> this.  I'm investigating this area.

It seems that I got to the bottom of this.  Changing
BtreeCheckState.target for a cross-page unique constraint check is
wrong, but that happens only for leaf pages.  After that
BtreeCheckState.target is only used for setting the low key.  The low
key is only used for non-leaf pages.  So, that didn't lead to any
visible bug.  I've revised the commit message to reflect this.
 
I agree with your analysis regarding state->target:
- when the unique check is on, state->target was reassigned only for the leaf pages (under P_ISLEAF(topaque) in bt_target_page_check).
- in this level (leaf) in bt_check_level_from_leftmost() this value of state->target was used to get state->lowkey. Then it was reset (in the next iteration of do loop in in bt_check_level_from_leftmost()
- state->lowkey lives until the end of pages level (leaf) iteration cycle. Then, low-key is reset (state->lowkey = NULL in the end of  bt_check_level_from_leftmost())
- state->lowkey is used only in bt_child_check/bt_child_highkey_check. Both are called only from non-leaf pages iteration cycles (under P_ISLEAF(topaque))
- Also there is a check (rightblock_number != P_NONE) in before getting rightpage into state->target in bt_target_page_check() that ensures us that rightpage indeed exists and getting this (unused) lowkey in bt_check_level_from_leftmost will not invoke any page reading errors.

I'm pretty sure that there was no bug in this, not just the bug was hidden.

Indeed re-assigning state->target in leaf page iteration for cross-page unique check was not beautiful, and Peter pointed out this. In my opinion the patch 0003 is a pure code refactoring. 

As for the cross-page check regression/TAP testing, this test had problems since the btree page layout is not fixed (especially it's different on 32-bit arch). I had a variant for testing cross-page check when the test was yet regression one upthread for both 32/64 bit architectures. I remember it was decided not to include it due to complications and low impact for testing the corner case of very rare cross-page duplicates. (There were also suggestions to drop cross-page duplicates check at all, which I didn't agree 2 years ago, but still it can make sense)

Separately, I propose to avoid getting state->lowkey for leaf pages at all as it's unused. PFA is a simple patch for this. (I don't add it to the current patch set as I believe it has nothing to do with UNIQUE constraint check, rather it improves the previous btree amcheck code)

A correction of a typo in previous message:
non-leaf pages iteration cycles (under !P_ISLEAF(topaque)) -> non-leaf pages iteration cycles (under !P_ISLEAF(topaque)) 
A correction of a typo in previous message:
non-leaf pages iteration cycles (under P_ISLEAF(topaque)) -> non-leaf pages iteration cycles (under !P_ISLEAF(topaque))

On Mon, 13 May 2024 at 16:19, Pavel Borisov <pashkin.elfe@gmail.com> wrote:


On Mon, 13 May 2024 at 15:55, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
Hi, Alexander!

On Mon, 13 May 2024 at 05:42, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
> On Sat, May 11, 2024 at 4:13 AM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
> > > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > The only bt_target_page_check() caller is
> > > bt_check_level_from_leftmost(), which overrides state->target in the
> > > next iteration anyway.  I think the patch is just refactoring to
> > > eliminate the confusion pointer by Peter Geoghegan upthread.
> >
> > I find your argument unconvincing.
> >
> > After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target in the next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by state->target.  See line 963.
> >
> > I'm left with four possibilities:
> >
> >
> > 1)  bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iteration where bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're making doesn't matter.
> >
> > 2)  The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thing to happen at what is now line 963.  The patch fixes the bug, because state->target no longer gets overwritten where you are now using "rightpage" for the value.
> >
> > 3)  The code used to work, having set up state->target correctly in the place where you are now using "rightpage", but v2-0003 has broken that.
> >
> > 4)  It's been broken all along and your patch just changes from wrong to wrong.
> >
> >
> > If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that you are not documenting it.  Even with documentation of this interrelationship, I'd be unhappy with how brittle the code is.  I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way or the other.  I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so adding Asserts doesn't help to investigate the question.
> >
> > If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix.  Bug fixes should be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylistic consequences.
> >
> > If (3) is true, then I'm complaining that the patch is flat busted.
> >
> > If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are needed.
> >
> > Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage.
> >
> >
> > For reference, I said something similar earlier today in another email to this thread:
> >
> > This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target", which the old implementation most certainly did.  That means that after returning from bt_target_page_check() into the calling function bt_check_level_from_leftmost() the value in state->target is not what it would have been prior to this patch.  Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 lines further down in bt_check_level_from_leftmost() state->target is clearly used.  So the behavior at that point is changing between the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch, wrong after the patch, or something else?  Is this a bug being introduced, being fixed, or ... ?
>
> Thank you for your analysis.  I'm inclined to believe in 2, but not
> yet completely sure.  It's really pity that our tests don't cover
> this.  I'm investigating this area.

It seems that I got to the bottom of this.  Changing
BtreeCheckState.target for a cross-page unique constraint check is
wrong, but that happens only for leaf pages.  After that
BtreeCheckState.target is only used for setting the low key.  The low
key is only used for non-leaf pages.  So, that didn't lead to any
visible bug.  I've revised the commit message to reflect this.
 
I agree with your analysis regarding state->target:
- when the unique check is on, state->target was reassigned only for the leaf pages (under P_ISLEAF(topaque) in bt_target_page_check).
- in this level (leaf) in bt_check_level_from_leftmost() this value of state->target was used to get state->lowkey. Then it was reset (in the next iteration of do loop in in bt_check_level_from_leftmost()
- state->lowkey lives until the end of pages level (leaf) iteration cycle. Then, low-key is reset (state->lowkey = NULL in the end of  bt_check_level_from_leftmost())
- state->lowkey is used only in bt_child_check/bt_child_highkey_check. Both are called only from non-leaf pages iteration cycles (under P_ISLEAF(topaque))
- Also there is a check (rightblock_number != P_NONE) in before getting rightpage into state->target in bt_target_page_check() that ensures us that rightpage indeed exists and getting this (unused) lowkey in bt_check_level_from_leftmost will not invoke any page reading errors.

I'm pretty sure that there was no bug in this, not just the bug was hidden.

Indeed re-assigning state->target in leaf page iteration for cross-page unique check was not beautiful, and Peter pointed out this. In my opinion the patch 0003 is a pure code refactoring. 

As for the cross-page check regression/TAP testing, this test had problems since the btree page layout is not fixed (especially it's different on 32-bit arch). I had a variant for testing cross-page check when the test was yet regression one upthread for both 32/64 bit architectures. I remember it was decided not to include it due to complications and low impact for testing the corner case of very rare cross-page duplicates. (There were also suggestions to drop cross-page duplicates check at all, which I didn't agree 2 years ago, but still it can make sense)

Separately, I propose to avoid getting state->lowkey for leaf pages at all as it's unused. PFA is a simple patch for this. (I don't add it to the current patch set as I believe it has nothing to do with UNIQUE constraint check, rather it improves the previous btree amcheck code)

A correction of a typo in previous message:
non-leaf pages iteration cycles (under !P_ISLEAF(topaque)) -> non-leaf pages iteration cycles (under !P_ISLEAF(topaque))