Re: Amcheck verification of GiST and GIN - Mailing list pgsql-hackers

From Jose Arthur Benetasso Villanova
Subject Re: Amcheck verification of GiST and GIN
Date
Msg-id 45772f21-f9ac-535a-5050-ff89c389cd5@benetasso.com
Whole thread Raw
In response to Re: Amcheck verification of GiST and GIN  (Andrew Borodin <amborodin86@gmail.com>)
Responses Re: Amcheck verification of GiST and GIN
List pgsql-hackers
Hello.

I reviewed this patch and I would like to share some comments.

It compiled with those 2 warnings:

verify_gin.c: In function 'gin_check_parent_keys_consistency':
verify_gin.c:481:38: warning: declaration of 'maxoff' shadows a previous 
local [-Wshadow=compatible-local]
   481 |                         OffsetNumber maxoff = 
PageGetMaxOffsetNumber(page);
       |                                      ^~~~~~
verify_gin.c:453:41: note: shadowed declaration is here
   453 |                                         maxoff;
       |                                         ^~~~~~
verify_gin.c:423:25: warning: unused variable 'heapallindexed' 
[-Wunused-variable]
   423 |         bool            heapallindexed = *((bool*)callback_state);
       |                         ^~~~~~~~~~~~~~


Also, I'm not sure about postgres' headers conventions, inside amcheck.h, 
there is "miscadmin.h" included, and inside verify_gin.c, verify_gist.h 
and verify_nbtree.c both amcheck.h and miscadmin.h are included.

About the documentation, the bt_index_parent_check has comments about the 
ShareLock and "SET client_min_messages = DEBUG1;", and both 
gist_index_parent_check and gin_index_parent_check lack it. verify_gin 
uses DEBUG3, I'm not sure if it is on purpose, but it would be nice to 
document it or put DEBUG1 to be consistent.

I lack enough context to do a deep review on the code, so in this area 
this patch needs more eyes.

I did the following test:

postgres=# create table teste (t text, tv tsvector);
CREATE TABLE
postgres=# insert into teste values ('hello', 'hello'::tsvector);
INSERT 0 1
postgres=# create index teste_tv on teste using gist(tv);
CREATE INDEX
postgres=# select pg_relation_filepath('teste_tv');
  pg_relation_filepath
----------------------
  base/5/16441
(1 row)

postgres=#
\q
$ bin/pg_ctl -D data -l log
waiting for server to shut down.... done
server stopped
$ okteta base/5/16441 # I couldn't figure out the dd syntax to change the 
1FE9 to '0'
$ bin/pg_ctl -D data -l log
waiting for server to start.... done
server started
$ bin/psql -U ze postgres
psql (16devel)
Type "help" for help.

postgres=# SET client_min_messages = DEBUG3;
SET
postgres=# select gist_index_parent_check('teste_tv'::regclass, true);
DEBUG:  verifying that tuples from index "teste_tv" are present in "teste"
ERROR:  heap tuple (0,1) from table "teste" lacks matching index tuple 
within index "teste_tv"
postgres=#

A simple index corruption in gin:

postgres=# CREATE TABLE "gin_check"("Column1" int[]);
CREATE TABLE
postgres=# insert into gin_check values (ARRAY[1]),(ARRAY[2]);
INSERT 0 2
postgres=# CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
CREATE INDEX
postgres=# select pg_relation_filepath('gin_check_idx');
  pg_relation_filepath
----------------------
  base/5/16453
(1 row)

postgres=#
\q
$ bin/pg_ctl -D data -l logfile stop
waiting for server to shut down.... done
server stopped
$ okteta data/base/5/16453 # edited some bits near 3FCC
$ bin/pg_ctl -D data -l logfile start
waiting for server to start.... done
server started
$ bin/psql -U ze postgres
psql (16devel)
Type "help" for help.

postgres=# SET client_min_messages = DEBUG3;
SET
postgres=# SELECT gin_index_parent_check('gin_check_idx', true);
ERROR:  number of items mismatch in GIN entry tuple, 49 in tuple header, 1 
decoded
postgres=#

There are more code paths to follow to check the entire code, and I had a 
hard time to corrupt the indices. Is there any automated code to corrupt 
index to test such code?


--
Jose Arthur Benetasso Villanova




pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Bug in row_number() optimization
Next
From: Julien Rouhaud
Date:
Subject: Re: Allow file inclusion in pg_hba and pg_ident files