Isn't gist_page_items broken for opclasses with compress methods? - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Isn't gist_page_items broken for opclasses with compress methods? |
Date | |
Msg-id | 70c9e416-e765-db10-b288-acc992678c20@enterprisedb.com Whole thread Raw |
List | pgsql-hackers |
Hi, I've been looking at a report of a crash an in-place upgrade [1], related to a GiST index, and I tried to use the new GiST support added to pageinspect in 756ab29124 (so it's in 14). My knowledge of GiST is somewhat limited, but after struggling with it for a while I wonder if it's actually correct for GiST indexes with "compress" methods. Consider a simple example with ltree: CREATE TABLE test (a ltree); INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy'); SELECT * FROM test; a ------------------------------------ Top.Collections.Pictures.Astronomy (1 row) CREATE INDEX ON test USING gist (a); so far everything seems fine, but let's use the pageinspect stuff. SELECT * FROM gist_page_items(get_raw_page('test_a_idx', 0), 'test_a_idx'); WARNING: problem in alloc set ExprContext: detected write past chunk end in block 0x1b1b110, chunk 0x1b1d2c0 WARNING: problem in alloc set ExprContext: bad single-chunk 0x1b1d358 in block 0x1b1b110 WARNING: problem in alloc set ExprContext: found inconsistent memory block 0x1b1b110 itemoffset | ctid | itemlen | dead | keys ------------+-------+---------+------+-------- 1 | (0,1) | 96 | f | (a)=() (1 row) Well, that's not great, right? I know pageinspect may misbehave if you supply incorrect info or for corrupted pages, but that's not what's happening here - the heap/index are fine, we have the right descriptor and all of that. Yet we corrupt memory, and generate bogus result (I mean, the keys clear are not empty). The simple truth is this simply assumes the GiST index simply stores the input data as is, and we can happily call output on the procedure, so we read stuff from the index and call ltree_out() on it. But that's nonsense, because what the index actually stores is ltree_gist, which is essentially "ltree" with extra 8B header. Which (of course) utterly confuses ltree_out, producing nonsense result. The comment before BuildIndexValueDescription(), which is used to print the keys, says this: * The passed-in values/nulls arrays are the "raw" input to the * index AM, e.g. results of FormIndexDatum --- this is not * necessarily what is stored in the index, but it's what the * user perceives to be stored. Which I think simply means it's fine to call BuildIndexValueDescription on the pass to index AMs, not on the stuff we read from the index. Because it's up to the AM to transform it arbitrarily. In the regression tests it seemingly works, but I believe that's merely due to using a Point. gist/point_ops defines gist_point_compress, but that merely builds BOX, which is two points next to each other. So we simply print the first one ... I've tried various other GiST opclasses (e.g. btree_git) and all of that prints just bogus values. Not sure what to do about this. BuildIndexValueDescription seems like the wrong place to deal with this, and it was intended for entirely different purpose (essentially just error messages). So perhaps pageinspect adding a variant that knows how to decompress the value? Or just not printing the keys, because the current state seems a bit useless and confusing. regards https://www.postgresql.org/message-id/17406-71e02820ae79bb40%40postgresql.org -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: