Isn't gist_page_items broken for opclasses with compress methods? - Mailing list pgsql-hackers

From Tomas Vondra
Subject Isn't gist_page_items broken for opclasses with compress methods?
Date
Msg-id 70c9e416-e765-db10-b288-acc992678c20@enterprisedb.com
Whole thread Raw
List pgsql-hackers
Hi,

I've been looking at a report of a crash an in-place upgrade [1],
related to a GiST index, and I tried to use the new GiST support added
to pageinspect in 756ab29124 (so it's in 14). My knowledge of GiST is
somewhat limited, but after struggling with it for a while I wonder if
it's actually correct for GiST indexes with "compress" methods.

Consider a simple example with ltree:

  CREATE TABLE test (a ltree);
  INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy');

  SELECT * FROM test;

                   a
  ------------------------------------
   Top.Collections.Pictures.Astronomy
  (1 row)

  CREATE INDEX ON test USING gist (a);

so far everything seems fine, but let's use the pageinspect stuff.

  SELECT * FROM gist_page_items(get_raw_page('test_a_idx', 0),
                               'test_a_idx');

  WARNING:  problem in alloc set ExprContext: detected write past chunk
            end in block 0x1b1b110, chunk 0x1b1d2c0
  WARNING:  problem in alloc set ExprContext: bad single-chunk 0x1b1d358
            in block 0x1b1b110
  WARNING:  problem in alloc set ExprContext: found inconsistent memory
            block 0x1b1b110

   itemoffset | ctid  | itemlen | dead |  keys
  ------------+-------+---------+------+--------
            1 | (0,1) |      96 | f    | (a)=()
  (1 row)

Well, that's not great, right? I know pageinspect may misbehave if you
supply incorrect info or for corrupted pages, but that's not what's
happening here - the heap/index are fine, we have the right descriptor
and all of that. Yet we corrupt memory, and generate bogus result (I
mean, the keys clear are not empty).

The simple truth is this simply assumes the GiST index simply stores the
input data as is, and we can happily call output on the procedure, so we
read stuff from the index and call ltree_out() on it. But that's
nonsense, because what the index actually stores is ltree_gist, which is
essentially "ltree" with extra 8B header. Which (of course) utterly
confuses ltree_out, producing nonsense result.


The comment before BuildIndexValueDescription(), which is used to print
the keys, says this:

  * The passed-in values/nulls arrays are the "raw" input to the
  * index AM, e.g. results of FormIndexDatum --- this is not
  * necessarily what is stored in the index, but it's what the
  * user perceives to be stored.

Which I think simply means it's fine to call BuildIndexValueDescription
on the pass to index AMs, not on the stuff we read from the index.
Because it's up to the AM to transform it arbitrarily.


In the regression tests it seemingly works, but I believe that's merely
due to using a Point. gist/point_ops defines gist_point_compress, but
that merely builds BOX, which is two points next to each other. So we
simply print the first one ...

I've tried various other GiST opclasses (e.g. btree_git) and all of that
prints just bogus values.


Not sure what to do about this. BuildIndexValueDescription seems like
the wrong place to deal with this, and it was intended for entirely
different purpose (essentially just error messages).

So perhaps pageinspect adding a variant that knows how to decompress the
value? Or just not printing the keys, because the current state seems a
bit useless and confusing.


regards

https://www.postgresql.org/message-id/17406-71e02820ae79bb40%40postgresql.org

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: "Finnerty, Jim"
Date:
Subject: Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?
Next
From: Bruce Momjian
Date:
Subject: Re: support for CREATE MODULE