Thread: spgist: palloc() negative size with smaller BLCKSZ

spgist: palloc() negative size with smaller BLCKSZ

From
Josh Kupershmidt
Date:
Hi all,

I noticed that configuring Postgres with a BLCKSZ smaller than default
was causing 'make check' give an interesting error on git head. You
should be able to see this with a simple:

  ./configure --enable-debug --enable-cassert --with-blocksize=4 &&
  make check

which, among a few other seemingly-minor sorting and EXPLAIN
differences, gives this nasty regression (for a 64-bit binary, built
on OS X 10.6):

--- /media/src/OSS/postgresql/src/test/regress/results/create_index.out    2012-06-25
18:49:00.000000000 -0700
***************
*** 79,84 ****
--- 79,85 ----
  INSERT INTO suffix_text_tbl VALUES ('P0123456789abcde');
  INSERT INTO suffix_text_tbl VALUES ('P0123456789abcdefF');
  CREATE INDEX sp_suff_ind ON suffix_text_tbl USING spgist (t);
+ ERROR:  invalid memory alloc request size 18446744073709551520
  --
  -- Test GiST and SP-GiST indexes
  --

Or, for a similar 32-bit build on a Debian machine, I get:
+ ERROR:  invalid memory alloc request size 4294967200

I believe this problem stems from this definition in spgtextproc.c:

#define SPGIST_MAX_PREFIX_LENGTH        (BLCKSZ - 256 * 16 - 100)

With a BLCKSZ of 4096, that comes to -100, which gets picked up here:

        commonLen = Min(commonLen, SPGIST_MAX_PREFIX_LENGTH);

and ultimately palloc'ed as a size. I'm not sure what'd be the right
way to fix this after reading the comment above
SPGIST_MAX_PREFIX_LENGTH, but thought I'd share.

Josh

Re: spgist: palloc() negative size with smaller BLCKSZ

From
Tom Lane
Date:
Josh Kupershmidt <schmiddy@gmail.com> writes:
> I noticed that configuring Postgres with a BLCKSZ smaller than default
> was causing 'make check' give an interesting error on git head.
> [ spgist regression test fails ]

Ugh.

> I'm not sure what'd be the right
> way to fix this after reading the comment above
> SPGIST_MAX_PREFIX_LENGTH, but thought I'd share.

I don't see any way for spgtextproc to support BLCKSZ of 4K or less
without some compromise of functionality.  We have a number of possible
ways we could compromise:

1. Just fail outright for small BLCKSZ.  Probably not very nice
considering the spgist regression test would then fall over.

2. Set SPGIST_MAX_PREFIX_LENGTH to something reasonable (maybe about
50 bytes or so) and just accept the possibility that it will fail if
an internal node gets too large due to too many distinct node labels.
This isn't really desirable either, but I suspect the probability of
failure in practice is pretty low, so it might be the right level of
effort considering use of non-8K BLCKSZ in the field is negligible.

3. Do something to reduce the number of node labels that are possible
when BLCKSZ is too small; for instance we could mask off the high order
bit so as to merge values that have the same next byte mod 128.  The
trouble with this is that spgtextproc currently uses the node label
literally as the next byte when reconstructing strings.  We'd have to
not use the node label for that, which would bloat entries slightly,
and perhaps worse would result in a significant difference in index
logic for large and small index pages.  I don't think I really want to
go there, especially not in the absence of a buildfarm machine testing
the small-page case.

I'm kind of leaning to #2, but anybody have another idea?

            regards, tom lane