GiST limits on contrib/cube with dimension > 100? - Mailing list pgsql-hackers

From Siarhei Siniak
Subject GiST limits on contrib/cube with dimension > 100?
Date
Msg-id 631800776.720455.1560103520960@mail.yahoo.com
Whole thread Raw
Responses Re: GiST limits on contrib/cube with dimension > 100?
Re: GiST limits on contrib/cube with dimension > 100?
List pgsql-hackers
I've been using cube extension recompiled with
#define MAX_DIM 256.

But with a version 11.3 I'm getting the following error:
failed to add item to index page in <index_name>

There's a regression unit test in contrib/cube/expected/cube.out:

CREATE TABLE test_cube (c cube);
\copy test_cube from 'data/test_cube.data'
CREATE INDEX test_cube_ix ON test_cube USING gist (c);
SELECT * FROM test_cube WHERE c && '(3000,1000),(0,0)' ORDER BY c;

I've created gist index in the same way, i.e. create index <index_name> on <table_name> using gist(<column_name>);

If MAX_DIM equals to 512, btree index complaints as:
index row size 4112 exceeds maximum 2712 for index <index_name>
HINT:  Values larger than 1/3 of a buffer page cannot be indexed.                                      
Consider a function index of an MD5 hash of the value, or use full text indexing.   

That's why 256 has been set.

But gist doesn't provide explanation on its error.

These are the places where the message might have been generated:
src/backend/access/gist/gist.c:418:                                     elog(ERROR, "failed to add item to index page in \"%s\"", RelationGetRelationName(rel));
src/backend/access/gist/gist.c:540:                                     elog(ERROR, "failed to add item to index page in \"%s\"",

Question is what restrains from setting MAX_DIM bigger than 100 in a custom recompiled cube extension version?
In practice the error messages are too cryptic.

contrib/cube/cube.c has the following methods regarding GIST:
/*
** GiST support methods
*/

PG_FUNCTION_INFO_V1(g_cube_consistent);
PG_FUNCTION_INFO_V1(g_cube_compress);
PG_FUNCTION_INFO_V1(g_cube_decompress);
PG_FUNCTION_INFO_V1(g_cube_penalty);
PG_FUNCTION_INFO_V1(g_cube_picksplit);
PG_FUNCTION_INFO_V1(g_cube_union);
PG_FUNCTION_INFO_V1(g_cube_same);
PG_FUNCTION_INFO_V1(g_cube_distance);

g_cube_compress has the following body:
    PG_RETURN_DATUM(PG_GETARG_DATUM(0));

Does it just returns void pointer to the underlying x array?
cube data structure:
typedef struct NDBOX
{
    /* varlena header (do not touch directly!) */
    int32        vl_len_;

    /*----------
     * Header contains info about NDBOX. For binary compatibility with old
     * versions, it is defined as "unsigned int".
     *
     * Following information is stored:
     *
     *    bits 0-7  : number of cube dimensions;
     *    bits 8-30 : unused, initialize to zero;
     *    bit  31   : point flag. If set, the upper right coordinates are not
     *                stored, and are implicitly the same as the lower left
     *                coordinates.
     *----------
     */
    unsigned int header;

    /*
     * The lower left coordinates for each dimension come first, followed by
     * upper right coordinates unless the point flag is set.
     */
    double        x[FLEXIBLE_ARRAY_MEMBER];
} NDBOX;

Can it be a problem of not fitting into some limits when building or updating gist index for cube with MAX_DIM > 100?
 

pgsql-hackers by date:

Previous
From: Avinash Kumar
Date:
Subject: Re: Bloom Indexes - bit array length and the total number of bits (orhash functions ?? ) !
Next
From: Andres Freund
Date:
Subject: Re: Temp table handling after anti-wraparound shutdown (Was: BUG#15840)