GIST and TOAST - Mailing list pgsql-hackers

From Gregory Stark
Subject GIST and TOAST
Date
Msg-id 87irdjr4w7.fsf@stark.xeocode.com
Whole thread Raw
List pgsql-hackers
I have a problem getting packed varlenas to work with GIST indexes. Namely,
the GIST code doesn't call pg_detoast_datum() enough. Instead of using the
DatumGetFoo() macros it uses DatumGetPointer() and calls PG_DETOAST_DATUM only
when it thinks it'll be necessary.

I've temporarily made the packed varlena code ignore user defined data types.
This isn't very satisfactory though since it means domains over things like
text don't get packed.

And in any case even with this in place it doesn't fix all the problems. The
Postgres regression tests pass but I can still trigger problems because the
GIST code doesn't reliably call detoast even on user data types they're given.

I think these are actual bugs. If you happened to provide a large enough datum
to the gist code it would cause the same problem I'm seeing. The packed
varlena patch just makes it easier to trigger.

I've fixed the problems I'm seeing with the following patch to _int_gist.c.
But I'm not sure it's correct. What I'm afraid of is that I'm not sure where
these functions are being called from and whether they expect to be leaking
memory. If they're expected to not leak memory then they're now leaking the
detoasted datum and that's a problem.

I'm also wondering if there aren't similar problems in the dozens of other
gist indexing modules...

I wouldn't mind a second pair of eyes on the _int_gist.c changes if there's
someone who can tell me whether any of these functions require a
PG_FREE_IF_COPY or equivalent.


Index: _int_gist.c
===================================================================
RCS file: /home/stark/src/REPOSITORY/pgsql/contrib/intarray/_int_gist.c,v
retrieving revision 1.16
diff -u -r1.16 _int_gist.c
--- _int_gist.c    4 Oct 2006 00:29:45 -0000    1.16
+++ _int_gist.c    2 Mar 2007 16:09:26 -0000
@@ -1,6 +1,6 @@#include "_int.h"
-#define GETENTRY(vec,pos) ((ArrayType *) DatumGetPointer((vec)->vector[(pos)].key))
+#define GETENTRY(vec,pos) (DatumGetArrayTypeP((vec)->vector[(pos)].key))/*** GiST support methods
@@ -39,7 +39,7 @@    if (strategy == BooleanSearchStrategy)    {        retval = execconsistent((QUERYTYPE *) query,
-                                (ArrayType *) DatumGetPointer(entry->key),
+                                DatumGetArrayTypeP(entry->key),                                GIST_LEAF(entry));
 pfree(query);
 
@@ -58,7 +58,7 @@    switch (strategy)    {        case RTOverlapStrategyNumber:
-            retval = inner_int_overlap((ArrayType *) DatumGetPointer(entry->key),
+            retval = inner_int_overlap(DatumGetArrayTypeP(entry->key),                                       query);
        break;        case RTSameStrategyNumber:
 
@@ -70,21 +70,21 @@                                    PointerGetDatum(&retval)                    );            else
-                retval = inner_int_contains((ArrayType *) DatumGetPointer(entry->key),
+                retval = inner_int_contains(DatumGetArrayTypeP(entry->key),
query);           break;        case RTContainsStrategyNumber:        case RTOldContainsStrategyNumber:
 
-            retval = inner_int_contains((ArrayType *) DatumGetPointer(entry->key),
+            retval = inner_int_contains(DatumGetArrayTypeP(entry->key),                                        query);
          break;        case RTContainedByStrategyNumber:        case RTOldContainedByStrategyNumber:            if
(GIST_LEAF(entry))               retval = inner_int_contains(query,
 
-                                  (ArrayType *) DatumGetPointer(entry->key));
+                                  DatumGetArrayTypeP(entry->key));            else
-                retval = inner_int_overlap((ArrayType *) DatumGetPointer(entry->key),
+                retval = inner_int_overlap(DatumGetArrayTypeP(entry->key),
query);           break;        default:
 
@@ -282,10 +282,10 @@    float        tmp1,                tmp2;
-    ud = inner_int_union((ArrayType *) DatumGetPointer(origentry->key),
-                         (ArrayType *) DatumGetPointer(newentry->key));
+    ud = inner_int_union(DatumGetArrayTypeP(origentry->key),
+                         DatumGetArrayTypeP(newentry->key));    rt__int_size(ud, &tmp1);
-    rt__int_size((ArrayType *) DatumGetPointer(origentry->key), &tmp2);
+    rt__int_size(DatumGetArrayTypeP(origentry->key), &tmp2);    *result = tmp1 - tmp2;    pfree(ud);
@@ -297,8 +297,8 @@Datumg_int_same(PG_FUNCTION_ARGS){
-    ArrayType  *a = (ArrayType *) PointerGetDatum(PG_GETARG_POINTER(0));
-    ArrayType  *b = (ArrayType *) PointerGetDatum(PG_GETARG_POINTER(1));
+    ArrayType  *a = PG_GETARG_ARRAYTYPE_P(0);
+    ArrayType  *b = PG_GETARG_ARRAYTYPE_P(1);    bool       *result = (bool *) PG_GETARG_POINTER(2);    int4        n
=ARRNELEMS(a);    int4       *da,
 


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: "Florian G. Pflug"
Date:
Subject: Re: UPSERT
Next
From: "Pavan Deolasee"
Date:
Subject: Re: HOT - preliminary results