Thread: [HACKERS] Unusable SP-GiST index

[HACKERS] Unusable SP-GiST index

From
Vik Fearing
Date:
While trying to find a case where spgist wins over btree for text, I
came across the following behavior which I would consider a bug:

CREATE TABLE texts (value text);
INSERT INTO texts SELECT repeat('a', (2^20)::integer);
CREATE INDEX ON texts USING spgist (value);
SET enable_seqscan = off;
TABLE texts;

That produces:

ERROR:  index row requires 12024 bytes, maximum size is 8191

It seems to me the index should not be allowed to be created if it won't
be usable.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: [HACKERS] Unusable SP-GiST index

From
Tom Lane
Date:
Vik Fearing <vik@2ndquadrant.fr> writes:
> While trying to find a case where spgist wins over btree for text, I
> came across the following behavior which I would consider a bug:

> CREATE TABLE texts (value text);
> INSERT INTO texts SELECT repeat('a', (2^20)::integer);
> CREATE INDEX ON texts USING spgist (value);
> SET enable_seqscan = off;
> TABLE texts;

> That produces:
> ERROR:  index row requires 12024 bytes, maximum size is 8191

Hmm ... it's not really SP-GiST's fault.  This query is trying to do
an index-only scan, and the API defined for that requires the index
to hand back an IndexTuple, which is of (very) limited size.
SP-GiST is capable of dealing with values much larger than one page,
but there's no way to hand them back through that API.

Maybe we should redefine the API as involving a TupleTableSlot that
the AM is supposed to fill --- basically, moving StoreIndexTuple
out of the common code in nodeIndexonlyscan.c and requiring the AM
to do that work.  The possible breakage of third-party code is a
bit annoying, but there can't be all that many third-party AMs
out there yet.
        regards, tom lane



Re: [HACKERS] Unusable SP-GiST index

From
Tom Lane
Date:
I wrote:
> Maybe we should redefine the API as involving a TupleTableSlot that
> the AM is supposed to fill --- basically, moving StoreIndexTuple
> out of the common code in nodeIndexonlyscan.c and requiring the AM
> to do that work.  The possible breakage of third-party code is a
> bit annoying, but there can't be all that many third-party AMs
> out there yet.

After looking a bit at gist and sp-gist, neither of them would find that
terribly convenient; they really want to create one blob of memory per
index entry so as to not complicate storage management too much.  But
they'd be fine with making that blob be a HeapTuple not IndexTuple.
So maybe the right approach is to expand the existing API to allow the
AM to return *either* a heap or index tuple; that could be made to not
be an API break.
        regards, tom lane



Re: [HACKERS] Unusable SP-GiST index

From
Tom Lane
Date:
I wrote:
> After looking a bit at gist and sp-gist, neither of them would find that
> terribly convenient; they really want to create one blob of memory per
> index entry so as to not complicate storage management too much.  But
> they'd be fine with making that blob be a HeapTuple not IndexTuple.
> So maybe the right approach is to expand the existing API to allow the
> AM to return *either* a heap or index tuple; that could be made to not
> be an API break.

Here's a draft patch along those lines.  With this approach, btree doesn't
need to be touched at all, since what it's returning certainly is an
IndexTuple anyway.  I fixed both SPGIST and GIST to use HeapTuple return
format.  It's not very clear to me whether GIST has a similar hazard with
very large return values, but it might, and it's simple enough to change.

            regards, tom lane

diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 40f201b..d4af010 100644
*** a/doc/src/sgml/indexam.sgml
--- b/doc/src/sgml/indexam.sgml
*************** amgettuple (IndexScanDesc scan,
*** 535,549 ****
    <para>
     If the index supports <link linkend="indexes-index-only-scans">index-only
     scans</link> (i.e., <function>amcanreturn</function> returns TRUE for it),
!    then on success the AM must also check
!    <literal>scan->xs_want_itup</>, and if that is true it must return
!    the original indexed data for the index entry, in the form of an
     <structname>IndexTuple</> pointer stored at <literal>scan->xs_itup</>,
!    with tuple descriptor <literal>scan->xs_itupdesc</>.
!    (Management of the data referenced by the pointer is the access method's
     responsibility.  The data must remain good at least until the next
     <function>amgettuple</>, <function>amrescan</>, or <function>amendscan</>
!    call for the scan.)
    </para>

    <para>
--- 535,553 ----
    <para>
     If the index supports <link linkend="indexes-index-only-scans">index-only
     scans</link> (i.e., <function>amcanreturn</function> returns TRUE for it),
!    then on success the AM must also check <literal>scan->xs_want_itup</>,
!    and if that is true it must return the originally indexed data for the
!    index entry.  The data can be returned in the form of an
     <structname>IndexTuple</> pointer stored at <literal>scan->xs_itup</>,
!    with tuple descriptor <literal>scan->xs_itupdesc</>; or in the form of
!    a <structname>HeapTuple</> pointer stored at <literal>scan->xs_hitup</>,
!    with tuple descriptor <literal>scan->xs_hitupdesc</>.  (The latter
!    format should be used when reconstructing data that might possibly not fit
!    into an IndexTuple.)  In either case,
!    management of the data referenced by the pointer is the access method's
     responsibility.  The data must remain good at least until the next
     <function>amgettuple</>, <function>amrescan</>, or <function>amendscan</>
!    call for the scan.
    </para>

    <para>
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index eea366b..122dc38 100644
*** a/src/backend/access/gist/gistget.c
--- b/src/backend/access/gist/gistget.c
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 441,452 ****
              so->pageData[so->nPageData].offnum = i;

              /*
!              * In an index-only scan, also fetch the data from the tuple.
               */
              if (scan->xs_want_itup)
              {
                  oldcxt = MemoryContextSwitchTo(so->pageDataCxt);
!                 so->pageData[so->nPageData].ftup =
                      gistFetchTuple(giststate, r, it);
                  MemoryContextSwitchTo(oldcxt);
              }
--- 441,453 ----
              so->pageData[so->nPageData].offnum = i;

              /*
!              * In an index-only scan, also fetch the data from the tuple.  The
!              * reconstructed tuples are stored in pageDataCxt.
               */
              if (scan->xs_want_itup)
              {
                  oldcxt = MemoryContextSwitchTo(so->pageDataCxt);
!                 so->pageData[so->nPageData].recontup =
                      gistFetchTuple(giststate, r, it);
                  MemoryContextSwitchTo(oldcxt);
              }
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 478,484 ****
                   * In an index-only scan, also fetch the data from the tuple.
                   */
                  if (scan->xs_want_itup)
!                     item->data.heap.ftup = gistFetchTuple(giststate, r, it);
              }
              else
              {
--- 479,485 ----
                   * In an index-only scan, also fetch the data from the tuple.
                   */
                  if (scan->xs_want_itup)
!                     item->data.heap.recontup = gistFetchTuple(giststate, r, it);
              }
              else
              {
*************** getNextNearest(IndexScanDesc scan)
*** 540,550 ****
      bool        res = false;
      int            i;

!     if (scan->xs_itup)
      {
          /* free previously returned tuple */
!         pfree(scan->xs_itup);
!         scan->xs_itup = NULL;
      }

      do
--- 541,551 ----
      bool        res = false;
      int            i;

!     if (scan->xs_hitup)
      {
          /* free previously returned tuple */
!         pfree(scan->xs_hitup);
!         scan->xs_hitup = NULL;
      }

      do
*************** getNextNearest(IndexScanDesc scan)
*** 601,607 ****

              /* in an index-only scan, also return the reconstructed tuple. */
              if (scan->xs_want_itup)
!                 scan->xs_itup = item->data.heap.ftup;
              res = true;
          }
          else
--- 602,608 ----

              /* in an index-only scan, also return the reconstructed tuple. */
              if (scan->xs_want_itup)
!                 scan->xs_hitup = item->data.heap.recontup;
              res = true;
          }
          else
*************** gistgettuple(IndexScanDesc scan, ScanDir
*** 685,691 ****

                  /* in an index-only scan, also return the reconstructed tuple */
                  if (scan->xs_want_itup)
!                     scan->xs_itup = so->pageData[so->curPageData].ftup;

                  so->curPageData++;

--- 686,692 ----

                  /* in an index-only scan, also return the reconstructed tuple */
                  if (scan->xs_want_itup)
!                     scan->xs_hitup = so->pageData[so->curPageData].recontup;

                  so->curPageData++;

diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index 33b3889..81ff8fc 100644
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
*************** gistrescan(IndexScanDesc scan, ScanKey k
*** 155,161 ****
       * tuple descriptor to represent the returned index tuples and create a
       * memory context to hold them during the scan.
       */
!     if (scan->xs_want_itup && !scan->xs_itupdesc)
      {
          int            natts;
          int            attno;
--- 155,161 ----
       * tuple descriptor to represent the returned index tuples and create a
       * memory context to hold them during the scan.
       */
!     if (scan->xs_want_itup && !scan->xs_hitupdesc)
      {
          int            natts;
          int            attno;
*************** gistrescan(IndexScanDesc scan, ScanKey k
*** 174,181 ****
                                 scan->indexRelation->rd_opcintype[attno - 1],
                                 -1, 0);
          }
!         scan->xs_itupdesc = so->giststate->fetchTupdesc;

          so->pageDataCxt = AllocSetContextCreate(so->giststate->scanCxt,
                                                  "GiST page data context",
                                                  ALLOCSET_DEFAULT_SIZES);
--- 174,182 ----
                                 scan->indexRelation->rd_opcintype[attno - 1],
                                 -1, 0);
          }
!         scan->xs_hitupdesc = so->giststate->fetchTupdesc;

+         /* Also create a memory context that will hold the returned tuples */
          so->pageDataCxt = AllocSetContextCreate(so->giststate->scanCxt,
                                                  "GiST page data context",
                                                  ALLOCSET_DEFAULT_SIZES);
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index f92baed..75845ba 100644
*** a/src/backend/access/gist/gistutil.c
--- b/src/backend/access/gist/gistutil.c
*************** gistFetchAtt(GISTSTATE *giststate, int n
*** 624,632 ****

  /*
   * Fetch all keys in tuple.
!  * returns new IndexTuple that contains GISTENTRY with fetched data
   */
! IndexTuple
  gistFetchTuple(GISTSTATE *giststate, Relation r, IndexTuple tuple)
  {
      MemoryContext oldcxt = MemoryContextSwitchTo(giststate->tempCxt);
--- 624,632 ----

  /*
   * Fetch all keys in tuple.
!  * Returns a new HeapTuple containing the originally-indexed data.
   */
! HeapTuple
  gistFetchTuple(GISTSTATE *giststate, Relation r, IndexTuple tuple)
  {
      MemoryContext oldcxt = MemoryContextSwitchTo(giststate->tempCxt);
*************** gistFetchTuple(GISTSTATE *giststate, Rel
*** 660,666 ****
      }
      MemoryContextSwitchTo(oldcxt);

!     return index_form_tuple(giststate->fetchTupdesc, fetchatt, isnull);
  }

  float
--- 660,666 ----
      }
      MemoryContextSwitchTo(oldcxt);

!     return heap_form_tuple(giststate->fetchTupdesc, fetchatt, isnull);
  }

  float
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index c4a393f..3599476 100644
*** a/src/backend/access/index/genam.c
--- b/src/backend/access/index/genam.c
*************** RelationGetIndexScan(Relation indexRelat
*** 119,124 ****
--- 119,126 ----

      scan->xs_itup = NULL;
      scan->xs_itupdesc = NULL;
+     scan->xs_hitup = NULL;
+     scan->xs_hitupdesc = NULL;

      ItemPointerSetInvalid(&scan->xs_ctup.t_self);
      scan->xs_ctup.t_data = NULL;
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 4822af9..89d1971 100644
*** a/src/backend/access/index/indexam.c
--- b/src/backend/access/index/indexam.c
*************** index_getnext_tid(IndexScanDesc scan, Sc
*** 409,416 ****
      /*
       * The AM's amgettuple proc finds the next index entry matching the scan
       * keys, and puts the TID into scan->xs_ctup.t_self.  It should also set
!      * scan->xs_recheck and possibly scan->xs_itup, though we pay no attention
!      * to those fields here.
       */
      found = scan->indexRelation->rd_amroutine->amgettuple(scan, direction);

--- 409,416 ----
      /*
       * The AM's amgettuple proc finds the next index entry matching the scan
       * keys, and puts the TID into scan->xs_ctup.t_self.  It should also set
!      * scan->xs_recheck and possibly scan->xs_itup/scan->xs_hitup, though we
!      * pay no attention to those fields here.
       */
      found = scan->indexRelation->rd_amroutine->amgettuple(scan, direction);

diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 139d998..2d96c00 100644
*** a/src/backend/access/spgist/spgscan.c
--- b/src/backend/access/spgist/spgscan.c
*************** resetSpGistScanOpaque(SpGistScanOpaque s
*** 92,102 ****

      if (so->want_itup)
      {
!         /* Must pfree IndexTuples to avoid memory leak */
          int            i;

          for (i = 0; i < so->nPtrs; i++)
!             pfree(so->indexTups[i]);
      }
      so->iPtr = so->nPtrs = 0;
  }
--- 92,102 ----

      if (so->want_itup)
      {
!         /* Must pfree reconstructed tuples to avoid memory leak */
          int            i;

          for (i = 0; i < so->nPtrs; i++)
!             pfree(so->reconTups[i]);
      }
      so->iPtr = so->nPtrs = 0;
  }
*************** spgbeginscan(Relation rel, int keysz, in
*** 195,202 ****
                                          "SP-GiST search temporary context",
                                          ALLOCSET_DEFAULT_SIZES);

!     /* Set up indexTupDesc and xs_itupdesc in case it's an index-only scan */
!     so->indexTupDesc = scan->xs_itupdesc = RelationGetDescr(rel);

      scan->opaque = so;

--- 195,202 ----
                                          "SP-GiST search temporary context",
                                          ALLOCSET_DEFAULT_SIZES);

!     /* Set up indexTupDesc and xs_hitupdesc in case it's an index-only scan */
!     so->indexTupDesc = scan->xs_hitupdesc = RelationGetDescr(rel);

      scan->opaque = so;

*************** storeGettuple(SpGistScanOpaque so, ItemP
*** 591,602 ****
      if (so->want_itup)
      {
          /*
!          * Reconstruct desired IndexTuple.  We have to copy the datum out of
!          * the temp context anyway, so we may as well create the tuple here.
           */
!         so->indexTups[so->nPtrs] = index_form_tuple(so->indexTupDesc,
!                                                     &leafValue,
!                                                     &isnull);
      }
      so->nPtrs++;
  }
--- 591,602 ----
      if (so->want_itup)
      {
          /*
!          * Reconstruct index data.  We have to copy the datum out of the temp
!          * context anyway, so we may as well create the tuple here.
           */
!         so->reconTups[so->nPtrs] = heap_form_tuple(so->indexTupDesc,
!                                                    &leafValue,
!                                                    &isnull);
      }
      so->nPtrs++;
  }
*************** spggettuple(IndexScanDesc scan, ScanDire
*** 619,636 ****
              /* continuing to return tuples from a leaf page */
              scan->xs_ctup.t_self = so->heapPtrs[so->iPtr];
              scan->xs_recheck = so->recheck[so->iPtr];
!             scan->xs_itup = so->indexTups[so->iPtr];
              so->iPtr++;
              return true;
          }

          if (so->want_itup)
          {
!             /* Must pfree IndexTuples to avoid memory leak */
              int            i;

              for (i = 0; i < so->nPtrs; i++)
!                 pfree(so->indexTups[i]);
          }
          so->iPtr = so->nPtrs = 0;

--- 619,636 ----
              /* continuing to return tuples from a leaf page */
              scan->xs_ctup.t_self = so->heapPtrs[so->iPtr];
              scan->xs_recheck = so->recheck[so->iPtr];
!             scan->xs_hitup = so->reconTups[so->iPtr];
              so->iPtr++;
              return true;
          }

          if (so->want_itup)
          {
!             /* Must pfree reconstructed tuples to avoid memory leak */
              int            i;

              for (i = 0; i < so->nPtrs; i++)
!                 pfree(so->reconTups[i]);
          }
          so->iPtr = so->nPtrs = 0;

diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index ddef3a4..62bdeb3 100644
*** a/src/backend/executor/nodeIndexonlyscan.c
--- b/src/backend/executor/nodeIndexonlyscan.c
*************** IndexOnlyNext(IndexOnlyScanState *node)
*** 144,152 ****
          }

          /*
!          * Fill the scan tuple slot with data from the index.
           */
!         StoreIndexTuple(slot, scandesc->xs_itup, scandesc->xs_itupdesc);

          /*
           * If the index was lossy, we have to recheck the index quals.
--- 144,169 ----
          }

          /*
!          * Fill the scan tuple slot with data from the index.  This might be
!          * provided in either HeapTuple or IndexTuple format.  Conceivably an
!          * index AM might fill both fields, in which case we prefer the heap
!          * format, since it's probably a bit cheaper to fill a slot from.
           */
!         if (scandesc->xs_hitup)
!         {
!             /*
!              * We don't take the trouble to verify that the provided tuple has
!              * exactly the slot's format, but it seems worth doing a quick
!              * check on the number of fields.
!              */
!             Assert(slot->tts_tupleDescriptor->natts ==
!                    scandesc->xs_hitupdesc->natts);
!             ExecStoreTuple(scandesc->xs_hitup, slot, InvalidBuffer, false);
!         }
!         else if (scandesc->xs_itup)
!             StoreIndexTuple(slot, scandesc->xs_itup, scandesc->xs_itupdesc);
!         else
!             elog(ERROR, "no data returned for index-only scan");

          /*
           * If the index was lossy, we have to recheck the index quals.
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 72f52d4..6522336 100644
*** a/src/include/access/gist_private.h
--- b/src/include/access/gist_private.h
*************** typedef struct GISTSearchHeapItem
*** 120,126 ****
      ItemPointerData heapPtr;
      bool        recheck;        /* T if quals must be rechecked */
      bool        recheckDistances;        /* T if distances must be rechecked */
!     IndexTuple    ftup;            /* data fetched back from the index, used in
                                   * index-only scans */
      OffsetNumber offnum;        /* track offset in page to mark tuple as
                                   * LP_DEAD */
--- 120,126 ----
      ItemPointerData heapPtr;
      bool        recheck;        /* T if quals must be rechecked */
      bool        recheckDistances;        /* T if distances must be rechecked */
!     HeapTuple    recontup;        /* data reconstructed from the index, used in
                                   * index-only scans */
      OffsetNumber offnum;        /* track offset in page to mark tuple as
                                   * LP_DEAD */
*************** extern void gistMakeUnionItVec(GISTSTATE
*** 529,535 ****
  extern bool gistKeyIsEQ(GISTSTATE *giststate, int attno, Datum a, Datum b);
  extern void gistDeCompressAtt(GISTSTATE *giststate, Relation r, IndexTuple tuple, Page p,
                    OffsetNumber o, GISTENTRY *attdata, bool *isnull);
! extern IndexTuple gistFetchTuple(GISTSTATE *giststate, Relation r,
                 IndexTuple tuple);
  extern void gistMakeUnionKey(GISTSTATE *giststate, int attno,
                   GISTENTRY *entry1, bool isnull1,
--- 529,535 ----
  extern bool gistKeyIsEQ(GISTSTATE *giststate, int attno, Datum a, Datum b);
  extern void gistDeCompressAtt(GISTSTATE *giststate, Relation r, IndexTuple tuple, Page p,
                    OffsetNumber o, GISTENTRY *attdata, bool *isnull);
! extern HeapTuple gistFetchTuple(GISTSTATE *giststate, Relation r,
                 IndexTuple tuple);
  extern void gistMakeUnionKey(GISTSTATE *giststate, int attno,
                   GISTENTRY *entry1, bool isnull1,
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index 8746045..8635f83 100644
*** a/src/include/access/relscan.h
--- b/src/include/access/relscan.h
*************** typedef struct IndexScanDescData
*** 103,111 ****
      /* index access method's private state */
      void       *opaque;            /* access-method-specific info */

!     /* in an index-only scan, this is valid after a successful amgettuple */
      IndexTuple    xs_itup;        /* index tuple returned by AM */
      TupleDesc    xs_itupdesc;    /* rowtype descriptor of xs_itup */

      /* xs_ctup/xs_cbuf/xs_recheck are valid after a successful index_getnext */
      HeapTupleData xs_ctup;        /* current heap tuple, if any */
--- 103,118 ----
      /* index access method's private state */
      void       *opaque;            /* access-method-specific info */

!     /*
!      * In an index-only scan, a successful amgettuple call must fill either
!      * xs_itup (and xs_itupdesc) or xs_hitup (and xs_hitupdesc) to provide the
!      * data returned by the scan.  It can fill both, in which case the heap
!      * format will be used.
!      */
      IndexTuple    xs_itup;        /* index tuple returned by AM */
      TupleDesc    xs_itupdesc;    /* rowtype descriptor of xs_itup */
+     HeapTuple    xs_hitup;        /* index data returned by AM, as HeapTuple */
+     TupleDesc    xs_hitupdesc;    /* rowtype descriptor of xs_hitup */

      /* xs_ctup/xs_cbuf/xs_recheck are valid after a successful index_getnext */
      HeapTupleData xs_ctup;        /* current heap tuple, if any */
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index b2979a9..67d45e8 100644
*** a/src/include/access/spgist_private.h
--- b/src/include/access/spgist_private.h
*************** typedef struct SpGistScanOpaqueData
*** 159,165 ****
      int            iPtr;            /* index for scanning through same */
      ItemPointerData heapPtrs[MaxIndexTuplesPerPage];    /* TIDs from cur page */
      bool        recheck[MaxIndexTuplesPerPage]; /* their recheck flags */
!     IndexTuple    indexTups[MaxIndexTuplesPerPage];        /* reconstructed tuples */

      /*
       * Note: using MaxIndexTuplesPerPage above is a bit hokey since
--- 159,165 ----
      int            iPtr;            /* index for scanning through same */
      ItemPointerData heapPtrs[MaxIndexTuplesPerPage];    /* TIDs from cur page */
      bool        recheck[MaxIndexTuplesPerPage]; /* their recheck flags */
!     HeapTuple    reconTups[MaxIndexTuplesPerPage];        /* reconstructed tuples */

      /*
       * Note: using MaxIndexTuplesPerPage above is a bit hokey since

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers