Re: Should pg 11 use a lot more memory building an spgist index? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Should pg 11 use a lot more memory building an spgist index?
Date
Msg-id 20752.1540842517@sss.pgh.pa.us
Whole thread Raw
Responses Re: Should pg 11 use a lot more memory building an spgist index?  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Re: Should pg 11 use a lot more memory building an spgist index?  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers
I wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> How about modifying SysScanDescData to have a memory context member,
>> which is created by systable_beginscan and destroyed by endscan?

> I think it would still have problems, in that it would affect in which
> context index_getnext delivers its output.  We could reasonably make
> these sorts of changes in places where the entire index_beginscan/
> index_getnext/index_endscan call structure is in one place, but that
> doesn't apply for the systable functions.

Actually, after looking more closely, index_getnext doesn't return a
palloc'd object anyway, or at least not one that the caller is responsible
for managing.  So maybe that could work.

I was confused about why the memory leak in Bruno's example is so much
larger in HEAD than v11; spgbeginscan does allocate more stuff than
before, but only if numberOfOrderBys > 0, which surely doesn't apply for
the exclusion-check code path.  Eventually I realized that the difference
is because commit 2a6368343 made SpGistScanOpaqueData a good deal larger
than it had been (10080 vs 6264 bytes, on my x86_64 box), so it was just
the failure to pfree that struct that accounted for the bigger leak.

There's another issue with 2a6368343, though: it added a couple of
fmgr_info_copy calls to spgbeginscan.  I don't understand why it'd be a
good idea to do that rather than using the FmgrInfos in the index's
relcache entry directly.  Making a copy every time risks a memory leak,
because spgendscan has no way to free any fn_extra data that gets
allocated by the called function; and by the same token it's inefficient,
because the function has no way to cache fn_extra data across queries.

If we consider only the leak aspect, this coding presents a reason why
we should try to change things as Alvaro suggests.  However, because of
the poor-caching aspect, I think this code is pretty broken anyway,
and once we fix it the issue is much less clear-cut.

(Looking around, it seems like we're very schizophrenic about whether
to copy index relcache support function FmgrInfos or just use them
directly.  But nbtree and hash seem to always do the latter, so I think
it's probably reasonable to standardize on that.)

Anyway, the attached proposed patch for HEAD makes Bruno's problem go
away, and it also fixes an unrelated leak introduced by 2a6368343
because it reallocates so->orderByTypes each time through spgrescan.
I think we should apply this, and suitable subsets in the back branches,
to fix the immediate problem.  Then we can work at leisure on a more
invasive HEAD-only patch, if anyone is excited about that.

            regards, tom lane

diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index a63fde2..c883ae9 100644
*** a/src/backend/access/spgist/spgscan.c
--- b/src/backend/access/spgist/spgscan.c
*************** spgbeginscan(Relation rel, int keysz, in
*** 294,303 ****
      /* Set up indexTupDesc and xs_hitupdesc in case it's an index-only scan */
      so->indexTupDesc = scan->xs_hitupdesc = RelationGetDescr(rel);

      if (scan->numberOfOrderBys > 0)
      {
!         so->zeroDistances = palloc(sizeof(double) * scan->numberOfOrderBys);
!         so->infDistances = palloc(sizeof(double) * scan->numberOfOrderBys);

          for (i = 0; i < scan->numberOfOrderBys; i++)
          {
--- 294,311 ----
      /* Set up indexTupDesc and xs_hitupdesc in case it's an index-only scan */
      so->indexTupDesc = scan->xs_hitupdesc = RelationGetDescr(rel);

+     /* Allocate various arrays needed for order-by scans */
      if (scan->numberOfOrderBys > 0)
      {
!         /* This will be filled in spgrescan, but allocate the space here */
!         so->orderByTypes = (Oid *)
!             palloc(sizeof(Oid) * scan->numberOfOrderBys);
!
!         /* These arrays have constant contents, so we can fill them now */
!         so->zeroDistances = (double *)
!             palloc(sizeof(double) * scan->numberOfOrderBys);
!         so->infDistances = (double *)
!             palloc(sizeof(double) * scan->numberOfOrderBys);

          for (i = 0; i < scan->numberOfOrderBys; i++)
          {
*************** spgbeginscan(Relation rel, int keysz, in
*** 305,313 ****
              so->infDistances[i] = get_float8_infinity();
          }

!         scan->xs_orderbyvals = palloc0(sizeof(Datum) * scan->numberOfOrderBys);
!         scan->xs_orderbynulls = palloc(sizeof(bool) * scan->numberOfOrderBys);
!         memset(scan->xs_orderbynulls, true, sizeof(bool) * scan->numberOfOrderBys);
      }

      fmgr_info_copy(&so->innerConsistentFn,
--- 313,324 ----
              so->infDistances[i] = get_float8_infinity();
          }

!         scan->xs_orderbyvals = (Datum *)
!             palloc0(sizeof(Datum) * scan->numberOfOrderBys);
!         scan->xs_orderbynulls = (bool *)
!             palloc(sizeof(bool) * scan->numberOfOrderBys);
!         memset(scan->xs_orderbynulls, true,
!                sizeof(bool) * scan->numberOfOrderBys);
      }

      fmgr_info_copy(&so->innerConsistentFn,
*************** spgrescan(IndexScanDesc scan, ScanKey sc
*** 336,341 ****
--- 347,353 ----
          memmove(scan->keyData, scankey,
                  scan->numberOfKeys * sizeof(ScanKeyData));

+     /* initialize order-by data if needed */
      if (orderbys && scan->numberOfOrderBys > 0)
      {
          int            i;
*************** spgrescan(IndexScanDesc scan, ScanKey sc
*** 343,350 ****
          memmove(scan->orderByData, orderbys,
                  scan->numberOfOrderBys * sizeof(ScanKeyData));

-         so->orderByTypes = (Oid *) palloc(sizeof(Oid) * scan->numberOfOrderBys);
-
          for (i = 0; i < scan->numberOfOrderBys; i++)
          {
              ScanKey        skey = &scan->orderByData[i];
--- 355,360 ----
*************** spgendscan(IndexScanDesc scan)
*** 380,390 ****
--- 390,411 ----
      MemoryContextDelete(so->tempCxt);
      MemoryContextDelete(so->traversalCxt);

+     if (so->keyData)
+         pfree(so->keyData);
+
+     if (so->state.deadTupleStorage)
+         pfree(so->state.deadTupleStorage);
+
      if (scan->numberOfOrderBys > 0)
      {
+         pfree(so->orderByTypes);
          pfree(so->zeroDistances);
          pfree(so->infDistances);
+         pfree(scan->xs_orderbyvals);
+         pfree(scan->xs_orderbynulls);
      }
+
+     pfree(so);
  }

  /*

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: replication_slots usability issue
Next
From: "Joshua D. Drake"
Date:
Subject: Re: replication_slots usability issue