Re: Avoid memory leaks during ANALYZE's compute_index_stats() ? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Avoid memory leaks during ANALYZE's compute_index_stats() ?
Date
Msg-id 9003.1289275386@sss.pgh.pa.us
Whole thread Raw
In response to Re: Avoid memory leaks during ANALYZE's compute_index_stats() ?  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
Josh Berkus <josh@agliodbs.com> writes:
> On 11/8/10 5:04 PM, Tom Lane wrote:
>> The extra copying is a bit annoying, since
>> it would add cycles while accomplishing nothing useful for index
>> expressions with no intermediate results, but I'm thinking this is a
>> must-fix.

> I'd say that performance numbers is what to check on this.  How much
> does it affect low-memory expressions to do the copying?

It's noticeable but not horrible.  I tried this test case:

regression=# \d tst
          Table "public.tst"
 Column |       Type       | Modifiers
--------+------------------+-----------
 f1     | double precision |
Indexes:
    "tsti" btree ((f1 + 1.0::double precision))

with 100000 rows on a 32-bit machine (so that float8 is pass-by-ref).
The ANALYZE time went from about 625 msec to about 685.  I believe that
this is pretty much the worst case percentage-wise: the table is small
enough to fit in RAM, so no I/O is involved, and the index expression is
about as simple and cheap to evaluate as it could possibly be, and the
amount of work done analyzing the main table is about as small as it
could possibly be.  In any other situation those other components of
the ANALYZE cost would grow proportionally more than the copying cost.

Not-too-well-tested-yet patch attached.

            regards, tom lane

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 68a92dbac0142ee45be3ceb33e6c76c19c12c25d..15464c3dc66d35b254ba0f2f93228449265fcdfa 100644
*** a/src/backend/commands/analyze.c
--- b/src/backend/commands/analyze.c
*************** compute_index_stats(Relation onerel, dou
*** 695,700 ****
--- 695,706 ----
          {
              HeapTuple    heapTuple = rows[rowno];

+             /*
+              * Reset the per-tuple context each time, to reclaim any cruft
+              * left behind by evaluating the predicate or index expressions.
+              */
+             ResetExprContext(econtext);
+
              /* Set up for predicate or expression evaluation */
              ExecStoreTuple(heapTuple, slot, InvalidBuffer, false);

*************** compute_index_stats(Relation onerel, dou
*** 719,733 ****
                                 isnull);

                  /*
!                  * Save just the columns we care about.
                   */
                  for (i = 0; i < attr_cnt; i++)
                  {
                      VacAttrStats *stats = thisdata->vacattrstats[i];
                      int            attnum = stats->attr->attnum;

!                     exprvals[tcnt] = values[attnum - 1];
!                     exprnulls[tcnt] = isnull[attnum - 1];
                      tcnt++;
                  }
              }
--- 725,750 ----
                                 isnull);

                  /*
!                  * Save just the columns we care about.  We copy the values
!                  * into ind_context from the estate's per-tuple context.
                   */
                  for (i = 0; i < attr_cnt; i++)
                  {
                      VacAttrStats *stats = thisdata->vacattrstats[i];
                      int            attnum = stats->attr->attnum;

!                     if (isnull[attnum - 1])
!                     {
!                         exprvals[tcnt] = (Datum) 0;
!                         exprnulls[tcnt] = true;
!                     }
!                     else
!                     {
!                         exprvals[tcnt] = datumCopy(values[attnum - 1],
!                                                    stats->attrtype->typbyval,
!                                                    stats->attrtype->typlen);
!                         exprnulls[tcnt] = false;
!                     }
                      tcnt++;
                  }
              }

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: CLUSTER can change t_len
Next
From: Peter Eisentraut
Date:
Subject: Re: Should we use make -k on the buildfarm?