Thread: Memory leaks in record_out and record_send

Memory leaks in record_out and record_send

From
Tom Lane
Date:
I looked into the problem complained of here:
http://archives.postgresql.org/pgsql-general/2012-11/msg00279.php
which turns out to have nothing to do with joins and everything to do
with the fact that record_out() leaks memory like mad.  It leaks both
the strings returned by the per-column output functions and any column
values that have to be detoasted.  You can easily reproduce this with
an example like

create table leak (f1 int, f2 text);

insert into leak select x, 'foo' from generate_series(1,1000000) x;

select leak from leak;

The attached patch against HEAD fixes this, as well as a similar
leakage in record_send().  The added code is lifted directly from
printtup() so it's not adding any new assumptions to the system.

I wonder though if we ought to think about running output functions in
a short-lived memory context instead of the executor's main context.
We've considered that before, I think, and it's always been the path
of least resistance to fix the output functions instead --- but there
will always be another leak I'm afraid.

OTOH I can't see trying to back-patch a solution like that.   If we want
to fix this in the back branches (and note the complaint linked above is
against 8.3), I think we have to do it as attached.

Thoughts?

            regards, tom lane

diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c
index 13e574d4e82b4412f7ab32e02398d1671d381097..d4ed7d0ca06ac1a0dee6b4963989a5b41713fb3b 100644
*** a/src/backend/utils/adt/rowtypes.c
--- b/src/backend/utils/adt/rowtypes.c
*************** typedef struct ColumnIOData
*** 32,37 ****
--- 32,38 ----
      Oid            column_type;
      Oid            typiofunc;
      Oid            typioparam;
+     bool        typisvarlena;
      FmgrInfo    proc;
  } ColumnIOData;

*************** record_out(PG_FUNCTION_ARGS)
*** 364,369 ****
--- 365,371 ----
      {
          ColumnIOData *column_info = &my_extra->columns[i];
          Oid            column_type = tupdesc->attrs[i]->atttypid;
+         Datum        attr;
          char       *value;
          char       *tmp;
          bool        nq;
*************** record_out(PG_FUNCTION_ARGS)
*** 387,403 ****
           */
          if (column_info->column_type != column_type)
          {
-             bool        typIsVarlena;
-
              getTypeOutputInfo(column_type,
                                &column_info->typiofunc,
!                               &typIsVarlena);
              fmgr_info_cxt(column_info->typiofunc, &column_info->proc,
                            fcinfo->flinfo->fn_mcxt);
              column_info->column_type = column_type;
          }

!         value = OutputFunctionCall(&column_info->proc, values[i]);

          /* Detect whether we need double quotes for this value */
          nq = (value[0] == '\0');    /* force quotes for empty string */
--- 389,412 ----
           */
          if (column_info->column_type != column_type)
          {
              getTypeOutputInfo(column_type,
                                &column_info->typiofunc,
!                               &column_info->typisvarlena);
              fmgr_info_cxt(column_info->typiofunc, &column_info->proc,
                            fcinfo->flinfo->fn_mcxt);
              column_info->column_type = column_type;
          }

!         /*
!          * If we have a toasted datum, forcibly detoast it here to avoid
!          * memory leakage inside the type's output routine.
!          */
!         if (column_info->typisvarlena)
!             attr = PointerGetDatum(PG_DETOAST_DATUM(values[i]));
!         else
!             attr = values[i];
!
!         value = OutputFunctionCall(&column_info->proc, attr);

          /* Detect whether we need double quotes for this value */
          nq = (value[0] == '\0');    /* force quotes for empty string */
*************** record_out(PG_FUNCTION_ARGS)
*** 416,432 ****

          /* And emit the string */
          if (nq)
!             appendStringInfoChar(&buf, '"');
          for (tmp = value; *tmp; tmp++)
          {
              char        ch = *tmp;

              if (ch == '"' || ch == '\\')
!                 appendStringInfoChar(&buf, ch);
!             appendStringInfoChar(&buf, ch);
          }
          if (nq)
!             appendStringInfoChar(&buf, '"');
      }

      appendStringInfoChar(&buf, ')');
--- 425,447 ----

          /* And emit the string */
          if (nq)
!             appendStringInfoCharMacro(&buf, '"');
          for (tmp = value; *tmp; tmp++)
          {
              char        ch = *tmp;

              if (ch == '"' || ch == '\\')
!                 appendStringInfoCharMacro(&buf, ch);
!             appendStringInfoCharMacro(&buf, ch);
          }
          if (nq)
!             appendStringInfoCharMacro(&buf, '"');
!
!         pfree(value);
!
!         /* Clean up detoasted copy, if any */
!         if (DatumGetPointer(attr) != DatumGetPointer(values[i]))
!             pfree(DatumGetPointer(attr));
      }

      appendStringInfoChar(&buf, ')');
*************** record_send(PG_FUNCTION_ARGS)
*** 714,719 ****
--- 729,735 ----
      {
          ColumnIOData *column_info = &my_extra->columns[i];
          Oid            column_type = tupdesc->attrs[i]->atttypid;
+         Datum        attr;
          bytea       *outputbytes;

          /* Ignore dropped columns in datatype */
*************** record_send(PG_FUNCTION_ARGS)
*** 734,756 ****
           */
          if (column_info->column_type != column_type)
          {
-             bool        typIsVarlena;
-
              getTypeBinaryOutputInfo(column_type,
                                      &column_info->typiofunc,
!                                     &typIsVarlena);
              fmgr_info_cxt(column_info->typiofunc, &column_info->proc,
                            fcinfo->flinfo->fn_mcxt);
              column_info->column_type = column_type;
          }

!         outputbytes = SendFunctionCall(&column_info->proc, values[i]);

          /* We assume the result will not have been toasted */
          pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4);
          pq_sendbytes(&buf, VARDATA(outputbytes),
                       VARSIZE(outputbytes) - VARHDRSZ);
          pfree(outputbytes);
      }

      pfree(values);
--- 750,784 ----
           */
          if (column_info->column_type != column_type)
          {
              getTypeBinaryOutputInfo(column_type,
                                      &column_info->typiofunc,
!                                     &column_info->typisvarlena);
              fmgr_info_cxt(column_info->typiofunc, &column_info->proc,
                            fcinfo->flinfo->fn_mcxt);
              column_info->column_type = column_type;
          }

!         /*
!          * If we have a toasted datum, forcibly detoast it here to avoid
!          * memory leakage inside the type's output routine.
!          */
!         if (column_info->typisvarlena)
!             attr = PointerGetDatum(PG_DETOAST_DATUM(values[i]));
!         else
!             attr = values[i];
!
!         outputbytes = SendFunctionCall(&column_info->proc, attr);

          /* We assume the result will not have been toasted */
          pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4);
          pq_sendbytes(&buf, VARDATA(outputbytes),
                       VARSIZE(outputbytes) - VARHDRSZ);
+
          pfree(outputbytes);
+
+         /* Clean up detoasted copy, if any */
+         if (DatumGetPointer(attr) != DatumGetPointer(values[i]))
+             pfree(DatumGetPointer(attr));
      }

      pfree(values);

Re: Memory leaks in record_out and record_send

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> OTOH I can't see trying to back-patch a solution like that.   If we want
> to fix this in the back branches (and note the complaint linked above is
> against 8.3), I think we have to do it as attached.
>
> Thoughts?

I've been using textin(record_out(NEW)) in generic partitioning
triggers, and you can find examples of that trick in the wiki, so I
think we have users of that in the field.

Please indeed do consider backpatching!

I don't have an opinion on the opportunity to use a shorter memory
context, I feel that would need some more involved analytics than my
brainpower of the moment allows me to consider.

Thanks,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: Memory leaks in record_out and record_send

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Thoughts?

> I've been using textin(record_out(NEW)) in generic partitioning
> triggers, and you can find examples of that trick in the wiki, so I
> think we have users of that in the field.

I think explicit calls like that actually wouldn't be a problem,
since they'd be run in a per-tuple context anyway.  The cases that
are problematic are hard-coded I/O function calls.  I'm worried
about the ones like, say, plpgsql's built-in conversion operations.
We could probably fix printtup's usage with some confidence, but
there are a lot of other ones.
        regards, tom lane



Re: Memory leaks in record_out and record_send

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> I think explicit calls like that actually wouldn't be a problem,
> since they'd be run in a per-tuple context anyway.  The cases that
> are problematic are hard-coded I/O function calls.  I'm worried
> about the ones like, say, plpgsql's built-in conversion operations.
> We could probably fix printtup's usage with some confidence, but
> there are a lot of other ones.

That's a good reason to get them into a shorter memory context, but
which? per transaction maybe? shorter?

-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: Memory leaks in record_out and record_send

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> I think explicit calls like that actually wouldn't be a problem,
>> since they'd be run in a per-tuple context anyway.  The cases that
>> are problematic are hard-coded I/O function calls.  I'm worried
>> about the ones like, say, plpgsql's built-in conversion operations.
>> We could probably fix printtup's usage with some confidence, but
>> there are a lot of other ones.

> That's a good reason to get them into a shorter memory context, but
> which? per transaction maybe? shorter?

It would have to be per-tuple to do any good.  The existing behavior
is per-query and causes problems if lots of rows are output.  In plpgsql
it would be a function-call-lifespan leak.
        regards, tom lane



Re: Memory leaks in record_out and record_send

From
Robert Haas
Date:
On Tue, Nov 13, 2012 at 12:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wonder though if we ought to think about running output functions in
> a short-lived memory context instead of the executor's main context.
> We've considered that before, I think, and it's always been the path
> of least resistance to fix the output functions instead --- but there
> will always be another leak I'm afraid.

Such is the lot of people who code in C.  I worry that the number of
memory contexts we're kicking around already is imposing a significant
distributed overhead on the system that is hard to measure but
nevertheless real, and that this will add to it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Memory leaks in record_out and record_send

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Nov 13, 2012 at 12:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I wonder though if we ought to think about running output functions in
>> a short-lived memory context instead of the executor's main context.
>> We've considered that before, I think, and it's always been the path
>> of least resistance to fix the output functions instead --- but there
>> will always be another leak I'm afraid.

> Such is the lot of people who code in C.  I worry that the number of
> memory contexts we're kicking around already is imposing a significant
> distributed overhead on the system that is hard to measure but
> nevertheless real, and that this will add to it.

Yeah, perhaps.  I'd like to think that a MemoryContextReset is cheaper
than a bunch of retail pfree's, but it's hard to prove anything without
actually coding and testing it --- and on modern machines, effects like
cache locality could swamp pure instruction-count gains anyway.

Anyway, I committed the narrow fix for the moment.
        regards, tom lane



Re: Memory leaks in record_out and record_send

From
Robert Haas
Date:
On Tue, Nov 13, 2012 at 3:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Nov 13, 2012 at 12:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I wonder though if we ought to think about running output functions in
>>> a short-lived memory context instead of the executor's main context.
>>> We've considered that before, I think, and it's always been the path
>>> of least resistance to fix the output functions instead --- but there
>>> will always be another leak I'm afraid.
>
>> Such is the lot of people who code in C.  I worry that the number of
>> memory contexts we're kicking around already is imposing a significant
>> distributed overhead on the system that is hard to measure but
>> nevertheless real, and that this will add to it.
>
> Yeah, perhaps.  I'd like to think that a MemoryContextReset is cheaper
> than a bunch of retail pfree's, but it's hard to prove anything without
> actually coding and testing it --- and on modern machines, effects like
> cache locality could swamp pure instruction-count gains anyway.

Yeah.  The thing that concerns me is that I think we have a pretty
decent number of memory contexts where the expected number of
allocations is very small ... and we have the context *just in case*
we do more than that in certain instances.  I've seen profiles where
the setup/teardown costs of memory contexts are significant ... which
doesn't mean that those examples would perform better with fewer
memory contexts, but it's enough to make me pause for thought.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Memory leaks in record_out and record_send

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> Yeah.  The thing that concerns me is that I think we have a pretty
> decent number of memory contexts where the expected number of
> allocations is very small ... and we have the context *just in case*
> we do more than that in certain instances.  I've seen profiles where
> the setup/teardown costs of memory contexts are significant ... which
> doesn't mean that those examples would perform better with fewer
> memory contexts, but it's enough to make me pause for thought.

So, for my 2c, I'm on the other side of this, personally.  We have
memory contexts for more-or-less exactly this issue.  It's one of the
great things about PG- it's resiliant and very unlikely to have large or
bad memory leaks in general, much of which can, imv, be attributed to
our use of memory contexts.
Thanks,
    Stephen

Re: Memory leaks in record_out and record_send

From
Martijn van Oosterhout
Date:
On Tue, Nov 13, 2012 at 05:50:08PM -0500, Stephen Frost wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
> > Yeah.  The thing that concerns me is that I think we have a pretty
> > decent number of memory contexts where the expected number of
> > allocations is very small ... and we have the context *just in case*
> > we do more than that in certain instances.  I've seen profiles where
> > the setup/teardown costs of memory contexts are significant ... which
> > doesn't mean that those examples would perform better with fewer
> > memory contexts, but it's enough to make me pause for thought.
>
> So, for my 2c, I'm on the other side of this, personally.  We have
> memory contexts for more-or-less exactly this issue.  It's one of the
> great things about PG- it's resiliant and very unlikely to have large or
> bad memory leaks in general, much of which can, imv, be attributed to
> our use of memory contexts.

If the problem is that we create memory context overhead which is not
necessary in many cases, perhaps we can reduce the overhead somehow.
IIRC we have a seperate function for resetting a context and freeing it
entirely.  If there was a quick test we could do such that resetting a
context did nothing unless at least (say) 16k had been allocated, that
might reduce the cost for many very small allocations.

Ofcourse, unless someone comes up with a way to measure the cost this
is all handwaving, but it might a nice project for someone interested
in learning to hack postgres.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer