Thread: TRUNCATE memory leak with temporary tables?

TRUNCATE memory leak with temporary tables?

From
"Nick Muerdter"
Date:
I've been seeing what looks like unbounded memory growth (until the OOM killer kicks in and kills the postgres process)
whenrunning a pl/pgsql function that performs TRUNCATE statements against various temporary tables in a loop. I think
I'vebeen able to come up with some fairly simple reproductions of the issue in isolation, but I'm trying to figure out
ifthis is a memory leak or of I'm perhaps doing something wrong with tuning or other settings.
 

What I've observed:

- The memory growth occurs if the temp table has indexes or a primary key set on it.
- Alternatively, the memory growth also occurs if the temp table has certain column types on it (eg, "text" types).
- If the table doesn't have indexes and only has integer columns present, then the memory growth does *not* occur.
- I originally saw this against a PostgreSQL 12 server, but I've tested this against PostgreSQL 9.6.22, 12.7, and 13.3
Dockercontainers and reproduced it against all versions in the containers.
 

Here are 2 separate examples that seem to show the memory growth on the server (the first being a table with a "text"
column,the second example having no text column but a primary key index):
 

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer, bar text);

    FOR i IN 1..200000000 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer);
    ALTER TABLE pg_temp.foo ADD PRIMARY KEY (id);

    FOR i IN 1..200000000 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

Compare that to this example (which doesn't have an index or any other column types that trigger this), which does
*not*show any memory growth:
 

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer);

    FOR i IN 1..200000000 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

Any help in determining what's going on here (or if there are other ways to go about this) would be greatly
appreciated!

Thank you!
Nick



Re: TRUNCATE memory leak with temporary tables?

From
Vijaykumar Jain
Date:
Yes,
I too see growth when text type is used, but not when int or even fixed size char(10) is used.

I always thought truncate was similar to delete + vacuum full, 
but checking for your scenarios, I did not see an update on pg_stat_user_table on truncate for vacuums.

then i checked 
to help understand truncation better.

but then i still do not understand how a col type text which is dynamic results in mem growth (coz there are no rows inserted, i understand for long strings db does work to compress, move them to toast tables etc) but these are empty rows.

Maybe someone else will be able to explain what is going on.




On Fri, 28 May 2021 at 06:52, Nick Muerdter <stuff@nickm.org> wrote:
I've been seeing what looks like unbounded memory growth (until the OOM killer kicks in and kills the postgres process) when running a pl/pgsql function that performs TRUNCATE statements against various temporary tables in a loop. I think I've been able to come up with some fairly simple reproductions of the issue in isolation, but I'm trying to figure out if this is a memory leak or of I'm perhaps doing something wrong with tuning or other settings.

What I've observed:

- The memory growth occurs if the temp table has indexes or a primary key set on it.
- Alternatively, the memory growth also occurs if the temp table has certain column types on it (eg, "text" types).
- If the table doesn't have indexes and only has integer columns present, then the memory growth does *not* occur.
- I originally saw this against a PostgreSQL 12 server, but I've tested this against PostgreSQL 9.6.22, 12.7, and 13.3 Docker containers and reproduced it against all versions in the containers.

Here are 2 separate examples that seem to show the memory growth on the server (the first being a table with a "text" column, the second example having no text column but a primary key index):

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer, bar text);

    FOR i IN 1..200000000 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer);
    ALTER TABLE pg_temp.foo ADD PRIMARY KEY (id);

    FOR i IN 1..200000000 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

Compare that to this example (which doesn't have an index or any other column types that trigger this), which does *not* show any memory growth:

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer);

    FOR i IN 1..200000000 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

Any help in determining what's going on here (or if there are other ways to go about this) would be greatly appreciated!

Thank you!
Nick




--
Thanks,
Vijay
Mumbai, India

Re: TRUNCATE memory leak with temporary tables?

From
Ravi Krishna
Date:
Truncate is not delete + vaccum.
It creates a new empty table , followed by rename of the existing table to the new empty table and finally dropping of the old table.

On May 28, 2021 at 7:05 AM, Vijaykumar Jain <vijaykumarjain.github@gmail.com> wrote:

Yes,
I too see growth when text type is used, but not when int or even fixed size char(10) is used.

I always thought truncate was similar to delete + vacuum full, 
but checking for your scenarios, I did not see an update on pg_stat_user_table on truncate for vacuums.

then i checked 
to help understand truncation better.

but then i still do not understand how a col type text which is dynamic results in mem growth (coz there are no rows inserted, i understand for long strings db does work to compress, move them to toast tables etc) but these are empty rows.

Maybe someone else will be able to explain what is going on.




On Fri, 28 May 2021 at 06:52, Nick Muerdter <stuff@nickm.org> wrote:
I've been seeing what looks like unbounded memory growth (until the OOM killer kicks in and kills the postgres process) when running a pl/pgsql function that performs TRUNCATE statements against various temporary tables in a loop. I think I've been able to come up with some fairly simple reproductions of the issue in isolation, but I'm trying to figure out if this is a memory leak or of I'm perhaps doing something wrong with tuning or other settings.

What I've observed:

- The memory growth occurs if the temp table has indexes or a primary key set on it.
- Alternatively, the memory growth also occurs if the temp table has certain column types on it (eg, "text" types).
- If the table doesn't have indexes and only has integer columns present, then the memory growth does *not* occur.
- I originally saw this against a PostgreSQL 12 server, but I've tested this against PostgreSQL 9.6.22, 12.7, and 13.3 Docker containers and reproduced it against all versions in the containers.

Here are 2 separate examples that seem to show the memory growth on the server (the first being a table with a "text" column, the second example having no text column but a primary key index):

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer, bar text);

    FOR i IN 1..200000000 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer);
    ALTER TABLE pg_temp.foo ADD PRIMARY KEY (id);

    FOR i IN 1..200000000 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

Compare that to this example (which doesn't have an index or any other column types that trigger this), which does *not* show any memory growth:

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer);

    FOR i IN 1..200000000 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

Any help in determining what's going on here (or if there are other ways to go about this) would be greatly appreciated!

Thank you!
Nick




--
Thanks,
Vijay
Mumbai, India

Re: TRUNCATE memory leak with temporary tables?

From
Ahmet Demir
Date:
Hi Ravi,

I am not sure about that
"It creates a new empty table , followed by rename of the existing table to the new empty table and finally dropping of the old table."

You mean table is re-created with new oid?

thanks
Ahmet

On Fri, 28 May 2021 at 15:10, Ravi Krishna <ravikrishna3@icloud.com> wrote:
Truncate is not delete + vaccum.
It creates a new empty table , followed by rename of the existing table to the new empty table and finally dropping of the old table.

On May 28, 2021 at 7:05 AM, Vijaykumar Jain <vijaykumarjain.github@gmail.com> wrote:

Yes,
I too see growth when text type is used, but not when int or even fixed size char(10) is used.

I always thought truncate was similar to delete + vacuum full, 
but checking for your scenarios, I did not see an update on pg_stat_user_table on truncate for vacuums.

then i checked 
to help understand truncation better.

but then i still do not understand how a col type text which is dynamic results in mem growth (coz there are no rows inserted, i understand for long strings db does work to compress, move them to toast tables etc) but these are empty rows.

Maybe someone else will be able to explain what is going on.




On Fri, 28 May 2021 at 06:52, Nick Muerdter <stuff@nickm.org> wrote:
I've been seeing what looks like unbounded memory growth (until the OOM killer kicks in and kills the postgres process) when running a pl/pgsql function that performs TRUNCATE statements against various temporary tables in a loop. I think I've been able to come up with some fairly simple reproductions of the issue in isolation, but I'm trying to figure out if this is a memory leak or of I'm perhaps doing something wrong with tuning or other settings.

What I've observed:

- The memory growth occurs if the temp table has indexes or a primary key set on it.
- Alternatively, the memory growth also occurs if the temp table has certain column types on it (eg, "text" types).
- If the table doesn't have indexes and only has integer columns present, then the memory growth does *not* occur.
- I originally saw this against a PostgreSQL 12 server, but I've tested this against PostgreSQL 9.6.22, 12.7, and 13.3 Docker containers and reproduced it against all versions in the containers.

Here are 2 separate examples that seem to show the memory growth on the server (the first being a table with a "text" column, the second example having no text column but a primary key index):

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer, bar text);

    FOR i IN 1..200000000 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer);
    ALTER TABLE pg_temp.foo ADD PRIMARY KEY (id);

    FOR i IN 1..200000000 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

Compare that to this example (which doesn't have an index or any other column types that trigger this), which does *not* show any memory growth:

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer);

    FOR i IN 1..200000000 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

Any help in determining what's going on here (or if there are other ways to go about this) would be greatly appreciated!

Thank you!
Nick




--
Thanks,
Vijay
Mumbai, India

Re: TRUNCATE memory leak with temporary tables?

From
Ravi Krishna
Date:
I am not sure about that
"It creates a new empty table , followed by rename of the existing table to the new empty table and finally dropping of the old table."

You mean table is re-created with new oid?

I don't think oid changes, but the file relnode on the disk changes. So let me rephrase it

truncate does the following:

1 - create a new empty file on the disk.
2 - at commit time, map the table oid to the new empty file.
3 - drop the old file.

Re: TRUNCATE memory leak with temporary tables?

From
Tom Lane
Date:
Vijaykumar Jain <vijaykumarjain.github@gmail.com> writes:
> I too see growth when text type is used, but not when int or even fixed
> size char(10) is used.
> ...
> but then i still do not understand how a col type *text* which is dynamic
> results in mem growth (coz there are no rows inserted, i understand for
> long strings db does work to compress, move them to toast tables etc) but
> these are empty rows.

The text column would cause the table to have an associated toast table [1],
which in turn would have an index.  Both of those would be reallocated as
new files on-disk during TRUNCATE, just like the table proper.

A plausible theory here is that TRUNCATE leaks some storage associated
with an index's relcache entry, but not any for a plain table.

            regards, tom lane

[1] https://www.postgresql.org/docs/current/storage-toast.html



Re: TRUNCATE memory leak with temporary tables?

From
Vijaykumar Jain
Date:
i tried to reproduce tracking mem allocation.

demo=# DO $$
        DECLARE    i bigint;
        BEGIN
                CREATE TEMPORARY TABLE pg_temp.foo (id int) with ( AUTOVACUUM_ENABLED = 0, TOAST.AUTOVACUUM_ENABLED = 0 );
                FOR i IN 1..200000000 LOOP
                        TRUNCATE pg_temp.foo;
                END LOOP;
        END
$$;

in a parallel tmux session.

strace -p 1620 --trace=memory

no movement/ no new output



****************************
when i replace the col with type text.

demo=# DO $$
        DECLARE    i bigint;
        BEGIN
                CREATE TEMPORARY TABLE pg_temp.foo (id text) with ( AUTOVACUUM_ENABLED = 0, TOAST.AUTOVACUUM_ENABLED = 0 );
                FOR i IN 1..200000000 LOOP
                        TRUNCATE pg_temp.foo;
                END LOOP;
        END
$$;

strace -p 1620 --trace=memory
strace: Process 1620 attached
--- SIGINT {si_signo=SIGINT, si_code=SI_USER, si_pid=1878, si_uid=1001} ---
brk(0x556c502ad000)                     = 0x556c502ad000
brk(0x556c502ed000)                     = 0x556c502ed000
brk(0x556c5036d000)                     = 0x556c5036d000
brk(0x556c5046d000)                     = 0x556c5046d000
brk(0x556c5066d000)                     = 0x556c5066d000
brk(0x556c50a6d000)                     = 0x556c50a6d000
brk(0x556c5126d000)                     = 0x556c5126d000

it seems it does try memory allocation repeatedly.
I am not a C developer :), please ignore if i am diverting.




On Fri, 28 May 2021 at 18:52, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Vijaykumar Jain <vijaykumarjain.github@gmail.com> writes:
> I too see growth when text type is used, but not when int or even fixed
> size char(10) is used.
> ...
> but then i still do not understand how a col type *text* which is dynamic
> results in mem growth (coz there are no rows inserted, i understand for
> long strings db does work to compress, move them to toast tables etc) but
> these are empty rows.

The text column would cause the table to have an associated toast table [1],
which in turn would have an index.  Both of those would be reallocated as
new files on-disk during TRUNCATE, just like the table proper.

A plausible theory here is that TRUNCATE leaks some storage associated
with an index's relcache entry, but not any for a plain table.

                        regards, tom lane

[1] https://www.postgresql.org/docs/current/storage-toast.html


--
Thanks,
Vijay
Mumbai, India

Re: TRUNCATE memory leak with temporary tables?

From
"Nick Muerdter"
Date:
On Fri, May 28, 2021, at 7:22 AM, Tom Lane wrote:
> The text column would cause the table to have an associated toast table [1],
> which in turn would have an index.  Both of those would be reallocated as
> new files on-disk during TRUNCATE, just like the table proper.
> 
> A plausible theory here is that TRUNCATE leaks some storage associated
> with an index's relcache entry, but not any for a plain table.
> 
>             regards, tom lane
> 
> [1] https://www.postgresql.org/docs/current/storage-toast.html

Yeah, I forgot to mention this originally, but I see memory growth against a "varchar(501)" field, but *not* against a
"varchar(500)"field, so I was wondering if there was some length threshold that triggered something with toast table
behaviorsomehow involved. But if the toast table involves an index, then maybe all of this gets back to just the
indexeslike you say.
 

And I originally thought this issue was limited to temp tables, but now I'm not so sure. I seem to be able to reproduce
thememory growth against regular tables (both normal and UNLOGGED) too:
 

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TABLE public.foo (id integer, bar text);

    FOR i IN 1..200000000 LOOP
      TRUNCATE public.foo;
    END LOOP;
  END
$$

The memory growth seems to be slower in this case, so maybe that's why I didn't catch it earlier, but I think it's
maybegrowing at the same rate, it's just that this loop goes slower against real tables than temp tables. For example,
Isee similar total memory growth by the time this reaches 100,000 loops for either temp or non-temp tables, the temp
versionjust reaches that point a lot more quickly (which makes sense).
 

Thanks!
Nick



Re: TRUNCATE memory leak with temporary tables?

From
Tom Lane
Date:
"Nick Muerdter" <stuff@nickm.org> writes:
> I've been seeing what looks like unbounded memory growth (until the OOM killer kicks in and kills the postgres
process)when running a pl/pgsql function that performs TRUNCATE statements against various temporary tables in a loop.
Ithink I've been able to come up with some fairly simple reproductions of the issue in isolation, but I'm trying to
figureout if this is a memory leak or of I'm perhaps doing something wrong with tuning or other settings. 

Hmm, so the "leak" occurs here:

#0  AllocSetAlloc (context=0x2b70820, size=528) at aset.c:730
#1  0x00000000009617fc in MemoryContextAlloc (context=0x2b70820,
    size=size@entry=528) at mcxt.c:875
#2  0x00000000009234b9 in AddInvalidationMessage (listHdr=0x2b70be8,
    msg=msg@entry=0x7ffdb35893d0) at inval.c:244
#3  0x00000000009235c0 in AddRelcacheInvalidationMessage (relId=37554,
    dbId=<optimized out>, hdr=<optimized out>) at inval.c:414
#4  RegisterRelcacheInvalidation (dbId=<optimized out>, relId=37554)
    at inval.c:520
#5  0x0000000000923b87 in CacheInvalidateRelcacheByTuple (
    classTuple=classTuple@entry=0x2bf82e8) at inval.c:1328
#6  0x00000000005da654 in index_update_stats (rel=0x7f23b49eb4d0,
    hasindex=<optimized out>, reltuples=<optimized out>) at index.c:2872
#7  0x00000000005dbca2 in index_build (
    heapRelation=heapRelation@entry=0x7f23b49eb4d0,
    indexRelation=indexRelation@entry=0x7f23b49eb900,
    indexInfo=indexInfo@entry=0x2bb5250, isreindex=isreindex@entry=true,
    parallel=parallel@entry=false) at index.c:3051
#8  0x00000000005d66be in RelationTruncateIndexes (
    heapRelation=heapRelation@entry=0x7f23b49eb4d0) at heap.c:3305
#9  0x00000000005d9875 in heap_truncate_one_rel (rel=rel@entry=0x7f23b49e72b0)
    at heap.c:3387
#10 0x000000000068047f in ExecuteTruncateGuts (explicit_rels=0x2bb4b78,
    relids=<optimized out>, relids_logged=0x0, behavior=DROP_RESTRICT,
    restart_seqs=false) at tablecmds.c:1957
#11 0x0000000000680b65 in ExecuteTruncate (stmt=0x2bcd538) at tablecmds.c:1709

That is, we're accumulating a record of system catalog invalidation
events, which have to be kept until end of transaction when they'll
be sent out to other backends.  (For the case of events on temp
tables, maybe that wouldn't be strictly necessary, but this certainly
can't be avoided for normal tables.)  So it's not really a leak, but
just a record of unfinished work caused by the TRUNCATEs.

However ... what seems odd is that we only get an inval message when
there's an index involved.  In principle such messages need to be
issued against the table as well.  I suspect that we're optimizing
away the message for the table, either on the grounds that it's temp
or that it's new in the current transaction; but the index code path
didn't get the same TLC.

Alternatively, maybe it's a bug that there's no message for the table.

            regards, tom lane



Re: TRUNCATE memory leak with temporary tables?

From
Tom Lane
Date:
I wrote:
> However ... what seems odd is that we only get an inval message when
> there's an index involved.  In principle such messages need to be
> issued against the table as well.  I suspect that we're optimizing
> away the message for the table, either on the grounds that it's temp
> or that it's new in the current transaction; but the index code path
> didn't get the same TLC.

Yeah, after a bit of digging, ExecuteTruncateGuts knows this:

         * Normally, we need a transaction-safe truncation here.  However, if
         * the table was either created in the current (sub)transaction or has
         * a new relfilenode in the current (sub)transaction, then we can just
         * truncate it in-place, because a rollback would cause the whole
         * table or the current physical file to be thrown away anyway.

The in-place-truncate code path isn't issuing any invalidation message,
which seems okay as far as I can think at the moment.  But that only
applies to truncating the table proper.  Indexes are "truncated" by
rebuilding them as empty, so that any required infrastructure such
as metapages will be recreated properly.  So that's largely the same
code path as a REINDEX, and index.c is careful to send out an inval
for that:

 * NOTE: an important side-effect of this operation is that an SI invalidation
 * message is sent out to all backends --- including me --- causing relcache
 * entries to be flushed or updated with the new data.  This must happen even
 * if we find that no change is needed in the pg_class row.  When updating
 * a heap entry, this ensures that other backends find out about the new
 * index.  When updating an index, it's important because some index AMs
 * expect a relcache flush to occur after REINDEX.

Maybe that could be optimized in this situation, but it's not totally
clear how, especially given the angle about index AM expectations.

Anyway, I don't see a lot of low-hanging fruit here as far as avoiding
storing such messages is concerned.

We could reduce the rate of the leak if inval.c were more stingy about
memory allocation for the invalidation message list.  In this particular
example, it's allocating space for 32 (FIRSTCHUNKSIZE) messages per
statement, but then only using 2.  So perhaps FIRSTCHUNKSIZE could be
reduced?  I'm not sure that anyone has ever made a survey of how many
messages different sorts of DDL tend to produce, so I suspect that number
was mostly plucked from the air.

A more aggressive idea that I've thought of but never done anything about
is that once we've accumulated more than a certain number of messages,
maybe we should just figure on issuing a full cache reset rather than
continuing to track individual inval events.  This'd not only bound
the storage required for pending invals, but very possibly be more
efficient when the time comes to actually process them.

            regards, tom lane