Home > mailing lists

Re: Memory leak from ExecutorState context? - Mailing list pgsql-hackers

From	Melanie Plageman
Subject	Re: Memory leak from ExecutorState context?
Date	April 20, 2023 22:51:19
Msg-id	CAAKRu_ZqOGePK6AqMisyQmUyc8vC1kz5NQHe6B4jKg0NpB_Wvg@mail.gmail.com Whole thread Raw
In response to	Re: Memory leak from ExecutorState context? (Konstantin Knizhnik <knizhnik@garret.ru>)
Responses	Re: Memory leak from ExecutorState context?
List	pgsql-hackers

Tree view

On Thu, Apr 20, 2023 at 12:42 PM Konstantin Knizhnik <knizhnik@garret.ru> wrote:
> On 11.04.2023 8:14 PM, Jehan-Guillaume de Rorthais wrote:
> > On Sat, 8 Apr 2023 02:01:19 +0200
> > Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
> >
> >> On Fri, 31 Mar 2023 14:06:11 +0200
> >> Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
> >>
> >>   [...]
> >>
> >> After rebasing Tomas' memory balancing patch, I did some memory measures
> >> to answer some of my questions. Please, find in attachment the resulting
> >> charts "HJ-HEAD.png" and "balancing-v3.png" to compare memory consumption
> >> between HEAD and Tomas' patch. They shows an alternance of numbers
> >> before/after calling ExecHashIncreaseNumBatches (see the debug patch). I
> >> didn't try to find the exact last total peak of memory consumption during the
> >> join phase and before all the BufFiles are destroyed. So the last number
> >> might be underestimated.
> > I did some more analysis about the total memory consumption in filecxt of HEAD,
> > v3 and v4 patches. My previous debug numbers only prints memory metrics during
> > batch increments or hash table destruction. That means:
> >
> > * for HEAD: we miss the batches consumed during the outer scan
> > * for v3: adds twice nbatch in spaceUsed, which is a rough estimation
> > * for v4: batches are tracked in spaceUsed, so they are reflected in spacePeak
> >
> > Using a breakpoint in ExecHashJoinSaveTuple to print "filecxt->mem_allocated"
> > from there, here are the maximum allocated memory for bufFile context for each
> > branch:
> >
> >          batches   max bufFiles   total  spaceAllowed rise
> >    HEAD    16384      199966960  ~194MB
> >    v3       4096       65419456   ~78MB
> >    v4(*3)   2048       34273280    48MB  nbatch*sizeof(PGAlignedBlock)*3
> >    v4(*4)   1024       17170160  60.6MB  nbatch*sizeof(PGAlignedBlock)*4
> >    v4(*5)   2048       34273280  42.5MB  nbatch*sizeof(PGAlignedBlock)*5
> >
> > It seems account for bufFile in spaceUsed allows a better memory balancing and
> > management. The precise factor to rise spaceAllowed is yet to be defined. *3 or
> > *4 looks good, but this is based on a single artificial test case.
> >
> > Also, note that HEAD is currently reporting ~4MB of memory usage. This is by
> > far wrong with the reality. So even if we don't commit the balancing memory
> > patch in v16, maybe we could account for filecxt in spaceUsed as a bugfix?
> >
> > Regards,
>
> Thank you for the patch.
> I  faced with the same problem (OOM caused by hash join).
> I tried to create simplest test reproducing the problem:
>
> create table t(pk int, val int);
> insert into t values (generate_series(1,100000000),0);
> set work_mem='64kB';
> explain (analyze,buffers) select count(*) from t t1 join t t2 on
> (t1.pk=t2.pk);
>
>
> There are three workers and size of each exceeds 1.3Gb.
>
> Plan is the following:
>
>   Finalize Aggregate  (cost=355905977972.87..355905977972.88 rows=1
> width=8) (actual time=2
> 12961.033..226097.513 rows=1 loops=1)
>     Buffers: shared hit=32644 read=852474 dirtied=437947 written=426374,
> temp read=944407 w
> ritten=1130380
>     ->  Gather  (cost=355905977972.65..355905977972.86 rows=2 width=8)
> (actual time=212943.
> 505..226097.497 rows=3 loops=1)
>           Workers Planned: 2
>           Workers Launched: 2
>           Buffers: shared hit=32644 read=852474 dirtied=437947
> written=426374, temp read=94
> 4407 written=1130380
>           ->  Partial Aggregate (cost=355905976972.65..355905976972.66
> rows=1 width=8) (ac
> tual time=212938.410..212940.035 rows=1 loops=3)
>                 Buffers: shared hit=32644 read=852474 dirtied=437947
> written=426374, temp r
> ead=944407 written=1130380
>                 ->  Parallel Hash Join (cost=1542739.26..303822614472.65
> rows=20833345000002 width=0) (actual time=163268.274..207829.524
> rows=33333333 loops=3)
>                       Hash Cond: (t1.pk = t2.pk)
>                       Buffers: shared hit=32644 read=852474
> dirtied=437947 written=426374, temp read=944407 written=1130380
>                       ->  Parallel Seq Scan on t t1
> (cost=0.00..859144.78 rows=41666678 width=4) (actual
> time=0.045..30828.051 rows=33333333 loops=3)
>                             Buffers: shared hit=16389 read=426089 written=87
>                       ->  Parallel Hash (cost=859144.78..859144.78
> rows=41666678 width=4) (actual time=82202.445..82202.447 rows=33333333
> loops=3)
>                             Buckets: 4096 (originally 4096)  Batches:
> 32768 (originally 8192)  Memory Usage: 192kB
>                             Buffers: shared hit=16095 read=426383
> dirtied=437947 written=426287, temp read=267898 written=737164
>                             ->  Parallel Seq Scan on t t2
> (cost=0.00..859144.78 rows=41666678 width=4) (actual
> time=0.054..12647.534 rows=33333333 loops=3)
>                                   Buffers: shared hit=16095 read=426383
> dirtied=437947 writ
> ten=426287
>   Planning:
>     Buffers: shared hit=69 read=38
>   Planning Time: 2.819 ms
>   Execution Time: 226113.292 ms
> (22 rows)
>
>
>
> -----------------------------
>
> So we have increased number of batches to 32k.
> I applied your patches 0001-0004 but unfortunately them have not reduced
> memory consumption - still size of each backend is more than 1.3Gb.

Is this EXPLAIN ANALYZE run on an instance with Jehan-Guillaume's
patchset applied or without?

I'm asking because the fourth patch in the series updates spaceUsed with
the size of the BufFile->buffer, but I notice in your EXPLAIN ANALZYE,
Memory Usage for the hashtable is reported as 192 kB, which, while
larger than the 64kB work_mem you set, isn't as large as I might expect.

- Melanie

pgsql-hackers by date:

From: David Rowley
Date: 20 April 2023, 22:49:42
Subject: Re: Fix typos and inconsistencies for v16

From: Nathan Bossart
Date: 20 April 2023, 22:51:54
Subject: Re: Should we remove vacuum_defer_cleanup_age?

Re: Memory leak from ExecutorState context? - Mailing list pgsql-hackers

Previous

Next