Re: Memory leak from ExecutorState context? - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: Memory leak from ExecutorState context? |
Date | |
Msg-id | CAAKRu_ZqOGePK6AqMisyQmUyc8vC1kz5NQHe6B4jKg0NpB_Wvg@mail.gmail.com Whole thread Raw |
In response to | Re: Memory leak from ExecutorState context? (Konstantin Knizhnik <knizhnik@garret.ru>) |
Responses |
Re: Memory leak from ExecutorState context?
|
List | pgsql-hackers |
On Thu, Apr 20, 2023 at 12:42 PM Konstantin Knizhnik <knizhnik@garret.ru> wrote: > On 11.04.2023 8:14 PM, Jehan-Guillaume de Rorthais wrote: > > On Sat, 8 Apr 2023 02:01:19 +0200 > > Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote: > > > >> On Fri, 31 Mar 2023 14:06:11 +0200 > >> Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote: > >> > >> [...] > >> > >> After rebasing Tomas' memory balancing patch, I did some memory measures > >> to answer some of my questions. Please, find in attachment the resulting > >> charts "HJ-HEAD.png" and "balancing-v3.png" to compare memory consumption > >> between HEAD and Tomas' patch. They shows an alternance of numbers > >> before/after calling ExecHashIncreaseNumBatches (see the debug patch). I > >> didn't try to find the exact last total peak of memory consumption during the > >> join phase and before all the BufFiles are destroyed. So the last number > >> might be underestimated. > > I did some more analysis about the total memory consumption in filecxt of HEAD, > > v3 and v4 patches. My previous debug numbers only prints memory metrics during > > batch increments or hash table destruction. That means: > > > > * for HEAD: we miss the batches consumed during the outer scan > > * for v3: adds twice nbatch in spaceUsed, which is a rough estimation > > * for v4: batches are tracked in spaceUsed, so they are reflected in spacePeak > > > > Using a breakpoint in ExecHashJoinSaveTuple to print "filecxt->mem_allocated" > > from there, here are the maximum allocated memory for bufFile context for each > > branch: > > > > batches max bufFiles total spaceAllowed rise > > HEAD 16384 199966960 ~194MB > > v3 4096 65419456 ~78MB > > v4(*3) 2048 34273280 48MB nbatch*sizeof(PGAlignedBlock)*3 > > v4(*4) 1024 17170160 60.6MB nbatch*sizeof(PGAlignedBlock)*4 > > v4(*5) 2048 34273280 42.5MB nbatch*sizeof(PGAlignedBlock)*5 > > > > It seems account for bufFile in spaceUsed allows a better memory balancing and > > management. The precise factor to rise spaceAllowed is yet to be defined. *3 or > > *4 looks good, but this is based on a single artificial test case. > > > > Also, note that HEAD is currently reporting ~4MB of memory usage. This is by > > far wrong with the reality. So even if we don't commit the balancing memory > > patch in v16, maybe we could account for filecxt in spaceUsed as a bugfix? > > > > Regards, > > Thank you for the patch. > I faced with the same problem (OOM caused by hash join). > I tried to create simplest test reproducing the problem: > > create table t(pk int, val int); > insert into t values (generate_series(1,100000000),0); > set work_mem='64kB'; > explain (analyze,buffers) select count(*) from t t1 join t t2 on > (t1.pk=t2.pk); > > > There are three workers and size of each exceeds 1.3Gb. > > Plan is the following: > > Finalize Aggregate (cost=355905977972.87..355905977972.88 rows=1 > width=8) (actual time=2 > 12961.033..226097.513 rows=1 loops=1) > Buffers: shared hit=32644 read=852474 dirtied=437947 written=426374, > temp read=944407 w > ritten=1130380 > -> Gather (cost=355905977972.65..355905977972.86 rows=2 width=8) > (actual time=212943. > 505..226097.497 rows=3 loops=1) > Workers Planned: 2 > Workers Launched: 2 > Buffers: shared hit=32644 read=852474 dirtied=437947 > written=426374, temp read=94 > 4407 written=1130380 > -> Partial Aggregate (cost=355905976972.65..355905976972.66 > rows=1 width=8) (ac > tual time=212938.410..212940.035 rows=1 loops=3) > Buffers: shared hit=32644 read=852474 dirtied=437947 > written=426374, temp r > ead=944407 written=1130380 > -> Parallel Hash Join (cost=1542739.26..303822614472.65 > rows=20833345000002 width=0) (actual time=163268.274..207829.524 > rows=33333333 loops=3) > Hash Cond: (t1.pk = t2.pk) > Buffers: shared hit=32644 read=852474 > dirtied=437947 written=426374, temp read=944407 written=1130380 > -> Parallel Seq Scan on t t1 > (cost=0.00..859144.78 rows=41666678 width=4) (actual > time=0.045..30828.051 rows=33333333 loops=3) > Buffers: shared hit=16389 read=426089 written=87 > -> Parallel Hash (cost=859144.78..859144.78 > rows=41666678 width=4) (actual time=82202.445..82202.447 rows=33333333 > loops=3) > Buckets: 4096 (originally 4096) Batches: > 32768 (originally 8192) Memory Usage: 192kB > Buffers: shared hit=16095 read=426383 > dirtied=437947 written=426287, temp read=267898 written=737164 > -> Parallel Seq Scan on t t2 > (cost=0.00..859144.78 rows=41666678 width=4) (actual > time=0.054..12647.534 rows=33333333 loops=3) > Buffers: shared hit=16095 read=426383 > dirtied=437947 writ > ten=426287 > Planning: > Buffers: shared hit=69 read=38 > Planning Time: 2.819 ms > Execution Time: 226113.292 ms > (22 rows) > > > > ----------------------------- > > So we have increased number of batches to 32k. > I applied your patches 0001-0004 but unfortunately them have not reduced > memory consumption - still size of each backend is more than 1.3Gb. Is this EXPLAIN ANALYZE run on an instance with Jehan-Guillaume's patchset applied or without? I'm asking because the fourth patch in the series updates spaceUsed with the size of the BufFile->buffer, but I notice in your EXPLAIN ANALZYE, Memory Usage for the hashtable is reported as 192 kB, which, while larger than the 64kB work_mem you set, isn't as large as I might expect. - Melanie
pgsql-hackers by date: