BUG #15592: Memory overuse with subquery containing unnest() and setoperations (11.x regression) - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #15592: Memory overuse with subquery containing unnest() and setoperations (11.x regression) |
Date | |
Msg-id | 15592-cc5729f9ba02e31c@postgresql.org Whole thread Raw |
Responses |
Re: BUG #15592: Memory overuse with subquery containing unnest() and set operations (11.x regression)
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 15592 Logged by: Dmitry Marakasov Email address: amdmi3@amdmi3.ru PostgreSQL version: 11.1 Operating system: FreeBSD 12.0 Description: Shamefully I wasn't aware of array && operator, so I've been using the following construct to check for array intersection of two columns of text[] type: SELECT ... WHERE EXISTS(SELECT unnest(first) INTERSECT SELECT unnest(second)); After upgrading from PostgreSQL 10.x to 11.x this query began to eat inadequate amounts of memory (several gigabytes per hundred thousands rows). I've narrowed it down to this simple case which demonstrates the problem: CREATE TABLE test AS SELECT generate_series(1,500000) AS id, '{a,b}'::text[] AS first, '{a}'::text[] AS second; SELECT DISTINCT EXISTS(SELECT unnest(first) INTERSECT SELECT unnest(second)) FROM test; This query makes worker process eat up about 2GBs of memory. Memory usage grows linearly during SELECT execution, afterwards the memory is freed. Memory overuse doesn't happen on PostgreSQL 10, so this may be a regression in 11. My original case was quite silly and fixable with using an array operator, but I suspect that this problem may fire in legitimate cases too, so I thought it's worth reporting it. Additional info: Execution plan (it's the same on 10 and 11): QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=36527.18..36527.29 rows=2 width=1) (actual time=5005.132..5005.132 rows=1 loops=1) Output: ((SubPlan 1)) Group Key: (SubPlan 1) Buffers: shared hit=2048 read=4125 dirtied=4125 written=4093 -> Seq Scan on public.test (cost=0.00..35215.42 rows=524705 width=1) (actual time=0.053..4875.718 rows=500000 loops=1) Output: (SubPlan 1) Buffers: shared hit=2048 read=4125 dirtied=4125 written=4093 SubPlan 1 -> HashSetOp Intersect (cost=0.00..4.54 rows=100 width=36) (actual time=0.005..0.005 rows=1 loops=500000) Output: "*SELECT* 1".unnest, (0) -> Append (cost=0.00..4.04 rows=200 width=36) (actual time=0.001..0.004 rows=3 loops=500000) -> Subquery Scan on "*SELECT* 1" (cost=0.00..1.52 rows=100 width=36) (actual time=0.001..0.002 rows=2 loops=500000) Output: "*SELECT* 1".unnest, 0 -> ProjectSet (cost=0.00..0.52 rows=100 width=32) (actual time=0.001..0.001 rows=2 loops=500000) Output: unnest(test.first) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=500000) -> Subquery Scan on "*SELECT* 2" (cost=0.00..1.52 rows=100 width=36) (actual time=0.001..0.001 rows=1 loops=500000) Output: "*SELECT* 2".unnest, 1 -> ProjectSet (cost=0.00..0.52 rows=100 width=32) (actual time=0.001..0.001 rows=1 loops=500000) Output: unnest(test.second) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=500000) Planning Time: 0.185 ms Execution Time: 5120.463 ms I've also managed to get MemoryContextStats(TopMemoryContext) (from my production query, no the simplified one above), here it is: https://gist.github.com/AMDmi3/b55b2f3f6acf06cf6eab59fc043c7a0f and this line looks suspicious: > 344753 more child contexts containing 2824216576 total in 344753 blocks; 2735944032 free (5 chunks); 88272544 used
pgsql-bugs by date: