Hash join in SELECT target list expression keeps consuming memory - Mailing list pgsql-hackers

From Amit Khandekar
Subject Hash join in SELECT target list expression keeps consuming memory
Date
Msg-id CAJ3gD9cFofAWGvcxLOxDHC=B0hjtW8yGmUsF2hdGh97CM38=7g@mail.gmail.com
Whole thread Raw
Responses Re: Hash join in SELECT target list expression keeps consuming memory  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

If the SELECT target list expression is a join subquery, and if the
subquery does a hash join, then the query keeps on consuming more and
more memory. Below is such a query :

SELECT
   (SELECT id FROM
      unnest((pg_catalog.acldefault('L',l.col1))) WITH ORDINALITY AS
perm(acl,id),
      generate_series(1, l.col1::int % 2) AS gen2(gen2_id)
      where id = gen2_id
)
FROM largetable l ;

where the table largetable is created using :
create table largetable as select * from generate_series(1, 32000000)
as t(col1);

Here is the plan :
                                        QUERY PLAN
-------------------------------------------------------------------------------------------
 Seq Scan on largetable l  (cost=0.00..672781971.36 rows=32000018 width=8)
   SubPlan 1
     ->  Hash Join  (cost=2.26..21.01 rows=500 width=8)
           Hash Cond: (gen2.gen2_id = perm.id)
           ->  Function Scan on generate_series gen2
(cost=0.01..10.01 rows=1000 width=4)
           ->  Hash  (cost=1.00..1.00 rows=100 width=8)
                 ->  Function Scan on unnest perm  (cost=0.01..1.00
rows=100 width=8)

Now, if we disable hash join, the planner chooses merge join, and that
does not keep on consuming memory:

                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 Seq Scan on largetable l  (cost=0.00..2275308358.33 rows=32000018 width=8)
   SubPlan 1
     ->  Merge Join  (cost=59.84..71.09 rows=500 width=8)
           Merge Cond: (perm.id = gen2.gen2_id)
           ->  Function Scan on unnest perm  (cost=0.01..1.00 rows=100 width=8)
           ->  Sort  (cost=59.83..62.33 rows=1000 width=4)
                 Sort Key: gen2.gen2_id
                 ->  Function Scan on generate_series gen2
(cost=0.01..10.01 rows=1000 width=4)

Either with merge join or nested loop join, the postgres process
memory remains constant all the time. (It chooses hash join due to
incorrect row estimates of unnest() and generate_series() functions)

I am yet to analyze the root cause of this behaviour, but meanwhile, I
am posting it here, in case this turns out to be a known
behaviour/issue for hash joins, or there is already some work being
done on it.

I suspected that the set returning functions might be leaking memory,
but then we would have seen the behaviour on other types of joins as
well.

----------

Actually I encountered this issue when I tried to run pg_dump with 32M
number of blobs. The corresponding backend process consumed memory
until it was killed by the OOM killer.

Steps to reproduce the issue with pg_dump :
1. create table lo_table (id numeric, lo oid) ;
2. -- Create 32M rows
   insert into lo_table select
a.i,lo_from_bytea(0,E'\\xffffff0000000000') from
generate_series(1,32775000) as a(i);
3. -- Then run pg_dump.

pg_dump backend gets killed, and pg_dump aborts wih this message, that
has the query which consumed memory :

pg_dump: [archiver (db)] query failed: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
pg_dump: [archiver (db)] query was:

SELECT l.oid,

(SELECT rolname FROM pg_catalog.pg_roles WHERE oid = l.lomowner) AS rolname,

(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl,
row_n FROM pg_catalog.unnest(coalesce(l.lomacl,pg_catalog.acldefault('L',l.lomowner)))
WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM
pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('L',l.lomowner)))
AS init(init_acl) WHERE acl = init_acl)) as foo) AS lomacl,

(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl,
row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('L',l.lomowner)))
WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM
pg_catalog.unnest(coalesce(l.lomacl,pg_catalog.acldefault('L',l.lomowner)))
AS permp(orig_acl) WHERE acl = orig_acl)) as foo) AS rlomacl,

NULL AS initlomacl, NULL AS initrlomacl

FROM pg_largeobject_metadata l LEFT JOIN pg_init_privs pip ON (l.oid =
pip.objoid AND pip.classoid = 'pg_largeobject'::regclass AND
pip.objsubid = 0) ;

As you can see, the subplan expressions are using pg_init_privs table,
so as to collect any changed extension ACLs for the large objects. So
this won't reproduce before commit
23f34fa4ba358671adab16773e79c17c92cbc870.

Note: the pg_dump client process itself also consumes more and more
memory, although to a lesser extent. But this is a different thing,
and is already a known/expected behaviour :
https://www.postgresql.org/message-id/29613.1476969807%40sss.pgh.pa.us


-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: [HACKERS] [PATCH] Incremental sort
Next
From: Amit Kapila
Date:
Subject: Re: [HACKERS] why not parallel seq scan for slow functions