Large join runs out of memory in 8.1 - Mailing list pgsql-bugs

From Joe Sunday
Subject Large join runs out of memory in 8.1
Date
Msg-id 20060314202928.GA9332@csh.rit.edu
Whole thread Raw
Responses Re: Large join runs out of memory in 8.1
List pgsql-bugs
I've got the following schema and identical data loaded
into both 7.4.12 and 8.1.3 running on Linux/Power5.
sort_mem/work_mem is 10240 on 7.4/8.1 respectively.

              Table "public.a"
  Column  |          Type          | Modifiers
----------+------------------------+-----------
 key_a    | character varying(50)  |
 key_b    | character varying(10)  |
 column1  | character varying(10)  |
 column2  | character varying(20)  |
 column3  | character varying(100) |
 column4  | character varying(20)  |
 column5  | character varying(100) |
 column6  | character varying(10)  |
 column7  | character varying(10)  |
 column8  | character varying(50)  |
 column9  | character varying(50)  |
 column10 | character varying(20)  |
 column11 | character varying(100) |
 column12 | character(1)           |
 column13 | character varying(50)  |
Indexes:
    "a_idx1" btree (key_a, key_b)

              Table "public.b"
 Column  |         Type          | Modifiers
---------+-----------------------+-----------
 key_a   | character varying(50) |
 key_b   | character varying(10) |
 local_a | character varying(50) |
 local_b | character varying(10) |
 flag    | boolean               |
Indexes:
    "b_idx1" btree (key_a, key_b)

a has 9,195,222 rows, b has 9,402,255. Both databases
have been analyzed after loading.

Given the following query:
SELECT a.key_a, a.key_b,
  a.column1, a.column2, a.column3,
  b.local_a, b.local_b
INTO TEMP x
FROM a a, b b
WHERE a.key_a = b.key_a
  AND a.key_b = b.key_b
  AND b.local_a is not null;

The query plan on 7.4 looks like this:
 Hash Join  (cost=325251.03..1427754.15 rows=2050172 width=91)
   Hash Cond: ((("outer".key_a)::text = ("inner".key_a)::text) AND (("outer".key_b)::text = ("inner".key_b)::text))
   ->  Seq Scan on a  (cost=0.00..387576.90 rows=8405790 width=67)
   ->  Hash  (cost=204254.15..204254.15 rows=8774776 width=48)
         ->  Seq Scan on b  (cost=0.00..204254.15 rows=8774776 width=48)
               Filter: (local_a IS NOT NULL)
(6 rows)

and this on 8.1:
 Hash Join  (cost=323425.35..1468437.88 rows=2146226 width=91)
   Hash Cond: ((("outer".key_a)::text = ("inner".key_a)::text) AND (("outer".key_b)::text = ("inner".key_b)::text))
   ->  Seq Scan on a  (cost=0.00..395483.23 rows=9196423 width=67)
   ->  Hash  (cost=201370.84..201370.84 rows=8266102 width=48)
         ->  Seq Scan on b  (cost=0.00..201370.84 rows=8266102 width=48)
               Filter: (local_a IS NOT NULL)
(6 rows)

7.4 completes as expected, with 8,149,534 rows in the resultant temp table.
Memory according to top never goes much above 25 megs in use during the query.

8.1 grows until it uses about 4 GB, at which point it dies with the
following error:
ERROR:  out of memory
DETAIL:  Failed on request of size 8224.

--Joe

--
Joe Sunday <sunday@csh.rit.edu>  http://www.csh.rit.edu/~sunday/
Computer Science House, Rochester Inst. Of Technology

pgsql-bugs by date:

Previous
From: Richard Huxton
Date:
Subject: Re: Random hang during commit
Next
From: Tom Lane
Date:
Subject: Re: Large join runs out of memory in 8.1