IN subquery not using a hash - Mailing list pgsql-general

From Paul Tillotson
Subject IN subquery not using a hash
Date
Msg-id 42DEDF3F.200@adelphia.net
Whole thread Raw
Responses Re: IN subquery not using a hash
List pgsql-general
For the following query, postgres is running the IN subquery over and
over again (once for each row scanned in the parent table.)


I would have expected it to run the whole query once and create a hash
which would then be probed once for every row scanned in the parent
table.  I assumed that it was not doing so because it thought that the
resulting hash table would exceed sort_mem, but setting sort_mem to half
a gigabyte did not make any difference.  Is there some other reason that
the optimizer is not using a hash table?


563 pages * 8 KB per page * 296 tuples fetched / 52085 tuples in the
whole table = 25 KB.  Shouldn't the optimizer think that the subquery
will only fetch 25 KB worth of rows?

(Later, I realized that the official name for "sort_mem" is now
work_mem.  Now, does this mean that my set sort_mem = 500000 did not do
anything?)

Regards,
Paul Tillotson

omnis=> select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 8.0.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.3
20040412 (Red Hat Linux 3.3.3-7)
(1 row)
omnis=> show sort_mem;
20000
omnis=> set sort_mem = 500000;
SET
omnis=> explain analyze select 1 from parents where lname = 'SMITH' OR
parentid IN (SELECT parentid FROM child where childlast = 'SMITH');
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on parents (cost=357.78..226649.83 rows=59785 width=0) (actual
time=127.644..104568.639 rows=855 loops=1)
    Filter: (((lname)::text = 'SMITH'::text) OR (subplan))
    SubPlan
        -> Materialize (cost=357.78..360.74 rows=296 width=4) (actual
time=0.001..0.257 rows=313 loops=117943)
            -> Index Scan using child_childlast_index on child
(cost=0.00..357.48 rows=296 width=4) (actual time=0.073..1.325 rows=313
loops=1)
            Index Cond: ((childlast)::text = 'SMITH'::text)
Total runtime: 104569.800 ms
(7 rows)

omnis=> select reltuples, relpages from pg_class where relname = 'child';
reltuples | relpages
-----------+----------
52085 | 563
(1 row)


pgsql-general by date:

Previous
From: "Ed L."
Date:
Subject: Re: help: production db stuck in startup mode
Next
From: Tom Lane
Date:
Subject: Re: IN subquery not using a hash