poor row estimates with multi-column joins - Mailing list pgsql-performance

From Robert Treat
Subject poor row estimates with multi-column joins
Date
Msg-id 200805141834.55014.xzilla@users.sourceforge.net
Whole thread Raw
List pgsql-performance
The following query produces some fairly off estimates for the number of rows
that should be returned (this is based on a much more complex query, but
whittling it down to this which seems to be the heart of the problem)

peii=# explain analyze select * from adv.peii_fast_lookup pfl1 join
adv.lsteml_m le1 on (pfl1.ctm_nbr = le1.ctm_nbr and pfl1.emal_id =
le1.emal_id) ;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=386721.95..1848154.67 rows=7 width=100) (actual
time=11407.555..103368.646 rows=18348993 loops=1)
Hash Cond: (((le1.ctm_nbr)::text = (pfl1.ctm_nbr)::text) AND
((le1.emal_id)::text = (pfl1.emal_id)::text))
-> Seq Scan on lsteml_m le1 (cost=0.00..435026.44 rows=18712844 width=67)
(actual time=0.027..7057.486 rows=18703401 loops=1)
-> Hash (cost=172924.18..172924.18 rows=9371918 width=33) (actual
time=11387.413..11387.413 rows=9368565 loops=1)
-> Seq Scan on peii_fast_lookup pfl1 (cost=0.00..172924.18 rows=9371918
width=33) (actual time=0.006..2933.512 rows=9368565 loops=1)
Total runtime: 108132.205 ms

default_stats_target is 100, both tables freshly analyzed
all join columns on both sides are varchar(12)
and we're on 8.3.1

I notice that it seems to give a better number of rows when doing single
column joins (explain only, didnt want to wait for it to actually run this)

peii=# explain select * from adv.peii_fast_lookup pfl1 join adv.lsteml_m le1
on (pfl1.ctm_nbr = le1.ctm_nbr) ;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Merge Join  (cost=7243997.70..8266364.43 rows=65065332 width=100)
   Merge Cond: ((pfl1.ctm_nbr)::text = (le1.ctm_nbr)::text)
   ->  Sort  (cost=1917159.20..1940589.00 rows=9371918 width=33)
         Sort Key: pfl1.ctm_nbr
         ->  Seq Scan on peii_fast_lookup pfl1  (cost=0.00..172924.18
rows=9371918 width=33)
   ->  Materialize  (cost=5326833.82..5560745.31 rows=18712919 width=67)
         ->  Sort  (cost=5326833.82..5373616.12 rows=18712919 width=67)
               Sort Key: le1.ctm_nbr
               ->  Seq Scan on lsteml_m le1  (cost=0.00..435028.19
rows=18712919 width=67)
(9 rows)

peii=# explain select * from adv.peii_fast_lookup pfl1 join adv.lsteml_m le1
on (pfl1.emal_id = le1.emal_id) ;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Hash Join  (cost=363292.16..1754557.17 rows=18712919 width=100)
   Hash Cond: ((le1.emal_id)::text = (pfl1.emal_id)::text)
   ->  Seq Scan on lsteml_m le1  (cost=0.00..435028.19 rows=18712919 width=67)
   ->  Hash  (cost=172924.18..172924.18 rows=9371918 width=33)
         ->  Seq Scan on peii_fast_lookup pfl1  (cost=0.00..172924.18
rows=9371918 width=33)
(5 rows)


for kicks, I upped the stats target and reran everything...

peii=# set default_statistics_target = 400;
SET
peii=# analyze verbose adv.peii_fast_lookup;
INFO:  analyzing "adv.peii_fast_lookup"
INFO:  "peii_fast_lookup": scanned 79205 of 79205 pages, containing 9368569
live rows and 316 dead rows; 120000 rows in sample, 9368569 estimated total
rows
ANALYZE
peii=# analyze verbose adv.lsteml_m;
INFO:  analyzing "adv.lsteml_m"
INFO:  "lsteml_m": scanned 120000 of 247899 pages, containing 9050726 live
rows and 110882 dead rows; 120000 rows in sample, 18697216 estimated total
rows
ANALYZE
peii=# explain analyze select * from adv.peii_fast_lookup pfl1 join
adv.lsteml_m le1 on (pfl1.ctm_nbr = le1.ctm_nbr and pfl1.emal_id =
le1.emal_id) ;
                                                                  QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=386611.22..1847063.87 rows=4 width=100) (actual
time=11169.338..95460.560 rows=18348993 loops=1)
   Hash Cond: (((le1.ctm_nbr)::text = (pfl1.ctm_nbr)::text) AND
((le1.emal_id)::text = (pfl1.emal_id)::text))
   ->  Seq Scan on lsteml_m le1  (cost=0.00..434871.16 rows=18697216 width=67)
(actual time=0.008..7012.533 rows=18703401 loops=1)
   ->  Hash  (cost=172890.69..172890.69 rows=9368569 width=33) (actual
time=11160.329..11160.329 rows=9368569 loops=1)
         ->  Seq Scan on peii_fast_lookup pfl1  (cost=0.00..172890.69
rows=9368569 width=33) (actual time=0.005..2898.336 rows=9368569 loops=1)
 Total runtime: 100223.220 ms
(6 rows)

peii=# set enable_hashjoin = false;
SET
peii=# explain analyze select * from adv.peii_fast_lookup pfl1 join
adv.lsteml_m le1 on (pfl1.ctm_nbr = le1.ctm_nbr and pfl1.emal_id =
le1.emal_id) ;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=5322783.26..5972103.39 rows=4 width=100) (actual
time=415082.543..606999.689 rows=18348993 loops=1)
   Merge Cond: (((pfl1.emal_id)::text = (le1.emal_id)::text) AND
((pfl1.ctm_nbr)::text = (le1.ctm_nbr)::text))
   ->  Index Scan using peii_fast_lookup_pkey on peii_fast_lookup pfl1
(cost=0.00..462635.50 rows=9368569 width=33) (actual time=0.031..7342.227
rows=9368569 loops=1)
   ->  Materialize  (cost=5322446.84..5556162.04 rows=18697216 width=67)
(actual time=414700.258..519877.718 rows=18703401 loops=1)
         ->  Sort  (cost=5322446.84..5369189.88 rows=18697216 width=67)
(actual time=414700.254..506652.718 rows=18703401 loops=1)
               Sort Key: le1.emal_id, le1.ctm_nbr
               Sort Method:  external merge  Disk: 1620632kB
               ->  Seq Scan on lsteml_m le1  (cost=0.00..434871.16
rows=18697216 width=67) (actual time=0.006..6776.725 rows=18703401 loops=1)
 Total runtime: 611728.059 ms
(9 rows)

Still the same issue, so this doesn't seem like something specific to hash
joins. I'll note that this is the behavior I recall from 8.2, so I'm not sure
if this is a bug, or just an outright deficiancy, but thought I would post to
see if anyone had any thoughts on it. (If there is some additional info I can
provide, please lmk).


--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

pgsql-performance by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: I/O on select count(*)
Next
From: "Subbiah Stalin-XCGF84"
Date:
Subject: Update performance degrades over time