Re: estimates for nested loop very wrong? - Mailing list pgsql-sql

From Tom Lane
Subject Re: estimates for nested loop very wrong?
Date
Msg-id 3742.1049984229@sss.pgh.pa.us
Whole thread Raw
In response to estimates for nested loop very wrong?  (joostje@komputilo.org)
Responses Re: estimates for nested loop very wrong?  (joostje@komputilo.org)
List pgsql-sql
joostje@komputilo.org writes:
> When JOINing two tabels (one with 23 entries, one with 2.5e6 entries),
> psql estimates the cost of the nested loop method way to high, causing
> it to use Hash Join, even though Hash Join actually takes 30 seconds,
> agianst 0.020 seconds for Nested Loop.

Have you done an ANALYZE or VACUUM ANALYZE recently?

> Nested Loop  (cost=0.00..208256.60 rows=61140 width=38) (actual time=0.92..18.49 rows=756 loops=1)
>   ->  Seq Scan on tmp1  (cost=0.00..1.23 rows=23 width=7) (actual time=0.24..0.39 rows=23 loops=1)
>   ->  Index Scan using db_id_idx on db  (cost=0.00..9021.35 rows=2658 width=31) (actual time=0.32..0.69 rows=33
loops=23)
> Total runtime: 19.20 msec

The planner is evidently estimating that each row of tmp1 will match 2600+
rows of db, whereas in reality there is only one match.  Rather than
mess with enable_hashjoin, you need to find out why that estimate is so
badly off.  Are the entries in tmp1 specially selected to correspond to
unique rows of db?
        regards, tom lane



pgsql-sql by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: estimates for nested loop very wrong?
Next
From: Ian Barwick
Date:
Subject: INSERT INTO ... SELECT (PostgreSQL vs. MySQL)