Any idea on how to improve the statistics estimates for this plan? - Mailing list pgsql-performance

From Guillaume Smet
Subject Any idea on how to improve the statistics estimates for this plan?
Date
Msg-id CALt0+o9EdN9HDWuzNO2-+Y4XWU6OT03jzvHdzOCrHsLa5dgyMQ@mail.gmail.com
Whole thread Raw
Responses Re: Any idea on how to improve the statistics estimates for this plan?
List pgsql-performance
Hi,

I'm struggling with a query for some time and the major problem of the
query is that the statistics are way wrong on a particular operation:
 ->  Nested Loop  (cost=3177.72..19172.84 rows=*2* width=112) (actual
time=139.221..603.929 rows=*355331* loops=1)
     Join Filter: (l.location_id = r.location_id)
     ->  Hash Join  (cost=3177.71..7847.52 rows=*33914* width=108)
(actual time=138.343..221.852 rows=*36664* loops=1)
             Hash Cond: (el.location_id = l.location_id)
             ...
     ->  Index Scan using idx_test1 on representations r
(cost=0.01..0.32 rows=*1* width=12) (actual time=0.002..0.008
rows=*10* loops=36664)
             ...
(extracted from the original plan which is quite massive)

I tried to improve the statistics of l.location_id, el.location_id,
r.location_id and idx_test1.location_id (up to 5000) but it doesn't
get better.

Any idea on how I could get better statistics in this particular
example and why the estimate of the nested loop is so wrong while the
ones for each individual operations are quite good?

This is with PostgreSQL 9.2.1.

Thanks.

--
Guillaume


pgsql-performance by date:

Previous
From: Shaun Thomas
Date:
Subject: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance
Next
From: Niels Kristian Schjødt
Date:
Subject: Re: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance