Re: same query different execution plan (hash join vs. semi-hash join) - Mailing list pgsql-performance

From Huang, Suya
Subject Re: same query different execution plan (hash join vs. semi-hash join)
Date
Msg-id D83E55F5F4D99B4A9B4C4E259E6227CD01499446@AUX1EXC01.apac.experian.local
Whole thread Raw
In response to Re: same query different execution plan (hash join vs. semi-hash join)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: same query different execution plan (hash join vs. semi-hash join)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Thank you Tom. But the time spent on scanning table test1 is less than 1 second (91.738 compares to 87.869), so I guess
thisshouldn't be the issue? 


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Friday, May 16, 2014 12:58 PM
To: Huang, Suya
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] same query different execution plan (hash join vs. semi-hash join)

"Huang, Suya" <Suya.Huang@au.experian.com> writes:
> I've got a query as below, it runs several times with different execution plan and totally different execution time.
Theone using hash-join is slow and the one using semi-hash join is very fast. However, I have no control over the
optimizerbehavior of PostgreSQL database. Or, do I have? 

A salient feature of the slow plan is that the planner is misinformed about the size of test1:

>                      ->  Seq Scan on test1  (cost=0.00..5153.94
> rows=63294 width=516) (actual time=0.068..91.378 rows=441736 loops=1)

whereas in the fast plan its rows estimate for that scan is dead on.
It looks like the two cases also have different ideas of how many distinct values are in the test1.userid column,
thoughthis is more a guess than an indisputable fact. 

In short, I suspect you're recreating the test1 table and not bothering to ANALYZE it after you fill it.  This leaves
youat the mercy of when the autovacuum daemon gets around to analyzing the table before you'll get good plans for it. 

            regards, tom lane


pgsql-performance by date:

Previous
From: Geoff Hull
Date:
Subject: View has different query plan than select statement
Next
From: David Rowley
Date:
Subject: Re: View has different query plan than select statement