Re: bad performances using hashjoin - Mailing list pgsql-performance

From David Brown
Subject Re: bad performances using hashjoin
Date
Msg-id 42194F00.3030704@bigpond.net.au
Whole thread Raw
In response to Re: bad performances using hashjoin  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: bad performances using hashjoin
List pgsql-performance
Tom Lane wrote:

>However: the reason the second plan wins is because there are zero rows
>fetched from sat_request, and so the bulk of the plan is never executed
>at all.  I doubt the second plan would win if there were any matching
>sat_request rows.
>
That's what I thought at first, but if you look more closely, that's
having very little impact on either the cost or actual time:

->  Index Scan using idx_id_url_sat_request on sat_request sr  (cost=0.00..3.04 rows=1 width=8) (actual
time=0.031..0.031rows=0 loops=1) 


The real problem appears to be here:

->  Hash Left Join  (cost=16.14..89.16 rows=1196 width=159) (actual time=3.504..809.262 rows=1203 loops=1)


As Gaetano points out in his follow-up post, the problem still exists
after he removed the sorts:

->  Hash Left Join  (cost=16.14..80.19 rows=1196 width=4) (actual time=7.291..13.620 rows=1203 loops=1)


The planner is not breaking up the outer join in his v_packages view:
  SELECT *
  FROM packages p LEFT OUTER JOIN package_security ps USING (id_package)

It's not being selective at all with packages, despite id_package being
the link to sat_request.

If this is too complex for the planner, could he re-arrange his outer
join so that's it's processed later? If he puts it in his actual query,
for instance, will the planner flatten it out anyway?

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: bad performances using hashjoin
Next
From: Christopher Browne
Date:
Subject: Re: Effects of IDLE processes