Home > mailing lists

Re: Hash Join cost estimates - Mailing list pgsql-hackers

From	Jeff Davis
Subject	Re: Hash Join cost estimates
Date	March 29, 2013 20:23:31
Msg-id	1364588591.1187.46.camel@sussancws0025 Whole thread Raw
In response to	Hash Join cost estimates (Stephen Frost <sfrost@snowman.net>)
Responses	Re: Hash Join cost estimates Re: Hash Join cost estimates
List	pgsql-hackers

Tree view

On Thu, 2013-03-28 at 19:56 -0400, Stephen Frost wrote:
>   41K hashed, seqscan 4M: 115030.10 + 1229.46 = 116259.56
>   4M hashed, seqscan 41K: 1229.46 + 211156.20 = 212385.66

I think those are backwards -- typo?

>   In the end, I think the problem here is that we are charging far too
>   much for these bucket costs (particularly when we're getting them so
>   far wrong) and not nearly enough for the cost of building the hash
>   table in the first place.
> 
>   Thoughts?  Ideas about how we can 'fix' this?  Have others run into
>   similar issues?

Yes, I have run into this issue (or something very similar). I don't
understand why the bucketsize even matters much -- assuming few hash
collisions, we are not actually evaluating the quals any more times than
necessary. So why all of the hashjoin-specific logic in determining the
number of qual evaluations? The only reason I can think of is to model
the cost of comparing the hashes themselves.

Also, searching the archives turns up at least one other, but I think
I've seen more:

http://www.postgresql.org/message-id/A82128A6-4E3B-43BD-858D-21B129F7BEEB@richrelevance.com

Regards,Jeff Davis

pgsql-hackers by date:

From: Pavel Stehule
Date: 29 March 2013, 20:21:17
Subject: Re: Getting to 9.3 beta

From: Tom Lane
Date: 29 March 2013, 20:37:33
Subject: Re: Hash Join cost estimates

Re: Hash Join cost estimates - Mailing list pgsql-hackers

Previous

Next