Home > mailing lists

Re: Hash Join cost estimates - Mailing list pgsql-hackers

From	ktm@rice.edu
Subject	Re: Hash Join cost estimates
Date	April 5, 2013 00:11:24
Msg-id	20130404211113.GN32580@aart.rice.edu Whole thread Raw
In response to	Re: Hash Join cost estimates (Stephen Frost <sfrost@snowman.net>)
List	pgsql-hackers

Tree view

On Thu, Apr 04, 2013 at 04:16:12PM -0400, Stephen Frost wrote:
> * Stephen Frost (sfrost@snowman.net) wrote:
> > It does look like reducing bucket depth, as I outlined before through
> > the use of a 2-level hashing system, might help speed up
> > ExecScanHashBucket, as it would hopefully have very few (eg: 1-2)
> > entries to consider instead of more.  Along those same lines, I really
> > wonder if we're being too generous wrt the bucket-depth goal of '10'
> > instead of, say, '1', especially when we've got plenty of work_mem
> > available.
> 
> Rerunning using a minimally configured build (only --enable-openssl
> and --enable-debug passed to configure) with NTUP_PER_BUCKET set to '1'
> results in a couple of interesting things-
> 
> First, the planner actually picks the plan to hash the small table and
> seqscan the big one.  That also, finally, turns out to be *faster* for
> this test case.
> 
> ...
> 
> I'm certainly curious about those, but I'm also very interested in the
> possibility of making NTUP_PER_BUCKET much smaller, or perhaps variable
> depending on the work_mem setting.  It's only used in
> ExecChooseHashTableSize, so while making it variable or depending on
> work_mem could slow planning down a bit, it's not a per-tuple cost item.
> 
+1 for adjusting this based on work_mem value.

Ken

pgsql-hackers by date:

From: Tom Lane
Date: 04 April 2013, 23:43:15
Subject: Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

From: Tom Lane
Date: 05 April 2013, 00:11:31
Subject: Re: CREATE EXTENSION BLOCKS

Re: Hash Join cost estimates - Mailing list pgsql-hackers

Previous

Next