Re: Suspicious call of initial_cost_hashjoin() - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Suspicious call of initial_cost_hashjoin()
Date
Msg-id CAEepm=3sB=BXFGxwhUKVQED=aF61S95NS05ek+be1c6P5NFDCQ@mail.gmail.com
Whole thread Raw
In response to Suspicious call of initial_cost_hashjoin()  (Antonin Houska <ah@cybertec.at>)
Responses Re: Re: Suspicious call of initial_cost_hashjoin()  (David Steele <david@pgmasters.net>)
List pgsql-hackers
On Fri, Dec 22, 2017 at 10:45 PM, Antonin Houska <ah@cybertec.at> wrote:
> try_partial_hashjoin_path() passes constant true to for the parallel_hash
> argument of initial_cost_hashjoin(). Shouldn't it instead pass the
> parallel_hash argument that it receives?

Thanks.  Yeah.  When initial_cost_hashjoin() calls
get_parallel_divisor() on a non-partial inner path I think it would
return 1.0, so no damage was done there, but when
ExecChooseHashTableSize() receives try_combined_work_mem == true it
might underestimate the number of batches required for a partial hash
join without parallel hash, because it would incorrectly assume that a
single batch join could use the combined work_mem budget.  This was
quite well hidden because ExecHashTableCreate() calls
ExecChooseHashTableSize() again (rather than reusing the results from
planning time), so the bad nbatch estimate doesn't show up anywhere.

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Ildar Musin
Date:
Subject: Re: General purpose hashing func in pgbench
Next
From: Dilip Kumar
Date:
Subject: Re: After dropping the rule - Not able to insert / server crash (onetime ONLY)