Re: Hash Join performance - Mailing list pgsql-performance

From Tom Lane
Subject Re: Hash Join performance
Date
Msg-id 22599.1236985691@sss.pgh.pa.us
Whole thread Raw
In response to Re: Hash Join performance  (Vamsidhar Thummala <vamsi@cs.duke.edu>)
Responses Re: Hash Join performance  (Vamsidhar Thummala <vamsi@cs.duke.edu>)
List pgsql-performance
Vamsidhar Thummala <vamsi@cs.duke.edu> writes:
> I am wondering why are we subtracting the entire Seq Scan time of Lineitem
> from the total time to calculate the HashJoin time.

Well, if you're trying to identify the speed of the join itself and not
how long it takes to provide the input for it, that seems like a
sensible calculation to make.

> Here is another plan I have for the same TPC-H 18 query with different
> configuration parameters (shared_buffers set to 400MB, just for experimental
> purposes) and HashJoin seems to take longer time (at least 155.58s based on
> above calculation):

Yeah, that seems to work out to about 25us per row instead of 3us, which
is a lot slower.  Maybe the hash got split up into multiple batches ...
what have you got work_mem set to?  Try turning on log_temp_files and
see if it records any temp files as getting created.

            regards, tom lane

pgsql-performance by date:

Previous
From: Vamsidhar Thummala
Date:
Subject: Re: Hash Join performance
Next
From: Gregory Stark
Date:
Subject: Re: 8.4 Performance improvements: was Re: Proposal of tunable fix for scalability of 8.4