Re: Hash Join performance - Mailing list pgsql-performance

From Tom Lane
Subject Re: Hash Join performance
Date
Msg-id 18410.1236980078@sss.pgh.pa.us
Whole thread Raw
In response to Hash Join performance  (Vamsidhar Thummala <vamsi@cs.duke.edu>)
Responses Re: Hash Join performance  (Vamsidhar Thummala <vamsi@cs.duke.edu>)
List pgsql-performance
Vamsidhar Thummala <vamsi@cs.duke.edu> writes:
> 1) The actual time on Seq Scan on Lineitem shows that the first record is
> fetched at time 0.022ms and the last record is fetched at 53.5s. Does it
> mean the sequential scan is completed with-in first 53.4s (absolute time)?

No, it means that we spent a total of 53.5 seconds executing that plan
node and its children.  There's no direct way to determine how that was
interleaved with the execution of a peer plan node.  In the particular
case here, since that seqscan is the outer child of a hash join, you
can infer that all the time charged to the inner child (the Hash node
and its children) happened first, while we were building the hashtable,
which is then probed for each row of the outer relation.

> 2) Why is the Hash Join (top most) so slow?

Doesn't look that bad to me.  The net time charged to the HashJoin node
is 186107.210 - 53597.555 - 112439.592 = 20070.063 msec.  In addition it
would be reasonable to count the hashtable build time, which evidently
is 112439.592 - 111855.510 = 584.082 msec.  So the hashtable build took
about 10 msec/row, in addition to the data fetching; and then the actual
join spent about 3 microsec per outer row, again exclusive of obtaining
those rows.  The table build seems a bit slow, maybe, but I don't see a
problem with the join speed.

            regards, tom lane

pgsql-performance by date:

Previous
From: Vamsidhar Thummala
Date:
Subject: Hash Join performance
Next
From: Vamsidhar Thummala
Date:
Subject: Re: Hash Join performance