Hash Join node sometimes slow - Mailing list pgsql-performance

From Dave Roberge
Subject Hash Join node sometimes slow
Date
Msg-id loom.20140702T144820-187@post.gmane.org
Whole thread Raw
Responses Re: Hash Join node sometimes slow
List pgsql-performance
Hi, I'm in the process of attempting to tune some slow queries. I came
across a scenario where I'm not entirely sure how I
might figure out why a node is taking awhile to process. I'm not concerned
with the query itself, we are working to figure
out how we can make it faster. But I was hoping someone might be able to
provide some insight into why a hash join is
sometimes slow.

For example, running explain (analyze, buffers) with the query, 4/5 times we
will see the following:

->  Hash Join  (cost=16385.76..103974.09 rows=523954 width=64) (actual
time=532.634..4018.678 rows=258648 loops=1)
      Hash Cond: (p.a = c.c)
      Buffers: shared hit=4 read=29147, temp read=12943 written=12923
      ->  Seq Scan on p (cost=0.00..38496.88 rows=1503188 width=60) (actual
time=0.013..1388.205 rows=1503188 loops=1)
            Buffers: shared hit=1 read=23464
      ->  Hash  (cost=15382.47..15382.47 rows=57703 width=12) (actual
time=527.237..527.237 rows=57789 loops=1)
            Buckets: 4096  Batches: 4  Memory Usage: 632kB
            Buffers: shared hit=3 read=5683, temp read=617 written=771

The other times, we will see something like this:

->  Hash Join  (cost=16385.76..103974.09 rows=523954 width=64) (actual
time=587.277..15208.621 rows=258648 loops=1)
      Hash Cond: (p.a = c.c)
      Buffers: shared hit=26 read=29125, temp read=12943 written=12923
      ->  Seq Scan on p  (cost=0.00..38496.88 rows=1503188 width=60) (actual
time=0.013..1525.608 rows=1503188 loops=1)
            Buffers: shared hit=22 read=23443
      ->  Hash  (cost=15382.47..15382.47 rows=57703 width=12) (actual
time=581.638..581.638 rows=57789 loops=1)
            Buckets: 4096  Batches: 4  Memory Usage: 632kB
            Buffers: shared hit=4 read=5682, temp read=617 written=771

Does anyone have ideas on what might be causing the difference in timing for
the hash join node?

Thanks

pgsql-performance by date:

Previous
From: Marc Mamin
Date:
Subject: fragmention issue with ext4: e4defrag?
Next
From: Tom Lane
Date:
Subject: Re: Hash Join node sometimes slow