Re: hash join vs nested loop join - Mailing list pgsql-performance

From Huan Ruan
Subject Re: hash join vs nested loop join
Date
Msg-id CAD1stZv-CsoBSpH3Oimu-OrhXPAf6ouA-pYLjbE6obgNpVNX8A@mail.gmail.com
Whole thread Raw
In response to Re: hash join vs nested loop join  ("Kevin Grittner" <kgrittn@mail.com>)
List pgsql-performance

Hi Kevin

On 13 December 2012 10:47, Kevin Grittner <kgrittn@mail.com> wrote:
Huan Ruan wrote:

> is a lot slower than a nested loop join.

Giving actual numbers is more useful than terms like "a lot". Even
better is to provide the output of EXPLAIN ANALYZZE rather than
just EXPLAIN. This shows estimates against actual numbers, and give
timings. For more suggestions see this page:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

You are right. I realised my information wasn't accurate. Was a bit slack and canceled the slower one. The full outputs are

Hash 1st run
"QUERY PLAN"
"Hash Join  (cost=1681.87..6414169.04 rows=48261 width=171) (actual time=2182.450..88158.645 rows=48257 loops=1)"
"  Hash Cond: (bigtable.invtranref = smalltable.invtranref)"
"  Buffers: shared hit=3950 read=3046219"
"  ->  Seq Scan on invtran bigtable  (cost=0.00..4730787.28 rows=168121728 width=108) (actual time=0.051..32581.052 rows=168121657 loops=1)"
"        Buffers: shared hit=3351 read=3046219"
"  ->  Hash  (cost=1078.61..1078.61 rows=48261 width=63) (actual time=21.751..21.751 rows=48261 loops=1)"
"        Buckets: 8192  Batches: 1  Memory Usage: 4808kB"
"        Buffers: shared hit=596"
"        ->  Seq Scan on im_match_table smalltable  (cost=0.00..1078.61 rows=48261 width=63) (actual time=0.007..8.299 rows=48261 loops=1)"
"              Buffers: shared hit=596"
"Total runtime: 88162.417 ms"
Hash 2nd run (after disconnect and reconnect)

"QUERY PLAN"
"Hash Join  (cost=1681.87..6414169.04 rows=48261 width=171) (actual time=2280.390..87934.540 rows=48257 loops=1)"
"  Hash Cond: (bigtable.invtranref = smalltable.invtranref)"
"  Buffers: shared hit=3982 read=3046187"
"  ->  Seq Scan on invtran bigtable  (cost=0.00..4730787.28 rows=168121728 width=108) (actual time=0.052..32747.805 rows=168121657 loops=1)"
"        Buffers: shared hit=3383 read=3046187"
"  ->  Hash  (cost=1078.61..1078.61 rows=48261 width=63) (actual time=62.161..62.161 rows=48261 loops=1)"
"        Buckets: 8192  Batches: 1  Memory Usage: 4808kB"
"        Buffers: shared hit=596"
"        ->  Seq Scan on im_match_table smalltable  (cost=0.00..1078.61 rows=48261 width=63) (actual time=0.006..8.209 rows=48261 loops=1)"
"              Buffers: shared hit=596"
"Total runtime: 87938.584 ms"
NL 1st run
"QUERY PLAN"
"Nested Loop  (cost=0.00..6451637.88 rows=48261 width=171) (actual time=0.056..551.438 rows=48257 loops=1)"
"  Buffers: shared hit=242267"
"  ->  Seq Scan on im_match_table smalltable  (cost=0.00..1078.61 rows=48261 width=63) (actual time=0.009..7.353 rows=48261 loops=1)"
"        Buffers: shared hit=596"
"  ->  Index Scan using pk_invtran on invtran bigtable  (cost=0.00..133.65 rows=1 width=108) (actual time=0.010..0.010 rows=1 loops=48261)"
"        Index Cond: (invtranref = smalltable.invtranref)"
"        Buffers: shared hit=241671"
"Total runtime: 555.336 ms"
NL 2nd run (after disconnect and reconnect)
"QUERY PLAN"
"Nested Loop  (cost=0.00..6451637.88 rows=48261 width=171) (actual time=0.058..554.215 rows=48257 loops=1)"
"  Buffers: shared hit=242267"
"  ->  Seq Scan on im_match_table smalltable  (cost=0.00..1078.61 rows=48261 width=63) (actual time=0.009..7.416 rows=48261 loops=1)"
"        Buffers: shared hit=596"
"  ->  Index Scan using pk_invtran on invtran bigtable  (cost=0.00..133.65 rows=1 width=108) (actual time=0.010..0.010 rows=1 loops=48261)"
"        Index Cond: (invtranref = smalltable.invtranref)"
"        Buffers: shared hit=241671"
"Total runtime: 558.095 ms"
 


> I don't understand why the optimiser chooses the hash join in
> favor of the nested loop. What can I do to get the optimiser to
> make a better decision (nested loop in this case)? I have run
> analyze on both tables.

> Config changes are
>
>  - shared_buffers = 6GB
>  - effective_cache_size = 18GB
>  - work_mem = 10MB
>  - maintenance_work_mem = 3GB

As already suggested, there was a change made in 9.2 which may have
over-penalized nested loops using index scans. This may be fixed in
the next minor release.

Will keep this in mind. 
 

Also, as already suggested, you may want to reduce random_page
cost, to bring it in line with the actual cost relative to
seq_page_cost based on your cache hit ratio.

Additionally, I just routinely set cpu_tuple_cost higher than the
default of 0.01. I find that 0.03 to 0.05 better models the actual
relative cost of processing a tuple.

I originally reduced random_page_cost to 2 to achieve the nested loop join. Now I set cpu_tuple_cost to 0.05 and reset random_page_cost back to 4, I can also achieve a nested loop join.

I'm still new in Postgres, but I'm worried about random_page_cost being 2 is too low, so maybe increasing cpu_tuple_cost is a better choice. All these tuning probably also depends on the above mentioned possible fix as well. Can you see any obvious issues with the other memory settings I changed?

Thanks for your help.

Cheers
Huan


-Kevin

pgsql-performance by date:

Previous
From: Huan Ruan
Date:
Subject: Re: hash join vs nested loop join
Next
From: Ghislain ROUVIGNAC
Date:
Subject: Slow queries after vacuum analyze