Home > mailing lists

Re: Why hash join instead of nested loop? - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: Why hash join instead of nested loop?
Date	August 9, 2005 14:13:36
Msg-id	17253.1123607579@sss.pgh.pa.us Whole thread Raw
In response to	Re: Why hash join instead of nested loop? (Rhett Garber <rhettg@gmail.com>)
Responses	Re: Why hash join instead of nested loop?
List	pgsql-performance

Tree view

Rhett Garber <rhettg@gmail.com> writes:
> Duplicated your setup in a separate DB.
> At least its reproducable for me.....

Hmm.  Well, we now have several data points but they seem to be on
wildly varying hardware.  To try to normalize the results a little,
I computed the total actual time for the hash plan divided by the sum
of the actual times for the two scan nodes.  Thus, for your example:

> Hash Join  (cost=4.83..5.91 rows=1 width=14) (actual time=7.148..7.159
> rows=1 loops=1)
>    Hash Cond: ("outer".id = "inner".obj2)
>    ->  Seq Scan on rtmessagestate  (cost=0.00..1.05 rows=5 width=14)
> (actual time=0.007..0.015 rows=5 loops=1)
>    ->  Hash  (cost=4.83..4.83 rows=1 width=4) (actual
> time=0.055..0.055 rows=0 loops=1)
>          ->  Index Scan using connection_regid_obj1_index on
> connection  (cost=0.00..4.83 rows=1 width=4) (actual time=0.028..0.032
> rows=1 loops=1)
>                Index Cond: ((connection_registry_id = 40105) AND (obj1
> = 73582)) Total runtime: 7.693 ms
> (7 rows)

this would be 7.159 / (0.015 + 0.032).  This is probably not an
enormously robust statistic but it at least focuses attention in the
right place.  Here's what I get (rounded off to 4 digits which is surely
as much precision as we have in the numbers):

 Tom 7.4.8+      1.619
 Ian 7.4.8      6.000
 Ian 7.4.2     13.95
 Steinar 7.4.7      8.833
 Rhett orig    108.3
 Rhett test    152.3
 Michael 7.4.1      2.015

My number seems to be a bit of an outlier to the low side, but yours are
way the heck to the high side.  And Michael's test seems to rule out the
idea that it's something broken in 7.4.1 in particular.

I'm now thinking you've got either a platform- or compiler-specific
problem.  Exactly what is the hardware (the CPU not the disks)?  How did
you build or come by the Postgres executables (compiler, configure
options, etc)?

            regards, tom lane

pgsql-performance by date:

From: Rhett Garber
Date: 09 August 2005, 13:51:57
Subject: Re: Why hash join instead of nested loop?

From: Dan Harris
Date: 09 August 2005, 15:02:03
Subject: Table locking problems?

Re: Why hash join instead of nested loop? - Mailing list pgsql-performance

Previous

Next