Thread: Re: [PERFORM] 7.4 vs 7.3 ( hash join issue )

Re: [PERFORM] 7.4 vs 7.3 ( hash join issue )

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Yeah, I was just looking at doing that.

Well I imagine it takes you as long to read my patch as it would for you to
write it. But anyways it's still useful to me as exercises.

> It would also be interesting to prefetch one row from the outer table and fall
> out immediately (without building the hash table) if the outer table is
> empty.  This seems to require some contortion of the code though :-(

Why is it any more complicated than just moving the hash build down lower?
There's one small special case needed in ExecHashJoinOuterGetTuple but it's
pretty non-intrusive.

It seems to work for me but I can't test multiple batches easily. I think I've
convinced myself that they would work fine but...

test=# explain analyze select * from a natural join b;
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
 Hash Join  (cost=22.50..345.00 rows=5000 width=4) (actual time=0.005..0.005 rows=0 loops=1)
   Hash Cond: ("outer".a = "inner".a)
   ->  Seq Scan on a  (cost=0.00..20.00 rows=1000 width=4) (actual time=0.002..0.002 rows=0 loops=1)
   ->  Hash  (cost=20.00..20.00 rows=1000 width=4) (never executed)
         ->  Seq Scan on b  (cost=0.00..20.00 rows=1000 width=4) (never executed)
 Total runtime: 0.070 ms
(6 rows)



--
greg

Attachment

Re: [PERFORM] 7.4 vs 7.3 ( hash join issue )

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
>> It would also be interesting to prefetch one row from the outer table and fall
>> out immediately (without building the hash table) if the outer table is
>> empty.  This seems to require some contortion of the code though :-(

> Why is it any more complicated than just moving the hash build down lower?

Having to inject the consideration into ExecHashJoinOuterGetTuple seems
messy to me.

On reflection I'm not sure it would be a win anyway, for a couple of reasons.
(1) Assuming that the planner has gotten things right and put the larger
relation on the outside, the case of an empty outer relation and a
nonempty inner one should rarely arise.
(2) Doing this would lose some of the benefit from the optimization to
detect an empty inner relation.  If the outer subplan is a slow-start
one (such as another hashjoin), it would lose a lot of the benefit :-(

            regards, tom lane