Home > mailing lists

Re: BUG #2930: Hash join abyssmal with many null fields. - Mailing list pgsql-bugs

From	Tom Lane
Subject	Re: BUG #2930: Hash join abyssmal with many null fields.
Date	January 26, 2007 13:22:35
Msg-id	25727.1169832146@sss.pgh.pa.us Whole thread Raw
In response to	BUG #2930: Hash join abyssmal with many null fields. ("Maciej Babinski" <maciej+postgres@apathy.killer-robot.net>)
Responses	Re: BUG #2930: Hash join abyssmal with many null fields.
List	pgsql-bugs

Tree view

Maciej Babinski <maciej@apathy.killer-robot.net> writes:
> Tom Lane wrote:
>> I see no bug here.  AFAICT your "much faster" query gets that way by
>> having eliminated all the candidate join rows on the B side.

> The additional clause eliminates no rows beyond what the existing
> clause would.  Any row eliminated by "b.join_id IS NOT NULL" could not
> possibly have satisfied "a.join_id = b.join_id".

Hmm.  You assume that the = operator is strict, which is probably true,
but the hash join code isn't assuming that.

It might be worth checking for the case, though.  What's happening,
since we go ahead and put the null rows into the hash table, is that
they all end up in the same hash chain because they all get hash code 0.
And then that very long chain gets searched for each null outer row.
If we knew the join operator is strict we could discard null rows
immediately on both sides.

> Please note that if the join columns are not null, but still produce
> no matches for the join, the results are fast without the need for an
> extra clause in the join:

Yeah, because the rows get reasonably well distributed into different
hash buckets.  The optimizer will avoid a hash if it sees the data is
not well-distributed, but IIRC it's not considering nulls when it
makes that decision.

            regards, tom lane

pgsql-bugs by date:

From: Tom Lane
Date: 26 January 2007, 01:49:59
Subject: Re: BUG #2930: Hash join abyssmal with many null fields.

From: "Michael Schmidt"
Date: 26 January 2007, 16:07:56
Subject: BUG #2931: Can't capture pg_dump Password prompt

Re: BUG #2930: Hash join abyssmal with many null fields. - Mailing list pgsql-bugs

Previous

Next