Re: [PERFORM] Hash Anti Join performance degradation - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [PERFORM] Hash Anti Join performance degradation
Date
Msg-id BANLkTim=Tp55eM1Ch6VMtkHT5Tr=Xvnq5g@mail.gmail.com
Whole thread Raw
In response to Re: [PERFORM] Hash Anti Join performance degradation  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [PERFORM] Hash Anti Join performance degradation
List pgsql-hackers
On Wed, Jun 1, 2011 at 4:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Because of the way that a bitmap heap scan works, the rows are
> guaranteed to be loaded into the hash table in physical order, which
> means (in the fast case) that the row with the largest "id" value gets
> loaded last.  And because ExecHashTableInsert pushes each row onto the
> front of its hash chain, that row ends up at the front of the hash
> chain.  Net result: for all the outer rows that aren't the one with
> maximum id, we get a joinqual match at the very first entry in the hash
> chain.  Since it's an antijoin, we then reject that outer row and go
> on to the next.  The join thus ends up costing us only O(N) time.

Ah!  Make sense.  If I'm reading your explanation right, this means
that we could have hit a similar pathological case on a nestloop as
well, just with a data ordering that is the reverse of what we have
here?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PERFORM] Hash Anti Join performance degradation
Next
From: Tom Lane
Date:
Subject: Re: [BUGS] BUG #6034: pg_upgrade fails when it should not.