Home > mailing lists

Re: [PERFORM] Hash Anti Join performance degradation - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: [PERFORM] Hash Anti Join performance degradation
Date	June 1, 2011 11:40:36
Msg-id	BANLkTim-DqDC2AbVJ_1t-XAS4NYq2tQYZg@mail.gmail.com Whole thread Raw
In response to	Re: [PERFORM] Hash Anti Join performance degradation (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [PERFORM] Hash Anti Join performance degradation Re: [PERFORM] Hash Anti Join performance degradation
List	pgsql-hackers

Tree view

On Tue, May 31, 2011 at 11:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> With respect to the root of the issue (why does the anti-join take so
>> long?), my first thought was that perhaps the OP was very unlucky and
>> had a lot of values that hashed to the same bucket.  But that doesn't
>> appear to be the case.
>
> Well, yes it is.  Notice what the subquery is doing: for each row in
> "box", it's pulling all matching "box_id"s from message and running a
> self-join across those rows.  The hash join condition is a complete
> no-op.  And some of the box_ids have hundreds of thousands of rows.
>
> I'd just write it off as being a particularly stupid way to find the
> max(), except I'm not sure why deleting just a few thousand rows
> improves things so much.  It looks like it ought to be an O(N^2)
> situation, so the improvement should be noticeable but not amazing.

Yeah, this is what I was getting at, though perhaps I didn't say it
well.  If the last 78K rows were particularly pathological in some
way, that might explain something, but as far as one can see they are
not a whole heck of a lot different from the rest of the data.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 01 June 2011, 11:37:54
Subject: Re: Cube Index Size

From: Dave Page
Date: 01 June 2011, 12:04:35
Subject: Re: pg_listener in 9.0

Re: [PERFORM] Hash Anti Join performance degradation - Mailing list pgsql-hackers

Previous

Next