Home > mailing lists

Re: RV: bad result in a query!! :-( - Mailing list pgsql-general

From	Tom Lane
Subject	Re: RV: bad result in a query!! :-(
Date	October 15, 2002 10:41:26
Msg-id	14922.1034692884@sss.pgh.pa.us Whole thread
In response to	Re: RV: bad result in a query!! :-( (Richard Huxton <dev@archonet.com>)
Responses	Re: RV: bad result in a query!! hopeless
List	pgsql-general

Tree view

Richard Huxton <dev@archonet.com> writes:
> On Tuesday 15 Oct 2002 12:12 pm, Nigel J. Andrews wrote:
>> On Tue, 15 Oct 2002, Jose Antonio Leo wrote:
>>> ->  Sort  (cost=10821.77..10821.77 rows=1485
> width=118) (actual time=16453.64..199329.55 rows=49801240 loops=1)
>>
>> What is this all about, the seqscan only returns 75918 rows?

> Yep - very strange. I'm not sure where the 4 million comes from - I can't see
> any relationship with the 75918.

I think what is happening is that there are many equal keys in the
relations being joined.  If you think about how a mergejoin works,
it has to back up and rescan a segment of the inner relation each
time it advances to a new outer tuple that has a key matching the
prior key.  I believe that the EXPLAIN ANALYZE machinery counts each
row fetched from the inner relation afresh, even if it's a re-fetch
of a row already fetched.

There is not currently any code in the planner to try to account
for this effect; if there were, it might choose a different plan.
(Not that I'm sure a hash join would be much better.)

Jose, how many distinct cod_ae1 values have you actually got in
each table?  Can you use additional join conditions (perhaps
cod_ae2, cod_ae3) to improve the specificity of the match between
the tables?

            regards, tom lane

pgsql-general by date:

From: Richard Huxton
Date: 15 October 2002, 09:30:11
Subject: Re: RV: bad result in a query!! :-(

From: Josh Burdick
Date: 15 October 2002, 10:46:01
Subject: Re: question about executing JOINs

Re: RV: bad result in a query!! :-( - Mailing list pgsql-general

Previous

Next