Re: BUG #5084: Query gives different number of rows depending on ORDER BY - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #5084: Query gives different number of rows depending on ORDER BY
Date
Msg-id 13692.1254168177@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #5084: Query gives different number of rows depending on ORDER BY  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #5084: Query gives different number of rows depending on ORDER BY  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
List pgsql-bugs
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
>> Notice that the (k = k) qual is being dropped somewhere, which changes
>> the output since that's a disguised not-null condition.

> Huh ... I'm surprised it only does that with the ORDER BY present.
> I suppose it's got something to do with the equivalence-class machinery.

After digging into it, I find that:

1. Without ORDER BY, process_equivalence generates an equivalence class
that lists k twice.  This is pretty bogus but it happens to produce the
desired results in the example at hand.  (In some other cases you'll get
redundant clauses out, because the eclass machinery isn't expecting
this.)

2. With ORDER BY k, the code first creates a single-element equivalence
class containing k, because it needs that to represent the desired
pathkey.  Then, process_equivalence finds that both sides of the k = k
clause are already known to be in the same eclass, so it concludes that
this is redundant information.

I'm inclined to think that the best solution is to have
process_equivalence just reject any clauses that have equal() left and
right sides, ie, throw them back to be processed as ordinary
non-equivalence clauses.  The only case I can think of where this might
be less than ideal is if you have "k = k AND k = x"; if both operators
are strict then the k = k test is indeed redundant and could be
discarded, but it won't be.  But it doesn't seem like that's going to
come up enough to be worth stressing about.  If we wanted to be smart
about it we'd have to have two kinds of single-element equivalence
classes (one that implies a k = k check is needed, and one that does
not).  It doesn't seem worth the complication.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #5084: Query gives different number of rows depending on ORDER BY
Next
From: Tim Bunce
Date:
Subject: Re: BUG #5066: plperl issues with perl_destruct() and END blocks