Re: Question about sorting internals - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Question about sorting internals
Date
Msg-id 15974.1386775807@sss.pgh.pa.us
Whole thread Raw
In response to Question about sorting internals  (hubert depesz lubaczewski <depesz@depesz.com>)
List pgsql-hackers
hubert depesz lubaczewski <depesz@depesz.com> writes:
> There are two simple queries: ...
> They differ only in order of queries in union all part.
> The thing is that they return the same result. Why isn't one of them returning
> "2005" for 6th "miesiac"?

With such a small amount of data, you're getting an in-memory quicksort,
and a well-known property of quicksort is that it isn't stable --- that
is, there are no guarantees about the order in which it will return items
that have equal keys.  In this case it's evidently making different
partitioning choices, as a consequence of the different arrival order of
the rows, that just by chance end up with the 6/2004/6 row being returned
before the 6/2005/6 row in both cases.  You could trace through the logic
and see exactly how that's happening, but I doubt it'd be a very edifying
exercise.

If you want to get well-defined results with DISTINCT ON, you should
make the ORDER BY sort by a candidate key.  Anything less opens you to
uncertainty about which rows the DISTINCT will select.
        regards, tom lane



pgsql-hackers by date:

Previous
From: "MauMau"
Date:
Subject: Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
Next
From: Andres Freund
Date:
Subject: Re: Replication Node Identifiers and crashsafe Apply