Thread: additional message to the bug #7499

additional message to the bug #7499

From
Denis Kolesnik
Date:
I have now VERY strong argument to consider it is as a bug:

if there a understandable for SQL language sequence which sorts
in other fashion when adding "LIMIT".

I did try the same with a last name starting with "G" (there also more
than one entry with identical surnames) and it worked ok(the results
were represented as I waited).

this one last example brings me to conseder it is as a bug.

 id  |    str_last_name
-----+----------------------
  83 | GXXXXXXXXX
 175 | GXXXXXXXXX

and

 id |    str_last_name
----+----------------------
 83 | GXXXXXXXXX
(1 =D1=81=D1=82=D1=80=D0=BE=D0=BA=D0=B0)

select id, str_last_name from tbl_owners_individual order by
str_last_name offset 26;

and

select id, str_last_name from tbl_owners_individual order by
str_last_name offset 26 limit 1;

corresponding...

and even sorting by id:
select id, str_last_name from tbl_owners_individual where id in
(83,175,111,1) order by str_last_name;

 id  |    str_last_name
-----+----------------------
  83 | GXXXXXXXXX
 175 | GXXXXXXXXX
   1 | Kolesnik
 111 | Kolesnik
(4 =D1=81=D1=82=D1=80=D0=BE=D0=BA=D0=B8)

select id, str_last_name from tbl_owners_individual where id in
(83,175,111,1) order by id;

 id  |    str_last_name
-----+----------------------
   1 | Kolesnik
  83 | GXXXXXXXXX
 111 | Kolesnik
 175 | GXXXXXXXXX
(4 =D1=81=D1=82=D1=80=D0=BE=D0=BA=D0=B8)

anyway sorted by id results the record with the "1" id appear before
the record with the id "111".

Re: additional message to the bug #7499

From
"Kevin Grittner"
Date:
Denis Kolesnik <lirex.software=40gmail.com> wrote:
=20
> I have now VERY strong argument to consider it is as a bug:
=20
No, you appear to have very strong feelings about it, but you are
not making an argument that holds water.
=20
> if there a understandable for SQL language sequence which sorts
> in other fashion when adding =22LIMIT=22.
=20
Each query is evaluated in terms of satisfying what is requested in
that query, and the fastest plan which returns those results is
chosen.  If you want results to be generated in a specific order, it
is incumbent on you to specify that in the query -- there is no
=22natural order=22 to rows which is used as a tie-breaker.  There are
even optimizations to have one query which is going to scan a table
start at the point that another table scan, already in progress is
at, to prevent extra reads -- so exactly the same query run at about
the same time, with no concurrent database changes can easily return
rows in different orders.  That=27s a feature, not a bug.  If you want
them in a particular order, say so, and appropriate index usage or
sorts will be added to the query execution to provide them the way
you ask, even though that is slower than it would be if you didn=27t
care about the order.
=20
> and even sorting by id:
> select id, str_last_name from tbl_owners_individual where id in
> (83,175,111,1) order by str_last_name;
>=20
>  id  =7C    str_last_name
> -----+----------------------
>   83 =7C GXXXXXXXXX
>  175 =7C GXXXXXXXXX
>    1 =7C Kolesnik
>  111 =7C Kolesnik
> (4 ******)
=20
No, it didn=27t go out of its way to sort that way, it just happened
to fall out that way that time; don=27t count on it always being that
way, even if it happens many times in a row.
=20
test=3D=23 create table tbl_owners_individual
test-=23   (id int not null primary key, str_last_name text not null);
CREATE TABLE
test=3D=23 insert into tbl_owners_individual values
test-=23   (1, =27Kolesnik=27),
test-=23   (83, =27GXXXXXXXXX=27),
test-=23   (111, =27Kolesnik=27),
test-=23   (175, =27GXXXXXXXXX=27);
INSERT 0 4
test=3D=23 select id, str_last_name from tbl_owners_individual where id
in
test-=23 (83,175,111,1) order by str_last_name;
 id  =7C str_last_name=20
-----+---------------
  83 =7C GXXXXXXXXX
 175 =7C GXXXXXXXXX
   1 =7C Kolesnik
 111 =7C Kolesnik
(4 rows)
=20
-Kevin

Re: additional message to the bug #7499

From
"Kevin Grittner"
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:
> Denis Kolesnik <lirex.software@gmail.com> wrote:

>> and even sorting by id:
>> select id, str_last_name from tbl_owners_individual where id in
>> (83,175,111,1) order by str_last_name;
>>
>>  id  |    str_last_name
>> -----+----------------------
>>   83 | GXXXXXXXXX
>>  175 | GXXXXXXXXX
>>    1 | Kolesnik
>>  111 | Kolesnik
>> (4 ******)
>
> No, it didn't go out of its way to sort that way, it just happened
> to fall out that way that time; don't count on it always being
> that way, even if it happens many times in a row.
>
> test=# create table tbl_owners_individual
> test-#   (id int not null primary key, str_last_name text not
> null);
> CREATE TABLE
> test=# insert into tbl_owners_individual values
> test-#   (1, 'Kolesnik'),
> test-#   (83, 'GXXXXXXXXX'),
> test-#   (111, 'Kolesnik'),
> test-#   (175, 'GXXXXXXXXX');
> INSERT 0 4
> test=# select id, str_last_name from tbl_owners_individual
> test-# where id in (83,175,111,1) order by str_last_name;
>  id  | str_last_name
> -----+---------------
>   83 | GXXXXXXXXX
>  175 | GXXXXXXXXX
>    1 | Kolesnik
>  111 | Kolesnik
> (4 rows)

Dang!  I missed the tail of that session, which was the part that
made my point.  If you follow the above with a VACUUM ANALYZE and
then run the same query again, you get a different order:

test=# vacuum analyze tbl_owners_individual;
VACUUM
test=# select id, str_last_name from tbl_owners_individual
test-# where id in (83,175,111,1) order by str_last_name;
 id  | str_last_name
-----+---------------
 175 | GXXXXXXXXX
  83 | GXXXXXXXXX
 111 | Kolesnik
   1 | Kolesnik
(4 rows)

With better statistics from the VACUUM ANALYZE it realized that the
index usage was pointless and slower, so it just used a table scan.

-Kevin

Re: additional message to the bug #7499

From
Craig Ringer
Date:
On 08/22/2012 08:36 AM, Denis Kolesnik wrote:
> I have now VERY strong argument to consider it is as a bug:
>
> if there a understandable for SQL language sequence which sorts
> in other fashion when adding "LIMIT".

Underspecified sorts can have unstable results, that's allowed by the
spec and is a reasonable performance optimisation. If you want stable
sorts you have to provide a fully defined ordering.

To say this is a bug is a bit like saying that undefined behaviour in C
is a bug. It isn't, the language and compiler are giving you what
they're supposed to, just not what you want.

http://en.wikipedia.org/wiki/Undefined_behavior

Unstable sorts are a common optimisation

http://en.wikipedia.org/wiki/Sorting_algorithm#Stability

and a very useful one. Not allowing Pg to use unstable sorts, or use
different sorts for different query plans, would slow correct queries
down to make an incorrect query like yours run how you expect.

--
Craig Ringer