Re: Very puzzling sort behavior - Mailing list pgsql-general

From Steve Crawford
Subject Re: Very puzzling sort behavior
Date
Msg-id CAEfWYyyB3uEn63v3rH_f0LFLMBPXWLoJ1rVjDDktVtJo0LR7xQ@mail.gmail.com
Whole thread Raw
In response to Very puzzling sort behavior  (Ken Tanzer <ken.tanzer@gmail.com>)
List pgsql-general
Any null values in first name??

-Steve

On Thu, Sep 10, 2015 at 12:35 PM, Ken Tanzer <ken.tanzer@gmail.com> wrote:
Hi.  In a table that includes these columns:

my_db=> \d tbl_client
...
 name_last                   | character varying(40)          | not null
 name_first                  | character varying(30)          | not null
...

I am extremely puzzled by the sorting of the "CLARKE"s in this list:

my_db=> SELECT name_last,length(name_last),name_last='CLARK' AS clark,name_last='CLARKE' AS clarke FROM tbl_client WHERE name_last ILIKE 'CLARK%' OR name_last ILIKE 'ADAMS%' ORDER BY name_last || ', ' || name_first;
 name_last | length | clark | clarke 
-----------+--------+-------+--------
 ADAMS     |      5 | f     | f
 ADAMS     |      5 | f     | f
 ADAMS     |      5 | f     | f
 ADAMS     |      5 | f     | f
 ADAMS     |      5 | f     | f
 ADAMS     |      5 | f     | f
 ADAMSON   |      7 | f     | f
 CLARK     |      5 | t     | f
 CLARK     |      5 | t     | f
 CLARKE    |      6 | f     | t
 CLARKE    |      6 | f     | t
 CLARKE    |      6 | f     | t
 CLARK     |      5 | t     | f
 CLARK     |      5 | t     | f
 CLARK     |      5 | t     | f
 CLARK     |      5 | t     | f
 CLARK     |      5 | t     | f
(17 rows)

The ADAMS are included just to show a similar example is ordering correctly.  I put the length and equality test columns in to try to make sure there weren't some bizarre characters in the data.  This is only happening on one particular database.  I did a reindex on the table just for good measure.  If I remove the name_first piece of the ORDER BY (which doesn't seem like it should matter), it sorts as expected:

my_db=> SELECT name_last,length(name_last),name_last='CLARK' AS clark,name_last='CLARKE' AS clarke FROM tbl_client WHERE name_last ILIKE 'CLARK%' OR name_last ILIKE 'ADAMS%' ORDER BY name_last || ', '               
;
 name_last | length | clark | clarke 
-----------+--------+-------+--------
 ADAMS     |      5 | f     | f
 ADAMS     |      5 | f     | f
 ADAMS     |      5 | f     | f
 ADAMS     |      5 | f     | f
 ADAMS     |      5 | f     | f
 ADAMS     |      5 | f     | f
 ADAMSON   |      7 | f     | f
 CLARK     |      5 | t     | f
 CLARK     |      5 | t     | f
 CLARK     |      5 | t     | f
 CLARK     |      5 | t     | f
 CLARK     |      5 | t     | f
 CLARK     |      5 | t     | f
 CLARK     |      5 | t     | f
 CLARKE    |      6 | f     | t
 CLARKE    |      6 | f     | t
 CLARKE    |      6 | f     | t
(17 rows)

I tried selecting those 17 rows from tbl_client into a new table, and get the same odd behavior.  However, if I run with new data I get an expected order:

CREATE TEMP TABLE test (
  name_first VARCHAR(40),
  name_last VARCHAR(30)

);

INSERT INTO test VALUES ('JOE','CLARKE'),('BILL','CLARK');

SELECT * FROM test ORDER BY name_last;
SELECT * FROM test ORDER BY name_last || ', ' || name_first;

Any thoughts about what's going on, what to do about it, or what obvious point I missing?  Thanks in advance!

my_db=> SELECT version();
                                                    version                                                     
----------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.2.13 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-11), 64-bit
(1 row)

Ken



--
AGENCY Software  
A Free Software data system
By and for non-profits
(253) 245-3801

learn more about AGENCY or
follow the discussion.

pgsql-general by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Very puzzling sort behavior
Next
From: Tom Lane
Date:
Subject: Re: Very puzzling sort behavior