Thread: Sorting Discrepancy in PostgreSQL 14.13
==============================================
POSTGRESQL BUG REPORT TEMPLATE
==============================================
Your name : minji-kim
Your email address : hzuiw33@gmail.com
# System Configuration:
---------------------
Architecture (example: Intel Pentium) : Intel(R) Core(TM) Ultra 7 155H
Operating System (example: Linux 2.4.18) : VMware Workstation Pro (Ubuntu-22.04)
PostgreSQL version (example: PostgreSQL 9.6.6) : PostgreSQL 14.13
Compiler used (example: gcc 3.3.5) : X (sudo apt install postgresql postgresql-contrib)
# Please enter a FULL description of your problem:
------------------------------------------------
Sorting Discrepancy in PostgreSQL 14.13
When running the following command in PostgreSQL 14.13:
```sql
CREATE TABLE t0 (c0 TEXT);
INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
SELECT c0, MIN(ABS(CAST(c0 AS BIGINT))) OVER (ORDER BY c0 NULLS FIRST) AS min_function_cast FROM t0;
DROP TABLE IF EXISTS t0;
```
The result is:
```
c0 | min_function_cast
-----+-------------------
-10 | 10
15 | 10
20 | 10
-25 | 10
-5 | 5
(5 rows)
```
However, in other DBMS (SQLite, MySQL, OracleDB) and PostgreSQL version 17.0, the output is:
```
c0 | min_function_cast
-----+-------------------
-10 | 10
-25 | 10
-5 | 5
15 | 5
20 | 5
(5 rows)
```
This discrepency is due to different string sorting orders.
The minimized PoC is:
```sql
CREATE TABLE t0 (c0 TEXT);
INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
SELECT c0 FROM t0 ORDER BY c0;
```
In PostgreSQL 14.13, the order is incorrect:
```
-10
15
20
-25
-5
```
While the correct order should be:
```
-10
-25
-5
10
20
```
as '-' is smaller than '1', or '2' in ascii.
I'm doubtful this is a collation issue, as most collations basically respect ASCII order.
Even if this issue appears to be related to collation, no warnings are provided when migrating from this version.
# Please describe a way to repeat the problem.
# Please try to provide a concise reproducible example, if at all possible:
----------------------------------------------------------------------
Running the following command in PostgreSQL 14.13:
```sql
CREATE TABLE t0 (c0 TEXT);
INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
SELECT c0, MIN(ABS(CAST(c0 AS BIGINT))) OVER (ORDER BY c0 NULLS FIRST) AS min_function_cast FROM t0;
DROP TABLE IF EXISTS t0;
```
Results in the incorrect sorting behavior.
If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------
Using the correct sort criteria will solve this problem.
POSTGRESQL BUG REPORT TEMPLATE
==============================================
Your name : minji-kim
Your email address : hzuiw33@gmail.com
# System Configuration:
---------------------
Architecture (example: Intel Pentium) : Intel(R) Core(TM) Ultra 7 155H
Operating System (example: Linux 2.4.18) : VMware Workstation Pro (Ubuntu-22.04)
PostgreSQL version (example: PostgreSQL 9.6.6) : PostgreSQL 14.13
Compiler used (example: gcc 3.3.5) : X (sudo apt install postgresql postgresql-contrib)
# Please enter a FULL description of your problem:
------------------------------------------------
Sorting Discrepancy in PostgreSQL 14.13
When running the following command in PostgreSQL 14.13:
```sql
CREATE TABLE t0 (c0 TEXT);
INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
SELECT c0, MIN(ABS(CAST(c0 AS BIGINT))) OVER (ORDER BY c0 NULLS FIRST) AS min_function_cast FROM t0;
DROP TABLE IF EXISTS t0;
```
The result is:
```
c0 | min_function_cast
-----+-------------------
-10 | 10
15 | 10
20 | 10
-25 | 10
-5 | 5
(5 rows)
```
However, in other DBMS (SQLite, MySQL, OracleDB) and PostgreSQL version 17.0, the output is:
```
c0 | min_function_cast
-----+-------------------
-10 | 10
-25 | 10
-5 | 5
15 | 5
20 | 5
(5 rows)
```
This discrepency is due to different string sorting orders.
The minimized PoC is:
```sql
CREATE TABLE t0 (c0 TEXT);
INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
SELECT c0 FROM t0 ORDER BY c0;
```
In PostgreSQL 14.13, the order is incorrect:
```
-10
15
20
-25
-5
```
While the correct order should be:
```
-10
-25
-5
10
20
```
as '-' is smaller than '1', or '2' in ascii.
I'm doubtful this is a collation issue, as most collations basically respect ASCII order.
Even if this issue appears to be related to collation, no warnings are provided when migrating from this version.
# Please describe a way to repeat the problem.
# Please try to provide a concise reproducible example, if at all possible:
----------------------------------------------------------------------
Running the following command in PostgreSQL 14.13:
```sql
CREATE TABLE t0 (c0 TEXT);
INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
SELECT c0, MIN(ABS(CAST(c0 AS BIGINT))) OVER (ORDER BY c0 NULLS FIRST) AS min_function_cast FROM t0;
DROP TABLE IF EXISTS t0;
```
Results in the incorrect sorting behavior.
If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------
Using the correct sort criteria will solve this problem.
Hi, On 11/14/24 13:49, [3반]김민지_4904 wrote: > > ... > > The minimized PoC is: > > ```sql > CREATE TABLE t0 (c0 TEXT); > INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25'); > SELECT c0 FROM t0 ORDER BY c0; > ``` > > In PostgreSQL 14.13, the order is incorrect: > ``` > -10 > 15 > 20 > -25 > -5 > ``` > > While the correct order should be: > > ``` > -10 > -25 > -5 > 10 > 20 > ``` > > as '-' is smaller than '1', or '2' in ascii. > > I'm doubtful this is a collation issue, as most collations basically > respect ASCII order. > This is 99.999% due to the collation, so which collations are being used on these systems? Also, I don't get this "incorrect" behavior on 14.13, it behaves the same as 17 for me, producing the expected result. regards -- Tomas Vondra
Tomas Vondra <tomas@vondra.me> writes: > On 11/14/24 13:49, [3반]김민지_4904 wrote: >> I'm doubtful this is a collation issue, as most collations basically >> respect ASCII order. > This is 99.999% due to the collation, so which collations are being used > on these systems? Also, I don't get this "incorrect" behavior on 14.13, > it behaves the same as 17 for me, producing the expected result. It surely is a collation issue. Using a glibc-based system, I get u8=# CREATE TABLE t0 (c0 TEXT); CREATE TABLE u8=# INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25'); INSERT 0 5 u8=# select * from t0 order by c0 collate "C"; c0 ----- -10 -25 -5 15 20 (5 rows) u8=# select * from t0 order by c0 collate "en_US"; c0 ----- -10 15 20 -25 -5 (5 rows) (In point of fact, most glibc collations do NOT "respect ASCII order". They tend to ignore punctuation until it's needed as a tiebreaker.) So this is surely down to the PG 14.13 installation having a different default collation than whatever it's compared to, which most likely is caused by having run initdb with a different locale environment. regards, tom lane
On 14.11.24 13:49, [3반]김민지_4904 wrote: > as '-' is smaller than '1', or '2' in ascii. > > I'm doubtful this is a collation issue, as most collations basically > respect ASCII order. See also here for a possible explanation: https://peter.eisentraut.org/blog/2023/04/12/how-collation-of-punctuation-and-whitespace-works