Re: insensitive collations - Mailing list pgsql-hackers

From Daniel Verite
Subject Re: insensitive collations
Date
Msg-id eab7bd7b-7b2b-47f6-85b9-493e9d8cd834@manitou-mail.org
Whole thread Raw
In response to Re: insensitive collations  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: insensitive collations  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-hackers
    Peter Eisentraut wrote:

> Here is an updated patch.


When using GROUP BY and ORDER BY on a field with a non-deterministic
collation, this pops out:

CREATE COLLATION myfr (locale='fr-u-ks-level1',
  provider='icu', deterministic=false);

=# select n from (values ('été' collate "myfr"), ('ete')) x(n)
  group by 1 order by 1 ;
  n
-----
 ete
(1 row)

=#  select n from (values ('été' collate "myfr"), ('ete')) x(n)
  group by 1 order by 1 desc;
  n
-----
 été
(1 row)

The single-row output is different whether it's sorted in the ASC or
DESC direction, even though in theory, ORDER BY is done after GROUP
BY, where it shouldn't make that difference.

EXPLAIN shows that the sort is done before grouping, which might
explain why it happens, but isn't that plan incorrect given the context?

postgres=# explain select n from (values ('été' collate "myfr"), ('ete'))
x(n)
  group by 1 order by 1  desc;
                QUERY PLAN
--------------------------------------------------------------------------
 Group    (cost=0.04..0.04 rows=2 width=32)
   Group Key: "*VALUES*".column1
   ->  Sort  (cost=0.04..0.04 rows=2 width=32)
     Sort Key: "*VALUES*".column1 COLLATE myfr DESC
     ->  Values Scan on "*VALUES*"    (cost=0.00..0.03 rows=2 width=32)
(5 rows)


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


pgsql-hackers by date:

Previous
From: Adrien Mobile
Date:
Subject: Re: Log a sample of transactions
Next
From: Adrien Mobile
Date:
Subject: Re: New GUC to sample log queries