Thread: cast to domain with default collation issue.
CREATE DOMAIN testdomain AS text;
--asume the default collation is as per show LC_COLLATE;
– on my pc, it is C.UTF-8.
--So the testdomain will be collation "C.UTF-8"
=> \d collate_test1
Table "test.collate_test1"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
b | text | en-x-icu | not null |
=> \d collate_test2
Table "test.collate_test2"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
b | text | sv-x-icu | |
=> \d collate_test3
Table "test.collate_test3"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | | b | text | C | |
My guess is that the following should be the same. Since the same content in the end will be cast to the same collation. However the following output contradicts with my understanding.
SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;
/*
+---+-----+
| a | b |
+---+-----+
| 1 | abc |
| 4 | ABC |
| 2 | äbc |
| 3 | bbc |
+---+-----+
*/
SELECT a, b::testdomain FROM collate_test2 ORDER BY 2;
/*
+---+-----+
| a | b |
+---+-----+
| 1 | abc |
| 4 | ABC |
| 3 | bbc |
| 2 | äbc |
+---+-----+
*/
SELECT a, b::testdomain FROM collate_test3 ORDER BY 2;
/*
+---+-----+
| a | b |
+---+-----+
| 4 | ABC |
| 1 | abc |
| 3 | bbc |
| 2 | äbc |
+---+-----+
*/
CREATE DOMAIN testdomain AS text;
--asume the default collation is as per show LC_COLLATE;
– on my pc, it is C.UTF-8.
--So the testdomain will be collation "C.UTF-8"
------------=> \d collate_test1
Table "test.collate_test1"
Column | Type | Collation | Nullable | Default
--------+---------+-----------
+----------+--------- a | integer | | |
b | text | en-x-icu | not null |
------------------------------
----------------- My guess is that the following should be the same.
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Monday, May 23, 2022, jian he <jian.universality@gmail.com> wrote: >> CREATE DOMAIN testdomain AS text; >> --asume the default collation is as per show LC_COLLATE; >> – on my pc, it is C.UTF-8. >> --So the testdomain will be collation "C.UTF-8" > My reading of the docs say this is consistent with outcome #2. > https://www.postgresql.org/docs/current/collation.html Yeah. The comments in parse_collate.c are clear that this behavior is intentional: case T_CoerceToDomain: { /* * If the domain declaration included a non-default COLLATE * spec, then use that collation as the output collation of * the coercion. Otherwise allow the input collation to * bubble up. (The input should be of the domain's base type, * therefore we don't need to worry about it not being * collatable when the domain is.) */ Perhaps this should be documented more clearly, but it's not obviously wrong. If the domain declaration doesn't include an explicit COLLATE then casting to the domain doesn't create an explicit collation requirement. (That is, the domain *doesn't* have a specific collation attached to it, any more than type text does.) regards, tom lane
I wrote: > Perhaps this should be documented more clearly, but it's not obviously > wrong. If the domain declaration doesn't include an explicit COLLATE > then casting to the domain doesn't create an explicit collation > requirement. (That is, the domain *doesn't* have a specific > collation attached to it, any more than type text does.) Perhaps we could improve matters like this? diff --git a/doc/src/sgml/ref/create_domain.sgml b/doc/src/sgml/ref/create_domain.sgml index 81a8924926..e4b856d630 100644 --- a/doc/src/sgml/ref/create_domain.sgml +++ b/doc/src/sgml/ref/create_domain.sgml @@ -94,7 +94,8 @@ CREATE DOMAIN <replaceable class="parameter">name</replaceable> [ AS ] <replacea <listitem> <para> An optional collation for the domain. If no collation is - specified, the underlying data type's default collation is used. + specified, the domain has the same collation behavior as its + underlying data type. The underlying type must be collatable if <literal>COLLATE</literal> is specified. </para> regards, tom lane
Otherwise, all input expressions must have the same implicit collation derivation or the default collation. If any non-default collation is present, that is the result of the collation combination. Otherwise, the result is the default collation.I think the above quote part can be used to explain the following examples.CREATE TABLE test1 ( a text COLLATE "de_DE", b text COLLATE "es_ES", c text );
SELECT a < 'foo' FROM test1;SELECT c < 'foo' FROM test1;But the non-default seems not that correct for me. Like a column if it does not mention anything, then the default value is null. So
create table test111( a text) The default collation for column a is the same as the output of show lc_collate.so there is no non-default?
On Monday, May 23, 2022, jian he <jian.universality@gmail.com> wrote:CREATE DOMAIN testdomain AS text;
--asume the default collation is as per show LC_COLLATE;
– on my pc, it is C.UTF-8.
--So the testdomain will be collation "C.UTF-8"
------------=> \d collate_test1
Table "test.collate_test1"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
b | text | en-x-icu | not null |
-----------------------------------------------
My guess is that the following should be the same.
My reading of the docs say this is consistent with outcome #2.David J.
--
On Tuesday, May 24, 2022, jian he <jian.universality@gmail.com> wrote:
Otherwise, all input expressions must have the same implicit collation derivation or the default collation. If any non-default collation is present, that is the result of the collation combination. Otherwise, the result is the default collation.I think the above quote part can be used to explain the following examples.CREATE TABLE test1 ( a text COLLATE "de_DE", b text COLLATE "es_ES", c text );
SELECT a < 'foo' FROM test1;SELECT c < 'foo' FROM test1;But the non-default seems not that correct for me. Like a column if it does not mention anything, then the default value is null. So
create table test111( a text) The default collation for column a is the same as the output of show lc_collate.so there is no non-default?
Otherwise, all input expressions must have the same implicit collation derivation or the default collation. If any non-default collation is present, that is the result of the collation combination. Otherwise, the result is the default collation.
For example, consider this table definition:
CREATE TABLE test1 ( a text COLLATE "de_DE", b text COLLATE "es_ES", ... );Then in
SELECT a < 'foo' FROM test1;the
<
comparison is performed according tode_DE
rules, because the expression combines an implicitly derived collation with the default collation.
The collation of an expression can be the “default” collation, which means the locale settings defined for the database. It is also possible for an expression's collation to be indeterminate. In such cases, ordering operations and other operations that need to know the collation will fail.
Please don’t top-post.
On Tuesday, May 24, 2022, jian he <jian.universality@gmail.com> wrote:Otherwise, all input expressions must have the same implicit collation derivation or the default collation. If any non-default collation is present, that is the result of the collation combination. Otherwise, the result is the default collation.I think the above quote part can be used to explain the following examples.CREATE TABLE test1 ( a text COLLATE "de_DE", b text COLLATE "es_ES", c text );
SELECT a < 'foo' FROM test1;SELECT c < 'foo' FROM test1;But the non-default seems not that correct for me. Like a column if it does not mention anything, then the default value is null. So
create table test111( a text) The default collation for column a is the same as the output of show lc_collate.so there is no non-default?I’m not following the point you are trying to make. table111.a contributes the default collation for any expression needing a collation implicitly resolved.David J.
--
On Wednesday, May 25, 2022, jian he <jian.universality@gmail.com> wrote:
I personally feel wording non-default may not be that correct. Because if the column is text then it automatically at least has default collation.
I wrote:
> Perhaps this should be documented more clearly, but it's not obviously
> wrong. If the domain declaration doesn't include an explicit COLLATE
> then casting to the domain doesn't create an explicit collation
> requirement. (That is, the domain *doesn't* have a specific
> collation attached to it, any more than type text does.)
Perhaps we could improve matters like this?
diff --git a/doc/src/sgml/ref/create_domain.sgml b/doc/src/sgml/ref/create_domain.sgml
index 81a8924926..e4b856d630 100644
--- a/doc/src/sgml/ref/create_domain.sgml
+++ b/doc/src/sgml/ref/create_domain.sgml
@@ -94,7 +94,8 @@ CREATE DOMAIN <replaceable class="parameter">name</replaceable> [ AS ] <replacea
<listitem>
<para>
An optional collation for the domain. If no collation is
- specified, the underlying data type's default collation is used.
+ specified, the domain has the same collation behavior as its
+ underlying data type.
The underlying type must be collatable if <literal>COLLATE</literal>
is specified.
</para>
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Tue, May 24, 2022 at 7:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Perhaps we could improve matters like this? >> - specified, the underlying data type's default collation is used. >> + specified, the domain has the same collation behavior as its >> + underlying data type. > +1 Pushed then, thanks for thinking about it. regards, tom lane