BUG #17868: Inconsistent collation in PL/pgSQL function parameters can lead to errors - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #17868: Inconsistent collation in PL/pgSQL function parameters can lead to errors
Date
Msg-id 17868-77406e24291c4548@postgresql.org
Whole thread Raw
Responses Re: BUG #17868: Inconsistent collation in PL/pgSQL function parameters can lead to errors  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17868
Logged by:          Tanner Sterling
Email address:      tanner.sterling@outlook.com
PostgreSQL version: 15.2
Operating system:   Windows 10
Description:

First of all, I would like to confirm with you that for PL/pgSQL functions
with inconsistent collation of incoming parameters, the documentation
mentions that
If there are no parameters of collatable data types, or no common collation
can be identified for them, then parameters and local variables use the
default collation of their data type (which is usually the database's
default collation, but could be different for variables of domain types)
[1].
whether it is expressed through this use case:

CREATE OR REPLACE FUNCTION less_than(a text, b text) RETURNS boolean AS $$
BEGIN
    RETURN a < b;
END;
$$ LANGUAGE plpgsql;

select less_than(param_1, param_2) from
LATERAL (SELECT 'abc' COLLATE "en_US") AS t_param_1(param_1),
LATERAL (SELECT '123' COLLATE "C") AS t_param_2(param_2);

Results of execution:
 less_than
-----------
 f
(1 row)

But when I tried to simplify this call method, the situation became
unpredictable. The following use case looks equivalent to the one above,
except that it passes two values with different collation directly into the
function, but the execution results are different from the above use case.

select less_than('abc' COLLATE "en_US", '123' COLLATE "C");

Execution error reported:
ERROR:  collation mismatch between explicit collations "en_US" and "C"
LINE 1: select less_than('abc' COLLATE "en_US", '123' COLLATE "C")

Why doesn't this example follow the principle mentioned above, which should
have converted the collations to the default collation instead of raising an
error? Is this a bug in the processing of PL/pgSQL function parameters?
It is possible that to me, these two invocations appear equivalent in
different formats. Some people may prefer to use the LATERAL keyword to give
a name to a value and then pass it as a parameter, while others may prefer
to directly fill in the parameter list. Therefore, these two invocations
should not produce different results, and this could be a bug in the
processing of PL/pgSQL function parameters.

By the way, according to the documentation [2], when multiple collations
need to be combined but there is a conflict that cannot be resolved, an
error is raised for the explicit collations that cannot agree, while the
default collation is usually promoted to the implicit collation. Does the
handling of collation combination conflicts in this case violate this
principle? Although the first example gives a name to the values, it still
explicitly specifies the collation types of the two values using the COLLATE
clause in the LATERAL subqueries, but it does not raise any error.

Similarly, I noticed that the reporter of BUG #17859 [3] mentioned that in a
recursive query, the collation type of the non-recursive item is first
assumed, and then the recursive item needs to match the non-recursive item
that has already been assumed, although they may not strictly follow the
principles of collation combination.

What I want to say is that these different ways of handling collation
conflicts seem to be inconsistent throughout PostgreSQL, and there are
always surprising special cases even after understanding the basic
principles of combination. Is it possible that this could be improved to
make it more consistent and easier to use.

regards, Tanner Sterling


[1]

https://www.postgresql.org/docs/15/plpgsql-declarations.html#PLPGSQL-DECLARATION-COLLATION:~:text=then%20parameters%20and%20local%20variables%20use%20the%20default%20collation%20of%20their%20data%20type
[2]

https://www.postgresql.org/docs/15/collation.html#:~:text=When%20multiple%20collations%20need%20to%20be%20combined%2C%20for%20example%20in%20a%20function%20call%2C%20the%20following%20rules%20are%20used%3A
[3]
https://www.postgresql.org/message-id/flat/17859-c530b7716e786d04%40postgresql.org


pgsql-bugs by date:

Previous
From: "Bonu, Miriam, Sony Music Italy"
Date:
Subject: R: RHEL8 Got error: unknown type name ‘sigjmp_buf’ while compiling C simple code after including pg_type.h
Next
From: Tom Lane
Date:
Subject: Re: BUG #17868: Inconsistent collation in PL/pgSQL function parameters can lead to errors