Thread: BUG #17859: Suspected collation conflict when using recursive query

BUG #17859: Suspected collation conflict when using recursive query

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17859
Logged by:          Jiangshan Liu
Email address:      jiangshan.liu@tju.edu.cn
PostgreSQL version: 15.2
Operating system:   Ubuntu 18.04.6 LTS
Description:

Dear PostgreSQL developers,
I encountered an error while executing a recursive query in PostgreSQL and
would like to report it. The query is as follows:

WITH RECURSIVE run(n, t) AS (
  SELECT 1, ''::text
    UNION ALL
  SELECT n + 1, (SELECT (select tableowner FROM pg_tables LIMIT 1)::text)
  FROM run
  WHERE n < 5
)
SELECT * FROM run;

The error message is
> ERROR:  recursive query "run" column 2 has collation "default" in
non-recursive term but collation "C" overall
LINE 2:  SELECT 1, ''::text
                   ^
HINT:  Use the COLLATE clause to set the collation of the non-recursive
term.

In fact, the collation of column "tableowner" in table "pg_tables" is "C".

SELECT column_name, collation_name 
FROM information_schema.columns 
WHERE table_name = 'pg_tables' AND column_name = 'tableowner';

 column_name | collation_name 
-------------+----------------
 tableowner  | C

I would like to inquire if this error is caused by a bug in the recursive
query implementation. According to the PostgreSQL documentation, the
collation of the result should be the non-default collation if any
non-default collation is present[1]. In this case, since the collation of
column 2 is "C", the collation of the result should also be "C". However,
the error suggests otherwise.

Furthermore, I noticed that the same behavior works in a non-recursive
query:

SELECT 1, ''::text
    UNION ALL
SELECT 2, (SELECT (select tableowner FROM pg_tables LIMIT 1)::text);

The documentation mentions that the collation of the input expression is
used when the database system has to perform an ordering or a character
classification[2]. However, the recursive query does not use any additional
ordering or character classification on the second column, and I couldn't
find any relevant information in the documentation.

I appreciate your attention to this matter and apologize if I made any
mistakes in my analysis. Thank you for your time and effort in maintaining
PostgreSQL.

Best regards, Jiangshan Liu


[1]

https://www.postgresql.org/docs/15/collation.html#:~:text=If%20any%20non%2Ddefault%20collation%20is%20present%2C%20that%20is%20the%20result%20of%20the%20collation%20combination
[2]

https://www.postgresql.org/docs/15/collation.html#:~:text=When%20the%20database%20system%20has%20to%20perform%20an%20ordering%20or%20a%20character%20classification%2C%20it%20uses%20the%20collation%20of%20the%20input%20expression


PG Bug reporting form <noreply@postgresql.org> writes:
> I encountered an error while executing a recursive query in PostgreSQL and
> would like to report it. The query is as follows:

> WITH RECURSIVE run(n, t) AS (
>   SELECT 1, ''::text
>     UNION ALL
>   SELECT n + 1, (SELECT (select tableowner FROM pg_tables LIMIT 1)::text)
>   FROM run
>   WHERE n < 5
> )
> SELECT * FROM run;

> The error message is
>> ERROR:  recursive query "run" column 2 has collation "default" in
> non-recursive term but collation "C" overall
> LINE 2:  SELECT 1, ''::text
>                    ^
> HINT:  Use the COLLATE clause to set the collation of the non-recursive
> term.

This is not a bug, and the HINT is telling you what to do to fix it:

  SELECT 1, ''::text COLLATE "C"
    UNION ALL
  ...

> I would like to inquire if this error is caused by a bug in the recursive
> query implementation. According to the PostgreSQL documentation, the
> collation of the result should be the non-default collation if any
> non-default collation is present[1].

You are interpreting "present" far too broadly.  The type and
collation of the result are deduced from the non-recursive term, and
then we parse the recursive term *using that information* to assign
type info to the recursive references.  Then the resolved output type
details of the UNION have to match what we assumed, or we throw this
error.  Any other procedure would be circular logic, since the
conclusion about what's the output type of the recursive term might
well depend on what we assumed about the recursive references.

> Furthermore, I noticed that the same behavior works in a non-recursive
> query:

Non-recursive cases don't require making assumptions about the types
of recursive references.

            regards, tom lane