Thread: planner does not detect same-as-default collation.

planner does not detect same-as-default collation.

From
Jeff Janes
Date:
I don't know if this is a bug, or a surprising feature, or just a
misunderstanding.

If I specify an explicit collation which happens to be the same as the
default collation, it is not recognized as being the same as the
default.

select version();
                                                    version
---------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit

(But also in at least 9.5rc1 and 9.6dev)

create database test2 encoding='utf8' lc_collate='en_US.utf8'
template=template0;
\c test2
create table foo as select generate_series::text as bar from
generate_series(1,100000);
create index on foo (bar );
cluster foo USING foo_bar_idx ;
set enable_sort TO off;
explain select * from foo order by bar;

<walks the index>

explain select * from foo order by bar collate "en_US.utf8";

<does the sort>

Am I the only one surprised by this?

Cheers,

Jeff


Re: planner does not detect same-as-default collation.

From
Thomas Munro
Date:
On Tue, Jan 5, 2016 at 6:39 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> I don't know if this is a bug, or a surprising feature, or just a
> misunderstanding.
>
> If I specify an explicit collation which happens to be the same as the
> default collation, it is not recognized as being the same as the
> default.
>
> select version();
>                                                     version
> ---------------------------------------------------------------------------------------------------------------
>  PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
> 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit
>
> (But also in at least 9.5rc1 and 9.6dev)
>
> create database test2 encoding='utf8' lc_collate='en_US.utf8'
> template=template0;
> \c test2
> create table foo as select generate_series::text as bar from
> generate_series(1,100000);
> create index on foo (bar );
> cluster foo USING foo_bar_idx ;
> set enable_sort TO off;
> explain select * from foo order by bar;
>
> <walks the index>
>
> explain select * from foo order by bar collate "en_US.utf8";
>
> <does the sort>
>
> Am I the only one surprised by this?

It works if you say: order by bar collate "default".  "default" and
"en_US.utf8" are different collations with different OIDs even if
'en_US.utf8' is the collation name listed in pg_database.datcollate.
I suppose if you knew the OID of that collation, you could teach
indxpath.c and (and I don't know what other planner machinery) to
consider that OID to be equivalent to DEFAULT_COLLATION_OID when
comparing them to consider an index path.

There was another email somewhere talking about constraint exclusion's
treatment of explicit and default collations.  Maybe predtest.c could
use the same type of logic.

--
Thomas Munro
http://www.enterprisedb.com