Re: Warts with SELECT DISTINCT - Mailing list pgsql-hackers

From Bruno Wolff III
Subject Re: Warts with SELECT DISTINCT
Date
Msg-id 20060504031057.GA30219@wolff.to
Whole thread Raw
In response to Warts with SELECT DISTINCT  (Greg Stark <gsstark@mit.edu>)
Responses Re: Warts with SELECT DISTINCT  (Greg Stark <gsstark@mit.edu>)
Re: Warts with SELECT DISTINCT  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, May 03, 2006 at 17:58:07 -0400, Greg Stark <gsstark@mit.edu> wrote:
> 
> Though it's optimized poorly and does a superfluous sort step:
> 
> stark=> explain select col1 from test group by col1 order by upper(col1);
>                                 QUERY PLAN                                 
> ---------------------------------------------------------------------------
>  Sort  (cost=99.72..100.22 rows=200 width=32)
>    Sort Key: upper(col1)
>    ->  Group  (cost=85.43..92.08 rows=200 width=32)
>          ->  Sort  (cost=85.43..88.50 rows=1230 width=32)
>                Sort Key: col1
>                ->  Seq Scan on test  (cost=0.00..22.30 rows=1230 width=32)
> (6 rows)
> 
> 
> Whereas it shouldn't be hard to prove that this is equivalent:
> 
> stark=> explain select col1 from test group by upper(col1),col1 order by upper(col1);
>                              QUERY PLAN                              
> ---------------------------------------------------------------------
>  Group  (cost=88.50..98.23 rows=200 width=32)
>    ->  Sort  (cost=88.50..91.58 rows=1230 width=32)
>          Sort Key: upper(col1), col1
>          ->  Seq Scan on test  (cost=0.00..25.38 rows=1230 width=32)
> (4 rows)

I don't think you can assume that that will be true for any locale. If there
are two different characters that both have the same uppercase version, this
will break things.

And while you would expect that x = y => upper(x) = upper(y) I am not sure
that is guarenteed for locales. I can imagine having two different characters
that are treated the same for ordering purposes, but have uppercase versions
that are considered different for ordering purposes.


pgsql-hackers by date:

Previous
From: "Nikolay Samokhvalov"
Date:
Subject: Re: [SoC] Relation between project "XML improvements" and "pgxml"
Next
From: Greg Stark
Date:
Subject: Re: Warts with SELECT DISTINCT