Re: is this a bug or I am blind? - Mailing list pgsql-general

From Bruce Momjian
Subject Re: is this a bug or I am blind?
Date
Msg-id 200512171726.jBHHQsm11468@candle.pha.pa.us
Whole thread Raw
In response to Re: is this a bug or I am blind?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: is this a bug or I am blind?
List pgsql-general
Where are we on this?  Given the original report:

    online=# select * from common_logins where username = 'potyty';
     uid | username | password | lastlogin | status | usertype | loginnum
    -----+----------+----------+-----------+--------+----------+----------
    (0 rows)

    online=# select * from common_logins where username like 'potyty';
      uid   | username | password |         lastlogin          | status |
    usertype | loginnum
    --------+----------+----------+----------------------------+--------+----------+----------
     155505 | potyty   | board    | 2004-08-16 17:45:55.723829 | A      |
    S        |        1
      60067 | potyty   | board    | 2004-07-07 20:22:17.68699  | A      |
    S        |        3
     174041 | potyty   | board    | 2005-02-17 00:00:13.706144 | A      |
    S        |        3
    (3 rows)

    online=# select username, username = 'potyty' from common_logins where
    username like 'potyty';
     username | ?column?
    ----------+----------
     potyty   | t
     potyty   | t
     potyty   | t
    (3 rows)

I don't think we can state that our current behavior is correct. I
realize we are being hit by the length comparison optimization, but
ultimiately the issue is that the Hungarian-specific locale considers
"tyty" and "tty" as the same string, which confuses our indexing
comparisons.

Is our fix going to be a Hungarian-specific one?

---------------------------------------------------------------------------

Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > On Fri, Dec 16, 2005 at 01:06:58PM -0500, Tom Lane wrote:
> >> Ah.  So we could redefine hashtext() to return the hash of the strxfrm
> >> value.  Slow, but a lot better than giving up hash join and hash
> >> aggregation altogether...
>
> > Not to put too fine a point on it, but either you want locale-sensetive
> > sorting or you don't.
>
> Nobody's said anything about giving up locale-sensitive sorting.  The
> question is about locale-sensitive equality: does it really make sense
> that 'tty' = 'tyty'?  Would your answer change in the context
> '/dev/tty' = '/dev/tyty'?  Are you willing to *not have access* to a
> text comparison operator that will make the distinction?
>
> I'm inclined to think that this is more like the occasional need for
> accent-insensitive comparisons.  It seems generally agreed that you want
> something like smash('ab') = smash('áb') rather than making the
> strings equal in all contexts.
>
> Of course, not being a native speaker of any of the affected languages,
> my opinion shouldn't be taken too seriously ...
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

pgsql-general by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Installation trouble - Solved
Next
From: Bruce Momjian
Date:
Subject: Re: Installation trouble - Solved