Bug #610: collation fails sorting because of strcoll() bug - Mailing list pgsql-bugs

From pgsql-bugs@postgresql.org
Subject Bug #610: collation fails sorting because of strcoll() bug
Date
Msg-id 20020307193616.1D835476A44@postgresql.org
Whole thread Raw
Responses Re: Bug #610: collation fails sorting because of strcoll() bug  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Mathias August Gruber (mgruber) reports a bug with a severity of 2
The lower the number the more severe it is.

Short Description
collation fails sorting because of strcoll() bug

Long Description
Hi there,

I was trying to migrate a MS-SQL Server database to a Postgresql platform about two years ago and could not make things
workbecause I needed collation.
 
Although documentation states that collation will work, this is not true when using string separated by blanks.
What happens is the strings are sorted as if they had no spaces.
This was really bad.
Nowadays I've taken this project again and noticed the problem is still there. So I started to read all docs and the
sourcecode and made lots of tests.
 
Also your regression tests lacks on this topic. You are only sorting single worded strings.

Now I have a verdict: The problem is on the GNU-C libraries strcoll()
function.

I have attached a little C program that reproduces this behavior. Just
compile it (and don't forget to set LC_ALL to any western language; I've
tested with pt_BR but the problem occurs almost with any other
configuration).

Hope I could help you with this superb project.

Very Best Regards


Sample Code

#include <stdio.h>
#include <string.h>
#include <locale.h>



int main(int argc, char **arv)
{
    int i;
    char src[4][32] =
    {
        "Joseval Almeida",
        "Jose Valter",
        "JOSE CAMARGO",
        "Jose Americo",
    };
    char arr[4][32];

    memcpy(arr, src, sizeof(src));

    /* Use current locale settings (in my case LC_ALL=pt_BR), that uses
    coventional LATIN 1 collation settings. */
    setlocale(LC_ALL, "");

    /* Print current array */
    puts("The input array is:\n");
    for(i = 0; i < 4; i++)
        puts(arr[i]);

    /* Sort the array */
    qsort(arr, 4, sizeof(char)*32, strcmp);

    /* Print the output */
    puts("\nThe strcmp sorted array is:\n");
    for(i = 0; i < 4; i++)
        puts(arr[i]);

    /* Sort the array */
    memcpy(arr, src, sizeof(src));
    qsort(arr, 4, sizeof(char)*32, strcasecmp);

    /* Print the output */
    puts("\nThe strcasecmp sorted array is:\n");
    for(i = 0; i < 4; i++)
        puts(arr[i]);

    /* Sort the array */
    memcpy(arr, src, sizeof(src));
    qsort(arr, 4, sizeof(char)*32, strcoll);

    /* Print the output */
    puts("\nThe strcoll sorted array is:\n");
    for(i = 0; i < 4; i++)
        puts(arr[i]);

    return 0;
}


No file was uploaded with this report

pgsql-bugs by date:

Previous
From: pgsql-bugs@postgresql.org
Date:
Subject: Bug #609: CREATE TABLE with implicit index should not fail if index already exists
Next
From: Tom Lane
Date:
Subject: Re: Bug #610: collation fails sorting because of strcoll() bug