Thread: pgsql/src bin/psql/Makefile bin/psql/print.c t ...

pgsql/src bin/psql/Makefile bin/psql/print.c t ...

From
ishii@postgresql.org
Date:
CVSROOT:    /cvsroot
Module name:    pgsql
Changes by:    ishii@postgresql.org    01/10/14 21:25:10

Modified files:
    src/bin/psql   : Makefile print.c
    src/test/mb/expected: unicode.out
Added files:
    src/bin/psql   : mbprint.c

Log message:
    Commit Patrice's patches except:

    > - corrects a bit the UTF-8 code from Tatsuo to allow Unicode 3.1
    >  characters (characters with values >= 0x10000, which are encoded on
    >  four bytes).

    Also, update mb/expected/unicode.out. This is necessary since the
    patches affetc the result of queries using UTF-8.
    ---------------------------------------------------------------
    Hi,

    I should have sent the patch earlier, but got delayed by other stuff.
    Anyway, here is the patch:

    - most of the functionality is only activated when MULTIBYTE is
    defined,

    - check valid UTF-8 characters, client-side only yet, and only on
    output, you still can send invalid UTF-8 to the server (so, it's
    only partly compliant to Unicode 3.1, but that's better than
    nothing).

    - formats with the correct number of columns (that's why I made it in
    the first place after all), but only for UNICODE. However, the code
    allows to plug-in routines for other encodings, as Tatsuo did for
    the other multibyte functions.

    - corrects a bit the UTF-8 code from Tatsuo to allow Unicode 3.1
    characters (characters with values >= 0x10000, which are encoded on
    four bytes).

    - doesn't depend on the locale capabilities of the glibc (useful for
    remote telnet).

    I would like somebody to check it closely, as it is my first patch to
    pgsql.  Also, I created dummy .orig files, so that the two files I
    created are included, I hope that's the right way.

    Now, a lot of functionality is NOT included here, but I will keep that
    for 7.3 :) That includes all string checking on the server side (which
    will have to be a bit more optimised ;) ), and the input checking on
    the client side for UTF-8, though that should not be difficult. It's
    just to send the strings through mbvalidate() before sending them to
    the server. Strong checking on UTF-8 strings is mandatory to be
    compliant with Unicode 3.1+ .

    Do I have time to look for a patch to include iso-8859-15 for 7.2 ?
    The euro is coming 1. january 2002 (before 7.3 !) and over 280
    millions people in Europe will need the euro sign and only iso-8859-15
    and iso-8859-16 have it (and unfortunately, I don't think all Unices
    will switch to Unicode in the meantime)....

    err... yes, I know that this is not every single person in Europe that
    uses PostgreSql, so it's not exactly 280m, but it's just a matter of
    time ! ;)

    I'll come back (on pgsql-hackers) later to ask a few questions
    regarding the full unicode support (normalisation, collation,
    regexes,...) on the server side :)

    Here is the patch !

    Patrice.

    --
    Patrice H�D� ------------------------------- patrice � islande org -----
    --  Isn't it weird  how scientists  can imagine  all the matter of the
    universe exploding out of a dot smaller than the head of a pin, but they
    can't come up with a more evocative name for it than "The Big Bang" ?
    -- What would _you_ call the creation of the universe ?
    -- "The HORRENDOUS SPACE KABLOOIE !"               - Calvin and Hobbes
    ------------------------------------------ http://www.islande.org/ -----