Re: Per-column collation, proof of concept - Mailing list pgsql-hackers
From | Pavel Stehule |
---|---|
Subject | Re: Per-column collation, proof of concept |
Date | |
Msg-id | AANLkTiloowmvpPEkcrxJ4WhxfJterBwS8wETHQjTo6it@mail.gmail.com Whole thread Raw |
In response to | Per-column collation, proof of concept (Peter Eisentraut <peter_e@gmx.net>) |
Responses |
Re: Per-column collation, proof of concept
|
List | pgsql-hackers |
Hello I have only one question - If I understand well you can use collate just for sort. What is your plan for range search operation? Sort is interesting and I am sure important for multilangual applications, for me - more important is case sensitive, case insensitive, accent sensitive, insensitive filtering - do you have a plan for it? Regards Pavel Stehule 2010/7/13 Peter Eisentraut <peter_e@gmx.net>: > Here is a proof of concept for per-column collation support. > > Here is how it works: When creating a table, an optional COLLATE clause > can specify a collation name, which is stored (by OID) in pg_attribute. > This becomes part of the type information and is propagated through the > expression parse analysis, like typmod. When an operator or function > call is parsed (transformed), the collations of the arguments are > unified, using some rules (like type analysis, but different in detail). > The collations of the function/operator arguments come either from Var > nodes which in turn got them from pg_attribute, or from other > function and operator calls, or you can override them with explicit > COLLATE clauses (not yet implemented, but will work a bit like > RelabelType). At the end, each function or operator call gets one > collation to use. > what about DISTINCT clause, maybe GROUP BY clause ? regards Pavel > The function call itself can then look up the collation using the > fcinfo->flinfo->fn_expr field. (Works for operator calls, but doesn't > work for sort operations, needs more thought.) > > A collation is in this implementation defined as an lc_collate string > and an lc_ctype string. The implementation of functions interested in > that information, such as comparison operators, or upper and lower > functions, will take the collation OID that is passed in, look up the > locale string, and use the xlocale.h interface (newlocale(), > strcoll_l()) to compute the result. > > (Note that the xlocale stuff is only 10 or so lines in this patch. It > should be feasible to allow other appropriate locale libraries to be > used.) > > Loose ends: > > - Support function calls (currently only operator calls) (easy) > > - Implementation of sort clauses > > - Indexing support/integration > > - Domain support (should be straightforward) > > - Make all expression node types deal with collation information > appropriately > > - Explicit COLLATE clause on expressions > > - Caching and not leaking memory of locale lookups > > - I have typcollatable to mark which types can accept collation > information, but perhaps there should also be proicareaboutcollation > to skip collation resolution when none of the functions in the > expression tree care. > > You can start by reading the collate.sql regression test file to see > what it can do. Btw., regression tests only work with "make check > MULTIBYTE=UTF8". And it (probably) only works with glibc for now. > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >
pgsql-hackers by date: