Charset/collate support and function parameters - Mailing list pgsql-hackers

From Dennis Bjorklund
Subject Charset/collate support and function parameters
Date
Msg-id Pine.LNX.4.44.0410301839490.2015-100000@zigo.dhs.org
Whole thread Raw
Responses Re: Charset/collate support and function parameters
List pgsql-hackers
I have a long term plan to implement charset support in pg and now when I
have dropped the work on the timestamps, I've been looking into this
subject.

Today we store the max length of a string in the typmod field, but that
has to be extended so we also store the charset and the collation of the
string. That's simple but we need functions that take a string of a
specific charset and collation as an input and give that as a result.
Currently all information we have about function arguments are the OID of
the type. The function argument OID's are stored in an array in pg_proc
and I suggest that we instead of this array have a table pg_parameters
that is much like

http://www.postgresql.org/docs/7.4/static/infoschema-parameters.html

Notice how there are a lot of columns describing the dynamic parts of a
type, like character_maximum_length, character_set_name,
datetime_precision. We would of course not store the name of a charset,
but the oid (and so on).

Most of these are NULL since they only apply to a specific type, but
that's okay since NULL values are stored in a bitmap so the row width will
still be small.

Before one start to work on charset/collation support I think it would be
good of one can make the above change with just the old properties. As a
result we could write functions like
 foo (bar varchar(5))

We probably won't write functions like that very often. but as a first
step this is what we want.

Changing this is a lot of work, especially when one look in pg_proc.h and 
realize that one need to alter 3000 lines of
 DATA(insert OID = 2238 ( bit_and PGNSP PGUID 12 t f f f i 1 23 "23" _null_ aggregate_dummy - _null_));
DESCR("bitwise-andinteger aggregate");
 

into another form. The "23" should be pulled out and it would become a row 
in the pg_parameters table. Maybe some job for a script :-)
 Sometimes I wish that (at least part of) the bootstrap was in a higher  level and that the above was just normal sql
statements:
 CREATE FUNCTION bit_and ( .... ) AS ...

In addition to the function arguments we also need to treat the function
return value in a similar way. The natural solution is to extend pg_proc
with many of the same columns as in the pg_parameters table. One could
also reuse the pg_parameters table and store a parameter with ordinal
number 0 to be the return value. But then there would be some columns that
do not apply to return values.

My current plan is

A) Implement a pg_parameters table and let everything else work  as today. Also, the return values have to be taken
careof in a   similar way.
 

B) Change function overloading so we can have functions with the same   name but different properties. For example for
stringsthat means   different max lengths are used to resolve overloading.
 

C) Work on charset / collation.

All of these will probably not happen for 8.1 but I hope to finish A and
B. It all depends on how much trouble I run into and how much time I can
put into it. The function overload parts in pg are far from trivial, but I
will not worry about that until I get that far.

Any comments about this plan?

-- 
/Dennis Björklund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Signature change for SPI_cursor_open
Next
From: Tom Lane
Date:
Subject: Re: Charset/collate support and function parameters