Thread: Re: Proposal for fixing numeric type-resolution issues

Re: Proposal for fixing numeric type-resolution issues

From
Thomas Lockhart
Date:
> Here is a proposal for fixing these problems.

Sounds good. We would be looking up this info in a table, right? So we
can integrate this type hierarchy fully into our type extensibility
system.

Another 7.1 project is to work on alternate languages and character
sets, to decouple multibyte and locale from the default SQL_TEXT
character set. This will probably bring up issues similar to the
numeric problems, and since these character sets will be added as
user-defined types it will be important for the backend to understand
how to convert them for comparison operations, for example.
                         - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: Proposal for fixing numeric type-resolution issues

From
Peter Eisentraut
Date:
Thomas Lockhart writes:

> Another 7.1 project is to work on alternate languages and character
> sets, to decouple multibyte and locale from the default SQL_TEXT
> character set. This will probably bring up issues similar to the
> numeric problems, and since these character sets will be added as
> user-defined types it will be important for the backend to understand
> how to convert them for comparison operations, for example.

Really? I always thought the character set would be some separate entity
and perhaps an oid reference would be stored with every character string
and attribute. That would get you around any type conversion as long as
the functions acting on character types take this "header" field into
account.

If you want to go the data type way then you'd need to have some sort of
most general character set to cast to. That could be Unicode but that
would require that every user-defined character set be a subset of
Unicode, which is perhaps not a good assumption to make. Also, I wonder
how collations would fit in there. Collations definitely can't be ordered
at all, so casting can't be done in a controlled fashion.

Just wondering...


-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: Proposal for fixing numeric type-resolution issues

From
Tatsuo Ishii
Date:
> Thomas Lockhart writes:
> 
> > Another 7.1 project is to work on alternate languages and character
> > sets, to decouple multibyte and locale from the default SQL_TEXT
> > character set. This will probably bring up issues similar to the
> > numeric problems, and since these character sets will be added as
> > user-defined types it will be important for the backend to understand
> > how to convert them for comparison operations, for example.
> 
> Really? I always thought the character set would be some separate entity
> and perhaps an oid reference would be stored with every character string
> and attribute. That would get you around any type conversion as long as
> the functions acting on character types take this "header" field into
> account.

I think that way too. If what Thomas is suggesting is that to make a
user-defined charaset, one need to make everything such as operators,
charset, functions to work with index etc. (like defining new a data
type), that would be too painfull.

> If you want to go the data type way then you'd need to have some sort of
> most general character set to cast to. That could be Unicode but that
> would require that every user-defined character set be a subset of
> Unicode, which is perhaps not a good assumption to make.

Right. But the problem is SQL92 actually requires such a charset
called "SQL_TEXT." For me, the only candidate for SQL_TEX at this
point seems to be "mule internal code." Basically it is a variant of
ISO-2022 and has a capability to adapt to most of charsets defined in
ISO-2022. I think we could expand it so that it could become a
superset even for Unicode. Of course the problem is mule internal code
is a "internal code" and is not widely spread in the world. Even
that's true we could use it for purely internal purpose (for the parse
tree etc.).

> Also, I wonder
> how collations would fit in there. Collations definitely can't be ordered
> at all, so casting can't be done in a controlled fashion.

Hmm... Collations seem to be a different issue. I think there's no
such an idea like "collation casting" in SQL92.
--
Tatsuo Ishii



Re: Proposal for fixing numeric type-resolution issues

From
Thomas Lockhart
Date:
All good ideas and thoughts. I have been thinking that essentially
separate types per character set is the right thing, but we'll have
plenty of time to talk about it.

One point is that SQL92 assigns a specific character set and collation
sequence to every character string and every column definition; if we
embedded this "type" identification into every string then we would be
replicating the existing Postgres type system one layer down (at least
for argument's sake ;)

There also need to be well defined conversions between character
sets/collations, and some or most combinations will be illegal (e.g.
how do you collate American English against Japanese?). The Postgres
type system can enforce this simply by not providing conversion or
comparison functions for the relevant mixture of types.
                    - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California