Re: Charset/collate support and function parameters - Mailing list pgsql-hackers

From Dennis Bjorklund
Subject Re: Charset/collate support and function parameters
Date
Msg-id Pine.LNX.4.44.0410302044190.2015-100000@zigo.dhs.org
Whole thread Raw
In response to Re: Charset/collate support and function parameters  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Charset/collate support and function parameters
List pgsql-hackers
On Sat, 30 Oct 2004, Tom Lane wrote:

> > Are you worried about performance or is it the smaller change that you
> > want?
> 
> I'm worried about the fact that instead of, say, one length(text)
> function, we would now have to have a different one for every
> characterset/collation.

This is not about how the parameter information is stored, but let's 
discuss that anyway. It's important issues.

I was hoping that we could implement functions where one didn't have to 
specify the charset and collation (but could if we want to).

For some functions one really want different ones depending on the
charset. For example the length function, then we will need to calculate
the length differently for each charset. We can never have one length
function that works for every possible charset. We could have one pg
function that do N different things inside depending on the charset, but
that's not really a simplification.

For functions where one have not specified the charset of an argument then
we need to be able to pass on that type information to where ever we use
that argument. Variables already have a type and if we have a (pseudo
code) function like

foo (a varchar) returns int
{ select length(a);
}

and call it with

foo ('foo' charset latin1) 

then we need to make sure that variable a inside the function body of foo
get the type from the caller and then the function call to length(a) will
work out since it would select the length function for latin1. I think it
should work but an implementation is the only way to know.

Every string do in the end need to know what charset and what collation it
is in. Otherwise it can not be used for anything, not even to compare it
with another string.

I could even imagine to have different functions for each
charset/collation. It's not that many functions built in that are affected
and not all of them need to work with every collation. The user just need
to call them with the correct one. I don't expect any functions like
 foo (a varchar collation sv_SE,      b varchar collation en_US)

or any other combination of a and b. If any then a and be will be the same
type. So there would not be arbitrary many combinations (but still a lot).

The alternative is storing the charset and collation inside each string.  
That seems like a too big price to pay, it belong in the type.

> Not to mention one for every possible N in varchar(N).

This doesn't matter since one can always implement functions to take 
varchar arguments without any limit and then any shorter string can be 
implictly casted up to that type. Or one can treat the length exactly like 
the charset above.

Of course you do not want one length function for each length.

-- 
/Dennis Björklund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: 8.0b4: COMMIT outside of a transaction echoes ROLLBACK
Next
From: Tom Lane
Date:
Subject: Re: Charset/collate support and function parameters