Thread: Per-column collation, work in progress
Following up on my previous patch [0], here is a fairly complete implementation of this feature. The general description and implementation outline of the previous message still apply. This patch contains documentation and regression tests, which can serve as further explanations. As this patch touches pretty much everything in the system, there are probably countless bugs and bogosities, some of which I have marked with FIXME, TODO, etc. But all the functionality is basically there, so it's time someone else gives this a serious examination. Note: As previously, regression tests only work with "make check MULTIBYTE=UTF8" and the feature overall only works on Linux/glibc. [0] http://archives.postgresql.org/message-id/1279045531.32647.14.camel@vanquo.pezone.net
Attachment
On Thu, Sep 16, 2010 at 5:46 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > Following up on my previous patch [0], here is a fairly complete > implementation of this feature. The general description and > implementation outline of the previous message still apply. This patch > contains documentation and regression tests, which can serve as further > explanations. I tested the patch on database with encoding=UTF8 and locale-C. I have a couple of questions and comments. * CREATE TABLE (LIKE table_with_collation) doesn't inherit collations. We need to copy collations by default, or add INCLUDINGCOLLATE option. * upper() doesn't work if a column has a collation. It still works if a column doesn't have a collation. postgres=# \d tbl Table "public.tbl"Column | Type | Modifiers --------+------+--------------------c | text | collate Cja | text | collate ja_JP.utf8 postgres=# SELECT name, setting FROM pg_settings WHERE name IN ('lc_ctype', 'lc_collate'); name | setting ------------+---------lc_collate | Clc_ctype | C (2 rows) postgres=# SELECT upper(c) FROM tbl; ERROR: invalid multibyte character for locale HINT: The server's LC_CTYPE locale is probably incompatible with the database encoding. postgres=# SELECT upper(ja) FROM tbl; ERROR: invalid multibyte character for locale HINT: The server's LC_CTYPE locale is probably incompatible with the database encoding * Comparison of strings with different collations is forbidden, but assignment is allowed, right? postgres=# SELECT * FROM tbl WHERE c = ja; ERROR: collation mismatch between implicit collations "C" and "ja_JP.utf8" LINE 1: SELECT * FROM tbl WHERE c = ja; ^ HINT: You can override the collation by applying the COLLATE clause to one or both expressions. postgres=# INSERT INTO tbl(c, ja) SELECT ja, c FROM tbl; INSERT 0 6 * psql \d needs a separator between collate and not null modifiers. postgres=# ALTER TABLE tbl ALTER COLUMN c SET NOT NULL; ALTER TABLE postgres=# \d tbl Table "public.tbl"Column | Type | Modifiers --------+------+--------------------c | text | collate Cnot null <= HEREja | text | collate ja_JP.utf8 > the feature overall only works on Linux/glibc. We could support it also on MSVC. http://msdn.microsoft.com/en-us/library/a7cwbx4t(v=VS.90).aspx -- _strcoll_l http://msdn.microsoft.com/en-us/library/45119yx3(v=VS.90).aspx -- _towupper_l -- Itagaki Takahiro
On ons, 2010-09-22 at 19:44 +0900, Itagaki Takahiro wrote: > * CREATE TABLE (LIKE table_with_collation) doesn't inherit collations. > We need to copy collations by default, or add INCLUDING COLLATE option. OK, should be easy to fix. > * upper() doesn't work if a column has a collation. > It still works if a column doesn't have a collation. I think what you are observing is the result of mixing C and non-C locales. Of course that should also be fixed, but it doesn't have much to do with what upper() does. Note that there is a regression test case for lower(), which works mostly the same way. > * Comparison of strings with different collations is forbidden, > but assignment is allowed, right? Correct. > * psql \d needs a separator between collate and not null modifiers. OK. > We could support it also on MSVC. > http://msdn.microsoft.com/en-us/library/a7cwbx4t(v=VS.90).aspx -- _strcoll_l > http://msdn.microsoft.com/en-us/library/45119yx3(v=VS.90).aspx -- _towupper_l Great.
Hello I am playing with your patch now. I found a few issues: 1. It's doesn't work with SQL 92 rules for sortby list. I can understand so explicit COLLATE using doesn't work, but the implicit using doesn't work too: CREATE TABLE foo(a text, b text COLLATE "cs_CZ.UTF8") SELECT * FROM foo ORDER BY 1 -- produce wrong order 2. Why default encoding for collate is static? There are latin2 for czech, cs_CZ and cs_CZ.iso88592. So any user with UTF8 has to write encoding explicitly. But the more used and preferred encoding is UTF8 now. I am thinking so cs_CZ on utf8 database should mean cs_CS.UTF8. 3. postgres=# select to_char(current_date,'tmday') collate "cs_CZ.utf8";to_char ──────────thursday -- bad result (1 row) 4. is somewhere ToDo for collation implementation? 5. postgres=# create table xy(a text, b text collate "cs_CZ"); ERROR: collation "cs_CZ" for current database encoding "UTF8" does not exist can be there some more friendly message or hint ? like "you cannot to use a different encoding". This collate is in pg_collates table. -- patch was applied cleanly and works in very well. Thank you. Regards Pavel Stehule 2010/9/15 Peter Eisentraut <peter_e@gmx.net>: > Following up on my previous patch [0], here is a fairly complete > implementation of this feature. The general description and > implementation outline of the previous message still apply. This patch > contains documentation and regression tests, which can serve as further > explanations. > > As this patch touches pretty much everything in the system, there are > probably countless bugs and bogosities, some of which I have marked with > FIXME, TODO, etc. But all the functionality is basically there, so it's > time someone else gives this a serious examination. > > Note: As previously, regression tests only work with "make check > MULTIBYTE=UTF8" and the feature overall only works on Linux/glibc. > > [0] > http://archives.postgresql.org/message-id/1279045531.32647.14.camel@vanquo.pezone.net > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >
On Thu, Sep 23, 2010 at 5:12 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 3. postgres=# select to_char(current_date,'tmday') collate "cs_CZ.utf8"; > to_char > ────────── > thursday -- bad result > (1 row) COLLATE means "collation" rather than "locale", no? > 5. > postgres=# create table xy(a text, b text collate "cs_CZ"); > ERROR: collation "cs_CZ" for current database encoding "UTF8" does not exist > can be there some more friendly message or hint ? I hope Postgres automatically detects the omitted encoding because it knows the database encoding is UTF8. -- Itagaki Takahiro
2010/9/23 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > On Thu, Sep 23, 2010 at 5:12 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> 3. postgres=# select to_char(current_date,'tmday') collate "cs_CZ.utf8"; >> to_char >> ────────── >> thursday -- bad result >> (1 row) > > COLLATE means "collation" rather than "locale", no? ok. > >> 5. >> postgres=# create table xy(a text, b text collate "cs_CZ"); >> ERROR: collation "cs_CZ" for current database encoding "UTF8" does not exist >> can be there some more friendly message or hint ? > > I hope Postgres automatically detects the omitted encoding > because it knows the database encoding is UTF8. I know what this issue means, but it needs some detail or hint I think Regards Pavel > > -- > Itagaki Takahiro >
On tor, 2010-09-23 at 10:12 +0200, Pavel Stehule wrote: > 1. It's doesn't work with SQL 92 rules for sortby list. I can > understand so explicit COLLATE using doesn't work, but the implicit > using doesn't work too: > > CREATE TABLE foo(a text, b text COLLATE "cs_CZ.UTF8") > > SELECT * FROM foo ORDER BY 1 -- produce wrong order I can't reproduce that. Please provide more details. > 2. Why default encoding for collate is static? There are latin2 for > czech, cs_CZ and cs_CZ.iso88592. So any user with UTF8 has to write > encoding explicitly. But the more used and preferred encoding is UTF8 > now. I am thinking so cs_CZ on utf8 database should mean cs_CS.UTF8. That's tweakable. One idea I had is to strip the ".utf8" suffix from locale names when populating the pg_collation catalog, or create both versions. I agree that the current way is a bit cumbersome. > 3. postgres=# select to_char(current_date,'tmday') collate "cs_CZ.utf8"; > to_char > ────────── > thursday -- bad result > (1 row) As was already pointed out, collation only covers lc_collate and lc_ctype. (It could cover other things, for example an application to the money type was briefly discussed, but that's outside the current mandate.) As a point of order, what you wrote above attaches a collation to the result of the function call. To get the collation to apply to the function call itself, you have to put the collate clause on one of the arguments, e.g., select to_char(current_date,'tmday' collate "cs_CZ.utf8"); > 4. is somewhere ToDo for collation implementation? At the moment it's mostly in the source code. I have a list of notes locally that I can clean up and put in the wiki once we agree on the general direction. > 5. > > postgres=# create table xy(a text, b text collate "cs_CZ"); > ERROR: collation "cs_CZ" for current database encoding "UTF8" does not exist > > can be there some more friendly message or hint ? like "you cannot to > use a different encoding". This collate is in pg_collates table. That can surely be polished.
On tor, 2010-09-23 at 17:29 +0900, Itagaki Takahiro wrote: > On Thu, Sep 23, 2010 at 5:12 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > > 5. > > postgres=# create table xy(a text, b text collate "cs_CZ"); > > ERROR: collation "cs_CZ" for current database encoding "UTF8" does not exist > > can be there some more friendly message or hint ? > > I hope Postgres automatically detects the omitted encoding > because it knows the database encoding is UTF8. I would rather not build too many expectations into this yet. The collation names are chosen by the user, the locale names are from the operating system. There is not necessarily a correspondence. The best fix is probably what I described earlier, populate the pg_collation table with the ".utf8" suffix stripped.
2010/9/23 Peter Eisentraut <peter_e@gmx.net>: > On tor, 2010-09-23 at 10:12 +0200, Pavel Stehule wrote: >> 1. It's doesn't work with SQL 92 rules for sortby list. I can >> understand so explicit COLLATE using doesn't work, but the implicit >> using doesn't work too: >> >> CREATE TABLE foo(a text, b text COLLATE "cs_CZ.UTF8") >> >> SELECT * FROM foo ORDER BY 1 -- produce wrong order > > I can't reproduce that. Please provide more details. sorry, it is ok - I was confused > >> 2. Why default encoding for collate is static? There are latin2 for >> czech, cs_CZ and cs_CZ.iso88592. So any user with UTF8 has to write >> encoding explicitly. But the more used and preferred encoding is UTF8 >> now. I am thinking so cs_CZ on utf8 database should mean cs_CS.UTF8. > > That's tweakable. One idea I had is to strip the ".utf8" suffix from > locale names when populating the pg_collation catalog, or create both > versions. I agree that the current way is a bit cumbersome. > yes. now almost all databases are in utf8 >> 3. postgres=# select to_char(current_date,'tmday') collate "cs_CZ.utf8"; >> to_char >> ────────── >> thursday -- bad result >> (1 row) > > As was already pointed out, collation only covers lc_collate and > lc_ctype. (It could cover other things, for example an application to > the money type was briefly discussed, but that's outside the current > mandate.) > ook > As a point of order, what you wrote above attaches a collation to the > result of the function call. To get the collation to apply to the > function call itself, you have to put the collate clause on one of the > arguments, e.g., > > select to_char(current_date,'tmday' collate "cs_CZ.utf8"); I am thinking, collates can be used for this purpose too. I see some impacts - this syntax changes a stable function to immutable and it cannot be simple to solve. > >> 4. is somewhere ToDo for collation implementation? > > At the moment it's mostly in the source code. I have a list of notes > locally that I can clean up and put in the wiki once we agree on the > general direction. > >> 5. >> >> postgres=# create table xy(a text, b text collate "cs_CZ"); >> ERROR: collation "cs_CZ" for current database encoding "UTF8" does not exist >> >> can be there some more friendly message or hint ? like "you cannot to >> use a different encoding". This collate is in pg_collates table. > > That can surely be polished. > > Regards Pavel Stehule
On Wed, Sep 22, 2010 at 10:29 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> We could support it also on MSVC. >> http://msdn.microsoft.com/en-us/library/a7cwbx4t(v=VS.90).aspx -- _strcoll_l >> http://msdn.microsoft.com/en-us/library/45119yx3(v=VS.90).aspx -- _towupper_l > > Great. If we support both glibc and msvc, how to we handle CREATE TABLE DDLs in pg_dump? Since collation names depend on platforms, a backup dumped at UNIX cannot be reloaded to Windows. We might need to normalize locale names to generate a portable dump. -- Itagaki Takahiro
On fre, 2010-09-24 at 09:32 +0900, Itagaki Takahiro wrote: > On Wed, Sep 22, 2010 at 10:29 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > >> We could support it also on MSVC. > >> http://msdn.microsoft.com/en-us/library/a7cwbx4t(v=VS.90).aspx -- _strcoll_l > >> http://msdn.microsoft.com/en-us/library/45119yx3(v=VS.90).aspx -- _towupper_l > > > > Great. > > If we support both glibc and msvc, how to we handle CREATE TABLE DDLs > in pg_dump? Since collation names depend on platforms, a backup dumped > at UNIX cannot be reloaded to Windows. We might need to normalize > locale names to generate a portable dump. It is not necessary that the SQL collation names are the same as the OS locale names. That is just a convenient way to initialize it. If you need to transport dumps, you can create your own SQL collation entry at the target that locally fits what you are trying to do (or perhaps rename the collation at the source). I don't think there is a universally applicable way to "normalize" locale names, because we want to support user-defined OS locales.
On tor, 2010-09-23 at 11:55 +0200, Pavel Stehule wrote: > > select to_char(current_date,'tmday' collate "cs_CZ.utf8"); > > I am thinking, collates can be used for this purpose too. I see some > impacts - this syntax changes a stable function to immutable and it > cannot be simple to solve. I don't understand how you come to that conclusion.
2010/9/24 Peter Eisentraut <peter_e@gmx.net>: > On tor, 2010-09-23 at 11:55 +0200, Pavel Stehule wrote: >> > select to_char(current_date,'tmday' collate "cs_CZ.utf8"); >> >> I am thinking, collates can be used for this purpose too. I see some >> impacts - this syntax changes a stable function to immutable and it >> cannot be simple to solve. > > I don't understand how you come to that conclusion. sorry, I was wrong - it has sense for date output function. to_char is immutable everywhere Pavel > > >
Hello Peter Is there any reason why you prohibit a different encodings per one database? Actually people expect from collate per column a possibility to store a two or more different encodings per one database. Without this possibility - only UTF8 is possible for practical work - and for other encodings only pairs (national locale + C). Yes - it is from my perspective (as Czech programmer) - very typical situation and request is mix latin2 and latin1. I can live with limit, but it is very hard limit and should be documented. Regards Pavel 2010/9/15 Peter Eisentraut <peter_e@gmx.net>: > Following up on my previous patch [0], here is a fairly complete > implementation of this feature. The general description and > implementation outline of the previous message still apply. This patch > contains documentation and regression tests, which can serve as further > explanations. > > As this patch touches pretty much everything in the system, there are > probably countless bugs and bogosities, some of which I have marked with > FIXME, TODO, etc. But all the functionality is basically there, so it's > time someone else gives this a serious examination. > > Note: As previously, regression tests only work with "make check > MULTIBYTE=UTF8" and the feature overall only works on Linux/glibc. > > [0] > http://archives.postgresql.org/message-id/1279045531.32647.14.camel@vanquo.pezone.net > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >
On Sun, Sep 26, 2010 at 1:15 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > Is there any reason why you prohibit a different encodings per one > database? Actually people expect from collate per column a possibility > to store a two or more different encodings per one database. These are two completely separate problems that only look related. The main difference is that while collation is a property of the comparison or sort you're performing encoding is actually a property of the string itself. It doesn't make sense to specify a different encoding than what the string actually contains. You could actually do what you want now by using bytea columns and convert_to/convert_from and it wouldn't be much easier if the support were built into text since you would still have to keep track of the encoding it's in and the encoding you want. We could have a encoded_text data type which includes both the encoding and the string and which any comparison function automatically handles conversion based on the encoding of the collation requested -- but I wouldn't want that to be the default text datatype. It would impose a lot of overhead on the basic text operations and magnify the effects of choosing the wrong collation. -- greg
On 09/26/2010 09:37 AM, Greg Stark wrote: > We could have a > encoded_text data type which includes both the encoding and the string > and which any comparison function automatically handles conversion > based on the encoding of the collation requested -- but I wouldn't > want that to be the default text datatype. It would impose a lot of > overhead on the basic text operations and magnify the effects of > choosing the wrong collation. Yeah, but it would be a nice gadget to have. cheers andrew
On Fri, Sep 24, 2010 at 1:57 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > On fre, 2010-09-24 at 09:32 +0900, Itagaki Takahiro wrote: >> On Wed, Sep 22, 2010 at 10:29 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> >> We could support it also on MSVC. >> >> http://msdn.microsoft.com/en-us/library/a7cwbx4t(v=VS.90).aspx -- _strcoll_l >> >> http://msdn.microsoft.com/en-us/library/45119yx3(v=VS.90).aspx -- _towupper_l >> > >> > Great. >> >> If we support both glibc and msvc, how to we handle CREATE TABLE DDLs >> in pg_dump? Since collation names depend on platforms, a backup dumped >> at UNIX cannot be reloaded to Windows. We might need to normalize >> locale names to generate a portable dump. > > It is not necessary that the SQL collation names are the same as the OS > locale names. That is just a convenient way to initialize it. If you > need to transport dumps, you can create your own SQL collation entry at > the target that locally fits what you are trying to do (or perhaps > rename the collation at the source). > > I don't think there is a universally applicable way to "normalize" > locale names, because we want to support user-defined OS locales. What's the status of this patch? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On ons, 2010-10-13 at 19:15 -0400, Robert Haas wrote: > What's the status of this patch? I would appreciate some more review of the basic architecture.
On Thu, Oct 14, 2010 at 12:53 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On ons, 2010-10-13 at 19:15 -0400, Robert Haas wrote: >> What's the status of this patch? > > I would appreciate some more review of the basic architecture. <reads patch> Wow, what a patch. On the whole, I think this looks pretty good. Of course, the obvious thing to be dismayed about is how many parts of the system this touches. To some extent, that's probably inevitable and maybe not that bad, but I wonder if there is some preparatory refactoring that could be done to trim it down a bit. I notice, for example, that a lot of places that looked at <asc/desc, nulls first/last> now look at <asc/desc, nulls first/last, collation>. In particular, all the pathkey stuff is like this. And similarly places that used to care about <type, typmod> now have to care about <type, tymod, collation>. There might be ways to restructure some of this code so that these things can be changed without having to touch quite so many places. If we're extending these lists from two items to three, do we need to worry about what happens when they grow to four or five or six? I particularly think this is in issue for the type information. We are still finding bugs where typemod isn't carried through properly; this kind of thing is only going to make it much worse. We need to encapsulate it in some future-proof way. It seems you've falsified the header comment in pathkeys_useful_for_merging(), although I guess it's already false because it doesn't seem to have been updated for the NULLS ASC/DESC stuff, and the interior comment in right_merge_direction() also needs adjusting. But this might be more than a documentation problem, because the choice of merge direction really *is* arbitrary in the case of ASC/DESC and NULLS FIRST/LAST, but I'm not sure whether that's actually true for collation. If collation affects the definition of equality then it certainly isn't true. It looks like you've define collations as objects that exist within particular namespaces, but there's no CREATE COLLATION statement, so I don't see what purpose this serves. I suppose we could leave that to be added later, but is there actually a use case for having collations in individual schemas, or should we treat them more like we do casts - i.e. as database-global objects? Why does the executor ever need to see collate clauses? In the department of minor nits, the use of the word "respectively" in the CREATE INDEX documentation doesn't make sense to me. The message about "has a collation conflict" is not self-explanatory. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On tor, 2010-10-14 at 22:54 -0400, Robert Haas wrote: > and maybe not that bad, but I wonder if there is some preparatory > refactoring that could be done to trim it down a bit. I notice, for > example, that a lot of places that looked at <asc/desc, nulls > first/last> now look at <asc/desc, nulls first/last, collation>. In > particular, all the pathkey stuff is like this. And similarly places > that used to care about <type, typmod> now have to care about <type, > tymod, collation>. There might be ways to restructure some of this > code so that these things can be changed without having to touch quite > so many places. Yeah, I think that's what I'll try to do next. We already have TypeName as a structure that contains type and typmod (and collation, in my patch). We could make that a primnode instead of a parsenode, and use it in more places, or we could make a new leaner structure that only contains the numeric info. We could then, for example, change things like this: typedef struct Var { Expr xpr; ... Oid vartype; int32 vartypmod; ... } into this typedef struct Var { Expr xpr; ... TypeName/TypeFoo vartype; ... } This would save boatloads of duplicate code. > It looks like you've define collations as objects that exist within > particular namespaces, but there's no CREATE COLLATION statement, so I > don't see what purpose this serves. I suppose we could leave that to > be added later, but is there actually a use case for having collations > in individual schemas, or should we treat them more like we do casts - > i.e. as database-global objects? The SQL standard defines it that way, and there should be a CREATE COLLATION statement later. Application-specific collation sequences might not be unreasonable in the future. > Why does the executor ever need to see collate clauses? Hmm, maybe not. I think it did in an earlier working draft.
On Thu, Oct 21, 2010 at 2:44 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On tor, 2010-10-14 at 22:54 -0400, Robert Haas wrote: >> and maybe not that bad, but I wonder if there is some preparatory >> refactoring that could be done to trim it down a bit. I notice, for >> example, that a lot of places that looked at <asc/desc, nulls >> first/last> now look at <asc/desc, nulls first/last, collation>. In >> particular, all the pathkey stuff is like this. And similarly places >> that used to care about <type, typmod> now have to care about <type, >> tymod, collation>. There might be ways to restructure some of this >> code so that these things can be changed without having to touch quite >> so many places. > > Yeah, I think that's what I'll try to do next. > > We already have TypeName as a structure that contains type and typmod > (and collation, in my patch). We could make that a primnode instead of > a parsenode, and use it in more places, or we could make a new leaner > structure that only contains the numeric info. > > We could then, for example, change things like this: > > typedef struct Var > { > Expr xpr; > ... > Oid vartype; > int32 vartypmod; > ... > } > > into this > > typedef struct Var > { > Expr xpr; > ... > TypeName/TypeFoo vartype; > ... > } > > This would save boatloads of duplicate code. I think that the idea of having a node that represents "a type in all its glory" is a very good one. I'm somewhat inclined not to reuse TypeName, because I think we'll end up wanting to use this in places where "names" and "location" are not available. In fact, judging by some of the logic in LookupTypeNames(), we have some cases like that already, which might be worth trying to clean up. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Peter Eisentraut <peter_e@gmx.net> writes: > We already have TypeName as a structure that contains type and typmod > (and collation, in my patch). We could make that a primnode instead of > a parsenode, and use it in more places, or we could make a new leaner > structure that only contains the numeric info. TypeName per se is completely inappropriate for use beyond the first stage of parsing, because it requires catalog lookups to make any sense of. I think the post-parsing representation should still start with a type OID. I can agree with replacing typmod with a struct, though. regards, tom lane
On Thu, Oct 21, 2010 at 4:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Eisentraut <peter_e@gmx.net> writes: >> We already have TypeName as a structure that contains type and typmod >> (and collation, in my patch). We could make that a primnode instead of >> a parsenode, and use it in more places, or we could make a new leaner >> structure that only contains the numeric info. > > TypeName per se is completely inappropriate for use beyond the first > stage of parsing, because it requires catalog lookups to make any sense > of. I think the post-parsing representation should still start with a > type OID. I can agree with replacing typmod with a struct, though. I think we should have both the type OID and the typmod in the struct.Carrying the type OID separately from the typmod hascaused us enough heartache already. No? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Oct 21, 2010 at 4:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> TypeName per se is completely inappropriate for use beyond the first >> stage of parsing, because it requires catalog lookups to make any sense >> of. �I think the post-parsing representation should still start with a >> type OID. �I can agree with replacing typmod with a struct, though. > I think we should have both the type OID and the typmod in the struct. > Carrying the type OID separately from the typmod has caused us enough > heartache already. No? I think that that would probably involve a whole lot more notational busywork than just replacing typmod with something more complicated. However, we're talking about breaking vast amounts of code in either case, so maybe making it even vaster isn't a real consideration. regards, tom lane
> I think that that would probably involve a whole lot more notational > busywork than just replacing typmod with something more complicated. > However, we're talking about breaking vast amounts of code in either > case, so maybe making it even vaster isn't a real consideration. Gods, yes. Please let's not extend typemod any further without an overhaul. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
On tor, 2010-10-14 at 22:54 -0400, Robert Haas wrote: > It seems you've falsified the header comment in > pathkeys_useful_for_merging(), although I guess it's already false > because it doesn't seem to have been updated for the NULLS ASC/DESC > stuff, and the interior comment in right_merge_direction() also needs > adjusting. But this might be more than a documentation problem, > because the choice of merge direction really *is* arbitrary in the > case of ASC/DESC and NULLS FIRST/LAST, but I'm not sure whether that's > actually true for collation. If collation affects the definition of > equality then it certainly isn't true. I did check that again and didn't arrive at the conclusion that the comments would need updating either with respect to this patch or some previous change. Could you check again and possibly provide a suggestion?
On ons, 2010-09-22 at 19:44 +0900, Itagaki Takahiro wrote: > * CREATE TABLE (LIKE table_with_collation) doesn't inherit collations. This was fixed in the CF2010-11 patch. > * psql \d needs a separator between collate and not null modifiers. And this as well.
On Wed, Nov 24, 2010 at 3:37 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On tor, 2010-10-14 at 22:54 -0400, Robert Haas wrote: >> It seems you've falsified the header comment in >> pathkeys_useful_for_merging(), although I guess it's already false >> because it doesn't seem to have been updated for the NULLS ASC/DESC >> stuff, and the interior comment in right_merge_direction() also needs >> adjusting. But this might be more than a documentation problem, >> because the choice of merge direction really *is* arbitrary in the >> case of ASC/DESC and NULLS FIRST/LAST, but I'm not sure whether that's >> actually true for collation. If collation affects the definition of >> equality then it certainly isn't true. > > I did check that again and didn't arrive at the conclusion that the > comments would need updating either with respect to this patch or some > previous change. Could you check again and possibly provide a > suggestion? I think that you are right and that my previous comment was erroneous.Sorry for the noise. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company