Thread: to_char and i18n
Now that Oracle supports i18n dependant behavior in its to_char formatting functions (at least for its 10g release) I was wondering if a patch to support this in PostgreSQL will get accepted. I was hoping to work on this now that I have some spare time. Regards, Manuel.
"Manuel Sugawara" <masm@fciencias.unam.mx> wrote > Now that Oracle supports i18n dependant behavior in its to_char > formatting functions (at least for its 10g release) I was wondering if > a patch to support this in PostgreSQL will get accepted. I was hoping > to work on this now that I have some spare time. > Can you give a small introduction of i18n and what's your plan in PostgreSQL? Regards, Qingqing
Manuel Sugawara <masm@fciencias.unam.mx> writes: > Now that Oracle supports i18n dependant behavior in its to_char > formatting functions (at least for its 10g release) I was wondering if > a patch to support this in PostgreSQL will get accepted. I thought to_char already had i18n behavior. What exactly are you thinking of changing? regards, tom lane
"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes: > Can you give a small introduction of i18n and what's your plan in > PostgreSQL? i18n == Internationalization (maybe I should say l10n == localization). This means that to_char functions might lead to different results depending on the i18n settings. For instance, nowadays, select to_char(now(), 'dd-mon-yy') returns 21-dec-05 regardless of the i18n settings. This should lead 21-dic-05 in the es_MX localization. This also applies to the concurrency symbol, thousand separator, etc. (Some time ago I proposed an--incomplete--patch and it was rejectd by Karel arguing that to_char functions should behave *exactly* the same way that they do in Oracle.) Regards, Manuel.
Tom Lane <tgl@sss.pgh.pa.us> writes: > I thought to_char already had i18n behavior. What exactly are you > thinking of changing? The modifiers that are suitable to localize. Month and day names comes to mind and maybe others, I'm not sure what the state of the code is, but I can say that, at least, the 'month' and 'day' modifiers does not behave in a localized way. Regards, Manuel.
Manuel Sugawara <masm@fciencias.unam.mx> writes: > (Some time ago I proposed an--incomplete--patch and it was rejectd by > Karel arguing that to_char functions should behave *exactly* the same > way that they do in Oracle.) That is the accepted plan for to_char ... of course, if Oracle changes to_char every so often, it'll get more interesting to decide what to do. regards, tom lane
"Manuel Sugawara" <masm@fciencias.unam.mx> wrote > > i18n == Internationalization (maybe I should say l10n == > localization). Good hint, I got it :-) Just like a crossword puzzle. 18 means there are 18 characters between 'i' and 'n' ... Regards, Qingqing
Qingqing Zhou wrote: > > "Manuel Sugawara" <masm@fciencias.unam.mx> wrote > > > > i18n == Internationalization (maybe I should say l10n == > > localization). > > Good hint, I got it :-) Just like a crossword puzzle. 18 means there are 18 > characters between 'i' and 'n' ... Huh? I don't understand. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Manuel Sugawara <masm@fciencias.unam.mx> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> I thought to_char already had i18n behavior. What exactly are you >> thinking of changing? > The modifiers that are suitable to localize. Month and day names comes > to mind and maybe others, I'm not sure what the state of the code is, > but I can say that, at least, the 'month' and 'day' modifiers does not > behave in a localized way. Can we spell the names differently but keep to the same field widths? I can see where it might cause problems to change the widths --- other than that, no objection. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Can we spell the names differently but keep to the same field widths? > I can see where it might cause problems to change the widths --- other > than that, no objection. Quite impossible. But if someone is relaying in the current behavior of to_char she might set lc_time accordingly as this parameter is not used in the code, AFAIK Regards, Manuel.
On Wed, 21 Dec 2005, Tom Lane wrote: > Manuel Sugawara <masm@fciencias.unam.mx> writes: > > (Some time ago I proposed an--incomplete--patch and it was rejectd by > > Karel arguing that to_char functions should behave *exactly* the same > > way that they do in Oracle.) > > That is the accepted plan for to_char ... of course, if Oracle changes > to_char every so often, it'll get more interesting to decide what to do. There's some functionality in 10g which PostgreSQL does not have: TZD - returns the short timezone string with daylight saving information, eg: PDT TZM - timezone offset minutes part TZH - timezone offset hours part TZR - timezone region (US/Pacific, for example) RR/RRRR - accept 'rounded' years, eg 99-1-1 = 1999-1-1 (kind of pointless) FF - specify how many digits to the right of the decimal place to display, when looking at factions of seconds. Eg: HH:MM:SS.FF3 would produce 15:56:22.123 X - the local radix character. Eg: HH:MM:SSXFF would produce 15:56:22.123 E - Era name (like, Japanese Imperial) (kind of pointless) EE - Full era name DS - Locale formatted short date. For example, DD/MM/YYYY for the Brits, MM/DD/YYYY for the Yanks DL - Locale formatted long date. Eg: fmDay, dd. Month yyyy in Germany SCC - Like 'CC', but will carry a - (minus) for BC dates (I'm not sure if this implies that Oracle wants BC dates to be marked 'BC'. I don't have an Oracle system around at the moment to check though :-() TS - Locale formatted short time. YEAR - Year in words SYEAR - Year in words, prefixed by minus sign for BC dates SYYYY - YYYY, prefixed by minus sign for BC dates Gavin
Gavin Sherry <swm@linuxworld.com.au> writes: > There's some functionality in 10g which PostgreSQL does not have: Good to know. I'm not an Oracle expert, actually I knew this reading an article in a past issue of the Oracle's magazine about i18n; essentially they were talking about how easy was for an Oracle database to get i18n as each parameter in the to_char functions will behave accordingly to the i18n settings. Regards, Manuel.
On Wed, 2005-12-21 at 23:50 -0500, Tom Lane wrote: > Manuel Sugawara <masm@fciencias.unam.mx> writes: > > Tom Lane <tgl@sss.pgh.pa.us> writes: > >> I thought to_char already had i18n behavior. What exactly are you > >> thinking of changing? > > > The modifiers that are suitable to localize. Month and day names comes > > to mind and maybe others, I'm not sure what the state of the code is, > > but I can say that, at least, the 'month' and 'day' modifiers does not > > behave in a localized way. The names for months and days are hardcoded to to_char code and it's in English only. > Can we spell the names differently but keep to the same field widths? That's important point. How resolve this problem Oracle? Maybe we can say (in docs) that with non-English locales it works with days/months names as in FM (fill) mode. # select length( to_char(now(), 'Day') ) as Normal, length( to_char(now(), 'FMDay') ) as FM;normal | fm --------+---- 9 | 8 It means 'FM' uses variable size of Day/Month field -- without FM is the size fixed to 9 chars. I think that for backward compatibility the locale sensitive to_char() should be implemented as separate call "to_char(datetime, format, locale)" or we should add new modifiers to the current to_char, something like "to_char(datetime, "LCMonth") or both. I don't have any time to work on to_char(), I can help to review patches only. Karel -- Karel Zak <zakkr@zf.jcu.cz>
--- Karel Zak <zakkr@zf.jcu.cz> escreveu: I have a patch like this. But this was for 7.4.x. I have to take a look at it. > That's important point. How resolve this problem Oracle? Maybe we can > say (in docs) that with non-English locales it works with days/months > names as in FM (fill) mode. > Yeah. We could make the new mode (TM?) ignores the FX mode and write a note in docs. > I think that for backward compatibility the locale sensitive > to_char() > should be implemented as separate call "to_char(datetime, format, > locale)" or we should add new modifiers to the current to_char, > something like "to_char(datetime, "LCMonth") or both. > I vote for another modifier (TM?). That's more flexible than another function overload because to_char() implements modifiers yet. > I don't have any time to work on to_char(), I can help to review > patches > only. > OK. I'll send a revised patch ASAP. Euler Taveira de Oliveira euler[at]yahoo_com_br _______________________________________________________ Yahoo! doce lar. Faça do Yahoo! sua homepage. http://br.yahoo.com/homepageset.html
--- Euler Taveira de Oliveira <eulerto@yahoo.com.br> escreveu: > I have a patch like this. But this was for 7.4.x. I have to take a > look > at it. > The patch is attached. It implements day and month i18n. I fixed a few misspelling comments. Docs is attached too. template1=# select to_char(now(), 'Day, DD Month YYYY'); to_char ------------------------------ Sunday , 25 December 2005 (1 registro) template1=# select to_char(now(), 'TMDay, DD TMMonth YYYY'); to_char --------------------------- Domingo, 25 Dezembro 2005 (1 registro) template1=# Comments? Euler Taveira de Oliveira euler[at]yahoo_com_br _______________________________________________________ Yahoo! doce lar. Faça do Yahoo! sua homepage. http://br.yahoo.com/homepageset.html
Attachment
On Sun, 2005-12-25 at 17:56 -0300, Euler Taveira de Oliveira wrote: > --- Euler Taveira de Oliveira <eulerto@yahoo.com.br> escreveu: > > > I have a patch like this. But this was for 7.4.x. I have to take a > > look > > at it. > > > The patch is attached. It implements day and month i18n. I fixed a few > misspelling comments. Docs is attached too. > > template1=# select to_char(now(), 'Day, DD Month YYYY'); > to_char > ------------------------------ > Sunday , 25 December 2005 > (1 registro) > > template1=# select to_char(now(), 'TMDay, DD TMMonth YYYY'); > to_char > --------------------------- > Domingo, 25 Dezembro 2005 > (1 registro) > > template1=# > > > Comments? I think it looks like a good patch. There's small problem that the current to_char() output is possible use as argument for to_timestamp() or to_date() function. It means you should implement vice-versa conversion from string with TMMonth/TMDay to timestamp. to_timestamp('Domingo, 25 Dezembro 2005', 'TMDay, DD TMMonth YYYY') Or.. at least describe in the docs that this way is unsupported for 'TM' prefix. Karel -- Karel Zak <zakkr@zf.jcu.cz>
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --------------------------------------------------------------------------- Euler Taveira de Oliveira wrote: > --- Euler Taveira de Oliveira <eulerto@yahoo.com.br> escreveu: > > > I have a patch like this. But this was for 7.4.x. I have to take a > > look > > at it. > > > The patch is attached. It implements day and month i18n. I fixed a few > misspelling comments. Docs is attached too. > > template1=# select to_char(now(), 'Day, DD Month YYYY'); > to_char > ------------------------------ > Sunday , 25 December 2005 > (1 registro) > > template1=# select to_char(now(), 'TMDay, DD TMMonth YYYY'); > to_char > --------------------------- > Domingo, 25 Dezembro 2005 > (1 registro) > > template1=# > > > Comments? > > Euler Taveira de Oliveira > euler[at]yahoo_com_br > > > > > > > > > _______________________________________________________ > Yahoo! doce lar. Fa?a do Yahoo! sua homepage. > http://br.yahoo.com/homepageset.html Content-Description: 1242239392-i18n-date.diff [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Patch applied. Thanks. --------------------------------------------------------------------------- Euler Taveira de Oliveira wrote: > --- Euler Taveira de Oliveira <eulerto@yahoo.com.br> escreveu: > > > I have a patch like this. But this was for 7.4.x. I have to take a > > look > > at it. > > > The patch is attached. It implements day and month i18n. I fixed a few > misspelling comments. Docs is attached too. > > template1=# select to_char(now(), 'Day, DD Month YYYY'); > to_char > ------------------------------ > Sunday , 25 December 2005 > (1 registro) > > template1=# select to_char(now(), 'TMDay, DD TMMonth YYYY'); > to_char > --------------------------- > Domingo, 25 Dezembro 2005 > (1 registro) > > template1=# > > > Comments? > > Euler Taveira de Oliveira > euler[at]yahoo_com_br > > > > > > > > > _______________________________________________________ > Yahoo! doce lar. Fa?a do Yahoo! sua homepage. > http://br.yahoo.com/homepageset.html Content-Description: 1242239392-i18n-date.diff [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Added to TODO: * Allow to_date() and to_timestamp() accept localized month names Comment added to the C code to show where it has to happen. --------------------------------------------------------------------------- Karel Zak wrote: > On Sun, 2005-12-25 at 17:56 -0300, Euler Taveira de Oliveira wrote: > > --- Euler Taveira de Oliveira <eulerto@yahoo.com.br> escreveu: > > > > > I have a patch like this. But this was for 7.4.x. I have to take a > > > look > > > at it. > > > > > The patch is attached. It implements day and month i18n. I fixed a few > > misspelling comments. Docs is attached too. > > > > template1=# select to_char(now(), 'Day, DD Month YYYY'); > > to_char > > ------------------------------ > > Sunday , 25 December 2005 > > (1 registro) > > > > template1=# select to_char(now(), 'TMDay, DD TMMonth YYYY'); > > to_char > > --------------------------- > > Domingo, 25 Dezembro 2005 > > (1 registro) > > > > template1=# > > > > > > Comments? > > I think it looks like a good patch. There's small problem that the > current to_char() output is possible use as argument for to_timestamp() > or to_date() function. It means you should implement vice-versa > conversion from string with TMMonth/TMDay to timestamp. > > to_timestamp('Domingo, 25 Dezembro 2005', 'TMDay, DD TMMonth YYYY') > > Or.. at least describe in the docs that this way is unsupported > for 'TM' prefix. > > Karel > > -- > Karel Zak <zakkr@zf.jcu.cz> > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > -- Bruce Momjian http://candle.pha.pa.us SRA OSS, Inc. http://www.sraoss.com + If your life is a hard drive, Christ can be your backup. +
Added to TODO: * Add missing parameter handling in to_char() http://archives.postgresql.org/pgsql-hackers/2005-12/msg00948.php I added a URL in TODO because it is a single message of detail I need to reference. --------------------------------------------------------------------------- Gavin Sherry wrote: > On Wed, 21 Dec 2005, Tom Lane wrote: > > > Manuel Sugawara <masm@fciencias.unam.mx> writes: > > > (Some time ago I proposed an--incomplete--patch and it was rejectd by > > > Karel arguing that to_char functions should behave *exactly* the same > > > way that they do in Oracle.) > > > > That is the accepted plan for to_char ... of course, if Oracle changes > > to_char every so often, it'll get more interesting to decide what to do. > > There's some functionality in 10g which PostgreSQL does not have: > > TZD - returns the short timezone string with daylight saving information, > eg: PDT > > TZM - timezone offset minutes part > > TZH - timezone offset hours part > > TZR - timezone region (US/Pacific, for example) > > RR/RRRR - accept 'rounded' years, eg 99-1-1 = 1999-1-1 (kind of pointless) > > FF - specify how many digits to the right of the decimal place to display, > when looking at factions of seconds. Eg: HH:MM:SS.FF3 would produce > 15:56:22.123 > > X - the local radix character. Eg: HH:MM:SSXFF would produce 15:56:22.123 > > E - Era name (like, Japanese Imperial) (kind of pointless) > EE - Full era name > > DS - Locale formatted short date. For example, DD/MM/YYYY for the Brits, > MM/DD/YYYY for the Yanks > > DL - Locale formatted long date. Eg: fmDay, dd. Month yyyy in Germany > > SCC - Like 'CC', but will carry a - (minus) for BC dates (I'm not sure if > this implies that Oracle wants BC dates to be marked 'BC'. I don't have > an Oracle system around at the moment to check though :-() > > TS - Locale formatted short time. > > YEAR - Year in words > > SYEAR - Year in words, prefixed by minus sign for BC dates > > SYYYY - YYYY, prefixed by minus sign for BC dates > > Gavin > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > -- Bruce Momjian http://candle.pha.pa.us SRA OSS, Inc. http://www.sraoss.com + If your life is a hard drive, Christ can be your backup. +
> Gavin Sherry wrote: > > On Wed, 21 Dec 2005, Tom Lane wrote: > > > > > Manuel Sugawara <masm@fciencias.unam.mx> writes: > > > > (Some time ago I proposed an--incomplete--patch and it was rejectd by > > > > Karel arguing that to_char functions should behave *exactly* the same > > > > way that they do in Oracle.) > > > > > > That is the accepted plan for to_char ... of course, if Oracle changes > > > to_char every so often, it'll get more interesting to decide what to do. > > > > There's some functionality in 10g which PostgreSQL does not have: > > > > TZD - returns the short timezone string with daylight saving information, > > eg: PDT This is the same as TZ and it is easy to implement. > > > > TZM - timezone offset minutes part Trivial > > > > TZH - timezone offset hours part Trivial > > > > TZR - timezone region (US/Pacific, for example) We don't currently have an offset -> region name lookup table but it should be easy enough to implement... > > > > RR/RRRR - accept 'rounded' years, eg 99-1-1 = 1999-1-1 (kind of pointless) > > > > FF - specify how many digits to the right of the decimal place to display, > > when looking at factions of seconds. Eg: HH:MM:SS.FF3 would produce > > 15:56:22.123 Trivial > > > > X - the local radix character. Eg: HH:MM:SSXFF would produce 15:56:22.123 > > I don't know how to get this character... is it included in the locale data some where (and where, specifically) > > E - Era name (like, Japanese Imperial) (kind of pointless) > > EE - Full era name No idea where to get this data. > > > > DS - Locale formatted short date. For example, DD/MM/YYYY for the Brits, > > MM/DD/YYYY for the Yanks Is this desirable? It may lead to confusion with datestyle. > > > > DL - Locale formatted long date. Eg: fmDay, dd. Month yyyy in Germany Should be straight forward - if the underlying library will honour locale. > > > > SCC - Like 'CC', but will carry a - (minus) for BC dates (I'm not sure if > > this implies that Oracle wants BC dates to be marked 'BC'. I don't have > > an Oracle system around at the moment to check though :-() Thoughts? > > > > TS - Locale formatted short time. Should be straight forward - if the underlying library will honour locale. > > > > YEAR - Year in words Hmmm. This would be hard to do if we want to support local language settings. > > > > SYEAR - Year in words, prefixed by minus sign for BC dates As above. > > > > SYYYY - YYYY, prefixed by minus sign for BC dates Should be straight forward. Any comments on the above? Gavin
>>> E - Era name (like, Japanese Imperial) (kind of pointless) >>> EE - Full era name Some stuff here: http://java.sun.com/javase/6/docs/guide/intl/calendar.doc.html