Thread: Created feature for to_date() conversion using patterns 'YYYY-WW','YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
Created feature for to_date() conversion using patterns 'YYYY-WW','YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
postgres
Date:
Hi, some days ago I ran into a problem with the to_date() function. I originally described it on StackExchange: https://dba.stackexchange.com/questions/250111/unexpected-behaviour-for-to-date-with-week-number-and-week-day The problem: If you want to parse a date string with year, week and day of week, you can do this using the ISO week pattern: 'IYYY-IW-ID'. This works as expected: date string | to_date() ------------+------------ '2019-1-1' | 2018-12-31 -> Monday of the first week of the year (defined as the week that includes the 4th of January) '2019-1-2' | 2019-01-01 '2019-1-3' | 2019-01-02 '2019-1-4' | 2019-01-03 '2019-1-5' | 2019-01-04 '2019-1-6' | 2019-01-05 '2019-1-7' | 2019-01-06 '2019-2-1' | 2019-01-07 '2019-2-2' | 2019-01-08 But if you are trying this with the non-ISO pattern 'YYYY-WW-D', the result was not expected: date string | to_date() ------------------------- '2019-1-1' | 2019-01-01 '2019-1-2' | 2019-01-01 '2019-1-3' | 2019-01-01 '2019-1-4' | 2019-01-01 '2019-1-5' | 2019-01-01 '2019-1-6' | 2019-01-01 '2019-1-7' | 2019-01-01 '2019-2-1' | 2019-01-08 '2019-2-2' | 2019-01-08 As you can see, the 'D' part of the pattern doesn't influence the resulting date. The answer of Laurenz Albe pointed to a part of the documentation, I missed so far: "In to_timestamp and to_date, weekday names or numbers (DAY, D, and related field types) are accepted but are ignored for purposes of computing the result. The same is true for quarter (Q) fields." (https://www.postgresql.org/docs/12/functions-formatting.html) So, I had a look at the relevant code part. I decided to try a patch by myself. Now it works as I would expect it: date string | to_date() ------------------------- '2019-1-1' | 2018-12-30 -> Sunday (!) of the first week of the year (the first week is at the first day of year) '2019-1-2' | 2018-12-31 '2019-1-3' | 2019-01-01 '2019-1-4' | 2019-01-02 '2019-1-5' | 2019-01-03 '2019-1-6' | 2019-01-04 '2019-1-7' | 2019-01-05 '2019-2-1' | 2019-01-06 '2019-2-2' | 2019-01-07 Furthermore, if you left the 'D' part, the date would be always set to the first day of the corresponding week (in that case it is Sunday, in contrast to the ISO week, which starts mondays). To be consistent, I added similar code for the week of month pattern ('W'). So, using the pattern 'YYYY-MM-W-D' yields in: date string | to_date() --------------------------- '2018-12-5-1' | 2018-12-23 '2018-12-6-1' | 2018-12-30 '2019-1-1-1' | 2018-12-30 -> First day (Su) of the first week of the first month of the year '2019-2-2-1' | 2019-02-03 -> First day (Su) of the second week of February '2019-10-3-5' | 2019-10-17 -> Fifth day (Th) of the third week of October If you left the 'D', it would be set to 1 as well. The code can be seen here: https://github.com/S-Man42/postgres/commit/534e6bd70e23864f385d60ecf187496c7f4387c9 I hope, keeping the code style of the surrounding code (especially the ISO code) is ok for you. Now the questions: 1. Although the ignorance of the 'D' pattern is well documented, does the new behaviour might be interesting for you? 2. Does it work as you'd expect it? 3. Because this could be my very first contribution to the PostgreSQL code base, I really want you to be as critical as possible. I am not quite sure if I didn't miss something important. 4. Currently something like '2019-1-8' does not throw an exception but results in the same as '2019-2-1' (8th is the same as the 1st of the next week). On the other hand, currently, the ISO week conversion gives out the result of '2019-1-7' for every 'D' >= 7. I am not sure if this is better. I think a consistent exception handling should be discussed separately (date roll over vs. out of range exception vs. ISO week behaviour) So far, I am very curious about your opinions! Kind regards, Mark/S-Man42
Re: Created feature for to_date() conversion using patterns'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Mark Lorenz
Date:
Hi, I apologize for the mistake. For the mailing list correspondence I created this mail account. But I forgot to change the sender name. So, the "postgres" name appeared as sender name in the mailing list. I changed it. Kind regards, Mark/S-Man42 > Hi, > > some days ago I ran into a problem with the to_date() function. I > originally described it on StackExchange: > https://dba.stackexchange.com/questions/250111/unexpected-behaviour-for-to-date-with-week-number-and-week-day > > The problem: > > If you want to parse a date string with year, week and day of week, > you can do this using the ISO week pattern: 'IYYY-IW-ID'. This works > as expected: > > date string | to_date() > ------------+------------ > '2019-1-1' | 2018-12-31 -> Monday of the first week of the year > (defined as the week that includes the 4th of January) > '2019-1-2' | 2019-01-01 > '2019-1-3' | 2019-01-02 > '2019-1-4' | 2019-01-03 > '2019-1-5' | 2019-01-04 > '2019-1-6' | 2019-01-05 > '2019-1-7' | 2019-01-06 > > '2019-2-1' | 2019-01-07 > '2019-2-2' | 2019-01-08 > > But if you are trying this with the non-ISO pattern 'YYYY-WW-D', the > result was not expected: > > date string | to_date() > ------------------------- > '2019-1-1' | 2019-01-01 > '2019-1-2' | 2019-01-01 > '2019-1-3' | 2019-01-01 > '2019-1-4' | 2019-01-01 > '2019-1-5' | 2019-01-01 > '2019-1-6' | 2019-01-01 > '2019-1-7' | 2019-01-01 > > '2019-2-1' | 2019-01-08 > '2019-2-2' | 2019-01-08 > > As you can see, the 'D' part of the pattern doesn't influence the > resulting date. > > The answer of Laurenz Albe pointed to a part of the documentation, I > missed so far: > > "In to_timestamp and to_date, weekday names or numbers (DAY, D, and > related field types) are accepted but are ignored for purposes of > computing the result. The same is true for quarter (Q) fields." > (https://www.postgresql.org/docs/12/functions-formatting.html) > > So, I had a look at the relevant code part. I decided to try a patch > by myself. Now it works as I would expect it: > > date string | to_date() > ------------------------- > '2019-1-1' | 2018-12-30 -> Sunday (!) of the first week of the year > (the first week is at the first day of year) > '2019-1-2' | 2018-12-31 > '2019-1-3' | 2019-01-01 > '2019-1-4' | 2019-01-02 > '2019-1-5' | 2019-01-03 > '2019-1-6' | 2019-01-04 > '2019-1-7' | 2019-01-05 > > '2019-2-1' | 2019-01-06 > '2019-2-2' | 2019-01-07 > > Furthermore, if you left the 'D' part, the date would be always set to > the first day of the corresponding week (in that case it is Sunday, in > contrast to the ISO week, which starts mondays). > > To be consistent, I added similar code for the week of month pattern > ('W'). So, using the pattern 'YYYY-MM-W-D' yields in: > > date string | to_date() > --------------------------- > '2018-12-5-1' | 2018-12-23 > '2018-12-6-1' | 2018-12-30 > '2019-1-1-1' | 2018-12-30 -> First day (Su) of the first week of the > first month of the year > '2019-2-2-1' | 2019-02-03 -> First day (Su) of the second week of > February > '2019-10-3-5' | 2019-10-17 -> Fifth day (Th) of the third week of > October > > If you left the 'D', it would be set to 1 as well. > > The code can be seen here: > https://github.com/S-Man42/postgres/commit/534e6bd70e23864f385d60ecf187496c7f4387c9 > > I hope, keeping the code style of the surrounding code (especially the > ISO code) is ok for you. > > Now the questions: > 1. Although the ignorance of the 'D' pattern is well documented, does > the new behaviour might be interesting for you? > 2. Does it work as you'd expect it? > 3. Because this could be my very first contribution to the PostgreSQL > code base, I really want you to be as critical as possible. I am not > quite sure if I didn't miss something important. > 4. Currently something like '2019-1-8' does not throw an exception but > results in the same as '2019-2-1' (8th is the same as the 1st of the > next week). On the other hand, currently, the ISO week conversion > gives out the result of '2019-1-7' for every 'D' >= 7. I am not sure > if this is better. I think a consistent exception handling should be > discussed separately (date roll over vs. out of range exception vs. > ISO week behaviour) > > So far, I am very curious about your opinions! > > Kind regards, > Mark/S-Man42
Re: Created feature for to_date() conversion using patterns'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Mark Lorenz
Date:
Hi, while preparing the patch for the Commitfest, I found a bug in the to_char() function that is quite correlated with this issue: SELECT to_char('1997-02-01'::date, 'YYYY-WW-D') returns: 1997-05-7 -> which is ok, I believe. Feb, 1st was on Saturday, so counting from Sundays, it was day 7 of week 5. SELECT to_char('1997-02-03'::date, 'YYYY-WW-D') returns: 1997-05-2 -> This cannot be. The input date is two days laters, but the result is 5 days earlier. I'd expect 1997-06-2 as result, but this occurs another week later: SELECT to_char('1997-02-10'::date, 'YYYY-WW-D') This is wrong, because this should be week 7 instead. On the other hand, the ISO week formats work very well. I'll have a look at the code and try to fix it in the patch as well. Kind regards, Mark Am 2019-10-08 17:49, schrieb Mark Lorenz: > Hi, > > I apologize for the mistake. > > For the mailing list correspondence I created this mail account. But I > forgot to change the sender name. So, the "postgres" name appeared as > sender name in the mailing list. I changed it. > > Kind regards, > Mark/S-Man42 > >> Hi, >> >> some days ago I ran into a problem with the to_date() function. I >> originally described it on StackExchange: >> https://dba.stackexchange.com/questions/250111/unexpected-behaviour-for-to-date-with-week-number-and-week-day >> >> The problem: >> >> If you want to parse a date string with year, week and day of week, >> you can do this using the ISO week pattern: 'IYYY-IW-ID'. This works >> as expected: >> >> date string | to_date() >> ------------+------------ >> '2019-1-1' | 2018-12-31 -> Monday of the first week of the year >> (defined as the week that includes the 4th of January) >> '2019-1-2' | 2019-01-01 >> '2019-1-3' | 2019-01-02 >> '2019-1-4' | 2019-01-03 >> '2019-1-5' | 2019-01-04 >> '2019-1-6' | 2019-01-05 >> '2019-1-7' | 2019-01-06 >> >> '2019-2-1' | 2019-01-07 >> '2019-2-2' | 2019-01-08 >> >> But if you are trying this with the non-ISO pattern 'YYYY-WW-D', the >> result was not expected: >> >> date string | to_date() >> ------------------------- >> '2019-1-1' | 2019-01-01 >> '2019-1-2' | 2019-01-01 >> '2019-1-3' | 2019-01-01 >> '2019-1-4' | 2019-01-01 >> '2019-1-5' | 2019-01-01 >> '2019-1-6' | 2019-01-01 >> '2019-1-7' | 2019-01-01 >> >> '2019-2-1' | 2019-01-08 >> '2019-2-2' | 2019-01-08 >> >> As you can see, the 'D' part of the pattern doesn't influence the >> resulting date. >> >> The answer of Laurenz Albe pointed to a part of the documentation, I >> missed so far: >> >> "In to_timestamp and to_date, weekday names or numbers (DAY, D, and >> related field types) are accepted but are ignored for purposes of >> computing the result. The same is true for quarter (Q) fields." >> (https://www.postgresql.org/docs/12/functions-formatting.html) >> >> So, I had a look at the relevant code part. I decided to try a patch >> by myself. Now it works as I would expect it: >> >> date string | to_date() >> ------------------------- >> '2019-1-1' | 2018-12-30 -> Sunday (!) of the first week of the year >> (the first week is at the first day of year) >> '2019-1-2' | 2018-12-31 >> '2019-1-3' | 2019-01-01 >> '2019-1-4' | 2019-01-02 >> '2019-1-5' | 2019-01-03 >> '2019-1-6' | 2019-01-04 >> '2019-1-7' | 2019-01-05 >> >> '2019-2-1' | 2019-01-06 >> '2019-2-2' | 2019-01-07 >> >> Furthermore, if you left the 'D' part, the date would be always set to >> the first day of the corresponding week (in that case it is Sunday, in >> contrast to the ISO week, which starts mondays). >> >> To be consistent, I added similar code for the week of month pattern >> ('W'). So, using the pattern 'YYYY-MM-W-D' yields in: >> >> date string | to_date() >> --------------------------- >> '2018-12-5-1' | 2018-12-23 >> '2018-12-6-1' | 2018-12-30 >> '2019-1-1-1' | 2018-12-30 -> First day (Su) of the first week of the >> first month of the year >> '2019-2-2-1' | 2019-02-03 -> First day (Su) of the second week of >> February >> '2019-10-3-5' | 2019-10-17 -> Fifth day (Th) of the third week of >> October >> >> If you left the 'D', it would be set to 1 as well. >> >> The code can be seen here: >> https://github.com/S-Man42/postgres/commit/534e6bd70e23864f385d60ecf187496c7f4387c9 >> >> I hope, keeping the code style of the surrounding code (especially the >> ISO code) is ok for you. >> >> Now the questions: >> 1. Although the ignorance of the 'D' pattern is well documented, does >> the new behaviour might be interesting for you? >> 2. Does it work as you'd expect it? >> 3. Because this could be my very first contribution to the PostgreSQL >> code base, I really want you to be as critical as possible. I am not >> quite sure if I didn't miss something important. >> 4. Currently something like '2019-1-8' does not throw an exception but >> results in the same as '2019-2-1' (8th is the same as the 1st of the >> next week). On the other hand, currently, the ISO week conversion >> gives out the result of '2019-1-7' for every 'D' >= 7. I am not sure >> if this is better. I think a consistent exception handling should be >> discussed separately (date roll over vs. out of range exception vs. >> ISO week behaviour) >> >> So far, I am very curious about your opinions! >> >> Kind regards, >> Mark/S-Man42
Re: Created feature for to_date() conversion using patterns'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Mark Lorenz
Date:
Hi, I fixed the described issue in the to char() function. The output of the current version is: postgres=# SELECT to_char('1997-02-01'::date, 'YYYY-WW-D'); to_char --------- 1997-05-7 (1 row) postgres=# SELECT to_char('1997-02-03'::date, 'YYYY-WW-D'); to_char --------- 1997-05-2 (1 row) postgres=# SELECT to_char('1997-02-10'::date, 'YYYY-WW-D'); to_char --------- 1997-06-2 (1 row) As you can see, the week day of the Feb 3rd - which is two days AFTER Feb 1st - yields in a result which is 5 days BEFORE the earlier date, which obviously cannot be. Furthermore, using the Gregorian calendar, Feb 3rd is in week 6. So, the Feb 10th cannot be in week 6 as well. The bug was, that the week day of Jan 1st was not considered in the calculation of the week number. So, a possible offset has not been set. New output: postgres=# SELECT to_char('1997-02-03'::date, 'YYYY-WW-D'); to_char --------- 1997-06-2 (1 row) postgres=# SELECT to_char('1997-02-01'::date, 'YYYY-WW-D'); to_char --------- 1997-05-7 (1 row) postgres=# SELECT to_char('1997-02-10'::date, 'YYYY-WW-D'); to_char --------- 1997-07-2 (1 row) ------------------- Furthermore I adjusted the to_date() functionality for the WW-D pattern as well. As said before in the thread, I know, ignoring the D part is known and documented, but I think, if the ISO format recognizes the day part, the non-ISO format should as well - especially when the "back" operation does as well (meaning to_char()): Output in the current version: postgres=# SELECT to_date('2019-1-1', 'YYYY-WW-D'); to_date ------------ 2019-01-01 (1 row) postgres=# SELECT to_date('2019-1-2', 'YYYY-WW-D'); to_date ------------ 2019-01-01 (1 row) postgres=# SELECT to_date('2019-1-3', 'YYYY-WW-D'); to_date ------------ 2019-01-01 (1 row) postgres=# SELECT to_date('2019-1-7', 'YYYY-WW-D'); to_date ------------ 2019-01-01 (1 row) postgres=# SELECT to_date('2019-2-1', 'YYYY-WW-D'); to_date ------------ 2019-01-08 (1 row) New output: postgres=# SELECT to_date('2019-1-1', 'YYYY-WW-D'); to_date ------------ 2018-12-30 (1 row) postgres=# SELECT to_date('2019-1-2', 'YYYY-WW-D'); to_date ------------ 2018-12-31 (1 row) postgres=# SELECT to_date('2019-1-3', 'YYYY-WW-D'); to_date ------------ 2019-01-01 (1 row) postgres=# SELECT to_date('2019-1-7', 'YYYY-WW-D'); to_date ------------ 2019-01-05 (1 row) postgres=# SELECT to_date('2019-2-1', 'YYYY-WW-D'); to_date ------------ 2019-01-06 (1 row) I added the patch as plain text attachment. It contains the code and, of course, the regression tests. Some existing tests failed, because they worked with the old output. I have changed their expected output. Hope you'll find it helpful. Best regards, Mark Lorenz
Attachment
Re: Created feature for to_date() conversion using patterns'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Mark Lorenz
Date:
Hi, I got the advice to split the patches for: - fixing the to_char() function - changing the to_date()/to_timestamp() behaviour So I appended the split patches. Kind regards, Mark Lorenz
Attachment
Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Tom Lane
Date:
Mark Lorenz <postgres@four-two.de> writes: > I got the advice to split the patches for: > - fixing the to_char() function > - changing the to_date()/to_timestamp() behaviour > So I appended the split patches. I'm a bit skeptical of the premise here. The fine manual says In to_timestamp and to_date, weekday names or numbers (DAY, D, and related field types) are accepted but are ignored for purposes of computing the result. The same is true for quarter (Q) fields. You appear to be trying to change that, but it's not at all clear what behavior you're changing it to, or whether the result is going to be any more sensible than it was before. In any case, this is certainly not a "bug fix", because the code is working as documented. It's a redefinition, and you haven't specified the new definition. Another point is that these functions are meant to be Oracle-compatible, so I wonder what Oracle does in not-terribly-well-defined cases like these. regards, tom lane
Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Tom Lane
Date:
Mark Lorenz <postgres@four-two.de> writes: > while preparing the patch for the Commitfest, I found a bug in the > to_char() function that is quite correlated with this issue: > SELECT to_char('1997-02-01'::date, 'YYYY-WW-D') > returns: 1997-05-7 -> which is ok, I believe. Feb, 1st was on Saturday, > so counting from Sundays, it was day 7 of week 5. > SELECT to_char('1997-02-03'::date, 'YYYY-WW-D') > returns: 1997-05-2 -> This cannot be. Why not? These format codes are specified as D day of the week, Sunday (1) to Saturday (7) WW week number of year (1–53) (the first week starts on the first day of the year) I don't see anything there that says that "D" is correlated with "WW". We do have a connection between "ID" and "IW", so that ID ought to specify a day within an IW week, but there's no connection between "D" and either "W" or "WW" week numbering. It's a day of the week, as per the regular calendar. Trying to define it as something else is just going to break stuff. The only way to make "D" as it stands compatible with a week-numbering system is to ensure that your weeks always start on Sundays, that is, just as confusing as ISO weeks but slightly differently confusing. Perhaps it would be worth inventing format codes that do have the same relationship to "W" and/or "WW" as "ID" does to "IW". But repurposing "D" for that is a bad idea. regards, tom lane
Re: Created feature for to_date() conversion using patterns'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Mark Lorenz
Date:
Hi Tom, thanks for answering! I commited two different patches: ------- The first one is for the strange behaviour of to_char(), which could be seen as a bug, I believe. As described earlier, to_char() with the 'WW-D' pattern could return wrong week numbers. The non-ISO week number is defined for weeks beginning with Sundays and ending with Saturdays. The first week of the year is the week with January, 1st. For example: postgres=# SELECT to_char('1997-01-01'::date, 'YYYY-WW-D'); to_char --------- 1997-01-4 (1 row) 1997-01-01 was a Wednesday. So the first week in 1997 was from Jan 1st to Jan 4th (Saturday). Week 2 started on Jan 5th. But to_char() gives out week number 1 until Tuesday (!), Jan 7th. postgres=# SELECT to_char('1997-01-07'::date, 'YYYY-WW-D'); to_char --------- 1997-01-3 (1 row) After that, on Jan 8th, the output switches from 01-3 to 02-4, which makes no sense in my personal opinion. The week number should be consistent from Sun to Sat and should not switch during any day in the week. Furthermore, it is not clear why Jan 7th should return an earlier week day (3) than Jan 1st (4). The bug is, that the calculation of the week number only considers the number of days of the current year. But it ignores the first week day, which defines an offset. This has been fixed in the first patch. ------- Second patch: As you stated correctly, this is not a bug fix, because the current behaviour is documented and it works as the documentation states. I tried to describe my confusion in the very first post of this thread: I was wondering why the D part is not recognized in the non-ISO week pattern while the ISO day is working very well. Although this is documented, there could be a chance that this simply was not implemented right now - so I tried. The main aspect, I believe, is, that to_date() or to_timestamp() is some kind of "back" operation of the to_char() function. So, a new definition simply should recognize the week day as the to_char() function does, instead of setting the day part fix to any number (please see the examples in the very first post for that). ------- Combining both patches, the to_char() fix and the to_date() change, it is possible to calculate the non-ISO week pattern in both directions: SELECT to_date(to_char(anydate, 'YYYY-WW-D'), 'YYYY-WW-D') would result in "anydate". Currently it does not: postgres=# SELECT to_date(to_char('1997-01-07'::date, 'YYYY-WW-D'), 'YYYY-WW-D') to_char --------- 1997-01-01 (1 row) postgres=# SELECT to_char(to_date('1997-01-07', 'YYYY-WW-D'), 'YYYY-WW-D') to_char --------- 1997-01-04 (1 row) On the other hand, the ISO week calculations work as expected, especially the there-and-back operation results in the original value: postgres=# SELECT to_date(to_char('1997-01-07'::date, 'IYYY-IW-ID'), 'IYYY-IW-ID') to_char --------- 1997-01-07 (1 row) postgres=# SELECT to_char(to_date('1997-01-07', 'IYYY-IW-ID'), 'IYYY-IW-ID') to_char --------- 1997-01-7 (1 row) The only difference between ISO and non-ISO weeks is the beginning on Mondays and the definition of the first week. But this cannot be the reason why one operation results in right values (comparing with a calendar) and the other one does not. Does this explanation make it clearer?
Re: Created feature for to_date() conversion using patterns'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Mark Lorenz
Date:
>> while preparing the patch for the Commitfest, I found a bug in the >> to_char() function that is quite correlated with this issue: > >> SELECT to_char('1997-02-01'::date, 'YYYY-WW-D') > >> returns: 1997-05-7 -> which is ok, I believe. Feb, 1st was on >> Saturday, >> so counting from Sundays, it was day 7 of week 5. > >> SELECT to_char('1997-02-03'::date, 'YYYY-WW-D') > >> returns: 1997-05-2 -> This cannot be. > > Why not? These format codes are specified as > > D day of the week, Sunday (1) to Saturday (7) > WW week number of year (1–53) (the first week starts on the first day > of the year) > Because 1997-05-2 is earlier than 1997-05-7. But 1997-02-03 is later than 1997-02-01. From my point of view, this is confusing. > I don't see anything there that says that "D" is correlated with "WW". > We do have a connection between "ID" and "IW", so that ID ought to > specify a day within an IW week, but there's no connection between "D" > and either "W" or "WW" week numbering. It's a day of the week, as > per the regular calendar. Trying to define it as something else is > just going to break stuff. > > The only way to make "D" as it stands compatible with a week-numbering > system is to ensure that your weeks always start on Sundays, that is, > just as confusing as ISO weeks but slightly differently confusing. > > Perhaps it would be worth inventing format codes that do have the > same relationship to "W" and/or "WW" as "ID" does to "IW". But > repurposing "D" for that is a bad idea. > > regards, tom lane I don't want to create any connection here. The day is calculated correctly. But the week number is wrong. 1997-02-03 was in week number 6, as well as 1997-02-04. But Postgres returns 5. The problem with to_char() is, that the week number is considering only the nmber of days in the year and divides them by 7. So, there is no diffence whether the year starts on Sunday or any other week day. So, an offset is missing, which yields in wrong week numbers, as I can see...
Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Tom Lane
Date:
Mark Lorenz <postgres@four-two.de> writes: >> Why not? These format codes are specified as >> D day of the week, Sunday (1) to Saturday (7) >> WW week number of year (1–53) (the first week starts on the first day >> of the year) > I don't want to create any connection here. The day is calculated > correctly. But the week number is wrong. 1997-02-03 was in week number > 6, as well as 1997-02-04. But Postgres returns 5. The week number is only wrong if you persist in ignoring the very clear definition given in the manual. According to the stated definition of WW, "week 1" consists of Jan 1 to Jan 7, "week 2" to Jan 8-14, etc. So it's correct for both of those dates to be in "week 5". There are other possible definitions of "week" of course, such as the ISO week, under which both those dates would be in week 6 (of 1997 anyway, not all other years). But if you want ISO week you should ask for it with "IW", not expect that we'll change the longstanding behavior of "WW" to match. As far as I can see, the only way to make a week definition that gives sensible results in combination with "D" is to do something like what ISO does, but with Sunday as the start day instead of Monday. But having three different week definitions seems more likely to confuse people (even more) than to be helpful. Plus you'd also need analogs of IYYY, IDDD, etc. Why not just use IYYY-IW-ID, instead? You'd have to adapt to week-starts-on-Monday, but you'd be using a notation that a lot of people are already familiar with, instead of inventing your own. Another possibility, perhaps, is to use WW in combination with some new field that counts 1-7, 1-7, 1-7, ... starting on Jan 1. But then that wouldn't have any easy mapping to day names, so there's no free lunch. Throwing MM into the mix makes it even more exciting, as month boundaries don't correspond with week boundaries either. I don't see any rational way to make YYYY-MM-W or YYYY-MM-W-D patterns that behave in a numerically consistent fashion. (Note that ISO didn't try --- there is no "ISO month".) The bottom line is that these various definitions aren't mutually consistent, and that's just a fact of life, not something that can be fixed. In any case, backwards compatibility alone would be a sufficient reason to reject a patch that changes the established behavior of the existing format codes. Whether you think they're buggy or not, other people are relying on the existing documented behavior. Perhaps we'd consider a patch that adds some new format codes with new behavior. But personally I'd vote against implementing new format codes unless you can point to well-established standards supporting their definitions. to_char/to_date are impossibly complex and unmaintainable already; we don't need to add more features with narrow use-cases to them. regards, tom lane
Re: Created feature for to_date() conversion using patterns'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Mark Lorenz
Date:
Updated the chg_to_date_yyyywwd.patch with additional tests (because it works not only for 'D' pattern but also for all day patterns like 'Day' or 'DY'). Added the necessary documentation change. (The fix_to_char_yyyywwd.patch from f4e740a8de3ad1e762a28f6ff253ea4f%40four-two.de is still up-to-date)
Attachment
Re: Created feature for to_date() conversion using patterns'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Cleysson Lima
Date:
Em sex., 10 de jan. de 2020 às 09:22, Mark Lorenz <postgres@four-two.de> escreveu:
Updated the chg_to_date_yyyywwd.patch with additional tests (because it
works not only for 'D' pattern but also for all day patterns like 'Day'
or 'DY'). Added the necessary documentation change.
(The fix_to_char_yyyywwd.patch from
f4e740a8de3ad1e762a28f6ff253ea4f%40four-two.de is still up-to-date)
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested
Hi Mark,
this is a review of the patch: chg_to_date_yyyywwd.patch
There hasn't been any problem, at least that I've been able to find.
This one applies cleanly.
The entire compilation went without error as well.
############# Without patch #############
postgres=# SELECT to_date('2019-1-1', 'YYYY-WW-D');
to_date
------------
2019-01-01
(1 row)
postgres=# SELECT to_date('2019-1-2', 'YYYY-WW-D');
to_date
------------
2019-01-01
(1 row)
postgres=# SELECT to_date('2019-1-9', 'YYYY-WW-D');
to_date
------------
2019-01-01
(1 row)
############# With patch #############
postgres=# SELECT to_date('2019-1-1', 'YYYY-WW-D');
to_date
------------
2018-12-30
(1 row)
postgres=# SELECT to_date('2019-1-2', 'YYYY-WW-D');
to_date
------------
2018-12-31
(1 row)
postgres=# SELECT to_date('2019-1-9', 'YYYY-WW-D');
to_date
------------
2019-01-07
(1 row)
+1 for committer review
--
Cleysson Lima
Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Tom Lane
Date:
Cleysson Lima <cleyssondba@gmail.com> writes: > this is a review of the patch: chg_to_date_yyyywwd.patch > There hasn't been any problem, at least that I've been able to find. AFAICS, the point of this patch is to make to_date symmetrical with the definition of WW that the other patch wants for to_char. But the other patch is wrong, for the reasons I explained upthread, so I doubt that we want this one either. I still think that it'd be necessary to invent at least one new format field code in order to get to a sane version of this feature. As they stand, 'WW' and 'D' do not agree on what a week is, and changing the behavior of either one in order to make them agree is just not going to happen. BTW, I went to check on what Oracle thinks about this, since these functions are allegedly Oracle-compatible. On PG, I get this for the WW and D values for the next few days: select to_char(current_date+n, 'YYYY-MM-DD YYYY-WW-D Day') from generate_series(0,10) n; to_char -------------------------------- 2020-01-31 2020-05-6 Friday 2020-02-01 2020-05-7 Saturday 2020-02-02 2020-05-1 Sunday 2020-02-03 2020-05-2 Monday 2020-02-04 2020-05-3 Tuesday 2020-02-05 2020-06-4 Wednesday 2020-02-06 2020-06-5 Thursday 2020-02-07 2020-06-6 Friday 2020-02-08 2020-06-7 Saturday 2020-02-09 2020-06-1 Sunday 2020-02-10 2020-06-2 Monday (11 rows) I did the same calculations using Oracle 11g R2 on sqlfiddle.com and got the same results. Interestingly, though, I also tried it on https://rextester.com/l/oracle_online_compiler and here's what I get there: 2020-01-31 2020-05-5 Freitag 2020-02-01 2020-05-6 Samstag 2020-02-02 2020-05-7 Sonntag 2020-02-03 2020-05-1 Montag 2020-02-04 2020-05-2 Dienstag 2020-02-05 2020-06-3 Mittwoch 2020-02-06 2020-06-4 Donnerstag 2020-02-07 2020-06-5 Freitag 2020-02-08 2020-06-6 Samstag 2020-02-09 2020-06-7 Sonntag 2020-02-10 2020-06-1 Montag (I don't know how to switch locales on these sites, so I don't have any way to check what happens in other locales.) So we agree with Oracle on what WW means, but they count D as 1-7 starting on either Sunday or Monday according to locale. I wonder whether we should change to match that? Maybe "TMD" should act that way? It's already the case that their "Day" acts like our "TMDay", evidently. Either way, though, the WW weeks don't line up with the D weeks, and we're not likely to make them do so. So I think an acceptable version of this feature has to involve defining at least one new format code and maybe as many as three, to produce year, week and day values that agree on whichever definition of "a week" you want to use, and then to_date has to enforce that input uses matching year/week/day field types, very much like it already does for ISO versus Gregorian dates. I also notice that neither patch touches the documentation. A minimum requirement here is defining what you think the underlying "week" is, if it's neither ISO nor the existing WW definition. As I said before, it'd also be a good idea to provide some evidence that there are other people using that same week definition. regards, tom lane
Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Tom Lane
Date:
I wrote: > Either way, though, the WW weeks don't line up with the D weeks, > and we're not likely to make them do so. > So I think an acceptable version of this feature has to involve > defining at least one new format code and maybe as many as three, > to produce year, week and day values that agree on whichever > definition of "a week" you want to use, and then to_date has to > enforce that input uses matching year/week/day field types, > very much like it already does for ISO versus Gregorian dates. A different line of thought could be to accept the current to_char() behavior for WW and D, and go ahead and teach to_date() to invert that. That is, take YYYY plus WW as specifying a seven-day interval, and then D chooses the matching day within that interval. This would still have the property you complained about originally that WW-plus-D don't form a monotonically increasing sequence, but I think that ship has sailed. regards, tom lane
Re: Created feature for to_date() conversion using patterns'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Alvaro Herrera
Date:
I just noticed that this patch has been classified under "bug fixes", but per Tom's comments, this is not a bug fix -- it seems we would need a new format code to implement some different week numbering mechanism. That seems a new feature, not a bug fix. Therefore I propose to move this in Commitfest from "Bug fixes" to "Server features". This has implications such as not automatically moving to next commitfest if no update appears during this one. I've never personally had to write calendaring applications, so I don't have an opinion on whether this is useful. Why isn't it sufficient to rely on ISO week/day numbering (IW/ID), which appears to be more consistent? I think we should consider adding more codes only if real-world use cases exist for them. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Created feature for to_date() conversion using patterns'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Mark Lorenz
Date:
Hi Tom, with a bit space to this issue, I re-read your comments. I am beginning to understand what you mean or - better - what's wrong with my thoughts. When I understand you correctly, you say, the WW can start at any weekday, and is not fixed to Sunday, right? In your opinion the WW starts with the weekday of Jan, 1st? That's what could be my problem: I always thought (maybe triggered through the D pattern), that WW has to start sundays. But, now I agree with you, the docs fit better to your interpretation: "the first week starts on the first day of the year" I interpreted it with: It starts on the week, which includes the first of the year, but the Sunday before. Did I understand you correctly? In that case, I accept, that my patch is no bugfix (I think, it would be one, if my interpretion would be the expected behaviour.). But, nevertheless, what about adding the function to accept the DAY, D (and maybe the Q) patterns for to_date() - in this case, of course, in the uncorrelated version? to_char() handles them properly. And, from my point of view, there is no reason why they should give "1" instead the real day number. What do you think?
Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Tom Lane
Date:
Mark Lorenz <postgres@four-two.de> writes: > with a bit space to this issue, I re-read your comments. I am beginning > to understand what you mean or - better - what's wrong with my thoughts. > When I understand you correctly, you say, the WW can start at any > weekday, and is not fixed to Sunday, right? In your opinion the WW > starts with the weekday of Jan, 1st? That's what could be my problem: I > always thought (maybe triggered through the D pattern), that WW has to > start sundays. But, now I agree with you, the docs fit better to your > interpretation: > "the first week starts on the first day of the year" Yes, that's clearly what our code, and what Oracle's does too, given the tests I showed upthread. > But, nevertheless, what about adding the function to accept the DAY, D > (and maybe the Q) patterns for to_date() - in this case, of course, in > the uncorrelated version? to_char() handles them properly. And, from my > point of view, there is no reason why they should give "1" instead the > real day number. What do you think? The trick is to produce something sane. I think that a reasonable precedent for this would be what to_date does with ISO-week fields: you can ask it to parse IYYY-IW-ID but you can't mix that with regular month/day/year fields. So for example, it seems like it'd be possible to reconstruct a date from YYYY-WW-D, because that's enough to uniquely identify a day. The D field isn't monotonically increasing within a week, but nonetheless there's exactly one day in each YYYY-WW week that has a particular D value. However you probably don't want to allow inconsistent mixtures like YYYY-WW-ID, because that's just a mess (and more than likely, it's a mistake). And I would not be in favor of allowing YYYY-Q either, because that would not be enough to uniquely identify a day, so there's really no point in allowing Q to enter into to_date's considerations at all. Whether there is actually any field demand for such a feature is not clear to me. AFAICT Oracle doesn't support it. regards, tom lane
Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
From
Tom Lane
Date:
I wrote: > Mark Lorenz <postgres@four-two.de> writes: >> But, nevertheless, what about adding the function to accept the DAY, D >> (and maybe the Q) patterns for to_date() - in this case, of course, in >> the uncorrelated version? to_char() handles them properly. And, from my >> point of view, there is no reason why they should give "1" instead the >> real day number. What do you think? > The trick is to produce something sane. I think that a reasonable > precedent for this would be what to_date does with ISO-week fields: > you can ask it to parse IYYY-IW-ID but you can't mix that with regular > month/day/year fields. So for example, it seems like it'd be possible > to reconstruct a date from YYYY-WW-D, because that's enough to uniquely > identify a day. The D field isn't monotonically increasing within a > week, but nonetheless there's exactly one day in each YYYY-WW week that > has a particular D value. However you probably don't want to allow > inconsistent mixtures like YYYY-WW-ID, because that's just a mess (and > more than likely, it's a mistake). And I would not be in favor of > allowing YYYY-Q either, because that would not be enough to uniquely > identify a day, so there's really no point in allowing Q to enter into > to_date's considerations at all. > Whether there is actually any field demand for such a feature is > not clear to me. AFAICT Oracle doesn't support it. Since we're certainly not going to commit these patches as-presented, and nothing has happened on this thread since early April, I've marked both the CF entries as Returned With Feedback. If you do write a patch to make to_date work as above, please file a new CF entry. (BTW, having two CF entries pointing at the same email thread is pretty confusing to our not-that-bright tools. It's probably better to have just one entry per thread in the future.) regards, tom lane