Thread: Unexpected casts while using date_trunc()
I have an application generating the following query on a DATE column in PostgreSQL 10.1: > SELECT TO_CHAR(DATE_TRUNC('month', jobs.active_until), 'YYYY-MM') > FROM jobs > GROUP BY DATE_TRUNC('month', jobs.active_until) > LIMIT 500 I wanted to support it with an expression index, but was surprised to find that the DATE_TRUNC call was not immutable: > CREATE INDEX ON jobs (DATE_TRUNC('month', active_until)); > ERROR: functions in index expression must be marked IMMUTABLE The documentation explains that DATE is first cast to TIMESTAMP. (As I understand it, this is an immutable cast; sounds find and appropriate.) https://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC But in my testing, the date value is actually cast to TIMESTAMPTZ: > SELECT pg_typeof(DATE_TRUNC('month', active_until)) > FROM jobs LIMIT 1; > pg_typeof > -------------------------- > timestamp with time zone > (1 row) Indeed, casting to TIMESTAMP first allows me to create the index: > CREATE INDEX ON jobs (DATE_TRUNC('month', active_until::timestamp)); However, this index doesn't help me because the query is generated by an application outside of my control. 1. It seems to me it is worth mentioning in the docs that DATE_TRUNC accepts and returns TIMESTAMPTZ. N.B. There's no mention of it in the table of functions: https://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-TABLE 2. Since this implicit cast to TIMESTAMPTZ is used rather than TIMESTAMP, it may be worthwhile to add an explicit implementation that accepts DATE. (Thanks to Andrew Gierth for explaining the mechanics on IRC.) -- Chris
Chris Bandy <bandy.chris@gmail.com> writes: > The documentation explains that DATE is first cast to TIMESTAMP. (As I > understand it, this is an immutable cast; sounds find and appropriate.) > https://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC > But in my testing, the date value is actually cast to TIMESTAMPTZ: Yeah. There are two relevant variants of date_trunc(): regression=# \df date_trunc List of functions Schema | Name | Result data type | Argument data types | Type ------------+------------+-----------------------------+-----------------------------------+------ pg_catalog | date_trunc | interval | text, interval | func pg_catalog | date_trunc | timestamp with time zone | text, timestamp with time zone | func pg_catalog | date_trunc | timestamp without time zone | text, timestamp without time zone | func (3 rows) and since timestamptz is the preferred type in the datetime category, the parser will prefer that one over the plain-timestamp one. There's no resolution heuristic that would let it decide that timestamp without tz is a better semantic match to a "date" input. The closest we could get with the available machinery is to make date-to-timestamptz not be an implicitly available cast, which I'm afraid would break as many things as it would fix. Just for grins, I tried changing that cast to "automatic" and ran the regression tests that way. It does have the perhaps-desirable effect that date_trunc now behaves like you expected, but it also breaks one unrelated test case that would now need an explicit cast: it's a union between a date column and a timestamptz column. Maybe that's acceptable collateral damage for some future release, but it's sure not something we could back-patch. So we probably ought to change the docs here. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> Yeah. There are two relevant variants of date_trunc(): [...] Tom> So we probably ought to change the docs here. There's also the option of adding an explicit function date_trunc(text,date) returns date, which is a workaround that I (and probably quite a few other people) have used. I think having such a function added to core would be less surprising than the current behavior. -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Tom> Yeah. There are two relevant variants of date_trunc(): > [...] > Tom> So we probably ought to change the docs here. > There's also the option of adding an explicit function > date_trunc(text,date) returns date, which is a workaround that I (and > probably quite a few other people) have used. I think having such a > function added to core would be less surprising than the current > behavior. Ah! Yes, of course, that would be better. Seems like a workable solution for Chris, too. We still can't back-patch it, though. regards, tom lane
On 5/24/18 2:31 PM, Tom Lane wrote: > Andrew Gierth <andrew@tao11.riddles.org.uk> writes: >> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: >> Tom> Yeah. There are two relevant variants of date_trunc(): >> [...] >> Tom> So we probably ought to change the docs here. > >> There's also the option of adding an explicit function >> date_trunc(text,date) returns date, which is a workaround that I (and >> probably quite a few other people) have used. I think having such a >> function added to core would be less surprising than the current >> behavior. > > Ah! Yes, of course, that would be better. Seems like a workable > solution for Chris, too. We still can't back-patch it, though. > > regards, tom lane > I could take a pass at this about two weeks from now. (I won't be sad if someone else beats me to it.) Are we in agreement that the return type should be date? I wasn't able to find a definitive reference for the expected behavior of date_trunc. Shall I replicate the behavior of casting to/from timestamp? What should happen when the user requests some time portion (e.g. hour) be truncated? -- Chris
Chris Bandy <bandy.chris@gmail.com> writes: > On 5/24/18 2:31 PM, Tom Lane wrote: >> Andrew Gierth <andrew@tao11.riddles.org.uk> writes: >>> There's also the option of adding an explicit function >>> date_trunc(text,date) returns date, which is a workaround that I (and >>> probably quite a few other people) have used. > Are we in agreement that the return type should be date? That is a good question, actually. That would be a larger behavior change than just avoiding the undesired conversion to TZ. I had imagined this as just being equivalent to date_trunc(text, date::timestamp). Casting the result back down to date seems safe, though. Another thing to consider is that the effective range of date is wider than timestamp's, meaning coerce-to-timestamp can fail. Is it worth providing a whole additional code path so we never coerce the date to timestamp at all? I'd tend to think not, but perhaps somebody wants to argue differently. regards, tom lane