Thread: Why *exactly* is date_trunc() not immutable ?
Hi all, we (GNUmed) run a medical database on PostgreSQL. We are very pleased with it (PostgreSQL, that is ;-) in all aspects. The date-of-birth field in our table holding patients is of type "timestamp with time zone". One of our patient search queries uses the date-of-birth field to find matches. Since users enter day, month, and year but not hour, minute, and second of the DOB we run the query with select ... where ... and date_trunc('day', dob) = date_trunc('day', what_the_user_entered_as_dob) and ... ; (appropriately escaped, of course) The actual DOB is indeed stored with hour, minute and second so the time information is not redundant but we don't need it for searching. So I figured it would make sense to add a functional index on date_trunc('day', dob) to the patients table. Which worked (appeared to, at least) with PG 7.4. One of our users is on PG 8.2 and gets the warning that date_trunc() is not immutable and can thus not be used in a functional index. Makes sense all by itself. What I don't understand, however, is exactly *why* date_trunc is not immutable ? All it does is extracting part of the information that's there anyways. One would assume it to be the timestamp equivalent of substring(), no ? (not a good example, perhaps, as that might depend on encoding settings...) It *shouldn't* really depend on, say, date/time related locale settings, should it ? I'd be happy to provide more details if that is needed for which I'd have to contact the user in question. Thanks for any insight offered, Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
followup to self: On Sun, Feb 18, 2007 at 12:29:17PM +0100, Karsten Hilbert wrote: > So I figured it would make sense to add a functional index > on date_trunc('day', dob) to the patients table. Which > worked (appeared to, at least) with PG 7.4. > > One of our users is on PG 8.2 PostgreSQL 8.1 I was to say. > and gets the warning that > date_trunc() is not immutable and can thus not be used in a > functional index. The code to create the index: create index idx_identity_dob_ymd on dem.identity(date_trunc('day', dob)) The exact error it emits: functions in index expression must be marked IMMUTABLE Those were lifted from the error log without further editing. I know that I could fake immutability by wrapping date_trunc() in a stored procedure marked IMMUTABLE but I wonder what pitfalls that might hold. Thanks, Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On Feb 18, 2007, at 20:29 , Karsten Hilbert wrote: > What I don't understand, however, is exactly *why* date_trunc is > not immutable ? I believe it's because the result of date_trunc depends on the time zone setting for the session. test=# select date_trunc('day', current_timestamp); date_trunc ------------------------ 2007-02-18 00:00:00+09 (1 row) test=# set time zone 'EST5EDT'; SET test=# select date_trunc('day', current_timestamp); date_trunc ------------------------ 2007-02-18 00:00:00-05 (1 row) So, given the same arguments, ('day', and current_timestamp), date_trunc is returning two different results. (Casting to date has the same issue.) Michael Glaesemann grzm seespotcode net
On Sun, Feb 18, 2007 at 09:19:43PM +0900, Michael Glaesemann wrote: > >What I don't understand, however, is exactly *why* date_trunc is > >not immutable ? > > I believe it's because the result of date_trunc depends on the time > zone setting for the session. ... > So, given the same arguments, ('day', and current_timestamp), > date_trunc is returning two different results. (Casting to date has > the same issue.) Ah, I see. That makes sense. Now, if I'd write a date_trunc_utc(precision, timestamp with time zone) which converts input timestamps to UTC I could fairly safely mark that IMMUTABLE, no ? Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
Karsten Hilbert <Karsten.Hilbert@gmx.net> writes: >> So I figured it would make sense to add a functional index >> on date_trunc('day', dob) to the patients table. Which >> worked (appeared to, at least) with PG 7.4. For the record, this was changed just before 8.0 release: http://archives.postgresql.org/pgsql-hackers/2004-10/msg00030.php regards, tom lane
Michael Glaesemann wrote: > > On Feb 18, 2007, at 20:29 , Karsten Hilbert wrote: > >> What I don't understand, however, is exactly *why* date_trunc is not >> immutable ? > > I believe it's because the result of date_trunc depends on the time zone > setting for the session. I understand the reasoning, but _under the conditions_ it is being used by the OP it could have been immutable, right? The index values will still match up with the queried values if they are in the same time zone. I'm not asking to change it back to immutable (it isn't), I just realized that the stability of functions may actually be conditional. -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
On Feb 19, 2007, at 18:04 , Alban Hertroys wrote: > Michael Glaesemann wrote: >> >> On Feb 18, 2007, at 20:29 , Karsten Hilbert wrote: >> >>> What I don't understand, however, is exactly *why* date_trunc is not >>> immutable ? >> >> I believe it's because the result of date_trunc depends on the >> time zone >> setting for the session. > > I understand the reasoning, but _under the conditions_ it is being > used > by the OP it could have been immutable, right? *Under the conditions* doesn't really make sense wrt immutable functions. Immutable means is immutable under all conditions. Michael Glaesemann grzm seespotcode net
On Feb 18, 2007, at 23:12 , Karsten Hilbert wrote: > On Sun, Feb 18, 2007 at 09:19:43PM +0900, Michael Glaesemann wrote: > >>> What I don't understand, however, is exactly *why* date_trunc is >>> not immutable ? >> >> I believe it's because the result of date_trunc depends on the time >> zone setting for the session. > > ... > >> So, given the same arguments, ('day', and current_timestamp), >> date_trunc is returning two different results. (Casting to date has >> the same issue.) > > Ah, I see. That makes sense. > > Now, if I'd write a > > date_trunc_utc(precision, timestamp with time zone) > > which converts input timestamps to UTC I could fairly safely > mark that IMMUTABLE, no ? Yeah, I think if you normalized it to UTC you could mark your new function as immutable. Michael Glaesemann grzm seespotcode net
Michael Glaesemann wrote: > > On Feb 19, 2007, at 18:04 , Alban Hertroys wrote: > >> Michael Glaesemann wrote: >>> >>> On Feb 18, 2007, at 20:29 , Karsten Hilbert wrote: >>> >>>> What I don't understand, however, is exactly *why* date_trunc is not >>>> immutable ? >>> >>> I believe it's because the result of date_trunc depends on the time zone >>> setting for the session. >> >> I understand the reasoning, but _under the conditions_ it is being used >> by the OP it could have been immutable, right? > > *Under the conditions* doesn't really make sense wrt immutable > functions. Immutable means is immutable under all conditions. What I'm trying to say is not that it _is_ immutable, but that it _behaves_ immutable (under said conditions). This could imply that if a certain condition is available in a query on which such a function operates, it would behave immutable. -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
On Mon, Feb 19, 2007 at 10:16:12AM +0100, Alban Hertroys wrote: > What I'm trying to say is not that it _is_ immutable, but that it > _behaves_ immutable (under said conditions). > > This could imply that if a certain condition is available in a query on > which such a function operates, it would behave immutable. That is precisely why I didn't get the idea upfront that date_trunc() wouldn't be immutable just so. I'll solve it with a date_trunc_utc() wrapper. Thanks to all who chipped in. Something new to learn every day. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On Mon, Feb 19, 2007 at 10:36:36AM +0100, Karsten Hilbert wrote: > On Mon, Feb 19, 2007 at 10:16:12AM +0100, Alban Hertroys wrote: > > > What I'm trying to say is not that it _is_ immutable, but that it > > _behaves_ immutable (under said conditions). > > > > This could imply that if a certain condition is available in a query on > > which such a function operates, it would behave immutable. > That is precisely why I didn't get the idea upfront that > date_trunc() wouldn't be immutable just so. > > I'll solve it with a date_trunc_utc() wrapper. It should be noted the date_truc(timestamptz) is not immutable, whereas date_trunc(timestamp) is. Thus you should be able to make an index on: date_trunc( timestamptz_column AT TIME ZONE 'UTC', 'foo' ) OTOH, if you're only storing times in UTC, then timestamp without timezone might be better anyway. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
On Mon, Feb 19, 2007 at 10:58:50AM +0100, Martijn van Oosterhout wrote: > > I'll solve it with a date_trunc_utc() wrapper. > > It should be noted the date_truc(timestamptz) is not immutable, whereas > date_trunc(timestamp) is. Thus you should be able to make an index on: > > date_trunc( timestamptz_column AT TIME ZONE 'UTC', 'foo' ) Ah, that makes it clear *why* this should work. I would assume to get meaningful results from a query using that index I'd have to normalize input timestamps to UTC, too, before putting them into the query, right ? > OTOH, if you're only storing times in UTC, then timestamp without > timezone might be better anyway. Well, PostgreSQL itself is storing UTC anyways but we need the timezone bit since our frontend delivers timestamps from various timezones and they are note normalized to UTC before they get to the database. IOW, I want the database to force programmers to have to think about from which timezone they deliver timestamps into a date-of-birth field into. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On Mon, Feb 19, 2007 at 12:03:07PM +0100, Karsten Hilbert wrote: > > date_trunc( timestamptz_column AT TIME ZONE 'UTC', 'foo' ) > Ah, that makes it clear *why* this should work. > > I would assume to get meaningful results from a query using > that index I'd have to normalize input timestamps to UTC, > too, before putting them into the query, right ? Well, your queries need to use the same form, ie: SELECT blah FROM foo WHERE date_trunc( 'entered_timestamp'::timestamptz AT TIME ZONE 'UTC', 'foo' ) That seems a bit error prone though, so your idea of making a simple SQL function to wrap it will probably save you much heartache. It will also make it clearer to people reading the code *why* it is written that way. > Well, PostgreSQL itself is storing UTC anyways but we need > the timezone bit since our frontend delivers timestamps from > various timezones and they are note normalized to UTC before > they get to the database. Yeah, the AT TIME ZONE 'UTC' needs to be put somewhere, and in the index is probably easier than everywhere else. Just checking you'd thought about it. :) > IOW, I want the database to force programmers to have to > think about from which timezone they deliver timestamps into > a date-of-birth field into. Right. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
On Mon, Feb 19, 2007 at 12:53:15PM +0100, Martijn van Oosterhout wrote: > Well, your queries need to use the same form, ie: > > SELECT blah FROM foo > WHERE date_trunc( 'entered_timestamp'::timestamptz AT TIME ZONE 'UTC', 'foo' ) Thought so. > That seems a bit error prone though, so your idea of making a simple > SQL function to wrap it will probably save you much heartache. It will > also make it clearer to people reading the code *why* it is written > that way. Yep, and the COMMENT ON FUCNTION provides for a nice place to document it :-) > > Well, PostgreSQL itself is storing UTC anyways but we need > > the timezone bit since our frontend delivers timestamps from > > various timezones and they are note normalized to UTC before > > they get to the database. > > Yeah, the AT TIME ZONE 'UTC' needs to be put somewhere, and in the > index is probably easier than everywhere else. Just checking you'd > thought about it. :) Thanks ! Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
Alban Hertroys <alban@magproductions.nl> writes: > What I'm trying to say is not that it _is_ immutable, but that it > _behaves_ immutable (under said conditions). > This could imply that if a certain condition is available in a query on > which such a function operates, it would behave immutable. Right, but we don't have any way to represent such a fact in date_trunc's pg_proc entry, so we have to mark it as "not immutable". There was a related discussion awhile ago when designing the current set of "what time is it" functions --- transaction_timestamp(), statement_timestamp(), and clock_timestamp(). The original proposal had just a single function that took a parameter telling which value you wanted. The trouble with that was that it'd have had to be marked volatile, thereby defeating any ability to optimize conditions using it. By splitting into three functions, we were able to limit the "volatile" label to clock_timestamp(). regards, tom lane
On Sun, Feb 18, 2007 at 12:29:17 +0100, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote: > > The date-of-birth field in our table holding patients is of > type "timestamp with time zone". One of our patient search > queries uses the date-of-birth field to find matches. Since > users enter day, month, and year but not hour, minute, and > second of the DOB we run the query with That seems like an odd choice. Is there some reason they didn't use a type of date? Maybe you could get them to change it?
On Mon, Feb 19, 2007 at 12:41:11PM -0600, Bruno Wolff III wrote: > Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote: > > > > The date-of-birth field in our table holding patients is of > > type "timestamp with time zone". One of our patient search > > queries uses the date-of-birth field to find matches. Since > > users enter day, month, and year but not hour, minute, and > > second of the DOB we run the query with > > That seems like an odd choice. Is there some reason they didn't use a type > of date? Maybe you could get them to change it? What time of day were you born ? http://en.wikipedia.org/wiki/Apgar What is the technical reason that makes you wonder ? Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On Mon, Feb 19, 2007 at 20:48:07 +0100, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote: > > What time of day were you born ? > > http://en.wikipedia.org/wiki/Apgar > > What is the technical reason that makes you wonder ? Because it would make doing the queries simpler. If you aren't collecting the data, it doesn't make sense to deal with the extra headaches involved with pretending you know what time of day someone was born.
On Mon, Feb 19, 2007 at 03:28:01PM -0600, Bruno Wolff III wrote: > > What is the technical reason that makes you wonder ? > > Because it would make doing the queries simpler. > If you aren't collecting the data, it doesn't make sense to deal with the > extra headaches involved with pretending you know what time of day someone > was born. Oh, I see. When I said that users don't enter the hour and minute that was targetted at search time. They do enter the time part when entering a new patient, of course. So, it's surely collected. It's just not used for searching. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
I have a question about the query optimizer of a postgres.
As long as I understood through a postgres manual, the postgres query optimizer is implemented using a genetic algorithm.
I'm thinking to modify the query optimizer.
Are there any postgres version which uses typical dynamic programming approach for query optimization?
Also, are there any body who have tried to modify the optimizer?
jungmin
jungmin shin escribió: > I have a question about the query optimizer of a postgres. > As long as I understood through a postgres manual, the postgres query > optimizer is implemented using a *genetic algorithm.* There is an algorithm said to be "genetic", but it only kicks in with big joins; 12 tables on the default configuration. On queries with less tables, the optimizer uses exhaustive search and lots of smarts. See http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/optimizer/README?rev=1.39;content-type=text%2Fplain > I'm thinking to modify the query optimizer. > Are there any postgres version which uses typical dynamic programming > approach for query optimization? > Also, are there any body who have tried to modify the optimizer? Sure, we have a very active optimizer hacker. He is too clever for the rest of us to follow though :-( (I should speak only for myself here of course). He goes by the name of Tom Lane. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support