Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check - Mailing list pgsql-hackers

From Vitaly Burovoy
Subject Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check
Date
Msg-id CAKOSWNmraFH4jjFKd5TKBdUoBBo_G7pT9cE9Y-n3XCK=8e1M=Q@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check  (Mark Dilger <hornschnorter@gmail.com>)
Responses Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check
List pgsql-hackers
On 2016-03-15, Mark Dilger <hornschnorter@gmail.com> wrote:
>
>> On Mar 14, 2016, at 5:12 PM, Vitaly Burovoy <vitaly.burovoy@gmail.com>
>> wrote:
>>
>> On 3/14/16, Mark Dilger <hornschnorter@gmail.com> wrote:
>>> The first thing I notice about this patch is that
>>> src/include/datatype/timestamp.h
>>> has some #defines that are brittle.  The #defines have comments
>>> explaining
>>> their logic, but I'd rather embed that in the #define directly:
>>>
>>> This:
>>>
>>> +#ifdef HAVE_INT64_TIMESTAMP
>>> +#define MIN_TIMESTAMP  INT64CONST(-211813488000000000)
>>> +/* == (0 - POSTGRES_EPOCH_JDATE) * 86400 * USECS_PER_SEC */
>>> +#define MAX_TIMESTAMP  INT64CONST(9223371331200000000)
>>> +/* == (JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * 86400 *
>>> USECS_PER_SEC
>>> */
>>> +#else
>>> +#define MIN_TIMESTAMP  -211813488000.0
>>> +/* == (0 - POSTGRES_EPOCH_JDATE) * 86400 */
>>> +#define MAX_TIMESTAMP  9223371331200.0
>>> +/* == (JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * 86400 */
>>> +#endif
>>>
>>> Could be written as:
>>>
>>> #ifdef HAVE_INT64_TIMESTAMP
>>> #define MIN_TIMESTAMP
>>> ((INT64CONST(0) - POSTGRES_EPOCH_JDATE) * USECS_PER_DAY)
>>> #define MAX_TIMESTAMP
>>> ((JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * USECS_PER_DAY)
>>> #else
>>> #define MIN_TIMESTAMP
>>> ((INT64CONST(0) - POSTGRES_EPOCH_JDATE) * SECS_PER_DAY)
>>> #define MAX_TIMESTAMP
>>> ((JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * SECS_PER_DAY)
>>> #endif
>>>
>>> I assume modern compilers would convert these to the same constants at
>>> compile-time,
>>
>> Firstly, Postgres is compiling not only by modern compilers.
>
> Do you have an example of a compiler that will not do this constant folding
> at compile time?

No, I'm not good at knowing features of all versions and all kings of
compilers, but I'm sure constants are better than expressions for big
values. =)

>>> rather than impose a runtime penalty.
>>
>> Secondary, It is hard to write it correctly obeying Postgres' coding
>> standard (e.g. 80-columns border) and making it clear: it is not so
>> visual difference between USECS_PER_DAY and SECS_PER_DAY in the
>> expressions above (but it is vivid in comments in the file).
>
> Hmm.  I think using USECS_PER_DAY is perfectly clear, but that is a
> personal
> opinion.  I don't see any way to argue if you don't see it that way.

I'm talking about perception of the constants when they a very similar
but they are not justified by a single column (where difference
_in_expressions_ are clear). There was a difference by a single char
("U") only which is not so obvious without deep looking on it (be
honest I'd missed it until started to write an answer).

>> or "Why is INT64CONST set for
>> MIN_TIMESTAMP, but not for MAX_TIMESTAMP?" (JULIAN_MAX4STAMPS is _not_
>> int64).
>
> I was only casting the zero to int64.  That doesn't seem necessary, so it
> can
> be removed.  Both MIN_TIMESTAMP and MAX_TIMESTAMP were defined
> in terms of USECS_PER_DAY, which itself is defined in terms of INT64CONST,
> so I believe they both work out to be an int64 constant.

I hope so. But in such cases I'd prefer to _begin_ calculations from
int64, not to _finish_ by it.
It is impossible to pass constants (like JULIAN_MAX4STAMPS) to
INT64CONST macros. Inserting "(int64)" makes rows larger by 7 chars...
;-)

>>> The #defines would be less brittle in
>>> the event, for example, that the postgres epoch were ever changed.
>>
>> I don't think it is real, and even in such case all constants are
>> collected together in the file and will be found and changed at once.
>
> I agree that they would be found at once.  I disagree that the example
> is not real, as I have changed the postgres epoch myself in some builds,
> to be able to use int32 timestamps on small devices.

Wow! I'm sorry, I didn't know about it.
But in such case (tighten to int32) there are more than two places
which should be changed. Two more (four with disabled
HAVE_INT64_TIMESTAMP) constants are not big deal with it.

-- 
Best regards,
Vitaly Burovoy



pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Minor typos in optimizer/README
Next
From: Mark Dilger
Date:
Subject: Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check