Re: INTERVAL overflow detection is terribly broken - Mailing list pgsql-hackers

From Rok Kralj
Subject Re: INTERVAL overflow detection is terribly broken
Date
Msg-id CAMWF=HQ0oT0C9Q6m1aYmrPYBmS4O40wXqsNP-Oh0z8iYiPNS+w@mail.gmail.com
Whole thread Raw
In response to INTERVAL overflow detection is terribly broken  (Rok Kralj <rok.kralj@gmail.com>)
List pgsql-hackers
So, any insights on these problems?

They might not be critical, but might be silently corrupting someone's data.


2013/6/23 Rok Kralj <rok.kralj@gmail.com>
Hi, after studying ITERVAL and having a long chat with RhoidumToad and StuckMojo on #postgresql, I am presenting you 3 bugs regarding INTERVAL.

As far as I understand, the Interval struct (binary internal representation) consists of:

int32 months
int32 days
int64 microseconds

1. OUTPUT ERRORS: Since 2^63 microseconds equals 2,562,047,788 > 2^31 hours, the overflow in pg_tm when displaying the value causes overflow. The value of Interval struct is actually correct, error happens only on displaying it.

SELECT INTERVAL '2147483647 hours' + INTERVAL '5 hours'
"-2147483644:00:00"

Even wireder:

SELECT INTERVAL '2147483647 hours' + '1 hour'
"--2147483648:00:00"

notice the double minus? Don't ask how I came across this two bugs.

2. OPERATION ERRORS: When summing two intervals, the user is not notified when overflow occurs:

SELECT INT '2147483647' + INT '1'
ERROR: integer out of range

SELECT INTERVAL '2147483647 days' + INTERVAL '1 day'
"-2147483648 days"

This should be analogous.

3. PARSER / INPUT ERRORS:

This is perhaps the hardest one to explain, since this is an architectural flaw. You are checking the overflows when parsing string -> pg_tm struct. However, at this point, the parser doesn't know, that weeks and days are going to get combined, or years are going to get converted to months, for example.

Unawarness of underlying Interval struct causes two types of suberrors:

a) False positive

SELECT INTERVAL '2147483648 microseconds'
ERROR:  interval field value out of range: "2147483648 microseconds"

This is not right. Microseconds are internally stored as 64 bit signed integer. The catch is: this amount of microseconds is representable in Interval data structure, this shouldn't be an error.

b) False negative

SELECT INTERVAL '1000000000 years'
"-73741824 years"

We don't catch errors like this, because parser only checks for overflow in pg_tm. If overflow laters happens in Interval, we don't seem to care.

4. POSSIBLE SOLUTIONS:

a) Move the overflow checking just after the conversion of pg_tm -> Interval is made. This way, you can accurately predict if the result is really not store-able.

b) Because of 1), you have to declare tm_hour as int64, if you want to use that for the output. But, why not use Interval struct for printing directly, without intermediate pg_tm?

5. Offtopic: Correct the documentation, INTERVAL data size is 16 bytes, not 12.

Rok Kralj



--
eMail: rok.kralj@gmail.com

pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Kudos for Reviewers -- straw poll
Next
From: "Joshua D. Drake"
Date:
Subject: Re: Kudos for Reviewers -- straw poll