Thread: seeming overflow during avg() of intervals without errors/warnings

seeming overflow during avg() of intervals without errors/warnings

From
Vick Khera
Date:
I'm attempting to compute an average age of some records based on a create_date field.

The simple query of

select avg(now() - user_created) from user_list where owner_id = 1 and user_status = 'active';

returns nonsense: -487 days -06:46:14.61501

The minimum user_created value is '2006-06-05 09:31:13-04', and the clock is correct for the current time. The user_created field is type "timestamp(0) with timezone, not null"

Couple of other interesting points:

select min(now() - user_created) from user_list where owner_id=1;
       min
-----------------
 00:58:10.419604

select max(now() - user_created) from user_list where owner_id=1;
            max
---------------------------
 2535 days 23:50:57.208109

So basically, it seems like however avg() on intervals is being computed it is overflowing somewhere without warning. I cannot figure out how to issue appropriate casts to make it not do so. In this case there are several million records that are involved.

I did find an alternate query that does not return nonsense:

select avg(age(now(),user_created)) from user_list where owner_id = 1 and user_status = 'active';

which says: 2 years 3 mons 42 days 19:19:15.100571

PostgreSQL 9.2.4 on FreeBSD 9.1 64-bit.

Advice? The original query seems to work for other subsets of the user_list, so to have it fail in such a crazy manor without any warnings is troublesome.
Vick Khera <vivek@khera.org> writes:
> So basically, it seems like however avg() on intervals is being computed it
> is overflowing somewhere without warning.

Hmmm ... yeah, it looks like interval_pl is missing any overflow check:

regression=# select '2147483648 days'::interval;
ERROR:  interval field value out of range: "2147483648 days"
LINE 1: select '2147483648 days'::interval;
               ^
regression=# select '2147483647 days'::interval;
    interval
-----------------
 2147483647 days
(1 row)

regression=# select '2147483647 days'::interval + '1 day'::interval;
     ?column?
------------------
 -2147483648 days
(1 row)

Somebody ought to fix that.

A larger question is whether it'd be worth the trouble for interval
avg() to use some wider internal representation so it could avoid
the overflow.  Given the lack of prior complaints I'm inclined to
suspect it's not.

            regards, tom lane


Re: seeming overflow during avg() of intervals without errors/warnings

From
Vick Khera
Date:
On Wed, May 15, 2013 at 10:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Somebody ought to fix that.

A larger question is whether it'd be worth the trouble for interval
avg() to use some wider internal representation so it could avoid
the overflow.  Given the lack of prior complaints I'm inclined to
suspect it's not.


Should I file a formal bug report? Personally, I care not either way it is fixed, but I never want to get garbage answers.

Thanks for looking into it.