Thread: Re: [PERFORM] autovacuum daemon stops doing work after about an hour

Re: [PERFORM] autovacuum daemon stops doing work after about an hour

From
Vivek Khera
Date:
>>>>> "MTO" == Matthew T O'Connor <matthew@zeut.net> writes:

MTO> Could this be the recently reported bug where time goes backwards on
MTO> FreeBSD?  Can anyone who knows more about this problem chime in, I know it
MTO> was recently discussed on Hackers.


Time does not go backwards -- the now and then variables are properly
incrementing in time as you see from the debugging output.

The error appears to be with the computation of the "diff".  It is
either a C programming error, or a compiler error.  I'm not a C "cop"
so I can't tell you which it is.

Witness this program, below, compiled as "cc -g -o t t.c" and the
output here:

% ./t
seconds = 3509
seconds1 = 3509000000
useconds = -452486
stepped diff = 3508547514
seconds2 = -785967296
seconds3 = 3509000000
diff = -786419782
long long diff = 3508547514
%

apperantly, if you compute (now.tv_sec - then.tv_sec) * 1000000 all at
once, it overflows since the RHS is all computed using longs rather
than long longs.  Fix is to cast at least one of the values to long
long on the RHS, as in the computation of seconds3 below.  compare
that to the computation of seconds2 and you'll see that this is the
cause.

I'd be curious to see the output of this program on other platforms
and other compilers.  I'm using gcc 2.95.4 as shipped with FreeBSD
4.8+.

That all being said, you should never sleep less than the base time,
and never for more than a max amount, perhaps 1 hour?


--cut here--
#include <sys/time.h>
#include <stdio.h>

int
main()
{
  struct timeval now, then;
  long long diff = 0;
  long long seconds, seconds1, seconds2, seconds3, useconds;

  now.tv_sec = 1070565077L;
  now.tv_usec = 216477L;

  then.tv_sec = 1070561568L;
  then.tv_usec = 668963L;

  seconds = now.tv_sec - then.tv_sec;
  printf("seconds = %lld\n",seconds);
  seconds1 = seconds * 1000000;
  printf("seconds1 = %lld\n",seconds1);
  useconds = now.tv_usec - then.tv_usec;
  printf("useconds = %lld\n",useconds);

  diff = seconds1 + useconds;
  printf("stepped diff = %lld\n",diff);

  /* this appears to be the culprit... it should be same as seconds1 */
  seconds2 = (now.tv_sec - then.tv_sec) * 1000000;
  printf("seconds2 = %lld\n",seconds2);

  /* seems we need to cast long's to long long's for this computation */
  seconds3 = ((long long)now.tv_sec - (long long)then.tv_sec) * 1000000;
  printf("seconds3 = %lld\n",seconds3);


  diff = (now.tv_sec - then.tv_sec) * 1000000 + (now.tv_usec - then.tv_usec);
  printf ("diff = %lld\n",diff);

  diff = ((long long)now.tv_sec - (long long)then.tv_sec) * 1000000 + (now.tv_usec - then.tv_usec);
  printf ("long long diff = %lld\n",diff);

  exit(0);
}


--cut here--

Re: [PERFORM] autovacuum daemon stops doing work after about an

From
Larry Rosenman
Date:

--On Thursday, December 04, 2003 16:20:22 -0500 Vivek Khera
<khera@kcilink.com> wrote:

>>>>>> "MTO" == Matthew T O'Connor <matthew@zeut.net> writes:
>
> MTO> Could this be the recently reported bug where time goes backwards on
> MTO> FreeBSD?  Can anyone who knows more about this problem chime in, I
> know it MTO> was recently discussed on Hackers.
>
>
> Time does not go backwards -- the now and then variables are properly
> incrementing in time as you see from the debugging output.
>
> The error appears to be with the computation of the "diff".  It is
> either a C programming error, or a compiler error.  I'm not a C "cop"
> so I can't tell you which it is.
>
> Witness this program, below, compiled as "cc -g -o t t.c" and the
> output here:
>
> % ./t
> seconds = 3509
> seconds1 = 3509000000
> useconds = -452486
> stepped diff = 3508547514
> seconds2 = -785967296
> seconds3 = 3509000000
> diff = -786419782
> long long diff = 3508547514
> %
>
> apperantly, if you compute (now.tv_sec - then.tv_sec) * 1000000 all at
> once, it overflows since the RHS is all computed using longs rather
> than long longs.  Fix is to cast at least one of the values to long
> long on the RHS, as in the computation of seconds3 below.  compare
> that to the computation of seconds2 and you'll see that this is the
> cause.
>
> I'd be curious to see the output of this program on other platforms
> and other compilers.  I'm using gcc 2.95.4 as shipped with FreeBSD
> 4.8+.
this is with the UnixWare compiler:
$ cc -O -o testvk testvk.c
$ ./testvk
seconds = 3509
seconds1 = 3509000000
useconds = -452486
stepped diff = 3508547514
seconds2 = -785967296
seconds3 = 3509000000
diff = -786419782
long long diff = 3508547514
$


I think this is a C bug.



>
> That all being said, you should never sleep less than the base time,
> and never for more than a max amount, perhaps 1 hour?
>
>
> --cut here--
># include <sys/time.h>
># include <stdio.h>
>
> int
> main()
> {
>   struct timeval now, then;
>   long long diff = 0;
>   long long seconds, seconds1, seconds2, seconds3, useconds;
>
>   now.tv_sec = 1070565077L;
>   now.tv_usec = 216477L;
>
>   then.tv_sec = 1070561568L;
>   then.tv_usec = 668963L;
>
>   seconds = now.tv_sec - then.tv_sec;
>   printf("seconds = %lld\n",seconds);
>   seconds1 = seconds * 1000000;
>   printf("seconds1 = %lld\n",seconds1);
>   useconds = now.tv_usec - then.tv_usec;
>   printf("useconds = %lld\n",useconds);
>
>   diff = seconds1 + useconds;
>   printf("stepped diff = %lld\n",diff);
>
>   /* this appears to be the culprit... it should be same as seconds1 */
>   seconds2 = (now.tv_sec - then.tv_sec) * 1000000;
>   printf("seconds2 = %lld\n",seconds2);
>
>   /* seems we need to cast long's to long long's for this computation */
>   seconds3 = ((long long)now.tv_sec - (long long)then.tv_sec) * 1000000;
>   printf("seconds3 = %lld\n",seconds3);
>
>
>   diff = (now.tv_sec - then.tv_sec) * 1000000 + (now.tv_usec -
> then.tv_usec);   printf ("diff = %lld\n",diff);
>
>   diff = ((long long)now.tv_sec - (long long)then.tv_sec) * 1000000 +
> (now.tv_usec - then.tv_usec);   printf ("long long diff = %lld\n",diff);
>
>   exit(0);
> }
>
>
> --cut here--
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>



--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

Attachment

Re: [PERFORM] autovacuum daemon stops doing work after about an

From
Vivek Khera
Date:
>>>>> "LR" == Larry Rosenman <ler@lerctr.org> writes:

>> I'd be curious to see the output of this program on other platforms
>> and other compilers.  I'm using gcc 2.95.4 as shipped with FreeBSD
>> 4.8+.
LR> this is with the UnixWare compiler:
LR> $ cc -O -o testvk testvk.c
LR> $ ./testvk
LR> seconds = 3509
LR> seconds1 = 3509000000
LR> useconds = -452486
LR> stepped diff = 3508547514
LR> seconds2 = -785967296
LR> seconds3 = 3509000000
LR> diff = -786419782
LR> long long diff = 3508547514
LR> $

LR> I think this is a C bug.

Upon further reflection, I think so to.  The entire RHS is long's so
the arithmetic is done in longs, then assigned to a long long when
done (after things have overflowed).  Forcing any one of the RHS
values to be long long causes the arithmetic to all be done using long
longs, and then you get the numbers you expect.

I think you only notice this in autovacuum when it takes a long time
to complete the work, like my example of about 3500 seconds.