Thread: Re: [PERFORM] autovacuum daemon stops doing work after about an hour
>>>>> "MTO" == Matthew T O'Connor <matthew@zeut.net> writes: MTO> Could this be the recently reported bug where time goes backwards on MTO> FreeBSD? Can anyone who knows more about this problem chime in, I know it MTO> was recently discussed on Hackers. Time does not go backwards -- the now and then variables are properly incrementing in time as you see from the debugging output. The error appears to be with the computation of the "diff". It is either a C programming error, or a compiler error. I'm not a C "cop" so I can't tell you which it is. Witness this program, below, compiled as "cc -g -o t t.c" and the output here: % ./t seconds = 3509 seconds1 = 3509000000 useconds = -452486 stepped diff = 3508547514 seconds2 = -785967296 seconds3 = 3509000000 diff = -786419782 long long diff = 3508547514 % apperantly, if you compute (now.tv_sec - then.tv_sec) * 1000000 all at once, it overflows since the RHS is all computed using longs rather than long longs. Fix is to cast at least one of the values to long long on the RHS, as in the computation of seconds3 below. compare that to the computation of seconds2 and you'll see that this is the cause. I'd be curious to see the output of this program on other platforms and other compilers. I'm using gcc 2.95.4 as shipped with FreeBSD 4.8+. That all being said, you should never sleep less than the base time, and never for more than a max amount, perhaps 1 hour? --cut here-- #include <sys/time.h> #include <stdio.h> int main() { struct timeval now, then; long long diff = 0; long long seconds, seconds1, seconds2, seconds3, useconds; now.tv_sec = 1070565077L; now.tv_usec = 216477L; then.tv_sec = 1070561568L; then.tv_usec = 668963L; seconds = now.tv_sec - then.tv_sec; printf("seconds = %lld\n",seconds); seconds1 = seconds * 1000000; printf("seconds1 = %lld\n",seconds1); useconds = now.tv_usec - then.tv_usec; printf("useconds = %lld\n",useconds); diff = seconds1 + useconds; printf("stepped diff = %lld\n",diff); /* this appears to be the culprit... it should be same as seconds1 */ seconds2 = (now.tv_sec - then.tv_sec) * 1000000; printf("seconds2 = %lld\n",seconds2); /* seems we need to cast long's to long long's for this computation */ seconds3 = ((long long)now.tv_sec - (long long)then.tv_sec) * 1000000; printf("seconds3 = %lld\n",seconds3); diff = (now.tv_sec - then.tv_sec) * 1000000 + (now.tv_usec - then.tv_usec); printf ("diff = %lld\n",diff); diff = ((long long)now.tv_sec - (long long)then.tv_sec) * 1000000 + (now.tv_usec - then.tv_usec); printf ("long long diff = %lld\n",diff); exit(0); } --cut here--
--On Thursday, December 04, 2003 16:20:22 -0500 Vivek Khera <khera@kcilink.com> wrote: >>>>>> "MTO" == Matthew T O'Connor <matthew@zeut.net> writes: > > MTO> Could this be the recently reported bug where time goes backwards on > MTO> FreeBSD? Can anyone who knows more about this problem chime in, I > know it MTO> was recently discussed on Hackers. > > > Time does not go backwards -- the now and then variables are properly > incrementing in time as you see from the debugging output. > > The error appears to be with the computation of the "diff". It is > either a C programming error, or a compiler error. I'm not a C "cop" > so I can't tell you which it is. > > Witness this program, below, compiled as "cc -g -o t t.c" and the > output here: > > % ./t > seconds = 3509 > seconds1 = 3509000000 > useconds = -452486 > stepped diff = 3508547514 > seconds2 = -785967296 > seconds3 = 3509000000 > diff = -786419782 > long long diff = 3508547514 > % > > apperantly, if you compute (now.tv_sec - then.tv_sec) * 1000000 all at > once, it overflows since the RHS is all computed using longs rather > than long longs. Fix is to cast at least one of the values to long > long on the RHS, as in the computation of seconds3 below. compare > that to the computation of seconds2 and you'll see that this is the > cause. > > I'd be curious to see the output of this program on other platforms > and other compilers. I'm using gcc 2.95.4 as shipped with FreeBSD > 4.8+. this is with the UnixWare compiler: $ cc -O -o testvk testvk.c $ ./testvk seconds = 3509 seconds1 = 3509000000 useconds = -452486 stepped diff = 3508547514 seconds2 = -785967296 seconds3 = 3509000000 diff = -786419782 long long diff = 3508547514 $ I think this is a C bug. > > That all being said, you should never sleep less than the base time, > and never for more than a max amount, perhaps 1 hour? > > > --cut here-- ># include <sys/time.h> ># include <stdio.h> > > int > main() > { > struct timeval now, then; > long long diff = 0; > long long seconds, seconds1, seconds2, seconds3, useconds; > > now.tv_sec = 1070565077L; > now.tv_usec = 216477L; > > then.tv_sec = 1070561568L; > then.tv_usec = 668963L; > > seconds = now.tv_sec - then.tv_sec; > printf("seconds = %lld\n",seconds); > seconds1 = seconds * 1000000; > printf("seconds1 = %lld\n",seconds1); > useconds = now.tv_usec - then.tv_usec; > printf("useconds = %lld\n",useconds); > > diff = seconds1 + useconds; > printf("stepped diff = %lld\n",diff); > > /* this appears to be the culprit... it should be same as seconds1 */ > seconds2 = (now.tv_sec - then.tv_sec) * 1000000; > printf("seconds2 = %lld\n",seconds2); > > /* seems we need to cast long's to long long's for this computation */ > seconds3 = ((long long)now.tv_sec - (long long)then.tv_sec) * 1000000; > printf("seconds3 = %lld\n",seconds3); > > > diff = (now.tv_sec - then.tv_sec) * 1000000 + (now.tv_usec - > then.tv_usec); printf ("diff = %lld\n",diff); > > diff = ((long long)now.tv_sec - (long long)then.tv_sec) * 1000000 + > (now.tv_usec - then.tv_usec); printf ("long long diff = %lld\n",diff); > > exit(0); > } > > > --cut here-- > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: ler@lerctr.org US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
Attachment
>>>>> "LR" == Larry Rosenman <ler@lerctr.org> writes: >> I'd be curious to see the output of this program on other platforms >> and other compilers. I'm using gcc 2.95.4 as shipped with FreeBSD >> 4.8+. LR> this is with the UnixWare compiler: LR> $ cc -O -o testvk testvk.c LR> $ ./testvk LR> seconds = 3509 LR> seconds1 = 3509000000 LR> useconds = -452486 LR> stepped diff = 3508547514 LR> seconds2 = -785967296 LR> seconds3 = 3509000000 LR> diff = -786419782 LR> long long diff = 3508547514 LR> $ LR> I think this is a C bug. Upon further reflection, I think so to. The entire RHS is long's so the arithmetic is done in longs, then assigned to a long long when done (after things have overflowed). Forcing any one of the RHS values to be long long causes the arithmetic to all be done using long longs, and then you get the numbers you expect. I think you only notice this in autovacuum when it takes a long time to complete the work, like my example of about 3500 seconds.