Thread: about explain analyze

about explain analyze

From

Teodor Sigaev

Date:

27 November 2003, 13:14:53

Hello!

Explain analyze takes 3 times more time for execution. Why?

wow=# \timing
Timing is on.
wow=# select max(click.accesses) from click;   max
---------- 10944762
(1 row)

Time: 105,654 ms
wow=# explain analyze select max(click.accesses) from click;                                                   QUERY
PLAN
 

-----------------------------------------------------------------------------------------------------------------
Aggregate (cost=1103.70..1103.70 rows=1 width=8) (actual 
 
time=289.391..289.393 rows=1 loops=1)   ->  Seq Scan on click  (cost=0.00..971.36 rows=52936 width=8) (actual 
time=0.013..133.943 rows=52936 loops=1) Total runtime: 289.498 ms
(3 rows)

Time: 290,695 ms
wow=# select version();                                                 version 

--------------------------------------------------------------------------------------------------------- PostgreSQL
7.4on i386-unknown-freebsd5.1, compiled by GCC gcc (GCC) 3.2.2 
 
[FreeBSD] 20030205 (release)
(1 row)


-- 
Teodor Sigaev                                  E-mail: teodor@sigaev.ru

Re: about explain analyze

From

Tom Lane

Date:

28 November 2003, 15:48:57

Teodor Sigaev <teodor@sigaev.ru> writes:
> Explain analyze takes 3 times more time for execution. Why?

Measurement overhead.  It would seem your platform has a particularly
slow version of gettimeofday() though ... I've never noticed such a
large discrepancy myself.
        regards, tom lane

Re: about explain analyze

From

"Marc G. Fournier"

Date:

28 November 2003, 16:03:56

On Fri, 28 Nov 2003, Tom Lane wrote:

> Teodor Sigaev <teodor@sigaev.ru> writes:
> > Explain analyze takes 3 times more time for execution. Why?
>
> Measurement overhead.  It would seem your platform has a particularly
> slow version of gettimeofday() though ... I've never noticed such a
> large discrepancy myself.

Actually, I'm noticing similar "problems" with v7.4 ... "EXPLAIN ANALYZE"
seems to be showing some *very* high ms for execution time, but if you run
the actual query, it doesn't seem to take even 1/10th the time reported
...

This is on a FreeBSD 4.x box ...

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: about explain analyze

From

Tom Lane

Date:

28 November 2003, 16:14:05

"Marc G. Fournier" <scrappy@postgresql.org> writes:
> Actually, I'm noticing similar "problems" with v7.4 ... "EXPLAIN ANALYZE"
> seems to be showing some *very* high ms for execution time, but if you run
> the actual query, it doesn't seem to take even 1/10th the time reported
> ...

Example?  I don't see anything out of line here.
        regards, tom lane

Re: about explain analyze

From

"Marc G. Fournier"

Date:

28 November 2003, 17:02:27

On Fri, 28 Nov 2003, Tom Lane wrote:

> "Marc G. Fournier" <scrappy@postgresql.org> writes:
> > Actually, I'm noticing similar "problems" with v7.4 ... "EXPLAIN ANALYZE"
> > seems to be showing some *very* high ms for execution time, but if you run
> > the actual query, it doesn't seem to take even 1/10th the time reported
> > ...
>
> Example?  I don't see anything out of line here.

pgsql74# /usr/local/pgsql/bin/psql -U pgsql -c "SELECT count(rec_id) FROM url" 186_archivescount
--------209872
(1 row)

pgsql74# /usr/local/pgsql/bin/psql -U pgsql -c "EXPLAIN ANALYZE SELECT max(rec_id) FROM url" 186_archives
                                    QUERY PLAN
 

--------------------------------------------------------------------------------------------------------------------Aggregate
(cost=11177.19..11177.19 rows=1 width=4) (actual time=2400.579..2400.585 rows=1 loops=1)  ->  Seq Scan on url
(cost=0.00..10741.55rows=174255 width=4) (actual time=0.056..1387.803 rows=209872 loops=1)Total runtime: 2407.095 ms
 
(3 rows)

pgsql74# time /usr/local/pgsql/bin/psql -U pgsql -c "SELECT max(rec_id) FROM url" 186_archives max
--------690582
(1 row)

0.000u 0.004s 0:00.32 0.0%      0+0k 0+0io 0pf+0w

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: about explain analyze

From

Teodor Sigaev

Date:

28 November 2003, 17:06:21

How many times is gettimeofday called?
wow=# explain analyze select 1;                                     QUERY PLAN 

------------------------------------------------------------------------------------ Result  (cost=0.00..0.01 rows=1
width=0)(actual time=0.008..0.010 rows=1 loops=1) Total runtime: 0.047 ms
 
(2 rows)

Several thousands?

Tom Lane wrote:
> Teodor Sigaev <teodor@sigaev.ru> writes:
> 
>>Explain analyze takes 3 times more time for execution. Why?
> 
> 
> Measurement overhead.  It would seem your platform has a particularly
> slow version of gettimeofday() though ... I've never noticed such a
> large discrepancy myself.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings

-- 
Teodor Sigaev                                  E-mail: teodor@sigaev.ru

Re: about explain analyze

From

Tom Lane

Date:

28 November 2003, 17:31:26

Teodor Sigaev <teodor@sigaev.ru> writes:
> How many times is gettimeofday called?

Twice per plan node visit, if you are doing EXPLAIN ANALYZE.  (The
number of "visits" is one more than the number of rows returned.)
        regards, tom lane

Re: about explain analyze

From

Tom Lane

Date:

28 November 2003, 18:47:52

> ... if its just me mis-reading the numbers, let me
> know ... it just "feels" off

Here's what I see:

> time psql -c "explain analyze SELECT count(rec_id) FROM url" 186_archives
      QUERY PLAN
 

--------------------------------------------------------------------------------------------------------------------Aggregate
(cost=11177.19..11177.19 rows=1 width=4) (actual time=3539.180..3539.184 rows=1 loops=1)  ->  Seq Scan on url
(cost=0.00..10741.55rows=174255 width=4) (actual time=0.111..2292.234 rows=215552 loops=1)Total runtime: 3539.922 ms
 
(3 rows)

0.006u 0.000s 0:03.57 0.0%      0+0k 0+0io 0pf+0w

So there doesn't seem to be any discrepancy between what EXPLAIN says
and what time(1) says.  It is true that the overhead of EXPLAIN ANALYZE
looks a bit high:

> time psql -c "SELECT count(rec_id) FROM url" 186_archivescount
--------215169
(1 row)

0.000u 0.004s 0:00.77 0.0%      0+0k 0+0io 0pf+0w

If I'm doing the arithmetic correctly this makes the measurement
overhead about 13 microseconds per row.  Presumably that's almost
entirely spent on the two gettimeofday() calls, so they are costing
upwards of 6 microseconds apiece, which seems like a lot for a modern
processor.  Might be worth griping to the BSD kernel folk...

Still though I think we could be proud that we've gotten the price of
a seqscan down to the point where a couple of gettimeofday() calls per
row are dominating the runtime.
        regards, tom lane

Re: about explain analyze

From

"Marc G. Fournier"

Date:

28 November 2003, 20:02:23

None that I've heard of ... Sean?

Tom, any way of writing a quite test C program for this?  Something to
'simulate' the same thing, but without having to build the whole
postmaster?

On Fri, 28 Nov 2003, Tom Lane wrote:

> "Marc G. Fournier" <scrappy@postgresql.org> writes:
> > On Fri, 28 Nov 2003, Tom Lane wrote:
> >> Now that I recall, didn't you complain of something similar with a beta?
>
> > Yup ... and I bet its not reproducible yet again, is it? :)  That would
> > make for twice though, with v7.4, that I've come up with - results :)
>
> Well, it's not reproducibly negative, but it seems reproducibly wrong:
>
>  Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual time=49641.603..49641.611 rows=1 loops=1)
>    ->  Seq Scan on ndict3  (cost=0.00..34560.57 rows=2243357 width=4) (actual time=34.854..724754.474 rows=3570252
loops=1)
>  Total runtime: 49688.524 ms
>
>  Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual time=36625.013..36625.018 rows=1 loops=1)
>    ->  Seq Scan on ndict3  (cost=0.00..34560.57 rows=2243357 width=4) (actual time=0.128..-676113.173 rows=3572298
loops=1)
>  Total runtime: 36625.779 ms
>
>  Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual time=41380.881..41380.885 rows=1 loops=1)
>    ->  Seq Scan on ndict3  (cost=0.00..34560.57 rows=2243357 width=4) (actual time=0.091..718200.092 rows=3575264
loops=1)
>  Total runtime: 41381.504 ms
> (3 rows)
>
> I'm wondering if there's an actual bug in gettimeofday() in this
> platform, such that once in a while it returns a value that's off
> a minute or so ...
>
>             regards, tom lane
>

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

gettimeofday() goes backwards on FreeBSD 4.9

From

Tom Lane

Date:

28 November 2003, 20:02:33

I said:
> I'm wondering if there's an actual bug in gettimeofday() in this
> platform, such that once in a while it returns a value that's off
> a minute or so ...

Whoa.  Try the following test program.  Then send it in to your friendly
local BSD hackers ....

> uname -a
FreeBSD pgsql74.hub.org 4.9-PRERELEASE FreeBSD 4.9-PRERELEASE #3: Sat Sep 20 14:16:48 ADT 2003
root@neptune.hub.org:/usr/obj/usr/src/sys/kernel i386
 

> ./a.out
out of order tv_sec: 1070050646 939970, prev 1070051342 516165
out of order tv_sec: 1070051342 789127, prev 1070050646 939970
out of order tv_usec: 1070051342 789127, prev 1070050646 939970
out of order tv_sec: 1070050654 690255, prev 1070051350 216448
out of order tv_sec: 1070051350 402412, prev 1070050654 690255
out of order tv_usec: 1070051350 402412, prev 1070050654 690255
out of order tv_sec: 1070050672 990359, prev 1070051368 416546
out of order tv_sec: 1070051368 501648, prev 1070050672 990359
out of order tv_usec: 1070051368 501648, prev 1070050672 990359
out of order tv_sec: 1070050672 990363, prev 1070051368 516557
out of order tv_sec: 1070051368 700227, prev 1070050672 990363
out of order tv_usec: 1070051368 700227, prev 1070050672 990363
out of order tv_sec: 1070050675 740524, prev 1070051371 216718
out of order tv_sec: 1070051371 346521, prev 1070050675 740524
out of order tv_usec: 1070051371 346521, prev 1070050675 740524
out of order tv_sec: 1070050676 440415, prev 1070051371 916605
out of order tv_usec: 1070050676 440415, prev 1070051371 916605
out of order tv_sec: 1070051372 238348, prev 1070050676 440415
out of order tv_usec: 1070051372 238348, prev 1070050676 440415
^C
> 
        regards, tom lane


#include <stdio.h>
#include <sys/time.h>

int
main()
{struct timeval prevtime;struct timeval curtime;
gettimeofday(&prevtime, NULL);
for (;;){    gettimeofday(&curtime, NULL);
    if (curtime.tv_usec < 0 || curtime.tv_usec >= 1000000)        fprintf(stderr, "bogus tv_usec: %ld %ld, prev %ld
%ld\n",               (long int) curtime.tv_sec,                (long int) curtime.tv_usec,                (long int)
prevtime.tv_sec,               (long int) prevtime.tv_usec);
 
    if (curtime.tv_sec < prevtime.tv_sec ||        curtime.tv_sec > prevtime.tv_sec + 1)        fprintf(stderr, "out of
ordertv_sec: %ld %ld, prev %ld %ld\n",                (long int) curtime.tv_sec,                (long int)
curtime.tv_usec,               (long int) prevtime.tv_sec,                (long int) prevtime.tv_usec);
 
    if (curtime.tv_usec < prevtime.tv_usec &&        curtime.tv_sec != prevtime.tv_sec + 1)        fprintf(stderr, "out
oforder tv_usec: %ld %ld, prev %ld %ld\n",                (long int) curtime.tv_sec,                (long int)
curtime.tv_usec,               (long int) prevtime.tv_sec,                (long int) prevtime.tv_usec);
 
    prevtime = curtime;}
return 0;
}

Re: about explain analyze

From

Tom Lane

Date:

28 November 2003, 20:08:36

"Marc G. Fournier" <scrappy@postgresql.org> writes:
> On Fri, 28 Nov 2003, Tom Lane wrote:
>> Now that I recall, didn't you complain of something similar with a beta?

> Yup ... and I bet its not reproducible yet again, is it? :)  That would
> make for twice though, with v7.4, that I've come up with - results :)

Well, it's not reproducibly negative, but it seems reproducibly wrong:
Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual time=49641.603..49641.611 rows=1 loops=1)  ->  Seq Scan on
ndict3 (cost=0.00..34560.57 rows=2243357 width=4) (actual time=34.854..724754.474 rows=3570252 loops=1)Total runtime:
49688.524ms
 
Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual time=36625.013..36625.018 rows=1 loops=1)  ->  Seq Scan on
ndict3 (cost=0.00..34560.57 rows=2243357 width=4) (actual time=0.128..-676113.173 rows=3572298 loops=1)Total runtime:
36625.779ms
 
Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual time=41380.881..41380.885 rows=1 loops=1)  ->  Seq Scan on
ndict3 (cost=0.00..34560.57 rows=2243357 width=4) (actual time=0.091..718200.092 rows=3575264 loops=1)Total runtime:
41381.504ms
 
(3 rows)

I'm wondering if there's an actual bug in gettimeofday() in this
platform, such that once in a while it returns a value that's off
a minute or so ...
        regards, tom lane

Re: about explain analyze

From

Tom Lane

Date:

28 November 2003, 20:09:33

Teodor Sigaev <teodor@sigaev.ru> writes:
> Tom, I am afraid that I don't understand. My first example:
> wow=# explain analyze select max(click.accesses) from click;
>                                                     QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------
>   Aggregate  (cost=1103.70..1103.70 rows=1 width=8) (actual 
> time=289.391..289.393 rows=1 loops=1)
>     ->  Seq Scan on click  (cost=0.00..971.36 rows=52936 width=8) (actual 
> time=0.013..133.943 rows=52936 loops=1)
>   Total runtime: 289.498 ms
> (3 rows)

> Time: 290,695 ms

> select max() returns only one row... If you mean rows in Seq Scan than
> gettimeofday was called (52936+1)*2. Huge value :(

Right, that many times at the Seq Scan plan node, and (1+1)*2 more times
at the Aggregate node.
        regards, tom lane

Re: about explain analyze

From

"Marc G. Fournier"

Date:

28 November 2003, 20:10:31

On Fri, 28 Nov 2003, Tom Lane wrote:

> "Marc G. Fournier" <scrappy@postgresql.org> writes:
> > did you happen to take a look at the other set of queries I sent you, that
> > were showing 39k and 41k explain results?
>
> You mean this one?

Yup ...

> > time psql -c "explain analyze select max(intag) from ndict3" 186_archives
>                                                          QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual time=48953.823..48953.827 rows=1 loops=1)
>    ->  Seq Scan on ndict3  (cost=0.00..34560.57 rows=2243357 width=4) (actual time=4.903..-666785.605 rows=3516680
loops=1)
>  Total runtime: 48982.514 ms
> (3 rows)
>
> 0.000u 0.005s 0:49.06 0.0%      0+0k 0+0io 0pf+0w
> > time psql -c "select max(intag) from ndict3" 186_archives
>     max
> ------------
>  2147418368
> (1 row)
>
> 0.000u 0.005s 0:03.06 0.0%      0+0k 0+0io 0pf+0w
>
> This looks like 46 seconds of overhead for 3516680 rows, or still right
> about 13 microseconds per row, so that's consistent.  The negative
> "actual time" measurement for the Seq Scan row seems pretty broken
> though :-(.
>
> Now that I recall, didn't you complain of something similar with a beta?

Yup ... and I bet its not reproducible yet again, is it? :)  That would
make for twice though, with v7.4, that I've come up with - results :)

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: about explain analyze

From

Teodor Sigaev

Date:

28 November 2003, 20:11:54

Tom, I am afraid that I don't understand. My first example:
wow=# explain analyze select max(click.accesses) from click;                                                   QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------
Aggregate (cost=1103.70..1103.70 rows=1 width=8) (actual 
 
time=289.391..289.393 rows=1 loops=1)   ->  Seq Scan on click  (cost=0.00..971.36 rows=52936 width=8) (actual 
time=0.013..133.943 rows=52936 loops=1) Total runtime: 289.498 ms
(3 rows)

Time: 290,695 ms

select max() returns only one row... If you mean rows in Seq Scan than
gettimeofday was called (52936+1)*2. Huge value :(



Tom Lane wrote:
> Teodor Sigaev <teodor@sigaev.ru> writes:
> 
>>How many times is gettimeofday called?
> 
> 
> Twice per plan node visit, if you are doing EXPLAIN ANALYZE.  (The
> number of "visits" is one more than the number of rows returned.)
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings

-- 
Teodor Sigaev                                  E-mail: teodor@sigaev.ru

Re: about explain analyze

From

Tom Lane

Date:

28 November 2003, 20:12:36

"Marc G. Fournier" <scrappy@postgresql.org> writes:
> did you happen to take a look at the other set of queries I sent you, that
> were showing 39k and 41k explain results?

You mean this one?

> time psql -c "explain analyze select max(intag) from ndict3" 186_archives
          QUERY PLAN
 

----------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=40168.96..40168.96 rows=1 width=4) (actual time=48953.823..48953.827 rows=1 loops=1)  ->  Seq Scan on ndict3
(cost=0.00..34560.57rows=2243357 width=4) (actual time=4.903..-666785.605 rows=3516680 loops=1)Total runtime: 48982.514
ms
(3 rows)

0.000u 0.005s 0:49.06 0.0%      0+0k 0+0io 0pf+0w
> time psql -c "select max(intag) from ndict3" 186_archives   max
------------2147418368
(1 row)

0.000u 0.005s 0:03.06 0.0%      0+0k 0+0io 0pf+0w

This looks like 46 seconds of overhead for 3516680 rows, or still right
about 13 microseconds per row, so that's consistent.  The negative
"actual time" measurement for the Seq Scan row seems pretty broken
though :-(.

Now that I recall, didn't you complain of something similar with a beta?
        regards, tom lane

Re: about explain analyze

From

"Marc G. Fournier"

Date:

28 November 2003, 20:14:07

did you happen to take a look at the other set of queries I sent you, that
were showing 39k and 41k explain results?  do they show pretty much the
same 'lag' in gettimeofday()?

On Fri, 28 Nov 2003, Tom Lane wrote:

> > ... if its just me mis-reading the numbers, let me
> > know ... it just "feels" off
>
> Here's what I see:
>
> > time psql -c "explain analyze SELECT count(rec_id) FROM url" 186_archives
>                                                      QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=11177.19..11177.19 rows=1 width=4) (actual time=3539.180..3539.184 rows=1 loops=1)
>    ->  Seq Scan on url  (cost=0.00..10741.55 rows=174255 width=4) (actual time=0.111..2292.234 rows=215552 loops=1)
>  Total runtime: 3539.922 ms
> (3 rows)
>
> 0.006u 0.000s 0:03.57 0.0%      0+0k 0+0io 0pf+0w
>
> So there doesn't seem to be any discrepancy between what EXPLAIN says
> and what time(1) says.  It is true that the overhead of EXPLAIN ANALYZE
> looks a bit high:
>
> > time psql -c "SELECT count(rec_id) FROM url" 186_archives
>  count
> --------
>  215169
> (1 row)
>
> 0.000u 0.004s 0:00.77 0.0%      0+0k 0+0io 0pf+0w
>
> If I'm doing the arithmetic correctly this makes the measurement
> overhead about 13 microseconds per row.  Presumably that's almost
> entirely spent on the two gettimeofday() calls, so they are costing
> upwards of 6 microseconds apiece, which seems like a lot for a modern
> processor.  Might be worth griping to the BSD kernel folk...
>
> Still though I think we could be proud that we've gotten the price of
> a seqscan down to the point where a couple of gettimeofday() calls per
> row are dominating the runtime.
>
>             regards, tom lane
>

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Darcy Buskermolen

Date:

28 November 2003, 22:53:51

On November 28, 2003 12:33 pm, Tom Lane wrote:
> I said:
> > I'm wondering if there's an actual bug in gettimeofday() in this
> > platform, such that once in a while it returns a value that's off
> > a minute or so ...
>
> Whoa.  Try the following test program.  Then send it in to your friendly
> local BSD hackers ....
>
> > uname -a
>
> FreeBSD pgsql74.hub.org 4.9-PRERELEASE FreeBSD 4.9-PRERELEASE #3: Sat Sep
> 20 14:16:48 ADT 2003     root@neptune.hub.org:/usr/obj/usr/src/sys/kernel 
> i386
>
> > ./a.out
>
> out of order tv_sec: 1070050646 939970, prev 1070051342 516165
> out of order tv_sec: 1070051342 789127, prev 1070050646 939970
> out of order tv_usec: 1070051342 789127, prev 1070050646 939970
> out of order tv_sec: 1070050654 690255, prev 1070051350 216448
> out of order tv_sec: 1070051350 402412, prev 1070050654 690255
> out of order tv_usec: 1070051350 402412, prev 1070050654 690255
> out of order tv_sec: 1070050672 990359, prev 1070051368 416546
> out of order tv_sec: 1070051368 501648, prev 1070050672 990359
> out of order tv_usec: 1070051368 501648, prev 1070050672 990359
> out of order tv_sec: 1070050672 990363, prev 1070051368 516557
> out of order tv_sec: 1070051368 700227, prev 1070050672 990363
> out of order tv_usec: 1070051368 700227, prev 1070050672 990363
> out of order tv_sec: 1070050675 740524, prev 1070051371 216718
> out of order tv_sec: 1070051371 346521, prev 1070050675 740524
> out of order tv_usec: 1070051371 346521, prev 1070050675 740524
> out of order tv_sec: 1070050676 440415, prev 1070051371 916605
> out of order tv_usec: 1070050676 440415, prev 1070051371 916605
> out of order tv_sec: 1070051372 238348, prev 1070050676 440415
> out of order tv_usec: 1070051372 238348, prev 1070050676 440415
> ^C
>


I've been running this code on a pair of FreeBSD (i386) boxen, for some time 
now, one of which is a 4.8-STABLE, the other is a 5.2-BETA.

In 10 minutes of execution the 4.8 box has only shown the following:

out of order tv_sec: 1070063393 115226, prev 1070063391 381087
out of order tv_usec: 1070063393 115226, prev 1070063391 381087

the 5.2 box, hasn't outputed anything.  I'm goignt o run this test on some 
other boxen  I have and see what/ifany results are happening.

>
>             regards, tom lane
>
>
> #include <stdio.h>
> #include <sys/time.h>
>
> int
> main()
> {
>     struct timeval prevtime;
>     struct timeval curtime;
>
>     gettimeofday(&prevtime, NULL);
>
>     for (;;)
>     {
>         gettimeofday(&curtime, NULL);
>
>         if (curtime.tv_usec < 0 || curtime.tv_usec >= 1000000)
>             fprintf(stderr, "bogus tv_usec: %ld %ld, prev %ld %ld\n",
>                     (long int) curtime.tv_sec,
>                     (long int) curtime.tv_usec,
>                     (long int) prevtime.tv_sec,
>                     (long int) prevtime.tv_usec);
>
>         if (curtime.tv_sec < prevtime.tv_sec ||
>             curtime.tv_sec > prevtime.tv_sec + 1)
>             fprintf(stderr, "out of order tv_sec: %ld %ld, prev %ld %ld\n",
>                     (long int) curtime.tv_sec,
>                     (long int) curtime.tv_usec,
>                     (long int) prevtime.tv_sec,
>                     (long int) prevtime.tv_usec);
>
>         if (curtime.tv_usec < prevtime.tv_usec &&
>             curtime.tv_sec != prevtime.tv_sec + 1)
>             fprintf(stderr, "out of order tv_usec: %ld %ld, prev %ld %ld\n",
>                     (long int) curtime.tv_sec,
>                     (long int) curtime.tv_usec,
>                     (long int) prevtime.tv_sec,
>                     (long int) prevtime.tv_usec);
>
>         prevtime = curtime;
>     }
>
>     return 0;
> }
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend

-- 
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx:  250.763.1759
http://www.wavefire.com

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Tom Lane

Date:

28 November 2003, 23:15:01

Darcy Buskermolen <darcy@wavefire.com> writes:
> On November 28, 2003 12:33 pm, Tom Lane wrote:
>> Whoa.  Try the following test program.  Then send it in to your friendly
>> local BSD hackers ....

> I've been running this code on a pair of FreeBSD (i386) boxen, for some time 
> now, one of which is a 4.8-STABLE, the other is a 5.2-BETA.

Could it be a hardware problem on Marc's box?  Or specific to some other
aspect of that installation (Marc, is pgsql74.hub.org multi-CPU, for
example?)

The failure is definitely quite repeatable on pgsql74.hub.org.  I don't
see it on svr1.postgresql.org, though, which seems to be running almost
the same kernel.
        regards, tom lane

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Tom Lane

Date:

28 November 2003, 23:20:15

I said:
> The failure is definitely quite repeatable on pgsql74.hub.org.  I don't
> see it on svr1.postgresql.org, though, which seems to be running almost
> the same kernel.

After looking more closely, I take that back: svr1 is failing too,
though not as often:

> uname -a
FreeBSD svr1.postgresql.org 4.9-PRERELEASE FreeBSD 4.9-PRERELEASE #4: Sat Sep 20 14:41:58 ADT 2003
root@jupiter.hub.org:/usr/obj/usr/src/sys/kernel i386
 
> ./a.out
out of order tv_sec: 1070064182 410497, prev 1070064877 836691     <-----------
out of order tv_usec: 1070064182 410497, prev 1070064877 836691
out of order tv_sec: 1070064877 920579, prev 1070064182 410497
out of order tv_sec: 1070064901 126624, prev 1070064899 897160
out of order tv_usec: 1070064901 126624, prev 1070064899 897160
out of order tv_sec: 1070064907 306286, prev 1070064905 481423
out of order tv_usec: 1070064907 306286, prev 1070064905 481423
out of order tv_sec: 1070064218 861047, prev 1070064914 337241     <-----------
out of order tv_sec: 1070064914 570717, prev 1070064218 861047
out of order tv_usec: 1070064914 570717, prev 1070064218 861047
out of order tv_sec: 1070064241 411391, prev 1070064936 837585     <-----------
out of order tv_usec: 1070064241 411391, prev 1070064936 837585
out of order tv_sec: 1070064937 497925, prev 1070064241 411391
out of order tv_sec: 1070064251 811548, prev 1070064947 337739     <-----------
out of order tv_sec: 1070064947 508364, prev 1070064251 811548
out of order tv_usec: 1070064947 508364, prev 1070064251 811548

Maybe it's a 4.9-PRERELEASE bug?
        regards, tom lane

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Darcy Buskermolen

Date:

28 November 2003, 23:23:15

On November 28, 2003 12:33 pm, Tom Lane wrote:
> I said:
> > I'm wondering if there's an actual bug in gettimeofday() in this
> > platform, such that once in a while it returns a value that's off
> > a minute or so ...
>
> Whoa.  Try the following test program.  Then send it in to your friendly
> local BSD hackers ....
>
> > uname -a
>
> FreeBSD pgsql74.hub.org 4.9-PRERELEASE FreeBSD 4.9-PRERELEASE #3: Sat Sep
> 20 14:16:48 ADT 2003     root@neptune.hub.org:/usr/obj/usr/src/sys/kernel 
> i386
>
> > ./a.out
----8<------ snip
Further to my last email, after 30 minutes of running I was able to get 
simular behavior off the 5.2 test system.

-- 
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx:  250.763.1759
http://www.wavefire.com

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

"Nigel J. Andrews"

Date:

28 November 2003, 23:30:48

On Fri, 28 Nov 2003, Tom Lane wrote:

> Darcy Buskermolen <darcy@wavefire.com> writes:
> > On November 28, 2003 12:33 pm, Tom Lane wrote:
> >> Whoa.  Try the following test program.  Then send it in to your friendly
> >> local BSD hackers ....
> 
> > I've been running this code on a pair of FreeBSD (i386) boxen, for some time 
> > now, one of which is a 4.8-STABLE, the other is a 5.2-BETA.
> 
> Could it be a hardware problem on Marc's box?  Or specific to some other
> aspect of that installation (Marc, is pgsql74.hub.org multi-CPU, for
> example?)
> 
> The failure is definitely quite repeatable on pgsql74.hub.org.  I don't
> see it on svr1.postgresql.org, though, which seems to be running almost
> the same kernel.

On an Intel Linux 2.4.18 I get them quite often, 25 in 1'45", but they are all
just a microsecond.

On an Intel FreeBSD 3.3 I had one just after starting the program that was 2
secs behind. Then I stopped to restart with a leading under time and I've not
seen any since. (4'50" it was running)  .529 .2

user/sys time = 0.53 on the Linux system
user/sys time = 0.2 on the FreeBSD one

Make of that what you will.

--
Nigel Andrews

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Darcy Buskermolen

Date:

28 November 2003, 23:34:30

On November 28, 2003 04:19 pm, Tom Lane wrote:
> I said:
> > The failure is definitely quite repeatable on pgsql74.hub.org.  I don't
> > see it on svr1.postgresql.org, though, which seems to be running almost
> > the same kernel.
>
> After looking more closely, I take that back: svr1 is failing too,
>

After 30+ minutes I'm now starting to see the same problems on my 5.2 testbed 
as well. I'm going to fire this test up on my FreeBSD 3.x and 2.x boxen and 
see if' it's there too. I can confirm that this not a SMP issue as that it's 
happeing on both UP and SMP boxen for me.



> though not as often:
> > uname -a
>
> FreeBSD svr1.postgresql.org 4.9-PRERELEASE FreeBSD 4.9-PRERELEASE #4: Sat
> Sep 20 14:41:58 ADT 2003    
> root@jupiter.hub.org:/usr/obj/usr/src/sys/kernel  i386
>
> > ./a.out
>
> out of order tv_sec: 1070064182 410497, prev 1070064877 836691    
> <----------- out of order tv_usec: 1070064182 410497, prev 1070064877
> 836691
> out of order tv_sec: 1070064877 920579, prev 1070064182 410497
> out of order tv_sec: 1070064901 126624, prev 1070064899 897160
> out of order tv_usec: 1070064901 126624, prev 1070064899 897160
> out of order tv_sec: 1070064907 306286, prev 1070064905 481423
> out of order tv_usec: 1070064907 306286, prev 1070064905 481423
> out of order tv_sec: 1070064218 861047, prev 1070064914 337241    
> <----------- out of order tv_sec: 1070064914 570717, prev 1070064218 861047
> out of order tv_usec: 1070064914 570717, prev 1070064218 861047
> out of order tv_sec: 1070064241 411391, prev 1070064936 837585    
> <----------- out of order tv_usec: 1070064241 411391, prev 1070064936
> 837585
> out of order tv_sec: 1070064937 497925, prev 1070064241 411391
> out of order tv_sec: 1070064251 811548, prev 1070064947 337739    
> <----------- out of order tv_sec: 1070064947 508364, prev 1070064251 811548
> out of order tv_usec: 1070064947 508364, prev 1070064251 811548
>
> Maybe it's a 4.9-PRERELEASE bug?
>
>             regards, tom lane

-- 
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx:  250.763.1759
http://www.wavefire.com

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Tom Lane

Date:

28 November 2003, 23:54:51

"Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:
> On an Intel Linux 2.4.18 I get them quite often, 25 in 1'45", but they
> are all just a microsecond.

What do you mean by "just a microsecond"?

Attached is a tightened-up test program that will only complain if the
value of gettimeofday goes backward (at all) or forward by more than
10 seconds (adjustable as MAX_SKIP).  This should be suitable to run on
moderately loaded machines where the test program might occasionally not
get dispatched for a few seconds.

I still see no failures on my own (non-BSD) machines, fairly frequent
failures on pgsql74.hub.org:

> ./a.out
out of order tv_sec: 1070066031 262048, prev 1070066726 688240
out of order tv_sec: 1070066726 785019, prev 1070066031 262048
out of order tv_sec: 1070066062 62622, prev 1070066757 588814
out of order tv_sec: 1070066757 771848, prev 1070066062 62622
out of order tv_sec: 1070066093 262974, prev 1070066788 689167
out of order tv_sec: 1070066788 777486, prev 1070066093 262974
out of order tv_sec: 1070066114 113410, prev 1070066809 589602
out of order tv_sec: 1070066809 724663, prev 1070066114 113410
out of order tv_sec: 1070066145 113899, prev 1070066840 590097
out of order tv_sec: 1070066840 726558, prev 1070066145 113899
out of order tv_sec: 1070066155 263911, prev 1070066850 690103
out of order tv_sec: 1070066850 781343, prev 1070066155 263911
out of order tv_sec: 1070066176 164307, prev 1070066871 590505
out of order tv_sec: 1070066871 643350, prev 1070066176 164307
out of order tv_sec: 1070066217 264846, prev 1070066912 691039
out of order tv_sec: 1070066912 775989, prev 1070066217 264846
out of order tv_sec: 1070066248 65394, prev 1070066943 591592
out of order tv_sec: 1070066943 773822, prev 1070066248 65394
^C

and rarer failures on cvs.postgresql.org:

> ./a.out
out of order tv_sec: 1070066099 389427, prev 1070066794 865617
out of order tv_sec: 1070066795 17855, prev 1070066099 389427
out of order tv_sec: 1070066252 541729, prev 1070066947 967921
out of order tv_sec: 1070066948 38715, prev 1070066252 541729
out of order tv_sec: 1070066371 393525, prev 1070067066 869715
out of order tv_sec: 1070067068 24754, prev 1070066371 393525
^C

It seems consistent that the error is 695 seconds and change
when it happens, and that the very next read gives a correct
(or at least plausible) value again.
        regards, tom lane


#include <stdio.h>
#include <sys/time.h>

#define MAX_SKIP  10

int
main()
{struct timeval prevtime;struct timeval curtime;
gettimeofday(&prevtime, NULL);
for (;;){    gettimeofday(&curtime, NULL);
    if (curtime.tv_usec < 0 || curtime.tv_usec >= 1000000)        fprintf(stderr, "bogus tv_usec: %ld %ld, prev %ld
%ld\n",               (long int) curtime.tv_sec,                (long int) curtime.tv_usec,                (long int)
prevtime.tv_sec,               (long int) prevtime.tv_usec);    else if (curtime.tv_sec < prevtime.tv_sec ||
curtime.tv_sec > prevtime.tv_sec + MAX_SKIP)        fprintf(stderr, "out of order tv_sec: %ld %ld, prev %ld %ld\n",
          (long int) curtime.tv_sec,                (long int) curtime.tv_usec,                (long int)
prevtime.tv_sec,               (long int) prevtime.tv_usec);    else if (curtime.tv_usec < prevtime.tv_usec &&
  curtime.tv_sec == prevtime.tv_sec)        fprintf(stderr, "out of order tv_usec: %ld %ld, prev %ld %ld\n",
   (long int) curtime.tv_sec,                (long int) curtime.tv_usec,                (long int) prevtime.tv_sec,
          (long int) prevtime.tv_usec);
 
    prevtime = curtime;}
return 0;
}

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

"Nigel J. Andrews"

Date:

29 November 2003, 00:03:38

On Fri, 28 Nov 2003, Tom Lane wrote:

> "Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:
> > On an Intel Linux 2.4.18 I get them quite often, 25 in 1'45", but they
> > are all just a microsecond.
> 
> What do you mean by "just a microsecond"?

I mean it's always a "out of order tv_usec..." line and the difference is
1us. That is a.out gives:

out of order tv_usec: 1070065862 374978, prev 1070065862 374979
out of order tv_usec: 1070065867 814300, prev 1070065867 814301
out of order tv_usec: 1070065868 794176, prev 1070065868 794177
out of order tv_usec: 1070065871 553831, prev 1070065871 553832

etc.


> 
> Attached is a tightened-up test program that will only complain if the
> value of gettimeofday goes backward (at all) or forward by more than
> 10 seconds (adjustable as MAX_SKIP).  This should be suitable to run on
> moderately loaded machines where the test program might occasionally not
> get dispatched for a few seconds.

I don't think that would show any for me. I've only got two skips showing from
a longer run on my FreeBSD 3.3 and those differ in the seconds counter by only
2. Time between the occurances is about 1100s.


--
Nigel

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Tom Lane

Date:

29 November 2003, 00:06:03

Darcy Buskermolen <darcy@wavefire.com> writes:
> After 30+ minutes I'm now starting to see the same problems on my 5.2 testbed
> as well. I'm going to fire this test up on my FreeBSD 3.x and 2.x boxen and 
> see if' it's there too. I can confirm that this not a SMP issue as that it's 
> happeing on both UP and SMP boxen for me.

Good, that's one variable eliminated.

Looking at my own data, I notice that when the error happens, the
elapsed time shown between the immediately preceding and following
okay-looking timestamps is always significant (at least 100 msec
and often a second or more).  That is kind of a lot for a tight loop
containing one simple kernel call, no?  I am suspicious that the failure
occurs when gettimeofday() is called just as the process is losing
control of the CPU (due to using up its timeslice or whatever).  When
control eventually returns, the process gets a reading that is neither
pre-loss-of-CPU nor post-regain-of-CPU, but some unholy combination that
nets out as a time about 15 min in the past.  Just a theory, but it fits
some of the available facts.

Can anyone think of a significant interpretation for the number 695
seconds?  That's got to be an important clue ...
        regards, tom lane

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Tom Lane

Date:

29 November 2003, 00:09:09

"Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:
> On Fri, 28 Nov 2003, Tom Lane wrote:
>> "Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:
> On an Intel Linux 2.4.18 I get them quite often, 25 in 1'45", but they
> are all just a microsecond.
>> 
>> What do you mean by "just a microsecond"?

> I mean it's always a "out of order tv_usec..." line and the difference is
> 1us. That is a.out gives:

> out of order tv_usec: 1070065862 374978, prev 1070065862 374979
> out of order tv_usec: 1070065867 814300, prev 1070065867 814301

Fascinating.  I'd call that a bug too, but evidently one with a
different mechanism than the BSD issue we are chasing.

FWIW, I have not seen any failures in a fair amount of runtime on
a 2.4.18 (Red Hat 8.0) kernel here, running on a Dell P4.  What is
the hardware platform you're using?
        regards, tom lane

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Neil Conway

Date:

29 November 2003, 00:13:46

Tom Lane <tgl@sss.pgh.pa.us> writes:
> Whoa.  Try the following test program.

I tried this on two uniprocessor x86 machines:

(1)

$ uname -a
Linux tokyo 2.6.0-test10 #3 Mon Nov 24 13:43:54 EST 2003 i686 GNU/Linux

This machine produces this output, after running the test app for
about 30 minutes:

out of order tv_sec: 1070067959 929846, prev 1070067957 813066
out of order tv_usec: 1070068017 80145, prev 1070068017 80384
out of order tv_usec: 1070068017 104138, prev 1070068017 104368

(2)

$ uname -a
FreeBSD home.samurai.com 4.9-RELEASE FreeBSD 4.9-RELEASE #0: Thu Nov  6 21:25:41 EST 2003
bryanf@joe.samurai.com:/usr/obj/usr/src/sys/SAMURAI i386
 

This machine produces no output after running the test app for about
30 minutes.

-Neil

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

"Nigel J. Andrews"

Date:

29 November 2003, 00:14:01

On Fri, 28 Nov 2003, Tom Lane wrote:

> "Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:
> > On Fri, 28 Nov 2003, Tom Lane wrote:
> >> "Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:
> > On an Intel Linux 2.4.18 I get them quite often, 25 in 1'45", but they
> > are all just a microsecond.
> >> 
> >> What do you mean by "just a microsecond"?
> 
> > I mean it's always a "out of order tv_usec..." line and the difference is
> > 1us. That is a.out gives:
> 
> > out of order tv_usec: 1070065862 374978, prev 1070065862 374979
> > out of order tv_usec: 1070065867 814300, prev 1070065867 814301
> 
> Fascinating.  I'd call that a bug too, but evidently one with a
> different mechanism than the BSD issue we are chasing.
> 
> FWIW, I have not seen any failures in a fair amount of runtime on
> a 2.4.18 (Red Hat 8.0) kernel here, running on a Dell P4.  What is
> the hardware platform you're using?

Ah, I have made a mistake. It's only a 2.2.18 kernal. Dual SMP P-III, perhaps
that's the issue there.

And on the FreeBSD system I've got this: 

$ time ./a.out 2>&1 | tee a.txt
out of order tv_sec: 1070066197 273140, prev 1070066195 721010
out of order tv_usec: 1070066197 273140, prev 1070066195 721010
out of order tv_sec: 1070067322 116061, prev 1070067320 440490
out of order tv_usec: 1070067322 116061, prev 1070067320 440490
out of order tv_sec: 1070067833 514969, prev 1070067831 755019
out of order tv_usec: 1070067833 514969, prev 1070067831 755019
^C

real    38m53.026s
user    6m13.953s
sys     32m6.589s

So not very often there.


--
Nigel

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

"Marc G. Fournier"

Date:

29 November 2003, 00:22:20

On Fri, 28 Nov 2003, Darcy Buskermolen wrote:

> I've been running this code on a pair of FreeBSD (i386) boxen, for some time
> now, one of which is a 4.8-STABLE, the other is a 5.2-BETA.
>
> In 10 minutes of execution the 4.8 box has only shown the following:
>
> out of order tv_sec: 1070063393 115226, prev 1070063391 381087
> out of order tv_usec: 1070063393 115226, prev 1070063391 381087

I'm suspecting that the issue is amplified by load on the server itself
...

> the 5.2 box, hasn't outputed anything.  I'm goignt o run this test on
> some other boxen I have and see what/ifany results are happening.

Similar results here with a 5.2 box ...

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Tom Lane

Date:

29 November 2003, 00:23:00

"Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:
> Ah, I have made a mistake. It's only a 2.2.18 kernal. Dual SMP P-III, perhaps
> that's the issue there.

Hm.  I seem to recall there were still lots of SMP bugs in the 2.2.*
kernels.

> And on the FreeBSD system I've got this: 

> $ time ./a.out 2>&1 | tee a.txt
> out of order tv_sec: 1070066197 273140, prev 1070066195 721010
> out of order tv_usec: 1070066197 273140, prev 1070066195 721010
> out of order tv_sec: 1070067322 116061, prev 1070067320 440490
> out of order tv_usec: 1070067322 116061, prev 1070067320 440490
> out of order tv_sec: 1070067833 514969, prev 1070067831 755019
> out of order tv_usec: 1070067833 514969, prev 1070067831 755019
> ^C

AFAICT the above is a legal trace, indicating only that the test program
sometimes lost control for more than a second at a time.  The "revised"
version of the test will not complain about this.

Time going backwards by even one microsecond, however, is generally
considered Bad News, unless you're actively manipulating the system
clock setting.

One variable I didn't think to ask about is whether you are running
NTP.  In my experience an ntp daemon that has achieved lock will never
step the clock back by even 1 usec (it's supposed to use much more
subtle methods than that to manage the clock ;-)) but maybe under
unstable conditions such things could happen.  The machines I have
tested here all run NTP.
        regards, tom lane

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

"Marc G. Fournier"

Date:

29 November 2003, 00:23:16

On Fri, 28 Nov 2003, Tom Lane wrote:

> Could it be a hardware problem on Marc's box?  Or specific to some other
> aspect of that installation (Marc, is pgsql74.hub.org multi-CPU, for
> example?)

All the servers are SMP ... pgsql74 is on a Dual-Xeon ...

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Tom Lane

Date:

29 November 2003, 00:26:02

"Marc G. Fournier" <scrappy@postgresql.org> writes:
> I'm suspecting that the issue is amplified by load on the server itself

That would fit right in with my idea that the failure occurs when the
process loses its timeslice partway through gettimeofday().  Heavier
load would presumably make it more likely to lose the CPU.

BTW, I can't get top(1) to work on either pgsql74 or svr1, any idea why?
        regards, tom lane

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

"Marc G. Fournier"

Date:

29 November 2003, 00:34:11

On Fri, 28 Nov 2003, Tom Lane wrote:

> "Marc G. Fournier" <scrappy@postgresql.org> writes:
> > I'm suspecting that the issue is amplified by load on the server itself
>
> That would fit right in with my idea that the failure occurs when the
> process loses its timeslice partway through gettimeofday().  Heavier
> load would presumably make it more likely to lose the CPU.

I'm posting to the freebsd lists about it, see if they have anything to
say ...

>
> BTW, I can't get top(1) to work on either pgsql74 or svr1, any idea why?

top is one of those things that do not work within a VM ...

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

"Nigel J. Andrews"

Date:

29 November 2003, 00:47:02

On Fri, 28 Nov 2003, Tom Lane wrote:

> "Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:
> 
> > $ time ./a.out 2>&1 | tee a.txt
> > out of order tv_sec: 1070066197 273140, prev 1070066195 721010
> > out of order tv_usec: 1070066197 273140, prev 1070066195 721010
> > out of order tv_sec: 1070067322 116061, prev 1070067320 440490
> > out of order tv_usec: 1070067322 116061, prev 1070067320 440490
> > out of order tv_sec: 1070067833 514969, prev 1070067831 755019
> > out of order tv_usec: 1070067833 514969, prev 1070067831 755019
> > ^C
> 
> AFAICT the above is a legal trace, indicating only that the test program
> sometimes lost control for more than a second at a time.  The "revised"
> version of the test will not complain about this.

Doh! I was looking for time going backwards so that's what I saw, even though I
already new the same complaint would be made if time skipped forward a second
or so.

> One variable I didn't think to ask about is whether you are running
> NTP.  In my experience an ntp daemon that has achieved lock will never
> step the clock back by even 1 usec (it's supposed to use much more
> subtle methods than that to manage the clock ;-)) but maybe under
> unstable conditions such things could happen.  The machines I have
> tested here all run NTP.

Yes, all NTPed. The FreeBSD system, the one that doesn't skip back *smacks
forehead*, is the local time server but is generally lightly loaded. Not to say
that there's not going to be the odd job it has to run which could make the
system busy and time time away from this test. It's just difficult to say that
every so often there's some task that makes the system unresponsive for a
couple of seconds, which is what this is saying.

Anyway, I like your suggestion as the timeslicing being a possible
cause. Seems a little strange though, presumably all the numbers going into the
timeofday calc are gathered by the one system call at a place past where the
thread could have had it's time sliced away and before the next. However, I
have no idea how the BSD kernel operates so can't do more than assume and
guess.

--
Nigel

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

"Marc G. Fournier"

Date:

29 November 2003, 01:07:48

On Fri, 28 Nov 2003, Marc G. Fournier wrote:

> On Fri, 28 Nov 2003, Tom Lane wrote:
>
> > "Marc G. Fournier" <scrappy@postgresql.org> writes:
> > > I'm suspecting that the issue is amplified by load on the server itself
> >
> > That would fit right in with my idea that the failure occurs when the
> > process loses its timeslice partway through gettimeofday().  Heavier
> > load would presumably make it more likely to lose the CPU.
>
> I'm posting to the freebsd lists about it, see if they have anything to
> say ...

Just got my first response on this, and, in fact, this is 'acceptable
behaviour' to a certain extent ... there is a kernel tuneable called
kern.timecounter.method, that is either 1 or 0:
    0    sacrifice precision in favor of faster code execution
    1    return the more precise time (the same as calling microtime() or         nanotime()).
    The intent of the getmicrotime() and getnanotime() functions is to    enforce the user's preference for timer
accuracyversus execution time.
 

Now, reading the man page, there seems to be a 'microtime' vs
'getmicrotime' function ... these might be preferable to using
gettimeofday on FreeBSD?
----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Tom Lane

Date:

29 November 2003, 01:10:09

"Marc G. Fournier" <scrappy@postgresql.org> writes:
> Just got my first response on this, and, in fact, this is 'acceptable
> behaviour' to a certain extent ... there is a kernel tuneable called
> kern.timecounter.method, that is either 1 or 0:

15 minutes of error is acceptable?
        regards, tom lane

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

"Marc G. Fournier"

Date:

29 November 2003, 01:15:36

On Fri, 28 Nov 2003, Tom Lane wrote:

> "Marc G. Fournier" <scrappy@postgresql.org> writes:
> > Just got my first response on this, and, in fact, this is 'acceptable
> > behaviour' to a certain extent ... there is a kernel tuneable called
> > kern.timecounter.method, that is either 1 or 0:
>
> 15 minutes of error is acceptable?

Apparently *shrug*  That was just the first answer back though, so we'll
see what happens ...

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Tom Lane

Date:

29 November 2003, 01:25:45

"Marc G. Fournier" <scrappy@postgresql.org> writes:
>> 15 minutes of error is acceptable?

> Apparently *shrug*  That was just the first answer back though, so we'll
> see what happens ...

BTW, it seems from this thread that the BIND guys have seen even more
bogus behavior than this from gettimeofday() on BSD.  Their attitude is
clearly "BSD is broken":

http://www.isc.org/ml-archives/bind-users/2001/01/msg01246.html

        regards, tom lane

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

"Marc G. Fournier"

Date:

29 November 2003, 01:34:59

On Fri, 28 Nov 2003, Neil Conway wrote:

> $ uname -a
> FreeBSD home.samurai.com 4.9-RELEASE FreeBSD 4.9-RELEASE #0: Thu Nov  6 21:25:41 EST 2003
bryanf@joe.samurai.com:/usr/obj/usr/src/sys/SAMURAI i386
 
>
> This machine produces no output after running the test app for about
> 30 minutes.

what is kern.timecounter.method set to?

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Neil Conway

Date:

29 November 2003, 03:52:48

"Marc G. Fournier" <scrappy@postgresql.org> writes:
> On Fri, 28 Nov 2003, Neil Conway wrote:
>> $ uname -a
>> FreeBSD home.samurai.com 4.9-RELEASE [...]
>
> what is kern.timecounter.method set to?

$ sysctl kern.timecounter.method 
kern.timecounter.method: 0

FYI, I ran the test program for another 30 minutes (an hour in total)
without any output.

-Neil

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

"Marc G. Fournier"

Date:

29 November 2003, 03:55:29

On Fri, 28 Nov 2003, Neil Conway wrote:

> "Marc G. Fournier" <scrappy@postgresql.org> writes:
> > On Fri, 28 Nov 2003, Neil Conway wrote:
> >> $ uname -a
> >> FreeBSD home.samurai.com 4.9-RELEASE [...]
> >
> > what is kern.timecounter.method set to?
>
> $ sysctl kern.timecounter.method
> kern.timecounter.method: 0
>
> FYI, I ran the test program for another 30 minutes (an hour in total)
> without any output.

Great, I love consistent results :)  Some have 5.x that do it, some don't
... 4.x does it on all my machines, but not yours ...

What kinda processor are you running?  Intel or AMD?  Mine are all Intel
...

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Neil Conway

Date:

29 November 2003, 04:22:06

"Marc G. Fournier" <scrappy@postgresql.org> writes:
> What kinda processor are you running?

From dmesg:

Timecounter "i8254"  frequency 1193182 Hz
CPU: Intel Celeron (631.29-MHz 686-class CPU) Origin = "GenuineIntel"  Id = 0x686  Stepping = 6
Features=0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>

-Neil

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Gavin Sherry

Date:

29 November 2003, 11:10:25

I ran Tom's second script for two hours on an AMD Duron on Linux 2.4.18.
No problems.

Gavin

Re: gettimeofday() goes backwards on FreeBSD 4.9

From

Kurt Roeckx

Date:

29 November 2003, 12:15:34

On Fri, Nov 28, 2003 at 08:22:30PM -0500, Tom Lane wrote:
> 
> One variable I didn't think to ask about is whether you are running
> NTP.  In my experience an ntp daemon that has achieved lock will never
> step the clock back by even 1 usec (it's supposed to use much more
> subtle methods than that to manage the clock ;-)) but maybe under
> unstable conditions such things could happen.  The machines I have
> tested here all run NTP.

ntpd will slew the clock in case the difference is small enough.
Which means it will tell the kernel to slow down time a little.

In case the offset is too large (0.128 seconds) it will tell the
kernel to step the clock.

ntpd also has an option to "always slew".  But it's broken on
atleast Linux 2.4.

Note that the kernel uses values from different places to
calculate what the current time is, and they're not all running
on the same frequency.  I wouldn't worry about a 1 us step back.

Kurt