last_statrequest is in the future - Mailing list pgsql-hackers

From Tom Lane
Subject last_statrequest is in the future
Date
Msg-id 22006.1269445154@sss.pgh.pa.us
Whole thread Raw
Responses Re: last_statrequest is in the future
Re: last_statrequest is in the future
List pgsql-hackers
Well, I didn't actually think that this patch
http://archives.postgresql.org/pgsql-committers/2010-03/msg00181.php
would yield much insight, but lookee what we have here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguar&dt=2010-03-24%2004:00:07

[4ba99150.5099:483] LOG:  statement: VACUUM ANALYZE num_exp_add;
[4ba99145.5071:1] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:2] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:3] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:4] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:5] LOG:  last_statrequest is in the future, resetting
...
[4ba99145.5071:497] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:498] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:499] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:500] LOG:  last_statrequest is in the future, resetting
[4ba99150.5099:484] WARNING:  pgstat wait timeout

There are multiple occurrences of "pgstat wait timeout" in the
postmaster log (some evidently from autovacuum, because they don't show
up as regression diffs), and every one of them is associated with a
bunch of "last_statrequest is in the future" bleats.

So at least on jaguar, it seems that the reason for this behavior is
that the system clock is significantly skewed between the stats
collector process and the backends, to the point where stats updates
generated by the collector will never appear new enough to satisfy the
requesting backends.  I think I'm going to go back and modify the code
to show the actual numbers involved so we can see just how bad it is ---
but the skew must be more than five seconds or we'd not be seeing this
failure.  That seems to me to put it in the class of "system bug".

Should we redesign the stats signaling logic to work around this,
or just hope we can nag kernel people into fixing it?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Gokulakannan Somasundaram
Date:
Subject: Re: Performance Improvement for Unique Indexes
Next
From: Steve Singer
Date:
Subject: Re: dtester-0.1 released