> On one of my servers stat collector went crasy on idle system.
>>> It constantly write around 50MB/s on idle database (no activity at all).
>>>
>>
> OK there are results of some entry-level gdb kung-fu:
>
> problem point is:
> /*
> * Write the stats file if a new request has arrived that is
> not
> * satisfied by existing file.
> */
> if (last_statwrite < last_statrequest)
> pgstat_write_statsfile(false);
>
> on some loop iteration difference between last_statrequest and
> last_statwrite is around 23000-29000 (e.g. 20-30ms).
>
> Hm may be there some clock drift between CPU cores?
>
What's made this situation interesting that the fact the code never got
inside
/*
* If there is clock skew between backends and the collector, we
could
* receive a stats request time that's in the future. If so,
complain
* and reset last_statrequest. Resetting ensures that no inquiry
* message can cause more than one stats file write to occur.
*/
if (last_statrequest > last_statwrite)
...
block.
Watchpoint on the (last_statwrite < last_statrequest) condition repetable
fire ONLY inside:
pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len)
{
if (msg->inquiry_time > last_statrequest)
last_statrequest = msg->inquiry_time;
}
function.
Now just some musings/theory (I might be dead wrong of course).
Lets suppose there are no clock skew actually and check execution flow.
Main loop executed:
if (last_statwrite < last_statrequest)
pgstat_write_statsfile(false);
and updated last_statwrite value.
However, it took some time because of file size.
During that period (while pgstat busy with writes) some backend issued
PGSTAT_MTYPE_INQUIRY request which will be succesfuly accepted on the next
code lines at
len = recv(pgStatSock, (char *) &msg,...
And as a result the last_statrequest will be updated to the
msg->inquiry_time value.
And this value will be definitely > than last_statwrite (because this
request issued after pgstat worker stared writing new copy of the stat
file).
So these could be case when this situation could happen without time
travel/clock drift.
Kind Regards,
Maksym