RE: stats.sql might fail due to shared buffers also used by parallel tests - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: stats.sql might fail due to shared buffers also used by parallel tests
Date
Msg-id OSCPR01MB14966BDD12141F158687AB1BCF557A@OSCPR01MB14966.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: stats.sql might fail due to shared buffers also used by parallel tests  (Alexander Lakhin <exclusion@gmail.com>)
List pgsql-hackers
Dear Alexander,

> > So according to me, I suspect the following causes
> > 1) The time difference between 'prev_stats_reset' and current
> > 'stats_reset' value is less than 1 microseconds.
> > 'stats_reset' is of type 'timestamp with time zone' and the content of
> > it is like: '2025-06-30 21:01:07.925253+05:30'. So if the time
> > difference between 'prev_stats_reset' and current 'stats_reset' is
> > less than 1 microseconds. The query 'SELECT :'prev_stats_reset' <
> > stats_reset FROM pg_stat_subscription_stats WHERE subname =
> > 'regress_testsub'' might return 'false' instead of 'true'.
> > But I was not able to reproduce such a scenario after multiple
> > testing. Even in high end machines, it takes at least a few
> > microseconds. Also there are multiple cases where we did similar
> > timestamp comparison in 'stats.sql' as well. And, I didn't find any
> > other failure related to such case. So, I feel this is not possible.
> 
> Did you try that on Windows (hamerkop is a Windows animal)? IIUC,
> GetCurrentTimestamp() -> gettimeofday() implemented on Windows via
> GetSystemTimePreciseAsFileTime(), and it has 100ns resolution,

Hmm. I'm not familiar with the Windows environment, but I have the doubt for it.

GetSystemTimePreciseAsFileTime() returns FILETIME structure, which represents the
time UTC with 100-nanosecod intervals [1]. The stack overflow seemed to refer it.
However, the document for GetSystemTimePreciseAsFileTime() says that the
resolution is < 1 us [2]. Also, MS doc [3] does not say that
GetSystemTimePreciseAsFileTime() returns value monotonically.
Another API QueryPerformanceCounter() seems to have the monotony.

A bit old document [4] also raised the possibility:

```
Consecutive calls may return the same result. The call time is less than the
smallest increment of the system time. The granularity is in the sub-microsecond
regime. The function may be used for time measurements but some care has to be
taken: Time differences may be ZERO.
```

Also, what if the the system clock is modified during the test via NTP?

> > 2) pg_stat_reset_subscription_stats(oid) function did not reset the stats.
> > We have a shared hash 'pgStatLocal.shared_hash'. If the entry
> > reference (for the subscription) is not found while executing
> > 'pg_stat_reset_subscription_stats(oid)'. It  may not be able to reset
> > the stats. Maybe somehow this shared hash is getting dropped..
> > Also, it could be failing due to the same reason as Alexander has
> 
> I don't think 2) is relevant here, because shared buffers shouldn't affect
> subscription's statistics.

To confirm; we do not consider the possibility that pgstat_get_entry_ref() returns
NULL right?

[1]: https://learn.microsoft.com/en-us/windows/win32/api/minwinbase/ns-minwinbase-filetime
[2]: https://learn.microsoft.com/en-us/windows/win32/api/sysinfoapi/nf-sysinfoapi-getsystemtimepreciseasfiletime
[3]: https://learn.microsoft.com/en-us/windows/win32/sysinfo/acquiring-high-resolution-time-stamps
[4]:
http://www.windowstimestamp.com/description#:~:text=2.1.4.2.%C2%A0%C2%A0Desktop%20Applications%3A%20GetSystemTimePreciseAsFileTime()

Best regards,
Hayato Kuroda
FUJITSU LIMITED
 


pgsql-hackers by date:

Previous
From: Rahila Syed
Date:
Subject: Re: Improve LWLock tranche name visibility across backends
Next
From: jian he
Date:
Subject: Re: pg_dump does not dump domain not-null constraint's comments