On Thu, Jan 26, 2023 at 3:28 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Nathan Bossart <nathandbossart@gmail.com> writes:
> > I think we might risk overflowing "long" when all the wakeup times are
> > DT_NOEND:
>
> > * This is typically used to calculate a wait timeout for WaitLatch()
> > * or a related function. The choice of "long" as the result type
> > * is to harmonize with that. It is caller's responsibility that the
> > * input timestamps not be so far apart as to risk overflow of "long"
> > * (which'd happen at about 25 days on machines with 32-bit "long").
>
> > Maybe we can adjust that function or create a new one to deal with this.
>
> It'd probably be reasonable to file down that sharp edge by instead
> specifying that TimestampDifferenceMilliseconds will clamp overflowing
> differences to LONG_MAX. Maybe there should be a clamp on the underflow
> side too ... but should it be to LONG_MIN or to zero?
That got me curious... Why did WaitLatch() use long in the first
place? I see that it was in Heikki's original sketch[1], but I can't
think of any implementation reason for it. Note that the current
implementation of WaitLatch() et al will reach WaitEventSetWait()'s
assertion that the timeout is <= INT_MAX, so a LONG_MAX clamp isn't
right without further clamping. Then internally,
WaitEventSetWaitBlock() takes an int, so there is an implicit cast to
int. If I had to guess I'd say the reasons for long in the API are
lost, and the WES rewrite used in "int" because that's what poll() and
epoll_wait() wanted.
[1] https://www.postgresql.org/message-id/flat/4C72E85C.3000201%2540enterprisedb.com