Re: [BUG]: the walsender does not update its IO statistics until it exits - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [BUG]: the walsender does not update its IO statistics until it exits
Date
Msg-id erpzwxoptqhuptdrtehqydzjapvroumkhh7lc6poclbhe7jk7l@l3yfsq5q4pw7
Whole thread Raw
In response to Re: [BUG]: the walsender does not update its IO statistics until it exits  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
Hi,

On 2025-02-26 15:37:10 +0900, Michael Paquier wrote:
> That's bad, worse for a logical WAL sender, because it means that we
> have no idea what kind of I/O happens in this process until it exits,
> and logical WAL senders could loop forever, since v16 where we've
> begun tracking I/O.

FWIW, I think medium term we need to work on splitting stats flushing into two
separate kinds of flushes:
1) non-transactional stats, which should be flushed at a regular interval,
   unless a process is completely idle
2) transaction stats, which can only be flushed at transaction boundaries,
   because before the transaction boundary we don't know if e.g. newly
   inserted rows should be counted as live or dead

So far we have some timer logic for 2), but we have basically no support for
1). Which means we have weird ad-hoc logic in various kinds of
non-plain-connection processes. And that will often have holes, as Bertrand
noticed here.

I think it's also bad that we don't have a solution for 1), even just for
normal connections. If a backend causes a lot of IO we might want to know
about that long before the longrunning transaction commits.

I suspect the right design here would be to have a generalized form of the
timeout mechanism we have for 2).

For that we'd need to make sure that pgstat_report_stat() can be safely called
inside a transaction.  The second part would be to redesign the
IdleStatsUpdateTimeoutPending mechanism so it is triggered independent of
idleness, without introducing unacceptable overhead - I think that's doable.



Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Draft for basic NUMA observability
Next
From: Tender Wang
Date:
Subject: Re: Anti join confusion