Home > mailing lists

Re: [BUG]: the walsender does not update its IO statistics until it exits - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: [BUG]: the walsender does not update its IO statistics until it exits
Date	February 26 13:08:17
Msg-id	erpzwxoptqhuptdrtehqydzjapvroumkhh7lc6poclbhe7jk7l@l3yfsq5q4pw7 Whole thread Raw
In response to	Re: [BUG]: the walsender does not update its IO statistics until it exits (Michael Paquier <michael@paquier.xyz>)
Responses	Re: [BUG]: the walsender does not update its IO statistics until it exits Re: [BUG]: the walsender does not update its IO statistics until it exits
List	pgsql-hackers

Tree view

Hi,

On 2025-02-26 15:37:10 +0900, Michael Paquier wrote:
> That's bad, worse for a logical WAL sender, because it means that we
> have no idea what kind of I/O happens in this process until it exits,
> and logical WAL senders could loop forever, since v16 where we've
> begun tracking I/O.

FWIW, I think medium term we need to work on splitting stats flushing into two
separate kinds of flushes:
1) non-transactional stats, which should be flushed at a regular interval,
   unless a process is completely idle
2) transaction stats, which can only be flushed at transaction boundaries,
   because before the transaction boundary we don't know if e.g. newly
   inserted rows should be counted as live or dead

So far we have some timer logic for 2), but we have basically no support for
1). Which means we have weird ad-hoc logic in various kinds of
non-plain-connection processes. And that will often have holes, as Bertrand
noticed here.

I think it's also bad that we don't have a solution for 1), even just for
normal connections. If a backend causes a lot of IO we might want to know
about that long before the longrunning transaction commits.

I suspect the right design here would be to have a generalized form of the
timeout mechanism we have for 2).

For that we'd need to make sure that pgstat_report_stat() can be safely called
inside a transaction.  The second part would be to redesign the
IdleStatsUpdateTimeoutPending mechanism so it is triggered independent of
idleness, without introducing unacceptable overhead - I think that's doable.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Andres Freund
Date: 26 February, 12:58:28
Subject: Re: Draft for basic NUMA observability

From: Tender Wang
Date: 26 February, 13:08:56
Subject: Re: Anti join confusion

Re: [BUG]: the walsender does not update its IO statistics until it exits - Mailing list pgsql-hackers

Previous

Next