Hi,
On 2024-06-03 11:11:46 +0000, Bertrand Drouvot wrote:
> The main argument is that we currently don’t have writes counters for relations.
> The reason is that we don’t have the relation OID when writing buffers out.
> Tracking writes per relfilenode would allow us to track/consolidate writes per
> relation (example in the v1 patch and in the message up-thread).
>
> I think that adding instrumentation in this area (writes counters) could be
> beneficial (like it is for the ones we currently have for reads).
>
> Second argument is that this is also beneficial for the "Split index and
> table statistics into different types of stats" thread (mentioned in the previous
> message). It would allow us to avoid additional branches in some situations (like
> the one mentioned by Andres in the link I provided up-thread).
I think there's another *very* significant benefit:
Right now physical replication doesn't populate statistics fields like
n_dead_tup, which can be a huge issue after failovers, because there's little
information about what autovacuum needs to do.
Auto-analyze *partially* can fix it at times, if it's lucky enough to see
enough dead tuples - but that's not a given and even if it works, is often
wildly inaccurate.
Once we put things like n_dead_tup into per-relfilenode stats, we can populate
them during WAL replay. Thus after a promotion autovacuum has much better
data.
This also is important when we crash: We've been talking about storing a
snapshot of the stats alongside each REDO pointer. Combined with updating
stats during crash recovery, we'll have accurate dead-tuple stats once recovey
has finished.
Greetings,
Andres Freund