Re: n_ins_since_vacuum stats for aborted transactions - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: n_ins_since_vacuum stats for aborted transactions
Date
Msg-id CAKFQuwbk6HfVGmE-Ko7zVWqpYxv33ZSfkTC=528d4YQPKYjSFQ@mail.gmail.com
Whole thread Raw
In response to Re: n_ins_since_vacuum stats for aborted transactions  (Sami Imseih <samimseih@gmail.com>)
Responses Re: n_ins_since_vacuum stats for aborted transactions
List pgsql-hackers
On Wednesday, April 9, 2025, Sami Imseih <samimseih@gmail.com> wrote:

> What is the use case for that behavior?  Perhaps you have one, but until you make it explicit, it is hard for others to get behind your proposal.

The point is to ensure that the pg_stats fields that autovacuum uses
are supplied the correct values
for the different thresholds they need to calculate, as described here [0]


[0] https://www.postgresql.org/message-id/CAA5RZ0uDyGW1omWqWkxyW8NB1qzsKmXhnoMtzTBeRzSd4DMatQ%40mail.gmail.com


Except there isn’t some singular provably correct value here.  Today’s behavior (considering dead tuples) is not intrinsically wrong nor correct, and neither is what you propose (ignoring the dead tuples).  The fact that those dead tuples get removed regardless is a point in favor of counting them when deciding what to do.  And it’s also the long-standing behavior.  You need to make a compelling argument to change to your preference.

Inserting aborted dead tuples moves the counter closer to both autovacuum thresholds.  There is no reason why that should be prohibited.  I can see the argument for why one threshold should be dead tuples only and the other live tuples only - but I don’t favor that design.

David J.

pgsql-hackers by date:

Previous
From: Sami Imseih
Date:
Subject: Re: n_ins_since_vacuum stats for aborted transactions
Next
From: Sami Imseih
Date:
Subject: Re: n_ins_since_vacuum stats for aborted transactions