Re: New criteria for autovacuum - Mailing list pgsql-hackers
From | Sami Imseih |
---|---|
Subject | Re: New criteria for autovacuum |
Date | |
Msg-id | CAA5RZ0sPur3KUt9JXW9J88Lcjosdz73m91SdJnXrFgat7Lqf1Q@mail.gmail.com Whole thread Raw |
In response to | Re: New criteria for autovacuum (Konstantin Knizhnik <knizhnik@garret.ru>) |
List | pgsql-hackers |
> From logical point of view I agree with you: taken in account number of inserted tuples makes sense if it allows to markpage as all-visible. > So `ins_since_vacuum` should be better renamed to `ins_all_visible_since_vacuum` and count only all-visible tuples. > If newly inserted tuple is not visible to all, then it should not be accounted in statistic and trigger autovacuum. cumulative stats are not updated for an in-flight transaction, so effectively, these rows are not accounted for by vacuum. I did just realize however that it is strange that a rolledback session will accumulate n_tup_ins and n_ins_since_vacuum # session 1 test=# create table t (id int); CREATE TABLE test=# begin; BEGIN test=*# insert into t values (1); INSERT 0 1 test=*# insert into t values (1); INSERT 0 1 test=*# insert into t values (1); INSERT 0 1 # session 2 test=# select n_tup_ins, n_tup_upd, n_tup_del, n_live_tup, n_dead_tup, n_ins_since_vacuum from pg_stat_all_tables where relname = 't'; n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup | n_ins_since_vacuum -----------+-----------+-----------+------------+------------+-------------------- 0 | 0 | 0 | 0 | 0 | 0 (1 row) # session 1 ROLLBACK # session 2 test=# select n_tup_ins, n_tup_upd, n_tup_del, n_live_tup, n_dead_tup, n_ins_since_vacuum from pg_stat_all_tables where relname = 't'; n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup | n_ins_since_vacuum -----------+-----------+-----------+------------+------------+-------------------- 3 | 0 | 0 | 0 | 3 | 3 (1 row) ROLLBACKs should not be the norm, but this looks like a bug to me as it may trigger vacuum based on insert threshold more often. Thoughts? > So I still think that maintaining count of heap visibility check is the best alternative. > It quite easy to implement, adds almost no overhead and this information indicates efficiency > of index-only scan. So it seems to be useful even if not used by autovacuum. I agree. It could aid in a user that wants to schedule some manual vacuum for tables that need very optimal index-only scans, or to per-table tune autovac for those types of tables. I actually think we should add a few columns as mentioned here [0] >> That being said, long-running transactions are a problem for >> autovacuum in general. Even if you track this stat you are proposing >> about heap fetches by index only scans, you won't know if the long >> running transaction is over and thus if it makes sense to try and >> trigger an autovacuum for that table again anyway. > This. It would be really useful to have some kind of a system for > figuring out when -- in terms of XIDs -- we ought to vacuum which > table. I think that's a hard problem, but it would help a lot of > people. hmm, isn't that what anti-wraparound autovac does? or, I may have missed the point here completely [0] https://www.postgresql.org/message-id/CAA5RZ0t1U38qtVAmg3epjh5RBbpT4VRB_Myfp0oGm_73w-UNRA%40mail.gmail.com -- Sami Imseih Amazon Web Services (AWS)
pgsql-hackers by date: