Thread: Re: add vacuum starttime columns
the user to monitor vacuum to ensure that vacuums
are completing on a relation at expected intervals.
I am not sure what value a start time will provide.
Can you provide a reason for this?
autovacuum
or auto analyze
on tables are not triggered in time, leading to suboptimal execution plans and some performance issues. When analyzing such problems, we often need to trace back to when autovacuum
or auto analyze
was triggered for the corresponding table. However, if the log_autovacuum_min_duration
parameter is not configured (as I’ve encountered in many cases where this parameter is either not set or has an inappropriate value), we cannot determine the trigger time and duration of the operation.Our idea is to directly query the pg_stat_all_tables
view to obtain the start time of (auto)vacuum/(auto)analyze for a table. This would help us monitor the duration of vacuum/analyze and determine whether it is necessary to tune and speed up the vacuum/analyze process.
Of course, to observe the duration of vacuum operations, we can configure the log_autovacuum_min_duration
parameter, but if there are many tables in the database, the vacuum entries in the logs might be quite numerous, making it difficult to analyze.
Additionally, we are currently considering whether it would be possible to add a last_(auto)vacuum_start
field to the pg_stat_all_tables
view. For tables where a vacuum operation is in progress, the last_(auto)vacuum
field may not be updated, and it may not be possible to estimate the vacuum duration using just these two fields.
However, this could still indicate whether a vacuum is in progress (if last_(auto)vacuum_start
is more recent than last_(auto)vacuum
, it means a vacuum is ongoing). While it is possible to monitor vacuum activity through the pg_stat_progress_vacuum
view, this view itself does not record timestamps, so additional views might be needed, which would be less convenient than querying pg_stat_all_tables
directly.
Therefore, we personally believe that adding such fields would be beneficial for monitoring the execution of (auto)vacuum and (auto)analyze.
Thanks
The last_(auto)vacuum is useful because it allows
the user to monitor vacuum to ensure that vacuums
are completing on a relation at expected intervals.
I am not sure what value a start time will provide.
Can you provide a reason for this?Regards,Sami Imseih
> However, if the log_autovacuum_min_duration parameter is not configured (as I’ve encountered in many cases where this parameteris either not set or has an inappropriate value), we cannot determine the trigger time and duration of the operation. log_autovacuum_min_duration logs when the vacuum completes and not when it starts. It does provide the elapsed time in the log as you say. > Our idea is to directly query the pg_stat_all_tables view to obtain the start time of (auto)vacuum/(auto)analyze for atable. This would help us monitor the duration of vacuum/analyze and determine whether it is necessary to tune and speedup the vacuum/analyze process. This is the crux of the question. I agree that pg_stat_all_tables is missing information to allow someone to determine if the vacuums for a table are becoming progressively longer to complete or to verify the vacuum is faster after some tuning was performed. The pg_stat_all_tables.vacuum(autovacuum)_count does not help answer these questions and one must then enable extra logging. > but if there are many tables in the database, the vacuum entries in the logs might be quite numerous, making it difficultto analyze. Agree > However, this could still indicate whether a vacuum is in progress (if last_(auto)vacuum_start is more recent than last_(auto)vacuum,it means a vacuum is ongoing). While it is possible to monitor vacuum activity through the pg_stat_progress_vacuumview, this view itself does not record timestamps, so additional views might be needed, which wouldbe less convenient than querying pg_stat_all_tables directly. The progress view does a good job of providing the phase of the vacuum which at least tells you if your vacuum is not "stuck" on a single operation. It is not really good at providing information to predict when the vacuum will complete. It is very difficult to actually make such a prediction. IMO, I think this situation could be improved, but I am not sure if a start_time is the best answer. Let's suppose you set the start_time at the beginning of the vacuum, then if the vacuum for some reason terminates early, then you are left with some inconsistent data. Thinking about this, it will be good to have cumulative statistics for elapsed times of a all vacuums on a relation ( since last reset ). This way, someone who is monitoring pg_stat_all_tables can divide the total accumulated elapsed time by the vacuum counts. This can give a good idea of how long vacuum is taking on a table and with enough sampling over time, one can derive patterns of vacuum slowing down, etc. Of course, the time can only be accumulated when the vacuum completes, which should be good enough. Regards, Sami Imseih
Of course, to observe the duration of vacuum operations, we can configure the
log_autovacuum_min_duration
parameter, but if there are many tables in the database, the vacuum entries in the logs might be quite numerous, making it difficult to analyze.
> Also, the log files give you historical overview that the pg_stat views simply cannot provide, > in addition to the actual details of what was vacuumed and why. Logfiles have the ability to provide more details and they have their place. However, one must also think about how much logging they want to enable and this limits how much they can learn about the system. A user also needs to come up with a custom solution to parse and process data. This is why we provide other (auto)vacuum/(auto)analyze metrics in pg_stat views. logs alone are not very practical to learn about such an important activity in the cluster. Being able to quickly answer "how long are vacuum/autovacuums are taking per table?" and being able to trend this information without enabling additional logging is a good user experience in my mind. > For what it's worth, log_autovacuum_min_duration is one of the few parameters that I always > recommend be turned on at the highest level (i.e. log it all). The extra log verbosity is well worth it. If we can avoid this additional logging, the better IMO. I would like to see us actually expose more vacuum level metrics in pg_stats, so maybe this is the time to think about potentially dedicated views for vacuum/analyze. Regards, Sami
worth mentioning here that there is discussion for tracking vacuum history in this recent thread [1], and this includes both start_time and end_time of the vacuum from what I can tell. postgres=# select * from vacuum_history ; start_time | end_time | dbid | relid | is_autovacuum | ..... It will also be good to see if there are other opinions for accumulating the elapsed time for vacuums in pg_stat_all_tables. [1] https://www.postgresql.org/message-id/flat/b68ab452-c41f-4d04-893f-eaab84f1855b%40vondra.me Regards, Sami