Thread: Vacuum statistics

Vacuum statistics

From
Alena Rybakina
Date:
Hello, everyone!

I think we don't have enough information to analyze vacuum functionality.

Needless to say that the vacuum is the most important process for a 
database system. It prevents problems like table and index bloating and 
emergency freezing if we have a wraparound problem. Furthermore, it 
keeps the visibility map up to date. On the other hand, because of 
incorrectly adjusted aggressive settings of autovacuum it can consume a 
lot of computing resources that lead to all queries to the system 
running longer.

Nowadays the vacuum gathers statistical information about tables, but it 
is important not for optimizer only.

Because the vacuum is an automation process, there are a lot of settings 
that determine their aggressive functionality to other objects of the 
database. Besides, sometimes it is important to set a correct parameter 
for the specified table, because of its dynamic changes.

An administrator of a database needs to set the settings of autovacuum 
to have a balance between the vacuum's useful action in the database 
system on the one hand, and the overhead of its workload on the other. 
However, it is not enough for him to decide on vacuum functionality 
through statistical information about the number of vacuum passes 
through tables and operational data from progress_vacuum, because it is 
available only during vacuum operation and does not provide a strategic 
overview over the considered period.

To sum up, an automation vacuum has a strategic behavior because the 
frequency of its functionality and resource consumption depends on the 
workload of the database. Its workload on the database is minimal for an 
append-only table and it is a maximum for the table with a 
high-frequency updating. Furthermore, there is a high dependence of the 
vacuum load on the number and volume of indexes. Because of the absence 
of the visibility map for indexes, the vacuum scans the index 
completely, and the worst situation when it needs to do it during a 
bloating index situation in a small table.


I suggest gathering information about vacuum resource consumption for 
processing indexes and tables and storing it in the table and index 
relationships (for example, PgStat_StatTabEntry structure like it has 
realized for usual statistics). It will allow us to determine how well 
the vacuum is configured and evaluate the effect of overhead on the 
system at the strategic level, the vacuum has gathered this information 
already, but this valuable information doesn't store it.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Vacuum statistics

From
Alena Rybakina
Date:
On 30.05.2024 10:33, Alena Rybakina wrote:
>
> I suggest gathering information about vacuum resource consumption for 
> processing indexes and tables and storing it in the table and index 
> relationships (for example, PgStat_StatTabEntry structure like it has 
> realized for usual statistics). It will allow us to determine how well 
> the vacuum is configured and evaluate the effect of overhead on the 
> system at the strategic level, the vacuum has gathered this 
> information already, but this valuable information doesn't store it.
>
My colleagues and I have prepared a patch that can help to solve this 
problem.

We are open to feedback.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: Vacuum statistics

From
Andrei Zubkov
Date:
Hi,

Th, 30/05/2024 at 10:33 -0700, Alena Rybakina wrote:
> I suggest gathering information about vacuum resource consumption for
> processing indexes and tables and storing it in the table and index
> relationships (for example, PgStat_StatTabEntry structure like it has
> realized for usual statistics). It will allow us to determine how
> well
> the vacuum is configured and evaluate the effect of overhead on the
> system at the strategic level, the vacuum has gathered this
> information
> already, but this valuable information doesn't store it.
>
It seems a little bit unclear to me, so let me explain a little the
point of a proposition.

As the vacuum process is a backend it has a workload instrumentation.
We have all the basic counters available such as a number of blocks
read, hit and written, time spent on I/O, WAL stats and so on.. Also,
we can easily get some statistics specific to vacuum activity i.e.
number of tuples removed, number of blocks removed, number of VM marks
set and, of course the most important metric - time spent on vacuum
operation.

All those statistics must be stored by the Cumulative Statistics System
on per-relation basis. I mean individual cumulative counters for every
table and every index in the database.

Such counters will provide us a clear view about vacuum workload on
individual objects of the database, providing means to measure the
efficiency of performed vacuum fine tuning.
--
Andrei Zubkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Vacuum statistics

From
Dilip Kumar
Date:
On Thu, May 30, 2024 at 11:57 PM Alena Rybakina
<lena.ribackina@yandex.ru> wrote:
>
> On 30.05.2024 10:33, Alena Rybakina wrote:
> >
> > I suggest gathering information about vacuum resource consumption for
> > processing indexes and tables and storing it in the table and index
> > relationships (for example, PgStat_StatTabEntry structure like it has
> > realized for usual statistics). It will allow us to determine how well
> > the vacuum is configured and evaluate the effect of overhead on the
> > system at the strategic level, the vacuum has gathered this
> > information already, but this valuable information doesn't store it.
> >
> My colleagues and I have prepared a patch that can help to solve this
> problem.
>
> We are open to feedback.

I was reading through the patch here are some initial comments.

--
+typedef struct LVExtStatCounters
+{
+ TimestampTz time;
+ PGRUsage ru;
+ WalUsage walusage;
+ BufferUsage bufusage;
+ int64 VacuumPageMiss;
+ int64 VacuumPageHit;
+ int64 VacuumPageDirty;
+ double VacuumDelayTime;
+ PgStat_Counter blocks_fetched;
+ PgStat_Counter blocks_hit;
+} LVExtStatCounters;


I noticed that you are storing both pgBufferUsage and
VacuumPage(Hit/Miss/Dirty) stats. Aren't these essentially the same?
It seems they both exist in the system because some code, like
heap_vacuum_rel(), uses pgBufferUsage, while do_analyze_rel() still
relies on the old counters. And there is already a patch to remove
those old counters.


--
+static Datum
+pg_stats_vacuum(FunctionCallInfo fcinfo, ExtVacReportType type, int ncolumns)
+{

I don't think you need this last parameter (ncolumns) we can anyway
fetch that from tupledesc, so adding an additional parameter
just for checking doesn't look good to me.

--
+ /* Tricky turn here: enforce pgstat to think that our database us dbid */
+
+ MyDatabaseId = dbid;

typo
/think that our database us dbid/think that our database has dbid

Also, remove the blank line between the comment and the next code
block that is related to that comment.


--
  VacuumPageDirty = 0;
+ VacuumDelayTime = 0.;

There is an extra "." after 0


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Vacuum statistics

From
Alena Rybakina
Date:

Hi! Thank you for your interest in this topic!

On 07.06.2024 09:46, Dilip Kumar wrote:
On Thu, May 30, 2024 at 11:57 PM Alena Rybakina
<lena.ribackina@yandex.ru> wrote:
On 30.05.2024 10:33, Alena Rybakina wrote:
I suggest gathering information about vacuum resource consumption for
processing indexes and tables and storing it in the table and index
relationships (for example, PgStat_StatTabEntry structure like it has
realized for usual statistics). It will allow us to determine how well
the vacuum is configured and evaluate the effect of overhead on the
system at the strategic level, the vacuum has gathered this
information already, but this valuable information doesn't store it.

My colleagues and I have prepared a patch that can help to solve this
problem.

We are open to feedback.
I was reading through the patch here are some initial comments.

--
+typedef struct LVExtStatCounters
+{
+ TimestampTz time;
+ PGRUsage ru;
+ WalUsage walusage;
+ BufferUsage bufusage;
+ int64 VacuumPageMiss;
+ int64 VacuumPageHit;
+ int64 VacuumPageDirty;
+ double VacuumDelayTime;
+ PgStat_Counter blocks_fetched;
+ PgStat_Counter blocks_hit;
+} LVExtStatCounters;


I noticed that you are storing both pgBufferUsage and
VacuumPage(Hit/Miss/Dirty) stats. Aren't these essentially the same?
It seems they both exist in the system because some code, like
heap_vacuum_rel(), uses pgBufferUsage, while do_analyze_rel() still
relies on the old counters. And there is already a patch to remove
those old counters.
I agree with you and I have fixed it.

--
+static Datum
+pg_stats_vacuum(FunctionCallInfo fcinfo, ExtVacReportType type, int ncolumns)
+{

I don't think you need this last parameter (ncolumns) we can anyway
fetch that from tupledesc, so adding an additional parameter
just for checking doesn't look good to me.
To be honest, I'm not sure if ncolumns should be deleted at all, because the pg_stats_vacuum function is used to display three different types of statistics: for tables, indexes, and databases. We use this parameter to pass information about the number of parameters (or how many statistics we expect) depending on the type of statistics. For example, table vacuum statistics contain 27 parameters, while indexes and databases contain 19 and 15 parameters, respectively. You can see that the pg_stats_vacuum function contains an Assert that checks that the expected number of tupledesc parameters matches the actual number.

Assert(tupdesc->natts == ncolumns);


Perhaps I can convert it to a local parameter and determine its value already in the function, for example:

pg_stats_vacuum(FunctionCallInfo fcinfo, ExtVacReportType type, int ncolumns)
{

int columns = 0;

switch (type)

{

case PGSTAT_EXTVAC_HEAP:

ncolumns = EXTVACHEAPSTAT_COLUMNS;

break;

case PGSTAT_EXTVAC_INDEX:

ncolumns = EXTVACINDEXSTAT_COLUMNS;

break;

case PGSTAT_EXTVAC_DB:

ncolumns = EXTVACDBSTAT_COLUMNS;

break; }

...

}

What do you think?

--
+ /* Tricky turn here: enforce pgstat to think that our database us dbid */
+
+ MyDatabaseId = dbid;

typo
/think that our database us dbid/think that our database has dbid

Also, remove the blank line between the comment and the next code
block that is related to that comment.


--  VacuumPageDirty = 0;
+ VacuumDelayTime = 0.;

There is an extra "." after 0


Thank you, I fixed it.


In addition to these changes, I fixed the problem with displaying vacuum statistics for databases: I found an error in defining the pg_stats_vacuum_database system view. In addition, I rewrote the tests and converted them into a regression test. In addition, I have divided the test to test the functionality of the output of vacuum statistics into two tests: one of them checks the functionality of tables and databases, and the other - indexes. This is caused by a problem with the vacuum functionality when the table contains an index. You can find more information about this here: [0] and [1].

I attached the diff to this letter.

[0] https://www.postgresql.org/message-id/d1ca3a1d-7ead-41a7-bfd0-5b66ad97b1cd%40yandex.ru

[1] https://www.postgresql.org/message-id/CAH2-Wznv94Q_Td8OS8bAN7fYLpfU6CGgjn6Xau5eJ_sDxEGeBA%40mail.gmail.com


I am currently working on dividing this patch into three parts to simplify the review process: one of them will contain code for collecting vacuum statistics on tables, the second on indexes and the last on databases. I also write the documentation.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: Vacuum statistics

From
Alena Rybakina
Date:

Hi!

On 11.06.2024 16:09, Alena Rybakina wrote:
On 08.06.2024 09:30, Alena Rybakina wrote:

I am currently working on dividing this patch into three parts to simplify the review process: one of them will contain code for collecting vacuum statistics on tables, the second on indexes and the last on databases.

I have divided the patch into three: the first patch contains code for the functionality of collecting and storage for tables, the second one for indexes and the last one for databases.
-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Vacuum statistics

From
Alena Rybakina
Date:

Hi!

On 11.06.2024 16:09, Alena Rybakina wrote:
On 08.06.2024 09:30, Alena Rybakina wrote:

I am currently working on dividing this patch into three parts to simplify the review process: one of them will contain code for collecting vacuum statistics on tables, the second on indexes and the last on databases.

I have divided the patch into three: the first patch contains code for the functionality of collecting and storage for tables, the second one for indexes and the last one for databases.
-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: Vacuum statistics

From
Alena Rybakina
Date:

I have written the documentary and attached the patch.

On 08.06.2024 09:30, Alena Rybakina wrote:

I am currently working on dividing this patch into three parts to simplify the review process: one of them will contain code for collecting vacuum statistics on tables, the second on indexes and the last on databases. I also write the documentation.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: Vacuum statistics

From
Masahiko Sawada
Date:
On Fri, May 31, 2024 at 4:19 AM Andrei Zubkov <zubkov@moonset.ru> wrote:
>
> Hi,
>
> Th, 30/05/2024 at 10:33 -0700, Alena Rybakina wrote:
> > I suggest gathering information about vacuum resource consumption for
> > processing indexes and tables and storing it in the table and index
> > relationships (for example, PgStat_StatTabEntry structure like it has
> > realized for usual statistics). It will allow us to determine how
> > well
> > the vacuum is configured and evaluate the effect of overhead on the
> > system at the strategic level, the vacuum has gathered this
> > information
> > already, but this valuable information doesn't store it.
> >
> It seems a little bit unclear to me, so let me explain a little the
> point of a proposition.
>
> As the vacuum process is a backend it has a workload instrumentation.
> We have all the basic counters available such as a number of blocks
> read, hit and written, time spent on I/O, WAL stats and so on.. Also,
> we can easily get some statistics specific to vacuum activity i.e.
> number of tuples removed, number of blocks removed, number of VM marks
> set and, of course the most important metric - time spent on vacuum
> operation.

I've not reviewed the patch closely but it sounds helpful for users. I
would like to add a statistic, the high-water mark of memory usage of
dead tuple TIDs. Since the amount of memory used by TidStore is hard
to predict, I think showing the high-water mark would help users to
predict how much memory they set to maintenance_work_mem.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Vacuum statistics

From
Andrei Zubkov
Date:
Hello!

On Thu, 27/06/2024 at 10:39 +0900, Masahiko Sawada:
> On Fri, May 31, 2024 at 4:19 AM Andrei Zubkov <zubkov@moonset.ru>
> wrote:
> > As the vacuum process is a backend it has a workload
> > instrumentation.
> > We have all the basic counters available such as a number of blocks
> > read, hit and written, time spent on I/O, WAL stats and so on..
> > Also,
> > we can easily get some statistics specific to vacuum activity i.e.
> > number of tuples removed, number of blocks removed, number of VM
> > marks
> > set and, of course the most important metric - time spent on vacuum
> > operation.
>
> I've not reviewed the patch closely but it sounds helpful for users.
> I
> would like to add a statistic, the high-water mark of memory usage of
> dead tuple TIDs. Since the amount of memory used by TidStore is hard
> to predict, I think showing the high-water mark would help users to
> predict how much memory they set to maintenance_work_mem.
>
Thank you for your interest on this patch. I've understand your idea.
The obvious goal of it is to avoid expensive index multi processing
during vacuum of the heap. Provided statistics in the patch contain the
index_vacuum_count counter for each table which can be compared to the
pg_stat_all_tables.vacuum_count to detect specific relation index
multi-passes. Previous setting of maintenance_work_mem is known. Usage
of TidStore should be proportional to the amount of dead-tuples vacuum
workload on the table, so as the first evaluation we can take the
number of index passes per one heap pass as a maintenance_work_mem
multiplier.

But there is a better way. Once we detected the index multiprocessing
we can lower the vacuum workload for the heap pass making vacuum a
little bit more aggressive for this particular relation. I mean, in
such case increasing maintenance_work_mem is not only decision.

Suggested high-water mark statistic can't be used as cumulative
statistic - any high-water mark statistic as maximim-like statistic is
valid for certain time period thus should be reset on some kind of
schedule. Without resets it should reach 100% once under the heavy load
and stay there forever.

Said that such high-water mark seems a little bit unclear and
complicated for the DBA. It seems redundant to me right now. I can see
the main value of such statistic is to avoid too large
maintenance_work_mem setting. But I can't see really dramatic
consequences of that. Maybe I've miss something..

--
Andrei Zubkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Vacuum statistics

From
Ilia Evdokimov
Date:
Hello, everyone!


Thank you for your interesting patch with extended information 
statistics about autovacuum.

Do you consider not to create new table in pg_catalog but to save 
statistics in existing table? I mean pg_class or 
pg_stat_progress_analyze, pg_stat_progress_vacuum?

P.S. If I sent this mail twice, I'm sorry :)


Regards

Ilia Evdokimov,

Tantor Labs.



Re: Vacuum statistics

From
Andrei Zubkov
Date:
Hi, Ilia!

> Do you consider not to create new table in pg_catalog but to save
> statistics in existing table? I mean pg_class or
> pg_stat_progress_analyze, pg_stat_progress_vacuum?
>
Thank you for your interest on our patch!

*_progress views is not our case. They hold online statistics while
vacuum is in progress. Once work is done on a table the entry is gone
from those views. Idea of this patch is the opposite - it doesn't
provide online statistics but it accumulates statistics about rosources
consumed by all vacuum passes over all relations. It's much closer to
the pg_stat_all_tables than pg_stat_progress_vacuum.

It seems pg_class is not the right place because it is not a statistic
view - it holds the current relation state and haven't anything about
the relation workload.

Maybe the pg_stat_all_tables is the right place but I have several
thoughts about why it is not:
- Some statistics provided by this patch is really vacuum specific. I
don't think we want them in the relation statistics view.
- Postgres is extreamly extensible. I'm sure someday there will be
table AMs that does not need the vacuum at all.

Right now vacuum specific workload views seems optimal choice to me.

Regards,

--
Andrei Zubkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Vacuum statistics

From
Kirill Reshke
Date:
On Wed, 12 Jun 2024 at 11:38, Alena Rybakina <lena.ribackina@yandex.ru> wrote:
>
> Hi!
>
> On 11.06.2024 16:09, Alena Rybakina wrote:
>
> On 08.06.2024 09:30, Alena Rybakina wrote:
>
> I am currently working on dividing this patch into three parts to simplify the review process: one of them will
containcode for collecting vacuum statistics on tables, the second on indexes and the last on databases.
 
>
> I have divided the patch into three: the first patch contains code for the functionality of collecting and storage
fortables, the second one for indexes and the last one for databases.
 
>
> --
> Regards,
> Alena Rybakina
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
Hi!
Few suggestions on this patch-set

1)
> +{ oid => '4701',
> +  descr => 'pg_stats_vacuum_tables return stats values',
> +  proname => 'pg_stats_vacuum_tables', provolatile => 's', prorettype => 'record',proisstrict => 'f',
> +  proretset => 't',
> +  proargtypes => 'oid oid',
> +  proallargtypes =>

During development, OIDs should be picked up from range 8000-9999.
Same for pg_stats_vacuum_database & pg_stats_vacuum_indexes

Also, why are these function naming schemes like
pg_stats_vacuum_*something*, not pg_stat_vacuum_*something*, like
pg_stat_replication etc?

2) In 0003:
> +  proargnames =>
'{dboid,dboid,db_blks_read,db_blks_hit,total_blks_dirtied,total_blks_written,wal_records,wal_fpi,wal_bytes,blk_read_time,blk_write_time,delay_time,system_time,user_time,total_time,interrupts}',

Repeated dboid arg name is strange. Is it done this way to make
pg_stats_vacuum function call in more unified fashion? I don't see any
other place within postgresql core with similar approach, so I doubt
it is correct.

3) 0001 patch vacuum_tables_statistics test creates
statistic_vacuum_database1, but does not use it. 0003 do.
Also I'm not sure if these additional checks on the second database
adds much value. Can you justify this please?

Other places look more or less fine to me.
However, I'll maybe post some additional nit-picky comments on the
next patch version.

-- 
Best regards,
Kirill Reshke



Re: Vacuum statistics

From
Alena Rybakina
Date:

Hi! Thank you for inquiring about our topic!

On 10.08.2024 23:57, Kirill Reshke wrote:
Hi!
Few suggestions on this patch-set

1)
+{ oid => '4701',
+  descr => 'pg_stats_vacuum_tables return stats values',
+  proname => 'pg_stats_vacuum_tables', provolatile => 's', prorettype => 'record',proisstrict => 'f',
+  proretset => 't',
+  proargtypes => 'oid oid',
+  proallargtypes =>
During development, OIDs should be picked up from range 8000-9999.
Same for pg_stats_vacuum_database & pg_stats_vacuum_indexes

Also, why are these function naming schemes like
pg_stats_vacuum_*something*, not pg_stat_vacuum_*something*, like
pg_stat_replication etc?
To be honest, when I named it, I missed this aspect. I thought about the plural vacuum statistics we show, so I named them. I fixed it.
2) In 0003:
+  proargnames => '{dboid,dboid,db_blks_read,db_blks_hit,total_blks_dirtied,total_blks_written,wal_records,wal_fpi,wal_bytes,blk_read_time,blk_write_time,delay_time,system_time,user_time,total_time,interrupts}',
Repeated dboid arg name is strange. Is it done this way to make
pg_stats_vacuum function call in more unified fashion? I don't see any
other place within postgresql core with similar approach, so I doubt
it is correct.
Both parameters are required for input and output. We are trying to find statistics for a specific database if the database oid was specified by the user or display statistics for all objects, but we need to display which database these statistics are for. I corrected the name of the first parameter.
3) 0001 patch vacuum_tables_statistics test creates
statistic_vacuum_database1, but does not use it. 0003 do.
Also I'm not sure if these additional checks on the second database
adds much value. Can you justify this please?

The statistic_vacuum_database1 needs us to check the visible of statistics from another database (statistic_vacuum_database) as they are after the manipulation with tables in another database, and after deleting the vestat table . In the latter case, we need to be sure that all the table statistics are not visible to us.

So, I agree that it should be added only in the latest version of the patch, where we add vacuum statistics for databases. I fixed it.

Other places look more or less fine to me.
However, I'll maybe post some additional nit-picky comments on the
next patch version.
We are glad any feedback and review, so feel free to do it)

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: Vacuum statistics

From
Ilia Evdokimov
Date:

On 10.8.24 22:37, Andrei Zubkov wrote:

Hi, Ilia!

Do you consider not to create new table in pg_catalog but to save 
statistics in existing table? I mean pg_class or 
pg_stat_progress_analyze, pg_stat_progress_vacuum?

Thank you for your interest on our patch!

*_progress views is not our case. They hold online statistics while
vacuum is in progress. Once work is done on a table the entry is gone
from those views. Idea of this patch is the opposite - it doesn't
provide online statistics but it accumulates statistics about rosources
consumed by all vacuum passes over all relations. It's much closer to
the pg_stat_all_tables than pg_stat_progress_vacuum.

It seems pg_class is not the right place because it is not a statistic
view - it holds the current relation state and haven't anything about
the relation workload.

Maybe the pg_stat_all_tables is the right place but I have several
thoughts about why it is not:
- Some statistics provided by this patch is really vacuum specific. I
don't think we want them in the relation statistics view.
- Postgres is extreamly extensible. I'm sure someday there will be
table AMs that does not need the vacuum at all.

Right now vacuum specific workload views seems optimal choice to me.

Regards,


Agreed. They are not god places to store such statistics.


I have some suggestions:

  1. pgstatfuncs.c in functions tuplestore_put_for_database() and tuplestore_put_for_relation you can remove 'nulls' array if you're sure that columns cannot be NULL.
  2. These functions are almost the same and I would think of writing one function depending of type 'ExtVacReportType'

Re: Vacuum statistics

From
Ilia Evdokimov
Date:
And I have one suggestion for pg_stat_vacuum_database: I suppose we 
should add database's name column after 'dboid' column because it is 
difficult to read statistics without database's name. We could call it 
'datname' just like in 'pg_stat_database' view.

Regards,

Ilia Evdokimov,
Tantor Labs LCC.




Re: Vacuum statistics

From
Ilia Evdokimov
Date:


On 15.8.24 11:49, Alena Rybakina wrote:

I have some suggestions:

  1. pgstatfuncs.c in functions tuplestore_put_for_database() and tuplestore_put_for_relation you can remove 'nulls' array if you're sure that columns cannot be NULL.
We need to use this for tuplestore_putvalues function. With this function, we fill the table with the values of the statistics.

Ah, right! I'm sorry.



  1. These functions are almost the same and I would think of writing one function depending of type 'ExtVacReportType'
I'm not sure that I fully understand what you mean. Can you explain it more clearly, please?


Ah, I didn't notice that the size of all three tables is different. Therefore, it won't be possible to write one function instead of two to avoid code duplication. My mistake.



Re: Vacuum statistics

From
jian he
Date:
in pg_stats_vacuum
    if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)
    {
        Oid                    relid = PG_GETARG_OID(1);

        /* Load table statistics for specified database. */
        if (OidIsValid(relid))
        {
            tabentry = fetch_dbstat_tabentry(dbid, relid);
            if (tabentry == NULL || tabentry->vacuum_ext.type != type)
                /* Table don't exists or isn't an heap relation. */
                PG_RETURN_NULL();

            tuplestore_put_for_relation(relid, rsinfo, tabentry);
        }
        else
        {
       }


So for functions pg_stat_vacuum_indexes and pg_stat_vacuum_tables,
it seems you didn't check "relid" 's relkind,
you may need to use get_rel_relkind.



Re: Vacuum statistics

From
Ilia Evdokimov
Date:
Are you certain that all tables are included in `pg_stat_vacuum_tables`? 
I'm asking because of the following:


SELECT count(*) FROM pg_stat_all_tables ;
  count
-------
    108
(1 row)

SELECT count(*) FROM pg_stat_vacuum_tables ;
  count
-------
     20
(1 row)

-- 
Regards,
Ilia Evdokimov,
Tantor Labs LCC.




Re: Vacuum statistics

From
Alena Rybakina
Date:

Hi! Thank you very much for your review! Sorry for my late response I was overwhelmed by tasks.

On 16.08.2024 14:12, jian he wrote:
On Thu, Aug 15, 2024 at 4:49 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
Hi!

I've applied all the v5 patches.
0002 and 0003 have white space errors.

+      <para>
+        Number of times blocks of this index were already found
+        in the buffer cache by vacuum operations, so that a read was
not necessary
+        (this only includes hits in the
+        &project; buffer cache, not the operating system's file system cache)
+      </para></entry>

+        Number of times blocks of this table were already found
+        in the buffer cache by vacuum operations, so that a read was
not necessary
+        (this only includes hits in the
+        &project; buffer cache, not the operating system's file system cache)
+      </para></entry>

"&project;"
represents a sgml file placeholder name as "project" and puts all the
content of "project.sgml" to system-views.sgml.
but you don't have "project.sgml". you may check
doc/src/sgml/filelist.sgml or doc/src/sgml/ref/allfiles.sgml
for usage of "&place_holder;".
so you can change it to "project", otherwise doc cannot build.


src/backend/commands/dbcommands.c
we have:    /*     * If built with appropriate switch, whine when regression-testing     * conventions for database names are violated.  But don't complain during     * initdb.     */
#ifdef ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS    if (IsUnderPostmaster && strstr(dbname, "regression") == NULL)        elog(WARNING, "databases created by regression test cases
should have names including \"regression\"");
#endif
so in src/test/regress/sql/vacuum_tables_and_db_statistics.sql you
need to change dbname:
CREATE DATABASE statistic_vacuum_database;
CREATE DATABASE statistic_vacuum_database1;


+  <para>
+   The view <structname>pg_stat_vacuum_indexes</structname> will contain
+   one row for each index in the current database (including TOAST
+   table indexes), showing statistics about vacuuming that specific index.
+  </para>
TOAST should
<acronym>TOAST</acronym>



+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
maybe change to            ereport(ERROR,                    (errcode(ERRCODE_DATATYPE_MISMATCH),                     errmsg("return type must be a row type")));
Later I found out "InitMaterializedSRF(fcinfo, 0);" already did all
the work. much of the code can be gotten rid of.
please check attached.
I agree with your suggestions for improving the code. I will add this in the next version of the patch.

#define EXTVACHEAPSTAT_COLUMNS    27
#define EXTVACIDXSTAT_COLUMNS    19
#define EXTVACDBSTAT_COLUMNS    15
#define EXTVACSTAT_COLUMNS Max(EXTVACHEAPSTAT_COLUMNS, EXTVACIDXSTAT_COLUMNS)

static Oid CurrentDatabaseId = InvalidOid;
we already defined MyDatabaseId in src/include/miscadmin.h,
Why do we need "static Oid CurrentDatabaseId = InvalidOid;"?
also src/backend/utils/adt/pgstatfuncs.c already included "miscadmin.h".
Hmm, Tom Lane added "misc admin.h", or I didn't notice something. Could you point this out, please?

We used the Current Database Id to output statistics on tables from another database, so we need to replace it with a different default value. But I want to rewrite this patch to display table statistics only for the current database, that is, this part will be removed in the future. In my opinion, it would be more correct.
the following code one function has 2 return statements?
------------------------------------------------------------------------
/* * Get the vacuum statistics for the heap tables. */
Datum
pg_stat_vacuum_tables(PG_FUNCTION_ARGS)
{    return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_HEAP, EXTVACHEAPSTAT_COLUMNS);
    PG_RETURN_NULL();
}

/* * Get the vacuum statistics for the indexes. */
Datum
pg_stat_vacuum_indexes(PG_FUNCTION_ARGS)
{    return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_INDEX, EXTVACIDXSTAT_COLUMNS);
    PG_RETURN_NULL();
}

/* * Get the vacuum statistics for the database. */
Datum
pg_stat_vacuum_database(PG_FUNCTION_ARGS)
{        return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_DB, EXTVACDBSTAT_COLUMNS);
    PG_RETURN_NULL();
}
You are right - the second return is superfluous. I'll fix it.
------------------------------------------------------------------------
in pg_stats_vacuum:    if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)    {        Oid                    relid = PG_GETARG_OID(1);
        /* Load table statistics for specified database. */        if (OidIsValid(relid))        {            tabentry = fetch_dbstat_tabentry(dbid, relid);            if (tabentry == NULL || tabentry->vacuum_ext.type != type)                /* Table don't exists or isn't an heap relation. */                PG_RETURN_NULL();
            tuplestore_put_for_relation(relid, tupstore, tupdesc,
tabentry, ncolumns);        }        else        {         ...        }
}
I don't understand the ELSE branch. the IF branch means the input
dboid, heap/index oid is correct.
the ELSE branch means table reloid is invalid = 0.
I am not sure 100% what the ELSE Branch means.
Also there are no comments explaining why.
experiments seem to show that  when reloid is 0, it will print out all
the vacuum statistics
for all the tables in the current database. If so, then only super
users can call pg_stats_vacuum?
but the table owner should be able to call pg_stats_vacuum for that
specific table.
If any reloid has not been set by the user, we output statistics for all objects - tables or indexes.In this part of the code, we find all the suitable objects from the snapshot, if they belong to the index or table type of objects.
/* Type of ExtVacReport */
typedef enum ExtVacReportType
{    PGSTAT_EXTVAC_INVALID = 0,    PGSTAT_EXTVAC_HEAP = 1,    PGSTAT_EXTVAC_INDEX = 2,    PGSTAT_EXTVAC_DB = 3,
} ExtVacReportType;
generally "HEAP" means table and index, maybe "PGSTAT_EXTVAC_HEAP" would be term

No, Heap means something like a table in a relationship database, or its alternative name is Heap.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Vacuum statistics

From
Alena Rybakina
Date:

We check it there: "tabentry->vacuum_ext.type != type". Or were you talking about something else?

On 19.08.2024 12:32, jian he wrote:
in pg_stats_vacuum    if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)    {        Oid                    relid = PG_GETARG_OID(1);
        /* Load table statistics for specified database. */        if (OidIsValid(relid))        {            tabentry = fetch_dbstat_tabentry(dbid, relid);            if (tabentry == NULL || tabentry->vacuum_ext.type != type)                /* Table don't exists or isn't an heap relation. */                PG_RETURN_NULL();
            tuplestore_put_for_relation(relid, rsinfo, tabentry);        }        else        {       }


So for functions pg_stat_vacuum_indexes and pg_stat_vacuum_tables,
it seems you didn't check "relid" 's relkind,
you may need to use get_rel_relkind.
-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Vacuum statistics

From
Alena Rybakina
Date:
I think you've counted the above system tables from the database, but 
I'll double-check it. Thank you for your review!

On 19.08.2024 19:28, Ilia Evdokimov wrote:
> Are you certain that all tables are included in 
> `pg_stat_vacuum_tables`? I'm asking because of the following:
>
>
> SELECT count(*) FROM pg_stat_all_tables ;
>  count
> -------
>    108
> (1 row)
>
> SELECT count(*) FROM pg_stat_vacuum_tables ;
>  count
> -------
>     20
> (1 row)
>
-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Vacuum statistics

From
Kirill Reshke
Date:
On Thu, 22 Aug 2024 at 07:48, jian he <jian.universality@gmail.com> wrote:
>
> On Wed, Aug 21, 2024 at 6:37 AM Alena Rybakina
> <a.rybakina@postgrespro.ru> wrote:
> >
> > We check it there: "tabentry->vacuum_ext.type != type". Or were you talking about something else?
> >
> > On 19.08.2024 12:32, jian he wrote:
> >
> > in pg_stats_vacuum
> >     if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)
> >     {
> >         Oid                    relid = PG_GETARG_OID(1);
> >
> >         /* Load table statistics for specified database. */
> >         if (OidIsValid(relid))
> >         {
> >             tabentry = fetch_dbstat_tabentry(dbid, relid);
> >             if (tabentry == NULL || tabentry->vacuum_ext.type != type)
> >                 /* Table don't exists or isn't an heap relation. */
> >                 PG_RETURN_NULL();
> >
> >             tuplestore_put_for_relation(relid, rsinfo, tabentry);
> >         }
> >         else
> >         {
> >        }
> >
> >
> > So for functions pg_stat_vacuum_indexes and pg_stat_vacuum_tables,
> > it seems you didn't check "relid" 's relkind,
> > you may need to use get_rel_relkind.
> >
> > --
>
> hi.
> I mentioned some points at [1],
> Please check the attached patchset to address these issues.
>
> there are four occurrences of "CurrentDatabaseId", i am still confused
> with usage of CurrentDatabaseId.
>
> also please don't  top-post, otherwise the archive, like [2] is not
> easier to read for future readers.
> generally you quote first, then reply.
>
> [1] https://postgr.es/m/CACJufxHb_YGCp=pVH6DZcpk9yML+SueffPeaRbX2LzXZVahd_w@mail.gmail.com
> [2] https://postgr.es/m/78394e29-a900-4af4-b5ce-d6eb2d263fad@postgrespro.ru

Hi, your points are valid.
Regarding 0003, I also wanted to object database naming in a
regression test during my review but for some reason didn't.Now, as
soon as we already need to change it, I suggest we also change
regression_statistic_vacuum_db1 to something less generic. Maybe
regression_statistic_vacuum_db_unaffected.



--
Best regards,
Kirill Reshke



Re: Vacuum statistics

From
Alexander Korotkov
Date:
On Wed, Aug 21, 2024 at 1:39 AM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
>
> I think you've counted the above system tables from the database, but
> I'll double-check it. Thank you for your review!
>
> On 19.08.2024 19:28, Ilia Evdokimov wrote:
> > Are you certain that all tables are included in
> > `pg_stat_vacuum_tables`? I'm asking because of the following:
> >
> >
> > SELECT count(*) FROM pg_stat_all_tables ;
> >  count
> > -------
> >    108
> > (1 row)
> >
> > SELECT count(*) FROM pg_stat_vacuum_tables ;
> >  count
> > -------
> >     20
> > (1 row)
> >

I'd like to do some review a well.

+   MyDatabaseId = dbid;
+
+   PG_TRY();
+   {
+       tabentry = pgstat_fetch_stat_tabentry(relid);
+       MyDatabaseId = storedMyDatabaseId;
+   }
+   PG_CATCH();
+   {
+       MyDatabaseId = storedMyDatabaseId;
+   }
+   PG_END_TRY();

I think this is generally wrong to change MyDatabaseId, especially if you have to wrap it with PG_TRY()/PG_CATCH().  I think, instead we need proper API changes, i.e. make pgstat_fetch_stat_tabentry() and others take dboid as an argument.

+/*
+ * Get the vacuum statistics for the heap tables.
+ */
+Datum
+pg_stat_vacuum_tables(PG_FUNCTION_ARGS)
+{
+   return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_HEAP, EXTVACHEAPSTAT_COLUMNS);
+
+   PG_RETURN_NULL();
+}

The PG_RETURN_NULL() is unneeded after another return statement.  However, does pg_stats_vacuum() need to return anything?  What about making its return type void?

@@ -874,4 +874,38 @@ pgstat_get_custom_snapshot_data(PgStat_Kind kind)
   return pgStatLocal.snapshot.custom_data[idx];
 }
 
+/* hash table for statistics snapshots entry */
+typedef struct PgStat_SnapshotEntry
+{
+  PgStat_HashKey key;
+  char     status;        /* for simplehash use */
+  void     *data;         /* the stats data itself */
+} PgStat_SnapshotEntry;

It would be nice to preserve encapsulation and don't expose pgstat_snapshot hash in the headers.  I see there is only one usage of it outside of pgstat.c: pg_stats_vacuum().  

+        Oid                  storedMyDatabaseId = MyDatabaseId;
+
+        pgstat_update_snapshot(PGSTAT_KIND_RELATION);
+        MyDatabaseId = storedMyDatabaseId;

This manipulation with storedMyDatabaseId looks pretty useless.  It seems to be intended to change MyDatabaseId, while I'm not fan of this as I mentioned above.

+static PgStat_StatTabEntry *
+fetch_dbstat_tabentry(Oid dbid, Oid relid)
+{
+  Oid                  storedMyDatabaseId = MyDatabaseId;
+  PgStat_StatTabEntry  *tabentry = NULL;
+
+  if (OidIsValid(CurrentDatabaseId) && CurrentDatabaseId == dbid)
+     /* Quick path when we read data from the same database */
+     return pgstat_fetch_stat_tabentry(relid);
+
+  pgstat_clear_snapshot();

It looks scary to reset the whole snapshot each time we access another database.  Need to also mention that the CurrentDatabaseId machinery isn't implemented.

New functions pg_stat_vacuum_tables(), pg_stat_vacuum_indexes(), pg_stat_vacuum_database() are SRFs.  When zero Oid is passed they report all the objects.  However, it seems they aren't intended to be used directly.  Instead, there are views with the same names.  These views always call them with particular Oids, therefore SRFs always return one row.  Then why bother with SRF?  They could return plain records instead.

Also, as I mentioned above patchset makes a lot of trouble accessing statistics of relations of another database.  But that seems to be useless given corresponding views allow to see only relations of the current database.  Even if you call functions directly, what is the value of this information given that you don't know the relation oids in another database?  So, I think if we will give up and limit access to the relations of the current database patch will become simpler and clearer.

------
Regards,
Alexander Korotkov
Supabase

Re: Vacuum statistics

From
Alena Rybakina
Date:
On 22.08.2024 05:47, jian he wrote:
On Wed, Aug 21, 2024 at 6:37 AM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
We check it there: "tabentry->vacuum_ext.type != type". Or were you talking about something else?

On 19.08.2024 12:32, jian he wrote:

in pg_stats_vacuum    if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)    {        Oid                    relid = PG_GETARG_OID(1);
        /* Load table statistics for specified database. */        if (OidIsValid(relid))        {            tabentry = fetch_dbstat_tabentry(dbid, relid);            if (tabentry == NULL || tabentry->vacuum_ext.type != type)                /* Table don't exists or isn't an heap relation. */                PG_RETURN_NULL();
            tuplestore_put_for_relation(relid, rsinfo, tabentry);        }        else        {       }


So for functions pg_stat_vacuum_indexes and pg_stat_vacuum_tables,
it seems you didn't check "relid" 's relkind,
you may need to use get_rel_relkind.

--
hi.
I mentioned some points at [1],
Please check the attached patchset to address these issues.

Thank you for your work! I checked the patches and added your suggested changes to the new version of the patch here [0]. In my opinion, nothing was missing, but please take a look.

[0] https://www.postgresql.org/message-id/c4e4e305-7119-4183-b49a-d7092f4efba3%40postgrespro.ru


there are four occurrences of "CurrentDatabaseId", i am still confused
with usage of CurrentDatabaseId.

It needed to be used because of scanning objects from the other database, so we change the id of dbid temporary to achieve it.

You should snow that every part of this code was deleted. Now we can check information about tables and indexes from the current database.

also please don't  top-post, otherwise the archive, like [2] is not
easier to read for future readers.
generally you quote first, then reply.

[1] https://postgr.es/m/CACJufxHb_YGCp=pVH6DZcpk9yML+SueffPeaRbX2LzXZVahd_w@mail.gmail.com
[2] https://postgr.es/m/78394e29-a900-4af4-b5ce-d6eb2d263fad@postgrespro.ru
Ok, no problem.
-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Vacuum statistics

From
Alena Rybakina
Date:
On 22.08.2024 07:29, Kirill Reshke wrote:
On Thu, 22 Aug 2024 at 07:48, jian he <jian.universality@gmail.com> wrote:
On Wed, Aug 21, 2024 at 6:37 AM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
We check it there: "tabentry->vacuum_ext.type != type". Or were you talking about something else?

On 19.08.2024 12:32, jian he wrote:

in pg_stats_vacuum    if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)    {        Oid                    relid = PG_GETARG_OID(1);
        /* Load table statistics for specified database. */        if (OidIsValid(relid))        {            tabentry = fetch_dbstat_tabentry(dbid, relid);            if (tabentry == NULL || tabentry->vacuum_ext.type != type)                /* Table don't exists or isn't an heap relation. */                PG_RETURN_NULL();
            tuplestore_put_for_relation(relid, rsinfo, tabentry);        }        else        {       }


So for functions pg_stat_vacuum_indexes and pg_stat_vacuum_tables,
it seems you didn't check "relid" 's relkind,
you may need to use get_rel_relkind.

--
hi.
I mentioned some points at [1],
Please check the attached patchset to address these issues.

there are four occurrences of "CurrentDatabaseId", i am still confused
with usage of CurrentDatabaseId.

also please don't  top-post, otherwise the archive, like [2] is not
easier to read for future readers.
generally you quote first, then reply.

[1] https://postgr.es/m/CACJufxHb_YGCp=pVH6DZcpk9yML+SueffPeaRbX2LzXZVahd_w@mail.gmail.com
[2] https://postgr.es/m/78394e29-a900-4af4-b5ce-d6eb2d263fad@postgrespro.ru
Hi, your points are valid.
Regarding 0003, I also wanted to object database naming in a
regression test during my review but for some reason didn't.Now, as
soon as we already need to change it, I suggest we also change
regression_statistic_vacuum_db1 to something less generic. Maybe
regression_statistic_vacuum_db_unaffected.

Hi! I fixed it in the new version of the patch [0]. Feel free to review it!

To be honest, I still doubt that the current database names (regression_statistic_vacuum_db and regression_statistic_vacuum_db1) can be easily generated, but if you insist on renaming, I will do it.

[0] https://www.postgresql.org/message-id/c4e4e305-7119-4183-b49a-d7092f4efba3%40postgrespro.ru

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Vacuum statistics

From
jian he
Date:
On Thu, Sep 5, 2024 at 1:23 AM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
>
> Hi, all!
>
> I have attached the new version of the code and the diff files
> (minor-vacuum.no-cbot).
>

hi.

still have white space issue when using "git apply",
you may need to use "git diff --check" to find out where.


 /* ----------
diff --git a/src/test/regress/expected/opr_sanity.out
b/src/test/regress/expected/opr_sanity.out
index 5d72b970b03..7026de157e4 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -32,11 +32,12 @@ WHERE p1.prolang = 0 OR p1.prorettype = 0 OR
        prokind NOT IN ('f', 'a', 'w', 'p') OR
        provolatile NOT IN ('i', 's', 'v') OR
        proparallel NOT IN ('s', 'r', 'u');
- oid  |        proname
-------+------------------------
+ oid  |         proname
+------+-------------------------
  8001 | pg_stat_vacuum_tables
  8002 | pg_stat_vacuum_indexes
-(2 rows)
+ 8003 | pg_stat_vacuum_database
+(3 rows)


looking at src/test/regress/sql/opr_sanity.sql:

-- **************** pg_proc ****************
-- Look for illegal values in pg_proc fields.

SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE p1.prolang = 0 OR p1.prorettype = 0 OR
       p1.pronargs < 0 OR
       p1.pronargdefaults < 0 OR
       p1.pronargdefaults > p1.pronargs OR
       array_lower(p1.proargtypes, 1) != 0 OR
       array_upper(p1.proargtypes, 1) != p1.pronargs-1 OR
       0::oid = ANY (p1.proargtypes) OR
       procost <= 0 OR
       CASE WHEN proretset THEN prorows <= 0 ELSE prorows != 0 END OR
       prokind NOT IN ('f', 'a', 'w', 'p') OR
       provolatile NOT IN ('i', 's', 'v') OR
       proparallel NOT IN ('s', 'r', 'u');

that means
 oid  |         proname
------+-------------------------
 8001 | pg_stat_vacuum_tables
 8002 | pg_stat_vacuum_indexes
 8003 | pg_stat_vacuum_database


These above functions, pg_proc.prorows should > 0 when
pg_proc.proretset is true.
I think that's the opr_sanity test's intention.
so you may need to change pg_proc.dat.

BTW the doc says:
prorows float4, Estimated number of result rows (zero if not proretset)



segmentation fault cases:
select * from pg_stat_vacuum_indexes(0);
select * from pg_stat_vacuum_tables(0);


+ else if (type == PGSTAT_EXTVAC_DB)
+ {
+ PgStatShared_Database   *dbentry;
+ PgStat_EntryRef   *entry_ref;
+ Oid dbid = PG_GETARG_OID(0);
+
+ if (OidIsValid(dbid))
+ {
+ entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_DATABASE,
+ dbid, InvalidOid, false);
+ dbentry = (PgStatShared_Database *) entry_ref->shared_stats;
+
+ if (dbentry == NULL)
+ /* Table doesn't exist or isn't a heap relation */
+ return;
+
+ tuplestore_put_for_database(dbid, rsinfo, dbentry);
+ pgstat_unlock_entry(entry_ref);
+ }
+ }
didn't error out when dbid is invalid?



pg_stat_vacuum_tables
pg_stat_vacuum_indexes
pg_stat_vacuum_database
these functions didn't verify the only input argument oid's kind.
for example:

create table s(a int primary key) with (autovacuum_enabled = off);
create view sv as select * from s;
vacuum s;
select * from pg_stat_vacuum_tables('sv'::regclass::oid);
select * from pg_stat_vacuum_indexes('sv'::regclass::oid);
select * from pg_stat_vacuum_database('sv'::regclass::oid);

above all these 3 examples should error out? because  sv is view.

in src/backend/catalog/system_views.sql
for view creation of pg_stat_vacuum_indexes
you can change to

WHERE
  db.datname = current_database() AND
  rel.oid = stats.relid AND
  ns.oid = rel.relnamespace
AND rel.relkind = 'i':



pg_stat_vacuum_tables  in in src/backend/catalog/system_views.sql
you can change to

WHERE
  db.datname = current_database() AND
  rel.oid = stats.relid AND
  ns.oid = rel.relnamespace
AND rel.relkind = 'r':



Re: Vacuum statistics

From
Masahiko Sawada
Date:
Hi,

On Thu, Sep 5, 2024 at 2:01 PM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
>
> Hi! Thank you for your review!
>
> On 05.09.2024 15:47, jian he wrote:
>
> On Thu, Sep 5, 2024 at 1:23 AM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
>
> Hi, all!
>
> I have attached the new version of the code and the diff files
> (minor-vacuum.no-cbot).

Thank you for updating the patches. I've reviewed the 0001 patch and
have two comments.

I think we can split the 0001 patch into two parts: adding
pg_stat_vacuum_tables system views that shows the vacuum statistics
that we are currently collecting such as scanned_pages and
removed_pages, and another one is to add new statistics to collect
such as vacrel->set_all_visible_pages and visibility map updates.

I'm concerned that a pg_stat_vacuum_tables view has some duplicated
statistics that we already collect in different ways. For instance,
total_blks_{read,hit,dirtied,written} are already tracked at
system-level by pg_stat_io, and per-relation block I/O statistics can
be collected using pg_stat_statements. Having duplicated statistics
consumes more memory for pgstat and could confuse users if these
statistics are not consistent. I think it would be better to avoid
collecting duplicated statistics in different places.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Vacuum statistics

From
Melanie Plageman
Date:
On Fri, Sep 27, 2024 at 2:16 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi,
>
> On Thu, Sep 5, 2024 at 2:01 PM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
> >
> > Hi! Thank you for your review!
> >
> > On 05.09.2024 15:47, jian he wrote:
> >
> > On Thu, Sep 5, 2024 at 1:23 AM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
> >
> > Hi, all!
> >
> > I have attached the new version of the code and the diff files
> > (minor-vacuum.no-cbot).
>
> Thank you for updating the patches. I've reviewed the 0001 patch and
> have two comments.

I took a very brief look at this and was wondering if it was worth
having a way to make the per-table vacuum statistics opt-in (like a
table storage parameter) in order to decrease the shared memory
footprint of storing the stats.

- Melanie



Re: Vacuum statistics

From
Andrei Zubkov
Date:
Hi,

On Fri, 2024-09-27 at 11:15 -0700, Masahiko Sawada wrote:
> I'm concerned that a pg_stat_vacuum_tables view has some duplicated
> statistics that we already collect in different ways. For instance,
> total_blks_{read,hit,dirtied,written} are already tracked at
> system-level by pg_stat_io,

pg_stat_vacuum_tables.total_blks_{read,hit,dirtied,written} tracks
blocks used by vacuum in different ways while vacuuming this particular
table while pg_stat_io tracks blocks used by vacuum on the cluster
level.

> and per-relation block I/O statistics can
> be collected using pg_stat_statements.

This is impossible. pg_stat_statements tracks block statistics on a
statement level. One statement could touch many tables and many
indexes, and all used database blocks will be counted by the
pg_stat_statements counters on a statement-level. Autovacuum statistics
won't be accounted by the pg_stat_statements. After all,
pg_stat_statements won't hold the statements statistics forever. Under
pressure of new statements the statement eviction can happen and
statistics will be lost.

All of the above is addressed by relation-level vacuum statistics held
in the Cumulative Statistics System proposed by this patch.
--
regards, Andrei Zubkov
Postgres Professional




Re: Vacuum statistics

From
Masahiko Sawada
Date:
On Fri, Sep 27, 2024 at 12:19 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Fri, Sep 27, 2024 at 2:16 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Hi,
> >
> > On Thu, Sep 5, 2024 at 2:01 PM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
> > >
> > > Hi! Thank you for your review!
> > >
> > > On 05.09.2024 15:47, jian he wrote:
> > >
> > > On Thu, Sep 5, 2024 at 1:23 AM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
> > >
> > > Hi, all!
> > >
> > > I have attached the new version of the code and the diff files
> > > (minor-vacuum.no-cbot).
> >
> > Thank you for updating the patches. I've reviewed the 0001 patch and
> > have two comments.
>
> I took a very brief look at this and was wondering if it was worth
> having a way to make the per-table vacuum statistics opt-in (like a
> table storage parameter) in order to decrease the shared memory
> footprint of storing the stats.

I'm not sure how users can select tables that enable vacuum statistics
as I think they basically want to have statistics for all tables, but
I see your point. Since the size of PgStat_TableCounts approximately
tripled by this patch (112 bytes to 320 bytes), it might be worth
considering ways to reduce the number of entries or reducing the size
of vacuum statistics.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Vacuum statistics

From
Alena Rybakina
Date:
Hi! Thank you for your interesting for this patch!
I took a very brief look at this and was wondering if it was worth
having a way to make the per-table vacuum statistics opt-in (like a
table storage parameter) in order to decrease the shared memory
footprint of storing the stats.
I'm not sure how users can select tables that enable vacuum statistics
as I think they basically want to have statistics for all tables, but
I see your point. Since the size of PgStat_TableCounts approximately
tripled by this patch (112 bytes to 320 bytes), it might be worth
considering ways to reduce the number of entries or reducing the size
of vacuum statistics.

The main purpose of these statistics is to see abnormal behavior of vacuum in relation to a table or the database as a whole.

For example, there may be a situation where vacuum has started to run more often and spends a lot of resources on processing a certain index, but the size of the index does not change significantly. Moreover, the table in which this index is located can be much smaller in size. This may be because the index is bloated and needs to be reindexed. 

This is exactly what vacuum statistics can show - we will see that compared to other objects, vacuum processed more blocks and spent more time on this index.

Perhaps the vacuum parameters for the index should be set more aggressively to avoid this in the future.

I suppose that if we turn off statistics collection for a certain object, we can miss it. In addition, the user may not enable the parameter for the object in time, because he will forget about it. 

As for the second option, now I cannot say which statistics can be removed, to be honest. So far, they all seem necessary.

-- 
Regards,
Alena Rybakina
Postgres Professional

Re: Vacuum statistics

From
Ilia Evdokimov
Date:


On 08.10.2024 19:18, Alena Rybakina wrote:
Made a rebase on a fresh master branch.
-- 
Regards,
Alena Rybakina
Postgres Professional

Thank you for rebasing.

I have noticed that when I create a table or an index on this table, there is no information about the table or index in pg_stat_vacuum_tables and pg_stat_vacuum_indexes until we perform a VACUUM.

Example:

CREATE TABLE t (i INT, j INT);
INSERT INTO t SELECT i/10, i/100 FROM  GENERATE_SERIES(1,1000000) i;
SELECT * FROM pg_stat_vacuum_tables WHERE relname = 't';
....
(0 rows)
CREATE INDEX ON t (i);
SELECT * FROM pg_stat_vacuum_indexes WHERE relname = 't_i_idx';
...
(0 rows)

I can see the entries after running VACUUM or executing autovacuum. or when autovacuum is executed. I would suggest adding a line about the relation even if it has not yet been processed by vacuum. Interestingly, this issue does not occur with pg_stat_vacuum_database:

CREATE DATABASE example_db;
SELECT * FROM pg_stat_vacuum_database WHERE dbname = 'example_db';
dboid |       dbname | ...
 ...      | example_db | ...
(1 row)

BTW, I recommend renaming the view pg_stat_vacuum_database to pg_stat_vacuum_databaseS  for consistency with pg_stat_vacuum_tables and pg_stat_vacuum_indexes

--
Regards,
Ilia Evdokimov,
Tantor Labs LLC.

Re: Vacuum statistics

From
Alena Rybakina
Date:

Hi!

On 16.10.2024 13:31, Ilia Evdokimov wrote:


On 08.10.2024 19:18, Alena Rybakina wrote:
Made a rebase on a fresh master branch.
-- 
Regards,
Alena Rybakina
Postgres Professional

Thank you for rebasing.

I have noticed that when I create a table or an index on this table, there is no information about the table or index in pg_stat_vacuum_tables and pg_stat_vacuum_indexes until we perform a VACUUM.

Example:

CREATE TABLE t (i INT, j INT);
INSERT INTO t SELECT i/10, i/100 FROM  GENERATE_SERIES(1,1000000) i;
SELECT * FROM pg_stat_vacuum_tables WHERE relname = 't';
....
(0 rows)
CREATE INDEX ON t (i);
SELECT * FROM pg_stat_vacuum_indexes WHERE relname = 't_i_idx';
...
(0 rows)

I can see the entries after running VACUUM or executing autovacuum. or when autovacuum is executed. I would suggest adding a line about the relation even if it has not yet been processed by vacuum. Interestingly, this issue does not occur with pg_stat_vacuum_database:

CREATE DATABASE example_db;
SELECT * FROM pg_stat_vacuum_database WHERE dbname = 'example_db';
dboid |       dbname | ...
 ...      | example_db | ...
(1 row)

BTW, I recommend renaming the view pg_stat_vacuum_database to pg_stat_vacuum_databaseS  for consistency with pg_stat_vacuum_tables and pg_stat_vacuum_indexes

Thanks for the review. I'm investigating this. I agree with the renaming, I will do it in the next version of the patch.

-- 
Regards,
Alena Rybakina
Postgres Professional

Re: Vacuum statistics

From
Andrei Zubkov
Date:
Hi Ilia,

On Wed, 2024-10-16 at 13:31 +0300, Ilia Evdokimov wrote:
> BTW, I recommend renaming the view pg_stat_vacuum_database to
> pg_stat_vacuum_databaseS  for consistency with pg_stat_vacuum_tables
> and pg_stat_vacuum_indexes

Such renaming doesn't seems correct to me because
pg_stat_vacuum_database is consistent with pg_stat_database view, while
pg_stat_vacuum_tables is consistent with pg_stat_all_tables.

This inconsistency is in Postgres views, so it should be changed
synchronously.
--
regards, Andrei Zubkov




Re: Vacuum statistics

From
Alexander Korotkov
Date:
On Sun, Aug 25, 2024 at 6:59 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
> I didn't understand correctly - did you mean that we don't need SRF if
> we need to display statistics for a specific object?
>
> Otherwise, we need this when we display information on all database
> objects (tables or indexes):
>
> while ((entry = ScanStatSnapshot(pgStatLocal.snapshot.stats, &hashiter))
> != NULL)
> {
>      CHECK_FOR_INTERRUPTS();
>
>      tabentry = (PgStat_StatTabEntry *) entry->data;
>
>      if (tabentry != NULL && tabentry->vacuum_ext.type == type)
>          tuplestore_put_for_relation(relid, rsinfo, tabentry);
> }
>
> I know we can construct a HeapTuple object containing a TupleDesc,
> values, and nulls for a particular object, but I'm not sure we can
> augment it while looping through multiple objects.
>
> /* Initialise attributes information in the tuple descriptor */
>
>   tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_STATS_COLS);
>
> ...
>
> PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
>
>
> If I missed something or misunderstood, can you explain in more detail?

Actually, I mean why do we need a possibility to return statistics for
all tables/indexes in one function call?  User anyway is supposed to
use pg_stat_vacuum_indexes/pg_stat_vacuum_tables view, which do
function calls one per relation.  I suppose we can get rid of
possibility to get all the objects in one function call and just
return a tuple from the functions like other pgstatfuncs.c functions
do.

------
Regards,
Alexander Korotkov
Supabase



Re: Vacuum statistics

From
Alexander Korotkov
Date:
On Sun, Sep 29, 2024 at 12:22 AM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
> Hi! Thank you for your interesting for this patch!
>
> I took a very brief look at this and was wondering if it was worth
> having a way to make the per-table vacuum statistics opt-in (like a
> table storage parameter) in order to decrease the shared memory
> footprint of storing the stats.
>
> I'm not sure how users can select tables that enable vacuum statistics
> as I think they basically want to have statistics for all tables, but
> I see your point. Since the size of PgStat_TableCounts approximately
> tripled by this patch (112 bytes to 320 bytes), it might be worth
> considering ways to reduce the number of entries or reducing the size
> of vacuum statistics.
>
> The main purpose of these statistics is to see abnormal behavior of vacuum in relation to a table or the database as
awhole. 
>
> For example, there may be a situation where vacuum has started to run more often and spends a lot of resources on
processinga certain index, but the size of the index does not change significantly. Moreover, the table in which this
indexis located can be much smaller in size. This may be because the index is bloated and needs to be reindexed. 
>
> This is exactly what vacuum statistics can show - we will see that compared to other objects, vacuum processed more
blocksand spent more time on this index. 
>
> Perhaps the vacuum parameters for the index should be set more aggressively to avoid this in the future.
>
> I suppose that if we turn off statistics collection for a certain object, we can miss it. In addition, the user may
notenable the parameter for the object in time, because he will forget about it. 

I agree with this point.  Additionally, in order to benefit from
gatherting vacuum statistics only for some relations in terms of
space, we need to handle variable-size stat entries.  That would
greatly increase the complexity.

> As for the second option, now I cannot say which statistics can be removed, to be honest. So far, they all seem
necessary.

Yes, but as Masahiko-san pointed out, PgStat_TableCounts is almost
tripled in space.  That a huge change from having no statistics on
vacuum to have it in much more detail than everything else we
currently have.  I think the feasible way might be to introduce some
most demanded statistics first then see how it goes.

------
Regards,
Alexander Korotkov
Supabase



Re: Vacuum statistics

From
Jim Nasby
Date:


On Oct 28, 2024, at 2:07 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:

I suppose that if we turn off statistics collection for a certain object, we can miss it. In addition, the user may not enable the parameter for the object in time, because he will forget about it.

I agree with this point.  Additionally, in order to benefit from
gatherting vacuum statistics only for some relations in terms of
space, we need to handle variable-size stat entries.  That would
greatly increase the complexity.

Could vacuum stats be treated as a separate category instead of adding it to PgStat_TableCounts?

As for the second option, now I cannot say which statistics can be removed, to be honest. So far, they all seem necessary.

Yes, but as Masahiko-san pointed out, PgStat_TableCounts is almost
tripled in space.  That a huge change from having no statistics on
vacuum to have it in much more detail than everything else we
currently have.  I think the feasible way might be to introduce some
most demanded statistics first then see how it goes.

Looking at the stats I do think the WAL stats are probably not helpful. First, there’s nothing users can do to tune how much WAL is generated by vacuum. Second, this introduces the risk of users saying “Wow, vacuum is creating a lot of WAL! I’m going to turn it down!”, which is most likely to make matters worse. There’s already a lot of stuff that goes into WAL without any detailed logging; if we ever wanted to provide a comprehensive view of what data is in WAL that should be handled separately.

The rest of the stats all look important. In fact, I think there’s even more stats that could be included (such as all frozen/visible pages skipped) - even more reason to look at having separate controls for tracking vacuum stats. There’s also an argument to be made for tracking autovac separately from manual vacuum. So long-term we might want to look at other ways to handle these stats, not only because of the large number of stats, but because they would be updated very infrequently compared to other stats counters. Ironically, the old stats system would probably have been more than sufficient for these stats. Tracking them in a real table might also be an option.

Is there a reason some fields are omitted from pg_stat_vacuum_database? While some stats are certainly more interesting at the per-relation level, I can’t really think of any that don’t make sense at the database level as well.

Looking at the per table/index stats, I strongly dislike the use of the term “delete” - it is a recipe for confusion with row deletion.. A much better term is “remove” or “removed”. I realize the term “delete” is used in places in vacuum logging, but IMO we should fix that as well instead of doubling-down on it.

I think “interrupts” is also a very confusing name - those fields should just be called “errors”.

I realize “relname” is being used for consistency with pg_stat_all_(tables|indexes), but I’m not sure it makes sense to double-down on that. Especially in pg_stat_vacuum_indexes, where it’s not completely clear whether relname is referring to the table or the index. I’m also inclined to say that the name of the table should be included in pg_stat_vacuum_indexes.

For all the views the docs should clarify that total_blks_written means blocks written by vacuum, as opposed to the background writer. Similarly they should clarify the difference between rel_blks_(read|hit) and total_blks_(read|hit). In the case of pg_stat_vacuum_indexes it’d be better if rel_blks_(read|hit) were called index_blks_(read|hit). Although… if total_blks_* is actually the count across the table and all the indexes I don’t know that we even need that counter. I realize that not ever vacuum even looks at the indexes, but if we’re going to go into that level of detail then we would (at minimum) need to count the number of times a vacuum completely skipped scanning the indexes.

Having rev_all_(frozen|visible)_pages in the same view as vacuum stats will confuse users into thinking that vacuum is clearing the bits. Those fields really belong in pg_stat_all_tables.

Sadly index_vacuum_count is may not useful at all at present. At minimum you’d need to know the number of times vacuum had run in total. I realize that’s in pg_stat_all_tables, but that doesn’t help if vacuum stats are tracked or reset separately. At minimum the docs should mention them. They also need to clarify if index_vacuum_count is incremented per-index or per-pass (hopefully the later). Assuming it’s per-pass, a better name for the field would be index_vacuum_passes, index_passes, index_pass_count, or similar. But even with that we still need a counter for the number of vacuums where index processing was skipped.

Other items
First, thanks to everyone that’s put work into this patch - it’s a big step forward. I certainly don’t want the perfect to be the enemy of the good, but since the size of these stats entries has already come up as a concern I want to consider use cases that would still not be covered by this patch. I’m not suggesting these need to be added now, but IMHO they’re logical next steps (that would also mean more counters). The cases below would probably mean at least doubling the number of vacuum-related counters, at least at the table level.

First, there’s still gaps in trying to track HOT; most notably a counter for how many updates would never be HOT eligible because they modify indexes. pg_stat_all_tables.n_tup_newpage_upd is really limited without that info.

There should also be stats about unused line pointers - in degenerate cases the lp array can consume a significant portion of heap storage.

Monitoring bloat would be a lot more accurate if vacuum reported total tuple length for each run along with the total number of tuples it looked at. Having that info would make it trivial to calculate average tuple size, which could then be applied to reltuples and relpages to calculate how much space would being lost to bloat.

Autovacuum will self-terminate if it would block another process (unless it’s an aggressive vacuum) - that’s definitely something that should be tracked. Not just the number of times that happens, but also stats about how much work was lost because of this.

Shrinking a relation (what vacuum calls truncation, which is very confusing with the truncate command) is a rather complex process that currently has no visibility.

Tuning vacuum_freeze_min_age (and the MXID variant) is rather complicated. We maybe have enough stats on whether it could be set lower, but there’s no visibility on how the settings affect how often vacuum decides to be aggressive. At minimum, we should have stats on when vacuum is aggressive, especially since it significantly changes the behavior of autovac.

I saw someone else already mentioned tuning vacuum memory usage, but I’ll mention it again. Even if the issues with index_vacuum_count are fixed that still only tells you if you have a problem; it doesn’t give you a great idea of how much more memory you need. The best you can do is assuming you need (number of passes - 1) * current memory.

Speaking of which… there should be stats on any time vacuum decided on it’s own to skip index processing due to wraparound proximity.

I’m sure there’s some other use cases that I’m not thinking of.

Re: Vacuum statistics

From
Alena Rybakina
Date:
On 28.10.2024 16:40, Alexander Korotkov wrote:
On Sun, Aug 25, 2024 at 6:59 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
I didn't understand correctly - did you mean that we don't need SRF if
we need to display statistics for a specific object?

Otherwise, we need this when we display information on all database
objects (tables or indexes):

while ((entry = ScanStatSnapshot(pgStatLocal.snapshot.stats, &hashiter))
!= NULL)
{     CHECK_FOR_INTERRUPTS();
     tabentry = (PgStat_StatTabEntry *) entry->data;
     if (tabentry != NULL && tabentry->vacuum_ext.type == type)         tuplestore_put_for_relation(relid, rsinfo, tabentry);
}

I know we can construct a HeapTuple object containing a TupleDesc,
values, and nulls for a particular object, but I'm not sure we can
augment it while looping through multiple objects.

/* Initialise attributes information in the tuple descriptor */
  tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_STATS_COLS);

...

PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));


If I missed something or misunderstood, can you explain in more detail?
Actually, I mean why do we need a possibility to return statistics for
all tables/indexes in one function call?  User anyway is supposed to
use pg_stat_vacuum_indexes/pg_stat_vacuum_tables view, which do
function calls one per relation.  I suppose we can get rid of
possibility to get all the objects in one function call and just
return a tuple from the functions like other pgstatfuncs.c functions
do.

I haven’t thought about this before and agree with you. Thanks for the clarification! I'll fix the patch this evening and release the updated version.
-- 
Regards,
Alena Rybakina
Postgres Professional

Re: Vacuum statistics

From
Andrei Zubkov
Date:
Hi,

Thanks for your attention to our patch!

On Mon, 2024-10-28 at 16:03 -0500, Jim Nasby wrote:
> > Yes, but as Masahiko-san pointed out, PgStat_TableCounts is almost
> > tripled in space.  That a huge change from having no statistics on
> > vacuum to have it in much more detail than everything else we
> > currently have.  I think the feasible way might be to introduce
> > some
> > most demanded statistics first then see how it goes.
>
> Looking at the stats I do think the WAL stats are probably not
> helpful. First, there’s nothing users can do to tune how much WAL is
> generated by vacuum. Second, this introduces the risk of users saying
> “Wow, vacuum is creating a lot of WAL! I’m going to turn it down!”,
> which is most likely to make matters worse. There’s already a lot of
> stuff that goes into WAL without any detailed logging; if we ever
> wanted to provide a comprehensive view of what data is in WAL that
> should be handled separately.

Yes, there is nothing we can directly do with WAL generated by vacuum,
but WAL generation is the part of vacuum work, and it will indirectly
affected by the changes of vacuum settings. So, WAL statistics is one
more dimension of vacuum workload. Also WAL stat is universal metric
which is measured cluster-wide and on the statement-level with
pg_stat_statements. Vacuum WAL counters will explain the part of
difference between those metrics. Besides vacuum WAL counters can be
used to locate abnormal vacuum behavior caused by a bug or the data
corruption. I think if the DBA is smart enough to look at vacuum WAL
generated stats and to understand what it means, the decision to
disable the autovacuum due to its WAL generation is unlikely.

Anyway I think some stats can be excluded to save some memory. The
first candidates are the system_time and user_time fields. Those are
very valuable, but are measured by the rusage stats, which won't be
available on all platforms. I think total_time and delay_time would be
sufficient.
The second is the interrupts field. It is needed for monitoring to know
do we have them or not, so tracking them on the database level will do
the trick. Interrupt is quite rare event, so once the monitoring system
will catch one the DBA can go to the server log for the details.

It seems there is another way. If the vacuum stats doesn't seems to be
mandatory in all systems, maybe we should add some hooks to the vacuum
so that vacuum statistics tracking can be done in an extension. I don't
think it is a good idea, because vacuum stats seems to me as mandatory
as the vacuum process itself.

> Is there a reason some fields are omitted
> from pg_stat_vacuum_database? While some stats are certainly more
> interesting at the per-relation level, I can’t really think of any
> that don’t make sense at the database level as well.

Some of the metrics are table-specific, some index-specific, so we
moved to the database level metrics more or less specific to the whole
database. Can you tell what stats you want to see at the database
level?

> Looking at the per table/index stats, I strongly dislike the use of
> the term “delete” - it is a recipe for confusion with row deletion..
> A much better term is “remove” or “removed”. I realize the term
> “delete” is used in places in vacuum logging, but IMO we should fix
> that as well instead of doubling-down on it.

Yes, this point was discussed in our team, and it seems confusing to me
too. We decided to name it as it is named in the code and to get
feedback from the community. Now we get one. Thank you. Now we should
discuss it and choose the best one. My personal choice is "removed".



> I realize “relname” is being used for consistency with
> pg_stat_all_(tables|indexes), but I’m not sure it makes sense to
> double-down on that. Especially in pg_stat_vacuum_indexes, where it’s
> not completely clear whether relname is referring to the table or the
> index. I’m also inclined to say that the name of the table should be
> included in pg_stat_vacuum_indexes.

Agreed. Table name is needed in the index view.


> For all the views the docs should clarify that total_blks_written
> means blocks written by vacuum, as opposed to the background Ywriter.

We have the "Number of database blocks written by vacuum operations
performed on this table" in the docs now. Do you mean we should
specifically note the vacuum process here?

> Similarly they should clarify the difference between
> rel_blks_(read|hit) and total_blks_(read|hit). In the case of
> pg_stat_vacuum_indexes it’d be better if rel_blks_(read|hit) were
> called index_blks_(read|hit). Although… if total_blks_* is actually
> the count across the table and all the indexes I don’t know that we
> even need that counter. I realize that not ever vacuum even looks at
> the indexes, but if we’re going to go into that level of detail then
> we would (at minimum) need to count the number of times a vacuum
> completely skipped scanning the indexes.

It is not clear to me enough. The stats described just as it is -
rel_blocks_* tracks blocks of the current heap, and total_* is for the
whole database blocks - not just tables and indexes, vacuum do some
work (quite a little) in the catalog and this work is counted here too.
Usually this stat won't be helpful, but maybe we can catch unusual
vacuum behavior using this stat.


> Having rev_all_(frozen|visible)_pages in the same view as vacuum
> stats will confuse users into thinking that vacuum is clearing the
> bits. Those fields really belong in pg_stat_all_tables.

Agreed.

> Sadly index_vacuum_count is may not useful at all at present. At
> minimum you’d need to know the number of times vacuum had run in
> total. I realize that’s in pg_stat_all_tables, but that doesn’t help
> if vacuum stats are tracked or reset separately.

I'm in doubt - is it really possible to reset the vacuum stats
independent of pg_stat_all_tables?

> At minimum the docs should mention them. They also need to clarify
> if index_vacuum_count is incremented per-index or per-pass (hopefully
> the later). Assuming it’s per-pass, a better name for the field would
> be index_vacuum_passes, index_passes, index_pass_count, or similar.
> But even with that we still need a counter for the number of vacuums
> where index processing was skipped.

Agreed, the "index_passes" looks good to me, and index processing skip
counter looks good.


> First, there’s still gaps in trying to track HOT; most notably a
> counter for how many updates would never be HOT eligible because they
> modify indexes. pg_stat_all_tables.n_tup_newpage_upd is really
> limited without that info.

Nice catch, I'll think about it. Those are not directly connected to
the vacuum workload but those are important.

> There should also be stats about unused line pointers - in degenerate
> cases the lp array can consume a significant portion of heap storage.
>
> Monitoring bloat would be a lot more accurate if vacuum reported
> total tuple length for each run along with the total number of tuples
> it looked at. Having that info would make it trivial to calculate
> average tuple size, which could then be applied to reltuples and
> relpages to calculate how much space would being lost to bloat.

Yes, bloat tracking is in our plans. Right now it is not clear enough
how to do it in the most reliable and convenient way.

> Autovacuum will self-terminate if it would block another process
> (unless it’s an aggressive vacuum) - that’s definitely something that
> should be tracked. Not just the number of times that happens, but
> also stats about how much work was lost because of this.

Agreed.

> Shrinking a relation (what vacuum calls truncation, which is very
> confusing with the truncate command) is a rather complex process that
> currently has no visibility.

In this patch table truncation can be seen in the "pages_removed" field
of "pg_stat_vacuum_tables" at least as the cumulative number of removed
pages. It is not clear enough, but it is visible.


> Tuning vacuum_freeze_min_age (and the MXID variant) is rather
> complicated. We maybe have enough stats on whether it could be set
> lower, but there’s no visibility on how the settings affect how often
> vacuum decides to be aggressive. At minimum, we should have stats on
> when vacuum is aggressive, especially since it significantly changes
> the behavior of autovac.

When you say "agressive" do you mean the number of times when the
vacuum was processing the table with the FREEZE intention? I think this
is needed too.

> I saw someone else already mentioned tuning vacuum memory usage, but
> I’ll mention it again. Even if the issues with index_vacuum_count are
> fixed that still only tells you if you have a problem; it doesn’t
> give you a great idea of how much more memory you need. The best you
> can do is assuming you need (number of passes - 1) * current memory.

Do you think such approach is insufficient? It seems we do not need
byte-to-byte accuracy here.

> Speaking of which… there should be stats on any time vacuum decided
> on it’s own to skip index processing due to wraparound proximity.

Maybe we should just count the number of times when the vacuum was
started to prevent wraparound?

Jim, thank you for such detailed review of our patch!

--
regards, Andrei Zubkov
Postgres Professional



Re: Vacuum statistics

From
Jim Nasby
Date:
On Oct 29, 2024, at 7:40 AM, Andrei Zubkov <zubkov@moonset.ru> wrote:
>
> Hi,
>
> Thanks for your attention to our patch!
>
> On Mon, 2024-10-28 at 16:03 -0500, Jim Nasby wrote:
>>> Yes, but as Masahiko-san pointed out, PgStat_TableCounts is almost
>>> tripled in space.  That a huge change from having no statistics on
>>> vacuum to have it in much more detail than everything else we
>>> currently have.  I think the feasible way might be to introduce
>>> some
>>> most demanded statistics first then see how it goes.
>>
>> Looking at the stats I do think the WAL stats are probably not
>> helpful. First, there’s nothing users can do to tune how much WAL is
>> generated by vacuum. Second, this introduces the risk of users saying
>> “Wow, vacuum is creating a lot of WAL! I’m going to turn it down!”,
>> which is most likely to make matters worse. There’s already a lot of
>> stuff that goes into WAL without any detailed logging; if we ever
>> wanted to provide a comprehensive view of what data is in WAL that
>> should be handled separately.
>
> Yes, there is nothing we can directly do with WAL generated by vacuum,
> but WAL generation is the part of vacuum work, and it will indirectly
> affected by the changes of vacuum settings. So, WAL statistics is one
> more dimension of vacuum workload. Also WAL stat is universal metric
> which is measured cluster-wide and on the statement-level with
> pg_stat_statements. Vacuum WAL counters will explain the part of
> difference between those metrics. Besides vacuum WAL counters can be
> used to locate abnormal vacuum behavior caused by a bug or the data
> corruption. I think if the DBA is smart enough to look at vacuum WAL
> generated stats and to understand what it means, the decision to
> disable the autovacuum due to its WAL generation is unlikely.

I’m generally for more stats rather than less - really just a question of how much we’re worried about stats overhead.

> Anyway I think some stats can be excluded to save some memory. The
> first candidates are the system_time and user_time fields. Those are
> very valuable, but are measured by the rusage stats, which won't be
> available on all platforms. I think total_time and delay_time would be
> sufficient.

Yeah, I considered throwing those under the bus. I agree they’re only marginally useful.

> The second is the interrupts field. It is needed for monitoring to know
> do we have them or not, so tracking them on the database level will do
> the trick. Interrupt is quite rare event, so once the monitoring system
> will catch one the DBA can go to the server log for the details.
Just to confirm… by “interrupt” you mean vacuum encountered an error?

> It seems there is another way. If the vacuum stats doesn't seems to be
> mandatory in all systems, maybe we should add some hooks to the vacuum
> so that vacuum statistics tracking can be done in an extension. I don't
> think it is a good idea, because vacuum stats seems to me as mandatory
> as the vacuum process itself.
I’d actually like hooks for all stats, so people can develop different ways of storing/aggregating them. But I agree
that’sa separate discussion. 

>> Is there a reason some fields are omitted
>> from pg_stat_vacuum_database? While some stats are certainly more
>> interesting at the per-relation level, I can’t really think of any
>> that don’t make sense at the database level as well.
>
> Some of the metrics are table-specific, some index-specific, so we
> moved to the database level metrics more or less specific to the whole
> database. Can you tell what stats you want to see at the database
> level?

Here’s the thing with pg_stat_vacuum_database; it’s the only way to see everything in the whole cluster. So I think the
betterquestion is what metrics simply don’t make sense at that level? And I don’t really see any that don’t. 

>> For all the views the docs should clarify that total_blks_written
>> means blocks written by vacuum, as opposed to the background Ywriter.
>
> We have the "Number of database blocks written by vacuum operations
> performed on this table" in the docs now. Do you mean we should
> specifically note the vacuum process here?

The reason the stat is confusing is because it doesn’t have the meaning that the name implies. Most people that see
thiswill think it’s actually measuring blocks dirtied, or at least something closer to that. It definitely hides the
factthat many of the dirtied blocks could actually be written by the bgwriter. So an improvement to the docs would be
“Numberof blocks written directly by vacuum or auto vacuum. Blocks that are dirtied by a vacuum process can be written
outby another process.” 

Which makes me realize… I think vacuum only counts a block as dirtied if it was previously clean? If so the docs for
thatmetric need to clarify that vacuum might modify a block but not count it as having been dirtied. 

>> Similarly they should clarify the difference between
>> rel_blks_(read|hit) and total_blks_(read|hit). In the case of
>> pg_stat_vacuum_indexes it’d be better if rel_blks_(read|hit) were
>> called index_blks_(read|hit). Although… if total_blks_* is actually
>> the count across the table and all the indexes I don’t know that we
>> even need that counter. I realize that not ever vacuum even looks at
>> the indexes, but if we’re going to go into that level of detail then
>> we would (at minimum) need to count the number of times a vacuum
>> completely skipped scanning the indexes.
>
> It is not clear to me enough. The stats described just as it is -
> rel_blocks_* tracks blocks of the current heap, and total_* is for the
> whole database blocks - not just tables and indexes, vacuum do some
> work (quite a little) in the catalog and this work is counted here too.
> Usually this stat won't be helpful, but maybe we can catch unusual
> vacuum behavior using this stat.

Ok, so this just needs to be clarified in the docs by explicitly stating what is and isn’t part of the metric. It would
alsobe better not to use the term “rel” since most people don’t immediately know what that means. So,
table_blks_(read|hit)or index_blks_(read|hit). 

Also, “total” is still not clear to me, at least in the context of pg_stat_vacuum_indexes. Is that different from
pg_stat_vacuum_tables.total_blks_*?If so, how? If it’s the same then IMO it should just be removed from
pg_stat_vacuum_indexes.

>> Sadly index_vacuum_count is may not useful at all at present. At
>> minimum you’d need to know the number of times vacuum had run in
>> total. I realize that’s in pg_stat_all_tables, but that doesn’t help
>> if vacuum stats are tracked or reset separately.
>
> I'm in doubt - is it really possible to reset the vacuum stats
> independent of pg_stat_all_tables?

Most stats can be independently reset, so I was thinking these wouldn’t be an exception. If that’s not the case then I
thinkthe docs need to mention pg_stat_all_tables.(auto)vacuum_count, since it’s in a completely different view. Or
betteryet, include the vacuum/analyze related stats that are in pg_stat_all_tables in pg_stat_vacuum_tables. 

BTW, have you thought about what stats should be added for ANALYZE? That’s obviously not as critical as vacuum, but
maybeworth considering as part of this... 

>> First, there’s still gaps in trying to track HOT; most notably a
>> counter for how many updates would never be HOT eligible because they
>> modify indexes. pg_stat_all_tables.n_tup_newpage_upd is really
>> limited without that info.
>
> Nice catch, I'll think about it. Those are not directly connected to
> the vacuum workload but those are important.

Just to re-iterate: I don’t think this patch has to boil the ocean and try to handle all these extra use cases.

>> There should also be stats about unused line pointers - in degenerate
>> cases the lp array can consume a significant portion of heap storage.
>>
>> Monitoring bloat would be a lot more accurate if vacuum reported
>> total tuple length for each run along with the total number of tuples
>> it looked at. Having that info would make it trivial to calculate
>> average tuple size, which could then be applied to reltuples and
>> relpages to calculate how much space would being lost to bloat.
>
> Yes, bloat tracking is in our plans. Right now it is not clear enough
> how to do it in the most reliable and convenient way.
>
>> Autovacuum will self-terminate if it would block another process
>> (unless it’s an aggressive vacuum) - that’s definitely something that
>> should be tracked. Not just the number of times that happens, but
>> also stats about how much work was lost because of this.
>
> Agreed.
>
>> Shrinking a relation (what vacuum calls truncation, which is very
>> confusing with the truncate command) is a rather complex process that
>> currently has no visibility.
>
> In this patch table truncation can be seen in the "pages_removed" field
> of "pg_stat_vacuum_tables" at least as the cumulative number of removed
> pages. It is not clear enough, but it is visible.

Ahh, good point. I think it’s probably worth adding a counter (to this patch) for how many times vacuum actually
decidedto do page removal, because it’s (presumably) a pretty rare event. Without that counter it’s very hard to make
anysense of the number of pages removed (other than being able to see some were removed, at least once). 

>> Tuning vacuum_freeze_min_age (and the MXID variant) is rather
>> complicated. We maybe have enough stats on whether it could be set
>> lower, but there’s no visibility on how the settings affect how often
>> vacuum decides to be aggressive. At minimum, we should have stats on
>> when vacuum is aggressive, especially since it significantly changes
>> the behavior of autovac.
>
> When you say "agressive" do you mean the number of times when the
> vacuum was processing the table with the FREEZE intention? I think this
> is needed too.

Yes. I intentionally use the term “aggressive” (as the code does) to avoid confusion with the FREEZE option (which as
I’msure you know simply forces some GUCs to 0). Further complicating this is that auto vac will report this as “to
preventwraparound”… 

In any case… I’m actually leaning towards there should be a complete second set of counters for aggressive vacuums,
becauseof how differently they work. :( 

>> I saw someone else already mentioned tuning vacuum memory usage, but
>> I’ll mention it again. Even if the issues with index_vacuum_count are
>> fixed that still only tells you if you have a problem; it doesn’t
>> give you a great idea of how much more memory you need. The best you
>> can do is assuming you need (number of passes - 1) * current memory.
>
> Do you think such approach is insufficient? It seems we do not need
> byte-to-byte accuracy here.

Byte-for-byte, no. But I do wonder if there’s any way to do better than some multiple of what *_work_mem was set to.

And setting that aside, another significant problem is that you can’t actually do anything here without actually
knowingwhat memory setting was used, which is definitely not a given. Off-hand I don’t see anyway this can actually be
tuned(at all) with nothing but counters. :( 

Definitely out of scope for this patch though :)

>> Speaking of which… there should be stats on any time vacuum decided
>> on it’s own to skip index processing due to wraparound proximity.
>
> Maybe we should just count the number of times when the vacuum was
> started to prevent wraparound?

Unfortunately even that isn’t simple… auto vac and manual vac have different GUCs, and of course there’s the FREEZE
option.And then there’s the issue that MXIDs are handled completely separately. 

Even ignoring all of that… by default an aggressive vacuum won’t skip indexes. That only happens when you hit
vacuum_(multixact_)failsafe_age.

BTW, something I’ve been mulling over is what stats related to cleanup might be tracked at a system level. I’m thinking
alongthe lines of how often heap_prune_page or the index marking code come across a dead tuple they can’t do anything
aboutyet because it’s still visible. While you could track that per-relation, I’m not sure how helpful that actually is
sinceit’s really a long-running transaction problem. 

Similarly, it’d be nice if we had stats about how often all of the auto vac workers were occupied; something that’s
alsoglobal in nature. 




Re: Vacuum statistics

From
Alena Rybakina
Date:

Hi!

On 29.10.2024 14:02, Alena Rybakina wrote:
On 28.10.2024 16:40, Alexander Korotkov wrote:
On Sun, Aug 25, 2024 at 6:59 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
I didn't understand correctly - did you mean that we don't need SRF if
we need to display statistics for a specific object?

Otherwise, we need this when we display information on all database
objects (tables or indexes):

while ((entry = ScanStatSnapshot(pgStatLocal.snapshot.stats, &hashiter))
!= NULL)
{     CHECK_FOR_INTERRUPTS();
     tabentry = (PgStat_StatTabEntry *) entry->data;
     if (tabentry != NULL && tabentry->vacuum_ext.type == type)         tuplestore_put_for_relation(relid, rsinfo, tabentry);
}

I know we can construct a HeapTuple object containing a TupleDesc,
values, and nulls for a particular object, but I'm not sure we can
augment it while looping through multiple objects.

/* Initialise attributes information in the tuple descriptor */
  tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_STATS_COLS);

...

PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));


If I missed something or misunderstood, can you explain in more detail?
Actually, I mean why do we need a possibility to return statistics for
all tables/indexes in one function call?  User anyway is supposed to
use pg_stat_vacuum_indexes/pg_stat_vacuum_tables view, which do
function calls one per relation.  I suppose we can get rid of
possibility to get all the objects in one function call and just
return a tuple from the functions like other pgstatfuncs.c functions
do.

I haven’t thought about this before and agree with you. Thanks for the clarification! I'll fix the patch this evening and release the updated version.

I updated the patches as per your suggestion. You can see it here [0].

[0] https://www.postgresql.org/message-id/85b963fe-5977-43aa-9241-75b862abcc69%40postgrespro.ru

Re: Vacuum statistics

From
Ilia Evdokimov
Date:


On 22.10.2024 22:30, Alena Rybakina wrote:

Hi!

On 16.10.2024 14:01, Alena Rybakina wrote:

Thank you for rebasing.

I have noticed that when I create a table or an index on this table, there is no information about the table or index in pg_stat_vacuum_tables and pg_stat_vacuum_indexes until we perform a VACUUM.

Example:

CREATE TABLE t (i INT, j INT);
INSERT INTO t SELECT i/10, i/100 FROM  GENERATE_SERIES(1,1000000) i;
SELECT * FROM pg_stat_vacuum_tables WHERE relname = 't';
....
(0 rows)
CREATE INDEX ON t (i);
SELECT * FROM pg_stat_vacuum_indexes WHERE relname = 't_i_idx';
...
(0 rows)

I can see the entries after running VACUUM or executing autovacuum. or when autovacuum is executed. I would suggest adding a line about the relation even if it has not yet been processed by vacuum. Interestingly, this issue does not occur with pg_stat_vacuum_database:

CREATE DATABASE example_db;
SELECT * FROM pg_stat_vacuum_database WHERE dbname = 'example_db';
dboid |       dbname | ...
 ...      | example_db | ...
(1 row)

BTW, I recommend renaming the view pg_stat_vacuum_database to pg_stat_vacuum_databaseS  for consistency with pg_stat_vacuum_tables and pg_stat_vacuum_indexes

Thanks for the review. I'm investigating this. I agree with the renaming, I will do it in the next version of the patch.

I fixed it. I added the left outer join to the vacuum views and for converting the coalesce function from NULL to null values.

I also fixed the code in getting database statistics - we can get it through the existing pgstat_fetch_stat_dbentry function and fixed couple of comments. 

I attached a diff file, as well as new versions of patches.

-- 
Regards,
Alena Rybakina
Postgres Professional

Thank you for fixing it.

1) I have found some typos in the test output files (out-files) when running 'make check' and 'make check-world'. These typos might cause minor discrepancies in test results. You may already be aware of them, but I wanted to bring them to your attention in case they haven't been noticed. I believe these can be fixed quickly.

2) Additionally, I observed that when we create a table and insert some rows, executing the VACUUM FULL command does not update the information in the 'pg_stat_get_vacuum_tables' However, running the VACUUM command does update this information as expected. This seems inconsistent, and it might be a bug.

Example:
CREATE TABLE t (i INT, j INT) WITH (autovacuum_enabled = false);
INSERT INTO t SELECT i/10, i/100 FROM  GENERATE_SERIES(1,1000000) i;
SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
schema | relname |    relid | total_blks_read | .........
-----------+------------+---------+----------------------+---------
   public | t            | 21416 |                       0 | ......
(1 row)

VACUUM FULL;
SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
schema | relname |    relid | total_blks_read | .........
-----------+------------+---------+----------------------+---------
   public | t            | 21416 |                       0 | ......
(1 row)

VACUUM;
SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
schema | relname |    relid | total_blks_read | .........
-----------+------------+---------+----------------------+---------
   public | t            | 21416 |                 4425 | ......
(1 row)

Regards,
Ilia Evdokimov,
Tantor Labs LLC.

Re: Vacuum statistics

From
Alena Rybakina
Date:
Hi! Thank you for review!

On 07.11.2024 17:49, Ilia Evdokimov wrote:
>
> Thank you for fixing it.
>
> 1) I have found some typos in the test output files (out-files) when 
> running 'make check' and 'make check-world'. These typos might cause 
> minor discrepancies in test results. You may already be aware of them, 
> but I wanted to bring them to your attention in case they haven't been 
> noticed. I believe these can be fixed quickly.
>
Yes, I'll fix it)
>
> 2) Additionally, I observed that when we create a table and insert 
> some rows, executing the VACUUM FULL command does not update the 
> information in the 'pg_stat_get_vacuum_tables' However, running the 
> VACUUM command does update this information as expected. This seems 
> inconsistent, and it might be a bug.
>
> Example:
> CREATE TABLE t (i INT, j INT) WITH (autovacuum_enabled = false);
> INSERT INTO t SELECT i/10, i/100 FROM GENERATE_SERIES(1,1000000) i;
> SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
> schema | relname |    relid | total_blks_read | .........
> -----------+------------+---------+----------------------+---------
>    public | t            | 21416 |                       0 | ......
> (1 row)
>
> VACUUM FULL;
> SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
> schema | relname |    relid | total_blks_read | .........
> -----------+------------+---------+----------------------+---------
>    public | t            | 21416 |                       0 | ......
> (1 row)
>
> VACUUM;
> SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
> schema | relname |    relid | total_blks_read | .........
> -----------+------------+---------+----------------------+---------
>    public | t            | 21416 |                 4425 | ......
> (1 row)
>
>
vacuum full operation doesn't call a vacuum operation, so we can't 
collect statistics for it. Furthermore, this is a different operation 
than vacuum because it completely rebuilds the table and indexes, so it 
looks like your previous table and its indexes were completely removed. 
To sum up, I think it isn't a bug that the statistics aren't showing here.

-- 
Regards,
Alena Rybakina
Postgres Professional




Re: Vacuum statistics

From
Jim Nasby
Date:

On Nov 2, 2024, at 7:22 AM, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:

The second is the interrupts field. It is needed for monitoring to know
do we have them or not, so tracking them on the database level will do
the trick. Interrupt is quite rare event, so once the monitoring system
will catch one the DBA can go to the server log for the details.
Just to confirm… by “interrupt” you mean vacuum encountered an error?
Yes it is.
In that case I feel rather strongly that we should label that as “errors”. “Interrupt” could mean a few different things, but “error” is very clear.

I updated patches. I excluded system and user time statistics and save number of interrupts only for database. I removed the ability to get statistics for all tables, now they can only be obtained for an oid table [0], as suggested here. I also renamed the statistics from pg_stat_vacuum_tables to pg_stat_get_vacuum_tables and similarly for indexes and databases. I noticed that that’s what they’re mostly called. Ready for discussion.

I think it’s better that the views follow the existing naming conventions (which don’t include “_get_”; only the functions have that in their names). Assuming that, the only question becomes pg_stat_vacuum_* vs pg_stat_*_vacuum. Given the existing precedent of pg_statio_*, I’m inclined to go with pg_stat_vacuum_*.

Re: Vacuum statistics

From
Ilia Evdokimov
Date:


On 08.11.2024 22:23, Alena Rybakina wrote:
Hi! Thank you for review!

On 07.11.2024 17:49, Ilia Evdokimov wrote:

Thank you for fixing it.

1) I have found some typos in the test output files (out-files) when running 'make check' and 'make check-world'. These typos might cause minor discrepancies in test results. You may already be aware of them, but I wanted to bring them to your attention in case they haven't been noticed. I believe these can be fixed quickly.

Yes, I'll fix it)

2) Additionally, I observed that when we create a table and insert some rows, executing the VACUUM FULL command does not update the information in the 'pg_stat_get_vacuum_tables' However, running the VACUUM command does update this information as expected. This seems inconsistent, and it might be a bug.

Example:
CREATE TABLE t (i INT, j INT) WITH (autovacuum_enabled = false);
INSERT INTO t SELECT i/10, i/100 FROM GENERATE_SERIES(1,1000000) i;
SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
schema | relname |    relid | total_blks_read | .........
-----------+------------+---------+----------------------+---------
   public | t            | 21416 |                       0 | ......
(1 row)

VACUUM FULL;
SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
schema | relname |    relid | total_blks_read | .........
-----------+------------+---------+----------------------+---------
   public | t            | 21416 |                       0 | ......
(1 row)

VACUUM;
SELECT * FROM pg_stat_get_vacuum_tables WHERE relname = 't';
schema | relname |    relid | total_blks_read | .........
-----------+------------+---------+----------------------+---------
   public | t            | 21416 |                 4425 | ......
(1 row)


vacuum full operation doesn't call a vacuum operation, so we can't collect statistics for it. Furthermore, this is a different operation than vacuum because it completely rebuilds the table and indexes, so it looks like your previous table and its indexes were completely removed. To sum up, I think it isn't a bug that the statistics aren't showing here.

Ah, you're right. This table does contain the statistics for it. Everything is okay then. Sorry for the confusion.

Regards,
Ilia Evdokimov,
Tantor Labs LLC.

Re: Vacuum statistics

From
Kirill Reshke
Date:
On Wed, 13 Nov 2024 at 21:21, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
>
> Hi! Thank you for your contribution to this thread!
>
> On 13.11.2024 03:24, Jim Nasby wrote:
>
> On Nov 10, 2024, at 2:09 PM, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
>
>
> On 08.11.2024 22:34, Jim Nasby wrote:
>
>
> On Nov 2, 2024, at 7:22 AM, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
>
> The second is the interrupts field. It is needed for monitoring to know
> do we have them or not, so tracking them on the database level will do
> the trick. Interrupt is quite rare event, so once the monitoring system
> will catch one the DBA can go to the server log for the details.
>
> Just to confirm… by “interrupt” you mean vacuum encountered an error?
>
> Yes it is.
>
> In that case I feel rather strongly that we should label that as “errors”. “Interrupt” could mean a few different
things,but “error” is very clear. 
>
> I updated patches. I excluded system and user time statistics and save number of interrupts only for database. I
removedthe ability to get statistics for all tables, now they can only be obtained for an oid table [0], as suggested
here.I also renamed the statistics from pg_stat_vacuum_tables to pg_stat_get_vacuum_tables and similarly for indexes
anddatabases. I noticed that that’s what they’re mostly called. Ready for discussion. 
>
> I think it’s better that the views follow the existing naming conventions (which don’t include “_get_”; only the
functionshave that in their names). Assuming that, the only question becomes pg_stat_vacuum_* vs pg_stat_*_vacuum.
Giventhe existing precedent of pg_statio_*, I’m inclined to go with pg_stat_vacuum_*. 
>
> I have fixed it.
>
>
> I’ve reviewed and made some cosmetic changes to patch 1, though of note it looks like an effort has been made to keep
stat_reset_timestampat the end of PgStat_StatDBEntry, so I re-arranged that. I also removed some obviously dead code.
Itappears that pgstat_update_snapshot(), InitSnapshotIterator() and ScanStatSnapshot() are also dead, but I’ve left it
inincase I’m missing something. The tests are also failing for me because a number of psql variables aren’t set. 
>
> Thank you! Yes, I have deleted them.
>
>
> I do think we should separate out the counts for deleted but still visible tuples vs tuples where we couldn’t get a
cleanuplock (in other words, recently_dead_tuples and missed_dead_tuples from LVRelState). I realize that’s a departure
fromhow some of the existing reporting works, but IMO combining them together isn’t a pattern we should be repeating
sincethey mean completely different things. Towards that end I did remove missed_dead_tuples from the reporting, and
renamedExtVacReport.dead_tuples to recently_dead_tuples, but I stopped short of creating a separate entry for
missed_dead_tuples.Note that while recently_dead_tuples is really a global thing (so only needs to be reported at a
global(or at most per-database) level, but missed_dead_tuples should really be at a per-table level. 
>
> I am willing to agree with your idea. But we need to think about how clearly describe them in the documentation.
>
>
> Updated 0001-v13 attached, as well as the diff between v12 and v13.
>
> Thank you)
>
> And I agree with your changes. And included them in patches.
>
> ---
> Regards,
> Alena Rybakina
> Postgres Professional

Hello!
After a brief glance, I think this patch set is good.
But there isn't any more time in the current CF to commit this :(.
So I moved to the next CF.

I also like the 0001 commit message. This commit message is quite
large and easy to understand. Actually, it might be too big. Perhaps
rather of being a commit message, the final paragraph (pages_frozen -
number of pages that..) need to be a part of the document. Perhaps
delete the explanation on pages_frozen that we have in 0004?


--
Best regards,
Kirill Reshke



Re: Vacuum statistics

From
Alexander Korotkov
Date:
Hi, Alena!

On Wed, Nov 13, 2024 at 6:21 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
> Updated 0001-v13 attached, as well as the diff between v12 and v13.
>
> Thank you)
>
> And I agree with your changes. And included them in patches.

Thank you for the updated patchset.  Some points from me.

* I've read the previous discussion on how important to keep all these
fields regarding vacuum statistics including points by Andrei and Jim.
It still worrying me that statistics volume is going to burst in about
3 times, but I don't have a particular proposal on how to make more
granular approach.  I wonder if you could propose something.
* Previously PGSTAT_FILE_FORMAT_ID got increased by 1. Your 0001 patch
increases it by 2.  It's minor note, but I'd like to keep the
tradition.
* Commit message for 0001 looks nice, but commit messages of 0002,
0003, and 0004 look messy.  Could you please, rearrange them.
* The distinction between 0001 and 0002 is not clear. The first line
of 0001 is "Machinery for grabbing an extended vacuum statistics on
heap relations", the first line of 0002 is "Machinery for grabbing an
extended vacuum statistics on heap and index relations."  I guess 0001
should be about heap relations while 0002 should be about just index
relations.  Is this correct?
* I guess this statistics should work for any table AM, based on what
has been done in relation_vacuum() interface method.  If that's
correct, we need to get rid of "heap" terminology and use "table"
instead.
* 0004 should be pure documentation patch, but it seems containing
changes to isolation tests.  Please, move them into a more appropriate
place.

------
Regards,
Alexander Korotkov
Supabase



Re: Vacuum statistics

From
Ilia Evdokimov
Date:
In my opinion, the patches are semantically correct. However, not all 
dead code has been removed - I'm referring to pgstat_update_snapshot(). 
Also, the tests need to be fixed.

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.




Re: Vacuum statistics

From
Alena Rybakina
Date:
On 02.12.2024 11:27, Alexander Korotkov wrote:
Hi, Alena!

On Wed, Nov 13, 2024 at 6:21 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
Updated 0001-v13 attached, as well as the diff between v12 and v13.

Thank you)

And I agree with your changes. And included them in patches.
Thank you for the updated patchset.  Some points from me.

* I've read the previous discussion on how important to keep all these
fields regarding vacuum statistics including points by Andrei and Jim.
It still worrying me that statistics volume is going to burst in about
3 times, but I don't have a particular proposal on how to make more
granular approach.  I wonder if you could propose something.
* Previously PGSTAT_FILE_FORMAT_ID got increased by 1. Your 0001 patch
increases it by 2.  It's minor note, but I'd like to keep the
tradition.
* Commit message for 0001 looks nice, but commit messages of 0002,
0003, and 0004 look messy.  Could you please, rearrange them.
* The distinction between 0001 and 0002 is not clear. The first line
of 0001 is "Machinery for grabbing an extended vacuum statistics on
heap relations", the first line of 0002 is "Machinery for grabbing an
extended vacuum statistics on heap and index relations."  I guess 0001
should be about heap relations while 0002 should be about just index
relations.  Is this correct?
* I guess this statistics should work for any table AM, based on what
has been done in relation_vacuum() interface method.  If that's
correct, we need to get rid of "heap" terminology and use "table"
instead.
* 0004 should be pure documentation patch, but it seems containing
changes to isolation tests.  Please, move them into a more appropriate
place.


Thank you for your valuable feedback, I am already carefully processing your comments and will update the patches soon.

I will think about what can be done to address the problem of increasing the volume of statistics; perhaps it will be possible to implement a guc that, when enabled, will accumulate additional information on vacuum statistics. For example, this way you can group statistics by buffers and vacuum statistics.
-- 
Regards,
Alena Rybakina
Postgres Professional

Re: Vacuum statistics

From
Alena Rybakina
Date:

Hi! Thank you for your review!

On 30.11.2024 07:48, Kirill Reshke wrote:
Hello!
After a brief glance, I think this patch set is good.
But there isn't any more time in the current CF to commit this :(.
So I moved to the next CF.
I agree with you. Thank you!)
I also like the 0001 commit message. This commit message is quite
large and easy to understand. Actually, it might be too big. Perhaps
rather of being a commit message, the final paragraph (pages_frozen -
number of pages that..) need to be a part of the document. Perhaps
delete the explanation on pages_frozen that we have in 0004?
To be honest, I don't quite understand what you're suggesting. Are you suggesting moving the explanation about the pages_frozen from the commit message to the documentation or fixing something in the documentation about the pages_frozen? Can you please explain?
-- 
Regards,
Alena Rybakina
Postgres Professional

Re: Vacuum statistics

From
Alena Rybakina
Date:
On 02.12.2024 17:46, Ilia Evdokimov wrote:
> In my opinion, the patches are semantically correct. However, not all 
> dead code has been removed - I'm referring to 
> pgstat_update_snapshot(). Also, the tests need to be fixed.
>
Thank you, I'll fix it

-- 
Regards,
Alena Rybakina
Postgres Professional




Re: Vacuum statistics

From
Alena Rybakina
Date:
Hi!

On 02.12.2024 17:46, Ilia Evdokimov wrote:
> In my opinion, the patches are semantically correct. However, not all 
> dead code has been removed - I'm referring to 
> pgstat_update_snapshot(). Also, the tests need to be fixed.
>
>
I fixed it [0]. Thank you!

[0] 
https://www.postgresql.org/message-id/86f76aa5-1ab5-4e2e-9b15-405051852a2a%40postgrespro.ru

-- 
Regards,
Alena Rybakina
Postgres Professional