Re: Vacuum statistics - Mailing list pgsql-hackers

From Alena Rybakina
Subject Re: Vacuum statistics
Date
Msg-id eddfa7fc-fd50-45e2-9a67-6f0dc121d801@postgrespro.ru
Whole thread Raw
In response to Vacuum statistics  (Alena Rybakina <lena.ribackina@yandex.ru>)
List pgsql-hackers

Hi! Thank you very much for your review! Sorry for my late response I was overwhelmed by tasks.

On 16.08.2024 14:12, jian he wrote:
On Thu, Aug 15, 2024 at 4:49 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
Hi!

I've applied all the v5 patches.
0002 and 0003 have white space errors.

+      <para>
+        Number of times blocks of this index were already found
+        in the buffer cache by vacuum operations, so that a read was
not necessary
+        (this only includes hits in the
+        &project; buffer cache, not the operating system's file system cache)
+      </para></entry>

+        Number of times blocks of this table were already found
+        in the buffer cache by vacuum operations, so that a read was
not necessary
+        (this only includes hits in the
+        &project; buffer cache, not the operating system's file system cache)
+      </para></entry>

"&project;"
represents a sgml file placeholder name as "project" and puts all the
content of "project.sgml" to system-views.sgml.
but you don't have "project.sgml". you may check
doc/src/sgml/filelist.sgml or doc/src/sgml/ref/allfiles.sgml
for usage of "&place_holder;".
so you can change it to "project", otherwise doc cannot build.


src/backend/commands/dbcommands.c
we have:    /*     * If built with appropriate switch, whine when regression-testing     * conventions for database names are violated.  But don't complain during     * initdb.     */
#ifdef ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS    if (IsUnderPostmaster && strstr(dbname, "regression") == NULL)        elog(WARNING, "databases created by regression test cases
should have names including \"regression\"");
#endif
so in src/test/regress/sql/vacuum_tables_and_db_statistics.sql you
need to change dbname:
CREATE DATABASE statistic_vacuum_database;
CREATE DATABASE statistic_vacuum_database1;


+  <para>
+   The view <structname>pg_stat_vacuum_indexes</structname> will contain
+   one row for each index in the current database (including TOAST
+   table indexes), showing statistics about vacuuming that specific index.
+  </para>
TOAST should
<acronym>TOAST</acronym>



+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
maybe change to            ereport(ERROR,                    (errcode(ERRCODE_DATATYPE_MISMATCH),                     errmsg("return type must be a row type")));
Later I found out "InitMaterializedSRF(fcinfo, 0);" already did all
the work. much of the code can be gotten rid of.
please check attached.
I agree with your suggestions for improving the code. I will add this in the next version of the patch.

#define EXTVACHEAPSTAT_COLUMNS    27
#define EXTVACIDXSTAT_COLUMNS    19
#define EXTVACDBSTAT_COLUMNS    15
#define EXTVACSTAT_COLUMNS Max(EXTVACHEAPSTAT_COLUMNS, EXTVACIDXSTAT_COLUMNS)

static Oid CurrentDatabaseId = InvalidOid;
we already defined MyDatabaseId in src/include/miscadmin.h,
Why do we need "static Oid CurrentDatabaseId = InvalidOid;"?
also src/backend/utils/adt/pgstatfuncs.c already included "miscadmin.h".
Hmm, Tom Lane added "misc admin.h", or I didn't notice something. Could you point this out, please?

We used the Current Database Id to output statistics on tables from another database, so we need to replace it with a different default value. But I want to rewrite this patch to display table statistics only for the current database, that is, this part will be removed in the future. In my opinion, it would be more correct.
the following code one function has 2 return statements?
------------------------------------------------------------------------
/* * Get the vacuum statistics for the heap tables. */
Datum
pg_stat_vacuum_tables(PG_FUNCTION_ARGS)
{    return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_HEAP, EXTVACHEAPSTAT_COLUMNS);
    PG_RETURN_NULL();
}

/* * Get the vacuum statistics for the indexes. */
Datum
pg_stat_vacuum_indexes(PG_FUNCTION_ARGS)
{    return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_INDEX, EXTVACIDXSTAT_COLUMNS);
    PG_RETURN_NULL();
}

/* * Get the vacuum statistics for the database. */
Datum
pg_stat_vacuum_database(PG_FUNCTION_ARGS)
{        return pg_stats_vacuum(fcinfo, PGSTAT_EXTVAC_DB, EXTVACDBSTAT_COLUMNS);
    PG_RETURN_NULL();
}
You are right - the second return is superfluous. I'll fix it.
------------------------------------------------------------------------
in pg_stats_vacuum:    if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)    {        Oid                    relid = PG_GETARG_OID(1);
        /* Load table statistics for specified database. */        if (OidIsValid(relid))        {            tabentry = fetch_dbstat_tabentry(dbid, relid);            if (tabentry == NULL || tabentry->vacuum_ext.type != type)                /* Table don't exists or isn't an heap relation. */                PG_RETURN_NULL();
            tuplestore_put_for_relation(relid, tupstore, tupdesc,
tabentry, ncolumns);        }        else        {         ...        }
}
I don't understand the ELSE branch. the IF branch means the input
dboid, heap/index oid is correct.
the ELSE branch means table reloid is invalid = 0.
I am not sure 100% what the ELSE Branch means.
Also there are no comments explaining why.
experiments seem to show that  when reloid is 0, it will print out all
the vacuum statistics
for all the tables in the current database. If so, then only super
users can call pg_stats_vacuum?
but the table owner should be able to call pg_stats_vacuum for that
specific table.
If any reloid has not been set by the user, we output statistics for all objects - tables or indexes.In this part of the code, we find all the suitable objects from the snapshot, if they belong to the index or table type of objects.
/* Type of ExtVacReport */
typedef enum ExtVacReportType
{    PGSTAT_EXTVAC_INVALID = 0,    PGSTAT_EXTVAC_HEAP = 1,    PGSTAT_EXTVAC_INDEX = 2,    PGSTAT_EXTVAC_DB = 3,
} ExtVacReportType;
generally "HEAP" means table and index, maybe "PGSTAT_EXTVAC_HEAP" would be term

No, Heap means something like a table in a relationship database, or its alternative name is Heap.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

Previous
From: Jelte Fennema-Nio
Date:
Subject: Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
Next
From: Alena Rybakina
Date:
Subject: Re: Vacuum statistics