Re: Autovacuum on partitioned table (autoanalyze) - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: Autovacuum on partitioned table (autoanalyze) |
Date | |
Msg-id | 20210804.142254.670315945770222525.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: Autovacuum on partitioned table (autoanalyze) (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
At Thu, 29 Jul 2021 18:03:55 -0700, Andres Freund <andres@anarazel.de> wrote in > And if one instead inverts the order of pgstat_report_analyze() and > pgstat_report_anl_ancestors() one gets a slightly different problem: A manual > ANALYZE of the partition root results in the partition root having a non-zero > changes_since_analyze afterwards. expand_vacuum() causes child partitions to be > added to the list of relations, which *first* causes the partition root to be > analyzed, and *then* partitions. The partitions then report their > changes_since_analyze upwards. For the last behavior, as Andres suggested, the scan order need to be reversed (or to be in the same order with autovacuum). Since find_all_inheritors scans breadth-first so just reversing the result works. The breadth-first is currently not in the contract of the interface of the function. I suppose we can add such a contract? Finally, I ended up with the attached. - reverse the relation order within a tree - reverse the order of pgstat_report_analyze and pgstat_report_analyze. Inheritance expansion is performed per-tree basis so it works fine even if multiple relations are given to vacuum(). > I don't think the code as is is fit for v14. It looks like it was rewritten > with a new approach just before the freeze ([1]), and as far as I can tell the > concerns I quoted above weren't even discussed in the whole thread. Alvaro, > any comments? > > Greetings, > > Andres Freund > > [1] https://www.postgresql.org/message-id/20210408032235.GA6842%40alvherre.pgsql FYI: this bahaves as the follows. CREATE TABLE p (a int) PARTITION BY RANGE (a); CREATE TABLE c1 PARTITION OF p FOR VALUES FROM (0) TO (200) PARTITION BY RANGE(a); CREATE TABLE c11 PARTITION OF c1 FOR VALUES FROM (0) TO (100); CREATE TABLE c12 PARTITION OF c1 FOR VALUES FROM (100) TO (200); CREATE TABLE c2 PARTITION OF p FOR VALUES FROM (200) TO (400) PARTITION BY RANGE(a); CREATE TABLE c21 PARTITION OF c2 FOR VALUES FROM (200) TO (300); CREATE TABLE c22 PARTITION OF c2 FOR VALUES FROM (300) TO (400); INSERT INTO p (SELECT a FROM generate_series(0, 400 - 1) a, generate_series(0, 10) b); INSERT INTO p (SELECT 200 FROM generate_series(0, 99)); SELECT relid, relname, n_mod_since_analyze FROM pg_stat_user_tables ORDER BY relid; relid | relname | n_mod_since_analyze -------+---------+--------------------- 16426 | p | 0 16429 | c1 | 0 16432 | c11 | 0 16435 | c12 | 0 16438 | c2 | 0 16441 | c21 | 100 16444 | c22 | 0 16447 | sa | 0 (8 rows) After "ANALYZE c21;" relid | relname | n_mod_since_analyze -------+---------+--------------------- 16426 | p | 100 16429 | c1 | 0 16432 | c11 | 0 16435 | c12 | 0 16438 | c2 | 100 16441 | c21 | 0 16444 | c22 | 0 16447 | sa | 0 After "ANALYZE c2;" relid | relname | n_mod_since_analyze -------+---------+--------------------- 16426 | p | 100 16429 | c1 | 0 16432 | c11 | 0 16435 | c12 | 0 16438 | c2 | 0 16441 | c21 | 0 16444 | c22 | 0 16447 | sa | 0 After "ANALYZE p;" (all zero) However, this gives a strange-looking side-effect, which affected regression results. =# VACUUM ANALYZE p(a, a); ERROR: column "a" of relation "c22" appears more than once (Prevously it complained about p.) regards. -- Kyotaro Horiguchi NTT Open Source Software Center From 16f7602f1b7755f288c508f1e57e0eae3c305813 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Wed, 4 Aug 2021 13:40:59 +0900 Subject: [PATCH] Fix changes_since_analyze's motion on manual analyze on partitioned tables The analyze-stats machinery assumed bottom-to-top-ordered relation scans but actually it was in the opposite order in manual ANALYZE. Addition to that the current code tries to propagate changes_since_analyze to parents after stats reporting which resets the number to propagate. As the result, when doing manual ANALYZE on a partition, changes_since_analyze vanishes instead of being propagated to parents. On the other hand when doing that on a partitioned tables, the leaf relations end up with having bogus stats values. To fix this, reverse the order relations on running manual ANALYZE and move stats-reporting after stats-propagation. --- src/backend/catalog/pg_inherits.c | 3 ++- src/backend/commands/analyze.c | 26 +++++++++++++------------- src/backend/commands/vacuum.c | 25 +++++++++++++++---------- src/test/regress/expected/vacuum.out | 18 +++++++++--------- 4 files changed, 39 insertions(+), 33 deletions(-) diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c index ae990d4877..3451578580 100644 --- a/src/backend/catalog/pg_inherits.c +++ b/src/backend/catalog/pg_inherits.c @@ -239,7 +239,8 @@ find_inheritance_children_extended(Oid parentrelId, bool omit_detached, /* * find_all_inheritors - * Returns a list of relation OIDs including the given rel plus - * all relations that inherit from it, directly or indirectly. + * all relations that inherit from it, directly or indirectly, in + * breadth-first ordering. * Optionally, it also returns the number of parents found for * each such relation within the inheritance tree rooted at the * given rel. diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c index 0c9591415e..414da6630b 100644 --- a/src/backend/commands/analyze.c +++ b/src/backend/commands/analyze.c @@ -682,19 +682,6 @@ do_analyze_rel(Relation onerel, VacuumParams *params, in_outer_xact); } - /* - * Now report ANALYZE to the stats collector. For regular tables, we do - * it only if not doing inherited stats. For partitioned tables, we only - * do it for inherited stats. (We're never called for not-inherited stats - * on partitioned tables anyway.) - * - * Reset the changes_since_analyze counter only if we analyzed all - * columns; otherwise, there is still work for auto-analyze to do. - */ - if (!inh || onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) - pgstat_report_analyze(onerel, totalrows, totaldeadrows, - (va_cols == NIL)); - /* * If this is a manual analyze of all columns of a permanent leaf * partition, and not doing inherited stats, also let the collector know @@ -711,6 +698,19 @@ do_analyze_rel(Relation onerel, VacuumParams *params, pgstat_report_anl_ancestors(RelationGetRelid(onerel)); } + /* + * Now report ANALYZE to the stats collector. For regular tables, we do + * it only if not doing inherited stats. For partitioned tables, we only + * do it for inherited stats. (We're never called for not-inherited stats + * on partitioned tables anyway.) + * + * Reset the changes_since_analyze counter only if we analyzed all + * columns; otherwise, there is still work for auto-analyze to do. + */ + if (!inh || onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) + pgstat_report_analyze(onerel, totalrows, totaldeadrows, + (va_cols == NIL)); + /* * If this isn't part of VACUUM ANALYZE, let index AMs do cleanup. * diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 5c4bc15b44..0a378487fb 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -826,13 +826,15 @@ expand_vacuum_rel(VacuumRelation *vrel, int options) ReleaseSysCache(tuple); /* - * If it is, make relation list entries for its partitions. Note that - * the list returned by find_all_inheritors() includes the passed-in - * OID, so we have to skip that. There's no point in taking locks on - * the individual partitions yet, and doing so would just add - * unnecessary deadlock risk. For this last reason we do not check - * yet the ownership of the partitions, which get added to the list to - * process. Ownership will be checked later on anyway. + * If it is, make relation list entries for its partitions in a + * bottom-to-top manner. Note that the list returned by + * find_all_inheritors() is in top-to-bottom ordering and includes the + * passed-in OID, so we have to reverse the order and skip the + * passed-in OID. There's no point in taking locks on the individual + * partitions yet, and doing so would just add unnecessary deadlock + * risk. For this last reason we do not check yet the ownership of the + * partitions, which get added to the list to process. Ownership will + * be checked later on anyway. */ if (include_parts) { @@ -852,9 +854,12 @@ expand_vacuum_rel(VacuumRelation *vrel, int options) * later. */ oldcontext = MemoryContextSwitchTo(vac_context); - vacrels = lappend(vacrels, makeVacuumRelation(NULL, - part_oid, - vrel->va_cols)); + + /* Make the order reversed */ + vacrels = lcons(makeVacuumRelation(NULL, + part_oid, + vrel->va_cols), + vacrels); MemoryContextSwitchTo(oldcontext); } } diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out index 3e70e4c788..5b9726225e 100644 --- a/src/test/regress/expected/vacuum.out +++ b/src/test/regress/expected/vacuum.out @@ -196,9 +196,9 @@ VACUUM (FULL) vacparted; VACUUM (FREEZE) vacparted; -- check behavior with duplicate column mentions VACUUM ANALYZE vacparted(a,b,a); -ERROR: column "a" of relation "vacparted" appears more than once +ERROR: column "a" of relation "vacparted1" appears more than once ANALYZE vacparted(a,b,b); -ERROR: column "b" of relation "vacparted" appears more than once +ERROR: column "b" of relation "vacparted1" appears more than once -- partitioned table with index CREATE TABLE vacparted_i (a int primary key, b varchar(100)) PARTITION BY HASH (a); @@ -239,7 +239,7 @@ ANALYZE vacparted (b), vactst; ANALYZE vactst, does_not_exist, vacparted; ERROR: relation "does_not_exist" does not exist ANALYZE vactst (i), vacparted (does_not_exist); -ERROR: column "does_not_exist" of relation "vacparted" does not exist +ERROR: column "does_not_exist" of relation "vacparted1" does not exist ANALYZE vactst, vactst; BEGIN; -- ANALYZE behaves differently inside a transaction block ANALYZE vactst, vactst; @@ -319,24 +319,24 @@ WARNING: skipping "pg_authid" --- only superuser can vacuum it -- independently. VACUUM vacowned_parted; WARNING: skipping "vacowned_parted" --- only table or database owner can vacuum it -WARNING: skipping "vacowned_part1" --- only table or database owner can vacuum it WARNING: skipping "vacowned_part2" --- only table or database owner can vacuum it +WARNING: skipping "vacowned_part1" --- only table or database owner can vacuum it VACUUM vacowned_part1; WARNING: skipping "vacowned_part1" --- only table or database owner can vacuum it VACUUM vacowned_part2; WARNING: skipping "vacowned_part2" --- only table or database owner can vacuum it ANALYZE vacowned_parted; WARNING: skipping "vacowned_parted" --- only table or database owner can analyze it -WARNING: skipping "vacowned_part1" --- only table or database owner can analyze it WARNING: skipping "vacowned_part2" --- only table or database owner can analyze it +WARNING: skipping "vacowned_part1" --- only table or database owner can analyze it ANALYZE vacowned_part1; WARNING: skipping "vacowned_part1" --- only table or database owner can analyze it ANALYZE vacowned_part2; WARNING: skipping "vacowned_part2" --- only table or database owner can analyze it VACUUM (ANALYZE) vacowned_parted; WARNING: skipping "vacowned_parted" --- only table or database owner can vacuum it -WARNING: skipping "vacowned_part1" --- only table or database owner can vacuum it WARNING: skipping "vacowned_part2" --- only table or database owner can vacuum it +WARNING: skipping "vacowned_part1" --- only table or database owner can vacuum it VACUUM (ANALYZE) vacowned_part1; WARNING: skipping "vacowned_part1" --- only table or database owner can vacuum it VACUUM (ANALYZE) vacowned_part2; @@ -389,22 +389,22 @@ ALTER TABLE vacowned_parted OWNER TO regress_vacuum; ALTER TABLE vacowned_part1 OWNER TO CURRENT_USER; SET ROLE regress_vacuum; VACUUM vacowned_parted; -WARNING: skipping "vacowned_part1" --- only table or database owner can vacuum it WARNING: skipping "vacowned_part2" --- only table or database owner can vacuum it +WARNING: skipping "vacowned_part1" --- only table or database owner can vacuum it VACUUM vacowned_part1; WARNING: skipping "vacowned_part1" --- only table or database owner can vacuum it VACUUM vacowned_part2; WARNING: skipping "vacowned_part2" --- only table or database owner can vacuum it ANALYZE vacowned_parted; -WARNING: skipping "vacowned_part1" --- only table or database owner can analyze it WARNING: skipping "vacowned_part2" --- only table or database owner can analyze it +WARNING: skipping "vacowned_part1" --- only table or database owner can analyze it ANALYZE vacowned_part1; WARNING: skipping "vacowned_part1" --- only table or database owner can analyze it ANALYZE vacowned_part2; WARNING: skipping "vacowned_part2" --- only table or database owner can analyze it VACUUM (ANALYZE) vacowned_parted; -WARNING: skipping "vacowned_part1" --- only table or database owner can vacuum it WARNING: skipping "vacowned_part2" --- only table or database owner can vacuum it +WARNING: skipping "vacowned_part1" --- only table or database owner can vacuum it VACUUM (ANALYZE) vacowned_part1; WARNING: skipping "vacowned_part1" --- only table or database owner can vacuum it VACUUM (ANALYZE) vacowned_part2; -- 2.27.0
pgsql-hackers by date: