Re: Autovacuum on partitioned table (autoanalyze) - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Autovacuum on partitioned table (autoanalyze)
Date
Msg-id 20200915.190135.1065752215377064117.horikyota.ntt@gmail.com
Whole thread Raw
Responses Re: Autovacuum on partitioned table (autoanalyze)  (yuzuko <yuzukohosoya@gmail.com>)
List pgsql-hackers
At Tue, 25 Aug 2020 14:28:20 +0200, Daniel Gustafsson <daniel@yesql.se> wrote in 
> > I attach the latest patch that solves the above Werror.
> > Could you please check it again?
> 
> This version now pass the tests in the Travis pipeline as can be seen in the
> link below, and is ready to be reviewed in the upcoming commitfest:
> 
>     http://cfbot.cputube.org/yuzuko-hosoya.html

At Mon, 6 Jul 2020 19:35:37 +0900, yuzuko <yuzukohosoya@gmail.com> wrote in 
> I think there are other approaches like Tom's idea that Justin previously
> referenced, but this patch works the same way as previous patches.
> (tracks updated/inserted/deleted tuples and checks whether the partitioned
> tables needs auto-analyze, same as nonpartitioned tables)
> Because I wanted to be able to analyze partitioned tables by autovacuum
> as a first step, and I think this approach is the simplest way to do it.

I'm not sure if anything bad happen if parent and children are not
agree on statistics.

The requirement suggested here seems to be:

- We want to update parent's stats when any of its children gets its
  stats updated. This is curucial especially for time-series
  partitioning.

- However, we don't want analyze the whole-tree every time any of the
  children was analyzed.

To achieve the both, stats-merging seems to the optimal solution.

Putting that aside, I had a brief look on the latest patch.

     /* We only count stats for things that have storage */
-    if (!RELKIND_HAS_STORAGE(relkind))
+    if (!RELKIND_HAS_STORAGE(relkind) ||
+        relkind == RELKIND_PARTITIONED_TABLE)
     {
         rel->pgstat_info = NULL;

RELKIND_HAS_STORAGE(RELKIND_PARTITIONED_TABLE) is already false.
Maybe you wanted to do "&& relkind !=" instead:p


+        /*
+         * If this relation is partitioned, we store all ancestors' oid
+         * to propagate its changed_tuples to their parents when this
+         * transaction is committed.
+         */
+        if (rel->rd_rel->relispartition && pgstat_info->ancestors == NULL)

If the relation was detached then attached to another partition within
a transaction, the ancestor list would get stale and the succeeding
modification to the relation propagates into wrong ancestors.

I think vacuum time is more appropriate to modify ancestors stats. It
seems to me that what Alvalo pointed isthe list-order-susceptible
manner of collecting children's modified tuples.


+        ? 0  /* partitioned tables don't have any data, so it's 0 */

If the comment is true, we shouldn't have non-zero t_changed_tuples,
too. I think the reason for the lines is something different.

# Oops! Time's up now.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: "lchch1990@sina.cn"
Date:
Subject: Re: Unnecessary delay in streaming replication due to replay lag
Next
From: Daniel Gustafsson
Date:
Subject: Re: ECPG: proposal for new DECLARE STATEMENT