Re: Autovacuum on partitioned table (autoanalyze) - Mailing list pgsql-hackers
From | yuzuko |
---|---|
Subject | Re: Autovacuum on partitioned table (autoanalyze) |
Date | |
Msg-id | CAKkQ50_D=1LbYLD9qtWaKxPvRq3T_qGWfQov5eH5rycbMEq-pA@mail.gmail.com Whole thread Raw |
In response to | Re: Autovacuum on partitioned table (autoanalyze) (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: Autovacuum on partitioned table (autoanalyze)
|
List | pgsql-hackers |
Horiguchi-san, Thank you for reviewing. On Tue, Sep 15, 2020 at 7:01 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Tue, 25 Aug 2020 14:28:20 +0200, Daniel Gustafsson <daniel@yesql.se> wrote in > > > I attach the latest patch that solves the above Werror. > > > Could you please check it again? > > > > This version now pass the tests in the Travis pipeline as can be seen in the > > link below, and is ready to be reviewed in the upcoming commitfest: > > > > http://cfbot.cputube.org/yuzuko-hosoya.html > > At Mon, 6 Jul 2020 19:35:37 +0900, yuzuko <yuzukohosoya@gmail.com> wrote in > > I think there are other approaches like Tom's idea that Justin previously > > referenced, but this patch works the same way as previous patches. > > (tracks updated/inserted/deleted tuples and checks whether the partitioned > > tables needs auto-analyze, same as nonpartitioned tables) > > Because I wanted to be able to analyze partitioned tables by autovacuum > > as a first step, and I think this approach is the simplest way to do it. > > I'm not sure if anything bad happen if parent and children are not > agree on statistics. > > The requirement suggested here seems to be: > > - We want to update parent's stats when any of its children gets its > stats updated. This is curucial especially for time-series > partitioning. > > - However, we don't want analyze the whole-tree every time any of the > children was analyzed. > > To achieve the both, stats-merging seems to the optimal solution. > > Putting that aside, I had a brief look on the latest patch. > > /* We only count stats for things that have storage */ > - if (!RELKIND_HAS_STORAGE(relkind)) > + if (!RELKIND_HAS_STORAGE(relkind) || > + relkind == RELKIND_PARTITIONED_TABLE) > { > rel->pgstat_info = NULL; > > RELKIND_HAS_STORAGE(RELKIND_PARTITIONED_TABLE) is already false. > Maybe you wanted to do "&& relkind !=" instead:p > Oh, indeed. I'll fix it. > > + /* > + * If this relation is partitioned, we store all ancestors' oid > + * to propagate its changed_tuples to their parents when this > + * transaction is committed. > + */ > + if (rel->rd_rel->relispartition && pgstat_info->ancestors == NULL) > > If the relation was detached then attached to another partition within > a transaction, the ancestor list would get stale and the succeeding > modification to the relation propagates into wrong ancestors. > > I think vacuum time is more appropriate to modify ancestors stats. It > seems to me that what Alvalo pointed isthe list-order-susceptible > manner of collecting children's modified tuples. > I proposed a patch that modified ancestors stats when vacuuming previously. In that time, having been pointed out by Alvaro and Amit, I tried to update the parents' changes_since_analyze in every ANALYZE. However, in that case, the problem mentioned in [1] occurred, but I could not find a way to avoid it. I think that it can be solved by updating the parents' changes_since_analyze only in the case of auto analyze, but what do you think? > > + ? 0 /* partitioned tables don't have any data, so it's 0 */ > > If the comment is true, we shouldn't have non-zero t_changed_tuples, > too. I think the reason for the lines is something different. > Yes, surely. I think updating the values of live_tuples and dead_tuples is confusing for users. I'll consider another comment. [1] https://www.postgresql.org/message-id/CAKkQ50-bwFEDMBGb1JmDXffXsiU8xk-hN6kJK9CKjdBa7r%3DHdw%40mail.gmail.com -- Best regards, Yuzuko Hosoya NTT Open Source Software Center
pgsql-hackers by date: