Thread: Revive num_dead_tuples column of pg_stat_progress_vacuum
Hi, Commit 667e65aac3 changed num_dead_tuples and max_dead_tuples columns to dead_tuple_bytes and max_dead_tuple_bytes columns, respectively. But at PGConf.dev, I got feedback from multiple people that num_dead_tuples information still can provide meaning insights for users to understand the vacuum progress. One use case is to compare num_dead_tuples to pg_stat_all_tables.n_dead_tup column. I've attached the patch to revive num_dead_tuples column back to the pg_stat_progress_vacuum view. This requires to bump catalog version. We're post-beta1 but it should be okay as it's only for PG17. Feedback is very welcome. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Mon, Jun 3, 2024 at 5:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > I've attached the patch to revive num_dead_tuples column back to the > pg_stat_progress_vacuum view. This requires to bump catalog version. > We're post-beta1 but it should be okay as it's only for PG17. > > Feedback is very welcome. Can we rename this to num_dead_item_ids (or something similar) in passing? That way we'll avoid confusing the number of dead tuples removed from the table by VACUUM (which includes dead heap-only tuples, but excludes any preexisting LP_DEAD items left behind by opportunistic pruning) with the number of dead item identifiers. As you know, TIDStore stores TIDs that refer to dead item identifiers in the heap, which is often very different to the number of dead tuples removed by VACUUM. The VACUUM log output has reported on dead item identifiers separately since 14. This seems like a good opportunity to bring pg_stat_progress_vacuum in line. -- Peter Geoghegan
On 2024-Jun-03, Masahiko Sawada wrote: > Commit 667e65aac3 changed num_dead_tuples and max_dead_tuples columns > to dead_tuple_bytes and max_dead_tuple_bytes columns, respectively. > But at PGConf.dev, I got feedback from multiple people that > num_dead_tuples information still can provide meaning insights for > users to understand the vacuum progress. One use case is to compare > num_dead_tuples to pg_stat_all_tables.n_dead_tup column. +1. > @@ -2887,7 +2887,9 @@ dead_items_add(LVRelState *vacrel, BlockNumber blkno, OffsetNumber *offsets, > TidStoreSetBlockOffsets(dead_items, blkno, offsets, num_offsets); > vacrel->dead_items_info->num_items += num_offsets; > > - /* update the memory usage report */ > + /* update the progress information */ > + pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES, > + vacrel->dead_items_info->num_items); > pgstat_progress_update_param(PROGRESS_VACUUM_DEAD_TUPLE_BYTES, > TidStoreMemoryUsage(dead_items)); > } You could use pgstat_progress_update_multi_param here. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "The important things in the world are problems with society that we don't understand at all. The machines will become more complicated but they won't be more complicated than the societies that run them." (Freeman Dyson)
> On 4 Jun 2024, at 00:26, Masahiko Sawada <sawada.mshk@gmail.com> wrote: Thank you! Vacuum enhancement is a really good step forward, and this small change would help a lot of observability tools. > On 4 Jun 2024, at 00:49, Peter Geoghegan <pg@bowt.ie> wrote: > > Can we rename this to num_dead_item_ids (or something similar) in > passing? I do not insist, but many tools will have to adapt to this change [0,1]. However, most of tools will have to deal with removedmax_dead_tuples anyway [2], so this is not that big problem. Best regards, Andrey Borodin. [0] https://github.com/jalexandre0/pgmetrics/blob/61cb150f6d1e535513a6a07fc0c6d751fbed517e/collector/collect.go#L1289 [1] https://github.com/DataDog/integrations-core/blob/250553f3b91d25de28add93afeb929e2abf67e53/postgres/datadog_checks/postgres/util.py#L517 [2] https://github.com/search?q=num_dead_tuples+language%3AGo&type=code
On Wed, Jun 5, 2024 at 7:19 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote: > > > > > On 4 Jun 2024, at 00:26, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Thank you! Vacuum enhancement is a really good step forward, and this small change would help a lot of observability tools. > > > > On 4 Jun 2024, at 00:49, Peter Geoghegan <pg@bowt.ie> wrote: > > > > Can we rename this to num_dead_item_ids (or something similar) in > > passing? > > I do not insist, but many tools will have to adapt to this change [0,1]. However, most of tools will have to deal withremoved max_dead_tuples anyway [2], so this is not that big problem. True, this incompatibility would not be a big problem. num_dead_item_ids seems good to me. I've updated the patch that incorporated the comment from Álvaro[1]. Regards, [1] https://www.postgresql.org/message-id/202406041535.pmyty3ci4pfd%40alvherre.pgsql -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Fri, Jun 7, 2024 at 10:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Jun 5, 2024 at 7:19 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote: > > > > > > > > > On 4 Jun 2024, at 00:26, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > Thank you! Vacuum enhancement is a really good step forward, and this small change would help a lot of observabilitytools. > > > > > > > On 4 Jun 2024, at 00:49, Peter Geoghegan <pg@bowt.ie> wrote: > > > > > > Can we rename this to num_dead_item_ids (or something similar) in > > > passing? > > > > I do not insist, but many tools will have to adapt to this change [0,1]. However, most of tools will have to deal withremoved max_dead_tuples anyway [2], so this is not that big problem. > > True, this incompatibility would not be a big problem. > > num_dead_item_ids seems good to me. I've updated the patch that > incorporated the comment from Álvaro[1]. I'm going to push the v2 patch in a few days if there is no further comment. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Thu, Jun 6, 2024 at 9:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > num_dead_item_ids seems good to me. I've updated the patch that > incorporated the comment from Álvaro[1]. Great, thank you. -- Peter Geoghegan
On Tue, Jun 11, 2024 at 10:38 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Jun 7, 2024 at 10:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Wed, Jun 5, 2024 at 7:19 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote: > > > > > > > > > > > > > On 4 Jun 2024, at 00:26, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > Thank you! Vacuum enhancement is a really good step forward, and this small change would help a lot of observabilitytools. > > > > > > > > > > On 4 Jun 2024, at 00:49, Peter Geoghegan <pg@bowt.ie> wrote: > > > > > > > > Can we rename this to num_dead_item_ids (or something similar) in > > > > passing? > > > > > > I do not insist, but many tools will have to adapt to this change [0,1]. However, most of tools will have to deal withremoved max_dead_tuples anyway [2], so this is not that big problem. > > > > True, this incompatibility would not be a big problem. > > > > num_dead_item_ids seems good to me. I've updated the patch that > > incorporated the comment from Álvaro[1]. > > I'm going to push the v2 patch in a few days if there is no further comment. > I was about to push the patch but let me confirm just in case: is it okay to bump the catversion even after post-beta1? This patch reintroduces a previously-used column to pg_stat_progress_vacuum so it requires bumping the catversion. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Masahiko Sawada <sawada.mshk@gmail.com> writes: > I was about to push the patch but let me confirm just in case: is it > okay to bump the catversion even after post-beta1? Yes, that happens somewhat routinely. regards, tom lane
On Thu, Jun 13, 2024 at 08:38:05PM -0400, Tom Lane wrote: > Masahiko Sawada <sawada.mshk@gmail.com> writes: >> I was about to push the patch but let me confirm just in case: is it >> okay to bump the catversion even after post-beta1? > > Yes, that happens somewhat routinely. Up to RC, even after beta2. This happens routinely every year because tweaks are always required for what got committed. And that's OK to do so now. -- Michael
Attachment
On Fri, Jun 14, 2024 at 9:41 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Jun 13, 2024 at 08:38:05PM -0400, Tom Lane wrote: > > Masahiko Sawada <sawada.mshk@gmail.com> writes: > >> I was about to push the patch but let me confirm just in case: is it > >> okay to bump the catversion even after post-beta1? > > > > Yes, that happens somewhat routinely. > > Up to RC, even after beta2. This happens routinely every year because > tweaks are always required for what got committed. And that's OK to > do so now. Thank you both for confirmation. I'll push it shortly. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Fri, Jun 14, 2024 at 9:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Jun 14, 2024 at 9:41 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Thu, Jun 13, 2024 at 08:38:05PM -0400, Tom Lane wrote: > > > Masahiko Sawada <sawada.mshk@gmail.com> writes: > > >> I was about to push the patch but let me confirm just in case: is it > > >> okay to bump the catversion even after post-beta1? > > > > > > Yes, that happens somewhat routinely. > > > > Up to RC, even after beta2. This happens routinely every year because > > tweaks are always required for what got committed. And that's OK to > > do so now. > > Thank you both for confirmation. I'll push it shortly. > Pushed. Thank you for giving feedback and reviewing the patch! Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Thu, Jun 13, 2024 at 9:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Fri, Jun 14, 2024 at 9:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Jun 14, 2024 at 9:41 AM Michael Paquier <michael@paquier.xyz> wrote: > > > On Thu, Jun 13, 2024 at 08:38:05PM -0400, Tom Lane wrote: > > > > Masahiko Sawada <sawada.mshk@gmail.com> writes: > > > >> I was about to push the patch but let me confirm just in case: is it > > > >> okay to bump the catversion even after post-beta1? > > > > > > > > Yes, that happens somewhat routinely. > > > > > > Up to RC, even after beta2. This happens routinely every year because > > > tweaks are always required for what got committed. And that's OK to > > > do so now. > > > > Thank you both for confirmation. I'll push it shortly. > > > > Pushed. Thank you for giving feedback and reviewing the patch! > One minor side effect of this change is the original idea of comparing pg_stat_progress.num_dead_tuples to pg_stat_all_tables.n_dead_tup column becomes less obvious. I presume the release notes for pg_stat_progress_vacuum will be updated to also include this column name change as well, so maybe that's enough for folks to figure things out? At least I couldn't find anywhere in the docs where we have described the relationship between these columns before. Thoughts? Robert Treat https://xzilla.net
On Sat, Jun 15, 2024 at 8:47 PM Robert Treat <rob@xzilla.net> wrote: > > On Thu, Jun 13, 2024 at 9:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Jun 14, 2024 at 9:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > On Fri, Jun 14, 2024 at 9:41 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Thu, Jun 13, 2024 at 08:38:05PM -0400, Tom Lane wrote: > > > > > Masahiko Sawada <sawada.mshk@gmail.com> writes: > > > > >> I was about to push the patch but let me confirm just in case: is it > > > > >> okay to bump the catversion even after post-beta1? > > > > > > > > > > Yes, that happens somewhat routinely. > > > > > > > > Up to RC, even after beta2. This happens routinely every year because > > > > tweaks are always required for what got committed. And that's OK to > > > > do so now. > > > > > > Thank you both for confirmation. I'll push it shortly. > > > > > > > Pushed. Thank you for giving feedback and reviewing the patch! > > > > One minor side effect of this change is the original idea of comparing > pg_stat_progress.num_dead_tuples to pg_stat_all_tables.n_dead_tup > column becomes less obvious. I presume the release notes for > pg_stat_progress_vacuum will be updated to also include this column > name change as well, so maybe that's enough for folks to figure things > out? The release note has been updated, and I think it would help users understand the change. > At least I couldn't find anywhere in the docs where we have > described the relationship between these columns before. Thoughts? It would be a good idea to improve the documentation, but I think that we cannot simply compare these two numbers since the numbers that these fields count are slightly different. For instance, pg_stat_all_tables.n_dead_tup includes the number of dead tuples that are going to be HOT-pruned. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, Jun 18, 2024 at 8:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > At least I couldn't find anywhere in the docs where we have > > described the relationship between these columns before. Thoughts? > > It would be a good idea to improve the documentation, but I think that > we cannot simply compare these two numbers since the numbers that > these fields count are slightly different. For instance, > pg_stat_all_tables.n_dead_tup includes the number of dead tuples that > are going to be HOT-pruned. This isn't a small difference. It's actually a huge one in many cases: https://www.postgresql.org/message-id/CAH2-WzkkGT2Gt4XauS5eQOQi4mVvL5X49hBTtWccC8DEqeNfKA@mail.gmail.com Practically speaking they're just two different things, with hardly any fixed relationship. -- Peter Geoghegan