Thread: REINDEX not updating partition progress

REINDEX not updating partition progress

From
Ilya Gladyshev
Date:

Hi,

While working on CIC for partitioned tables [1], I noticed that REINDEX for partitioned tables is not tracking keeping progress of partitioned tables, so I'm creating a separate thread for this fix as suggested. 

The way REINDEX for partitioned tables works now ReindexMultipleInternal treats every partition as an independent table without keeping track of partition_done/total counts, because this is just generic code for processing multiple tables. The patch addresses that by passing down the knowledge a flag to distinguish partitions from independent tables in ReindexMultipleInternal and its callees. I also noticed that the partitioned CREATE INDEX progress tracking could also benefit from progress_index_partition_done function that zeroizes params in addition to incrementing the counter, so I applied it there as well.

[1] https://www.postgresql.org/message-id/55cfae76-2ffa-43ed-a7e7-901bffbebee4%40gmail.com

Attachment

Re: REINDEX not updating partition progress

From
Michael Paquier
Date:
On Fri, Jul 12, 2024 at 11:07:49PM +0100, Ilya Gladyshev wrote:
> While working on CIC for partitioned tables [1], I noticed that REINDEX for
> partitioned tables is not tracking keeping progress of partitioned tables,
> so I'm creating a separate thread for this fix as suggested.

This limitation is not a bug, because we already document that
partitions_total and partitions_done are 0 during a REINDEX.  Compared
to CREATE INDEX, a REINDEX is a bit wilder as it would work on all the
indexes for all the partitions, providing this information makes
sense.

Agreed that this could be better, but what's now on HEAD is not wrong
either.

> The way REINDEX for partitioned tables works now ReindexMultipleInternal
> treats every partition as an independent table without keeping track of
> partition_done/total counts, because this is just generic code for
> processing multiple tables. The patch addresses that by passing down the
> knowledge a flag to distinguish partitions from independent tables
> in ReindexMultipleInternal and its callees. I also noticed that the
> partitioned CREATE INDEX progress tracking could also benefit from
> progress_index_partition_done function that zeroizes params in addition to
> incrementing the counter, so I applied it there as well.

Still it does not do it because we know that the fields are going to
be overwritten pretty quickly anyway, no?  For now, we could just do
without progress_index_partition_done(), I guess, keeping it to the
incrementation of PROGRESS_CREATEIDX_PARTITIONS_DONE.  There's an
argument that this makes the code slightly easier to follow, with less
wrappers around the progress reporting.

+    int            progress_params[3] = {
+        PROGRESS_CREATEIDX_COMMAND,
+        PROGRESS_CREATEIDX_PHASE,
+        PROGRESS_CREATEIDX_PARTITIONS_TOTAL
+    };
+    int64        progress_values[3];
+    Oid            heapId = relid;

Rather than having new code blocks, let's use a style consistent with
DefineIndex() where we have the pgstat_progress_update_multi_param(),
with a single {} block.

Adding the counter increment at the end of the loop in
ReindexMultipleInternal() is a good idea.  It considers both the
concurrent and non-concurrent cases.

+   progress_values[2] = list_length(partitions);
+   pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX, heapId);

Hmm.  Is setting the relid only once for pg_stat_progress_create_index
the best choice there is?  Could it be better to report the partition
OID instead?  Let's remember that indexes may have been attached with
names inconsistent with the partitioned table's index.  It is a bit
confusing to stick to the relid all the partitioned table all the
time, for all the indexes of all the partitions reindexed.  Could it
be better to actually introduce an entirely new field to the progress
table?  What I would look for is more information:
1) the partitioned table OID on which the REINDEX runs
2) the partition table being processed
3) the index OID being processed (old and new for concurrent case).

The patch sets 1) to the OID of the partitioned table, lacks 2) and
sets 3) each time an index is rebuilt.
--
Michael

Attachment

Re: REINDEX not updating partition progress

From
Ilya Gladyshev
Date:


19 июля 2024 г., в 04:17, Michael Paquier <michael@paquier.xyz> написал(а):

On Fri, Jul 12, 2024 at 11:07:49PM +0100, Ilya Gladyshev wrote:
While working on CIC for partitioned tables [1], I noticed that REINDEX for
partitioned tables is not tracking keeping progress of partitioned tables,
so I'm creating a separate thread for this fix as suggested.

This limitation is not a bug, because we already document that
partitions_total and partitions_done are 0 during a REINDEX.  Compared
to CREATE INDEX, a REINDEX is a bit wilder as it would work on all the
indexes for all the partitions, providing this information makes
sense.

Agreed that this could be better, but what's now on HEAD is not wrong
either.

Thanks for pointing it out, I didn’t notice it. You’re right, this not a bug, but an improvement. Will remove the bits from the documentation as well.


Still it does not do it because we know that the fields are going to
be overwritten pretty quickly anyway, no?  For now, we could just do
without progress_index_partition_done(), I guess, keeping it to the
incrementation of PROGRESS_CREATEIDX_PARTITIONS_DONE.  There's an
argument that this makes the code slightly easier to follow, with less
wrappers around the progress reporting.

The use-case that I thought this would improve is REINDEX CONCURRENT, where data from later stages of the previous partitions would linger for quite a while before it gets to the same stage of the current partition. I don’t think this is of big importance, so I’m ok with making code simpler and leaving it out.


+ int progress_params[3] = {
+ PROGRESS_CREATEIDX_COMMAND,
+ PROGRESS_CREATEIDX_PHASE,
+ PROGRESS_CREATEIDX_PARTITIONS_TOTAL
+ };
+ int64 progress_values[3];
+ Oid heapId = relid;

Rather than having new code blocks, let's use a style consistent with
DefineIndex() where we have the pgstat_progress_update_multi_param(),
with a single {} block.

Fixed.

Adding the counter increment at the end of the loop in
ReindexMultipleInternal() is a good idea.  It considers both the
concurrent and non-concurrent cases.

Another reason to do so is that reindex_relation calls itself recursively, so it'd require some additional mechanism so that it wouldn't increment multiple times per partition.

+   progress_values[2] = list_length(partitions);
+   pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX, heapId);

Hmm.  Is setting the relid only once for pg_stat_progress_create_index
the best choice there is?  Could it be better to report the partition
OID instead?

Technically, we could do this, but it looks like no other commands do it and currently there’s no API to do it without erasing the rest of progress information.

 Let's remember that indexes may have been attached with
names inconsistent with the partitioned table's index.  It is a bit
confusing to stick to the relid all the partitioned table all the
time, for all the indexes of all the partitions reindexed.

I agree that it’s a bit confusing especially because documentation gives no hints about it. Judging by documentation, I would expect relid and index_relid to correspond to the same table. However, if we say that this field represents the oid of the relation on which the command was invoked, then I think it does make sense. Edited documentation to say that in the new patch.

 Could it
be better to actually introduce an entirely new field to the progress
table?  What I would look for is more information:
1) the partitioned table OID on which the REINDEX runs
2) the partition table being processed
3) the index OID being processed (old and new for concurrent case).

The patch sets 1) to the OID of the partitioned table, lacks 2) and
sets 3) each time an index is rebuilt.

Michael

I like this approach more, but I’m not sure whether adding another field for partition oid is worth it, since we already have index_relid that we can use to join with pg_index to get that. On the other hand, index_relid is missing during regular CREATE INDEX, so this new field could be useful to indicate which table is being indexed in this case. I'm on the fence about this, attached this as a separate patch, if you think it's a good idea.

Thank you for the review!

Attachment

Re: REINDEX not updating partition progress

From
Ilya Gladyshev
Date:
Forgot to update partition_relid in reindex_index in the second patch. Fixed in attachment.



21 июля 2024 г., в 01:49, Ilya Gladyshev <ilya.v.gladyshev@gmail.com> написал(а):



19 июля 2024 г., в 04:17, Michael Paquier <michael@paquier.xyz> написал(а):

On Fri, Jul 12, 2024 at 11:07:49PM +0100, Ilya Gladyshev wrote:
While working on CIC for partitioned tables [1], I noticed that REINDEX for
partitioned tables is not tracking keeping progress of partitioned tables,
so I'm creating a separate thread for this fix as suggested.

This limitation is not a bug, because we already document that
partitions_total and partitions_done are 0 during a REINDEX.  Compared
to CREATE INDEX, a REINDEX is a bit wilder as it would work on all the
indexes for all the partitions, providing this information makes
sense.

Agreed that this could be better, but what's now on HEAD is not wrong
either.

Thanks for pointing it out, I didn’t notice it. You’re right, this not a bug, but an improvement. Will remove the bits from the documentation as well.


Still it does not do it because we know that the fields are going to
be overwritten pretty quickly anyway, no?  For now, we could just do
without progress_index_partition_done(), I guess, keeping it to the
incrementation of PROGRESS_CREATEIDX_PARTITIONS_DONE.  There's an
argument that this makes the code slightly easier to follow, with less
wrappers around the progress reporting.

The use-case that I thought this would improve is REINDEX CONCURRENT, where data from later stages of the previous partitions would linger for quite a while before it gets to the same stage of the current partition. I don’t think this is of big importance, so I’m ok with making code simpler and leaving it out.


+ int progress_params[3] = {
+ PROGRESS_CREATEIDX_COMMAND,
+ PROGRESS_CREATEIDX_PHASE,
+ PROGRESS_CREATEIDX_PARTITIONS_TOTAL
+ };
+ int64 progress_values[3];
+ Oid heapId = relid;

Rather than having new code blocks, let's use a style consistent with
DefineIndex() where we have the pgstat_progress_update_multi_param(),
with a single {} block.

Fixed.

Adding the counter increment at the end of the loop in
ReindexMultipleInternal() is a good idea.  It considers both the
concurrent and non-concurrent cases.

Another reason to do so is that reindex_relation calls itself recursively, so it'd require some additional mechanism so that it wouldn't increment multiple times per partition.

+   progress_values[2] = list_length(partitions);
+   pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX, heapId);

Hmm.  Is setting the relid only once for pg_stat_progress_create_index
the best choice there is?  Could it be better to report the partition
OID instead? 

Technically, we could do this, but it looks like no other commands do it and currently there’s no API to do it without erasing the rest of progress information.

 Let's remember that indexes may have been attached with
names inconsistent with the partitioned table's index.  It is a bit
confusing to stick to the relid all the partitioned table all the
time, for all the indexes of all the partitions reindexed.

I agree that it’s a bit confusing especially because documentation gives no hints about it. Judging by documentation, I would expect relid and index_relid to correspond to the same table. However, if we say that this field represents the oid of the relation on which the command was invoked, then I think it does make sense. Edited documentation to say that in the new patch.

 Could it
be better to actually introduce an entirely new field to the progress
table?  What I would look for is more information:
1) the partitioned table OID on which the REINDEX runs
2) the partition table being processed
3) the index OID being processed (old and new for concurrent case).

The patch sets 1) to the OID of the partitioned table, lacks 2) and
sets 3) each time an index is rebuilt.

Michael

I like this approach more, but I’m not sure whether adding another field for partition oid is worth it, since we already have index_relid that we can use to join with pg_index to get that. On the other hand, index_relid is missing during regular CREATE INDEX, so this new field could be useful to indicate which table is being indexed in this case. I'm on the fence about this, attached this as a separate patch, if you think it's a good idea.

Thank you for the review!

<v2-0002-partition_relid-column-for-create-index-progress.patch><v2-0001-make-REINDEX-track-partition-progress.patch>

Attachment

Re: REINDEX not updating partition progress

From
Michael Paquier
Date:
On Sun, Jul 21, 2024 at 11:41:43AM +0100, Ilya Gladyshev wrote:
> Forgot to update partition_relid in reindex_index in the second patch. Fixed in attachment.

        <structfield>relid</structfield> <type>oid</type>
       </para>
       <para>
-       OID of the table on which the index is being created.
+       OID of the table on which the command was run.
       </para></entry>

Hmm.  I am not sure if we really need to change the definition of this
field, because it can have the same meaning when using a REINDEX on a
partitioned table, pointing to the parent table (the partition) of the
index currently rebuilt.

Hence, rather than a partition_relid, could a partitioned_relid
reflect better the situation, set only when issuing a REINDEX on a
partitioned relation?

+   if (relkind == RELKIND_PARTITIONED_INDEX)
+   {
+       heapId = IndexGetRelation(relid, true);
+   }
Shouldn't we report the partitioned index OID rather than its parent
table when running a REINDEX on a partitioned index?
--
Michael

Attachment

Re: REINDEX not updating partition progress

From
Ilya Gladyshev
Date:


25 июля 2024 г., в 09:55, Michael Paquier <michael@paquier.xyz> написал(а):

On Sun, Jul 21, 2024 at 11:41:43AM +0100, Ilya Gladyshev wrote:
Forgot to update partition_relid in reindex_index in the second patch. Fixed in attachment.

       <structfield>relid</structfield> <type>oid</type>
      </para>
      <para>
-       OID of the table on which the index is being created.
+       OID of the table on which the command was run.
      </para></entry>

Hmm.  I am not sure if we really need to change the definition of this
field, because it can have the same meaning when using a REINDEX on a
partitioned table, pointing to the parent table (the partition) of the
index currently rebuilt.

Hence, rather than a partition_relid, could a partitioned_relid
reflect better the situation, set only when issuing a REINDEX on a
partitioned relation?

I'm not quite happy with the documentation update, but I think the approach for partitioned tables in this patch makes sense. I checked what other commands, that deal with partitions, (CREATE INDEX and ANALYZE) do, and they put a root partitioned table in "relid". ANALYZE has a separate column for the id of partition named current_child_table_relid, so I think it makes sense to have REINDEX do the same.

In addition, the current API for progress tracking doesn't have a way of updating "relid" without wiping out all other fields (that's what pgstat_progress_start_command does). This can definitely be changed, but that's another thing that made me not think in this direction.

+   if (relkind == RELKIND_PARTITIONED_INDEX)
+   {
+       heapId = IndexGetRelation(relid, true);
+   }
Shouldn't we report the partitioned index OID rather than its parent
table when running a REINDEX on a partitioned index?

Michael

It’s used to update the "relid" field of the progress report. It’s the one that’s described in docs currently as "OID of the table on which the index is being created.", so I think it’s correct.