Re: REINDEX not updating partition progress - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: REINDEX not updating partition progress |
Date | |
Msg-id | Zpnaz_gHMyf0xsso@paquier.xyz Whole thread Raw |
In response to | REINDEX not updating partition progress (Ilya Gladyshev <ilya.v.gladyshev@gmail.com>) |
Responses |
Re: REINDEX not updating partition progress
|
List | pgsql-hackers |
On Fri, Jul 12, 2024 at 11:07:49PM +0100, Ilya Gladyshev wrote: > While working on CIC for partitioned tables [1], I noticed that REINDEX for > partitioned tables is not tracking keeping progress of partitioned tables, > so I'm creating a separate thread for this fix as suggested. This limitation is not a bug, because we already document that partitions_total and partitions_done are 0 during a REINDEX. Compared to CREATE INDEX, a REINDEX is a bit wilder as it would work on all the indexes for all the partitions, providing this information makes sense. Agreed that this could be better, but what's now on HEAD is not wrong either. > The way REINDEX for partitioned tables works now ReindexMultipleInternal > treats every partition as an independent table without keeping track of > partition_done/total counts, because this is just generic code for > processing multiple tables. The patch addresses that by passing down the > knowledge a flag to distinguish partitions from independent tables > in ReindexMultipleInternal and its callees. I also noticed that the > partitioned CREATE INDEX progress tracking could also benefit from > progress_index_partition_done function that zeroizes params in addition to > incrementing the counter, so I applied it there as well. Still it does not do it because we know that the fields are going to be overwritten pretty quickly anyway, no? For now, we could just do without progress_index_partition_done(), I guess, keeping it to the incrementation of PROGRESS_CREATEIDX_PARTITIONS_DONE. There's an argument that this makes the code slightly easier to follow, with less wrappers around the progress reporting. + int progress_params[3] = { + PROGRESS_CREATEIDX_COMMAND, + PROGRESS_CREATEIDX_PHASE, + PROGRESS_CREATEIDX_PARTITIONS_TOTAL + }; + int64 progress_values[3]; + Oid heapId = relid; Rather than having new code blocks, let's use a style consistent with DefineIndex() where we have the pgstat_progress_update_multi_param(), with a single {} block. Adding the counter increment at the end of the loop in ReindexMultipleInternal() is a good idea. It considers both the concurrent and non-concurrent cases. + progress_values[2] = list_length(partitions); + pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX, heapId); Hmm. Is setting the relid only once for pg_stat_progress_create_index the best choice there is? Could it be better to report the partition OID instead? Let's remember that indexes may have been attached with names inconsistent with the partitioned table's index. It is a bit confusing to stick to the relid all the partitioned table all the time, for all the indexes of all the partitions reindexed. Could it be better to actually introduce an entirely new field to the progress table? What I would look for is more information: 1) the partitioned table OID on which the REINDEX runs 2) the partition table being processed 3) the index OID being processed (old and new for concurrent case). The patch sets 1) to the OID of the partitioned table, lacks 2) and sets 3) each time an index is rebuilt. -- Michael
Attachment
pgsql-hackers by date: