This is a new thread about possible refactoring and commit a6642b3ae
("Add support for partitioned tables and indexes in REINDEX")
Currently, ReindexPartitions() calls ReindexMultipleInternal() which loops
around ReindexRelationConcurrently(), similar to what's done with
ReindexMultipleTables()
Contrast that with ReindexTable(), which calls ReindexRelationConcurrently()
with the table's OID, which then handles all indexes at once:
|postgres=# REINDEX TABLE CONCURRENTLY onek;
|DEBUG: 00000: building index "onek_unique1_ccnew" on table "onek" serially
|LOCATION: index_build, index.c:2853
|DEBUG: 00000: index "onek_unique1_ccnew" can safely use deduplication
|LOCATION: _bt_allequalimage, nbtutils.c:2742
|DEBUG: 00000: building index "onek_unique2_ccnew" on table "onek" serially
|LOCATION: index_build, index.c:2853
|DEBUG: 00000: index "onek_unique2_ccnew" can safely use deduplication
|LOCATION: _bt_allequalimage, nbtutils.c:2742
|DEBUG: 00000: building index "onek_hundred_ccnew" on table "onek" serially
|LOCATION: index_build, index.c:2853
|DEBUG: 00000: index "onek_hundred_ccnew" can safely use deduplication
|LOCATION: _bt_allequalimage, nbtutils.c:2742
|DEBUG: 00000: building index "onek_stringu1_ccnew" on table "onek" serially
|LOCATION: index_build, index.c:2853
|DEBUG: 00000: index "onek_stringu1_ccnew" can safely use deduplication
|LOCATION: _bt_allequalimage, nbtutils.c:2742
|DEBUG: 00000: validate_index found 1000 heap tuples, 1000 index tuples; inserted 0 missing tuples
|LOCATION: validate_index, index.c:3281
|DEBUG: 00000: validate_index found 1000 heap tuples, 1000 index tuples; inserted 0 missing tuples
|LOCATION: validate_index, index.c:3281
|DEBUG: 00000: validate_index found 1000 heap tuples, 1000 index tuples; inserted 0 missing tuples
|LOCATION: validate_index, index.c:3281
|DEBUG: 00000: validate_index found 1000 heap tuples, 1000 index tuples; inserted 0 missing tuples
|LOCATION: validate_index, index.c:3281
|REINDEX
Should we refactor ReindexRelationConcurrently() into two functions ? One to
build a list of indexes on a relation, and one to concurrently reindex an
list of indexes. Then, ReindexPartitions() would make a list of leaf indexes
of a partitioned index, or leaf indexes of partitioned indexes of a partitioned
table, and then reindex those indexes all at once. For CIC, we could call
that from DefineIndex().
The reason is that concurrent Reindex must wait for longrunning transactions,
and if we call it in a loop, then we wait for longrunning transactions N times.
I can imagine scenarios where it's easy for an DBA to schedule maintenance to
do reindex concurrently and restart processes to allow the reindex to proceed.
But it might be infeasible to restart processes every 5min for 3 hours to allow
reindex to proceed on each partition.
ReindexMultipleTables avoids doing that to avoid deadlocks, which makes great
sense for REINDEX SCHEMA/DATABASE/SYSTEM. But I wonder if that reasoning
doesn't apply to partitioned tables.
We currently have this:
ReindexIndex()
=> ReindexPartitions
=> ReindexRelationConcurrently
=> reindex_index
ReindexTable()
=> ReindexPartitions
=> ReindexRelationConcurrently
=> reindex_relation
ReindexPartitions()
=> ReindexMultipleInternal()
=> ReindexRelationConcurrently()
=> reindex_relation()
=> reindex_index()
And I'm proposing to consider this:
ReindexIndex()
=> ReindexPartitions
=> ReindexIndexesConcurrently
=> reindex_index
ReindexTable()
=> ReindexPartitions
=> ReindexRelationConcurrently - this would be just a wrapper to collect the list of index Oids
=> reindex_relation
ReindexPartitions - this exists mainly to make a list of all leaf indexes on all childs of a partitioned table
=> ReindexIndexesConcurrently - this processes all indexes at once
=> ReindexMultipleInternal - this loops around everything
=> ReindexRelationConcurrently()
=> reindex_index()
=> reindex_relation()
ReindexRelationConcurrently
=> ReindexIndexesConcurrently - this is the worker function factored out of ReindexRelationConcurrently
I'm not sure if I'm missing any opportunities to simplify...
So then it's processed similar to REINDEX TABLE (rather than REINDEX DATABASE).
|postgres=# REINDEX TABLE CONCURRENTLY hash_parted;
|DEBUG: building index "hpart0_a_idx_ccnew" on table "hpart0" serially
|DEBUG: index "hpart0_a_idx_ccnew" can safely use deduplication
|DEBUG: building index "hpart1_a_idx_ccnew" on table "hpart1" serially
|DEBUG: index "hpart1_a_idx_ccnew" can safely use deduplication
|DEBUG: building index "hpart2_a_idx_ccnew" on table "hpart2" serially
|DEBUG: index "hpart2_a_idx_ccnew" can safely use deduplication
|DEBUG: building index "hpart3_a_idx_ccnew" on table "hpart3" serially
|DEBUG: index "hpart3_a_idx_ccnew" can safely use deduplication
|DEBUG: validate_index found 2489 heap tuples, 2489 index tuples; inserted 0 missing tuples
|DEBUG: validate_index found 2527 heap tuples, 2527 index tuples; inserted 0 missing tuples
|DEBUG: validate_index found 2530 heap tuples, 2530 index tuples; inserted 0 missing tuples
|DEBUG: validate_index found 2453 heap tuples, 2453 index tuples; inserted 0 missing tuples
|REINDEX
And, if there are multiple indexes:
|postgres=# REINDEX TABLE CONCURRENTLY hash_parted;
|DEBUG: building index "hpart0_a_idx_ccnew" on table "hpart0" serially
|DEBUG: index "hpart0_a_idx_ccnew" can safely use deduplication
|DEBUG: building index "hpart0_a_idx1_ccnew" on table "hpart0" serially
|DEBUG: index "hpart0_a_idx1_ccnew" can safely use deduplication
|DEBUG: building index "hpart1_a_idx_ccnew" on table "hpart1" serially
|DEBUG: index "hpart1_a_idx_ccnew" can safely use deduplication
|DEBUG: building index "hpart1_a_idx1_ccnew" on table "hpart1" serially
|DEBUG: index "hpart1_a_idx1_ccnew" can safely use deduplication
|DEBUG: building index "hpart2_a_idx_ccnew" on table "hpart2" serially
|DEBUG: index "hpart2_a_idx_ccnew" can safely use deduplication
|DEBUG: building index "hpart2_a_idx1_ccnew" on table "hpart2" serially
|DEBUG: index "hpart2_a_idx1_ccnew" can safely use deduplication
|DEBUG: building index "hpart3_a_idx_ccnew" on table "hpart3" serially
|DEBUG: index "hpart3_a_idx_ccnew" can safely use deduplication
|DEBUG: building index "hpart3_a_idx1_ccnew" on table "hpart3" serially
|DEBUG: index "hpart3_a_idx1_ccnew" can safely use deduplication
|DEBUG: validate_index found 2489 heap tuples, 2489 index tuples; inserted 0 missing tuples
|DEBUG: validate_index found 2489 heap tuples, 2489 index tuples; inserted 0 missing tuples
|DEBUG: validate_index found 2527 heap tuples, 2527 index tuples; inserted 0 missing tuples
|DEBUG: validate_index found 2527 heap tuples, 2527 index tuples; inserted 0 missing tuples
|DEBUG: validate_index found 2530 heap tuples, 2530 index tuples; inserted 0 missing tuples
|DEBUG: validate_index found 2530 heap tuples, 2530 index tuples; inserted 0 missing tuples
|DEBUG: validate_index found 2453 heap tuples, 2453 index tuples; inserted 0 missing tuples
|DEBUG: validate_index found 2453 heap tuples, 2453 index tuples; inserted 0 missing tuples
|REINDEX
I think the usual scenario is to have 100-1000 partitions, and 1-10 indexes per
partition. It seems to me that at least all partitions of a given index should
be processed simultaneously.
Also, it occured to me that CIC for regular tables could be simplied to do the
same thing that I'm doing for CIC on a partitioned table: create an INVALID
catalog entry, and then reindex it concurrently. This seems to work, with the
only difference I see being that REINDEX leaves behind ccnew indexes.
Attached is what I'm thinking of.