回复:回复:回复:回复:how to create index concurrently on partitioned table - Mailing list pgsql-hackers
From | 李杰(慎追) |
---|---|
Subject | 回复:回复:回复:回复:how to create index concurrently on partitioned table |
Date | |
Msg-id | 9267dd21-c816-4ba2-bf91-d0a9c0b799fe.adger.lj@alibaba-inc.com Whole thread Raw |
In response to | Re: 回复:回复:回复:how to create index concurrently on partitioned table (Michael Paquier <michael@paquier.xyz>) |
Responses |
回复:回复:回复:回复:how to create index concurrently on partitioned table
|
List | pgsql-hackers |
> Not sure I am following. In the case of REINDEX, it seems to me that
> the calls to validate_index() and index_concurrently_build() can
> happen in a separate transaction for each index, as long as all the
> calls to index_concurrently_swap() are grouped together in the same
> transaction to make sure that index partition trees are switched
> consistently when all entries are swapped from an invalid state to a
> valid state, because the swapping phase is also when we attach a fresh
> index to a partition tree. See also index_concurrently_create_copy()
> where we don't set parentIndexRelid for the lower call to
> index_create(). It would be good of course to check that when
> swapping we have the code to handle that for a lot of indexes at
> once.
> the calls to validate_index() and index_concurrently_build() can
> happen in a separate transaction for each index, as long as all the
> calls to index_concurrently_swap() are grouped together in the same
> transaction to make sure that index partition trees are switched
> consistently when all entries are swapped from an invalid state to a
> valid state, because the swapping phase is also when we attach a fresh
> index to a partition tree. See also index_concurrently_create_copy()
> where we don't set parentIndexRelid for the lower call to
> index_create(). It would be good of course to check that when
> swapping we have the code to handle that for a lot of indexes at
> once.
Let's look at this example:
A partition table has five partitions,
parttable: part1, part2, part3, part3 ,part5
We simply abstract the following definitions:
phase 1: index_create(), it is only registered in catalogs.
phase 2: index_concurrently_build(), Build the indexes.
phase 3: validate_index(), insert any missing index entries, mark the index as valid.
(schema 1)
```
StartTransaction one
parttable phase 1
part 1 phase 1
part 2 phase 1
part 3 phase 1
part 4 phase 1
part 5 phase 1
CommitTransaction
StartTransaction two
parttable phase 2
part 1 phase 2 part 2 phase 2
part 3 phase 2 (error occurred )
part 4 phase 2
part 5 phase 2
CommitTransaction
StartTransaction three
parttable phase 3
part 1 phase 3
part 2 phase 3
part 3 phase 3
part 4 phase 4
part 5 phase 5
CommitTransaction...
```
Now, the following steps cannot continue due to an error in Transaction two .
so, Transaction two roll back, Transaction three haven't started.
All of our indexes are invalid. In this way,
we ensure the strong consistency of indexes in the partition tree.
However, we need to rebuild all indexes when reindex.
(schema 2)
```
StartTransaction one
parttable phase 1
part 1 phase 1
part 2 phase 1
part 3 phase 1
part 4 phase 1
part 5 phase 1
CommitTransaction
StartTransaction two
part 1 phase 2 part 1 phase 3
CommitTransaction
StartTransaction three
part 2 phase 3
CommitTransaction
StartTransaction four
part 3 phase 3
CommitTransaction
StartTransaction five
part 4 phase 3
StartTransaction six
part 5 phase 3
CommitTransaction
StartTransaction seven
parttable phase 2 parttable phase 3
CommitTransaction
```
Now, the following steps cannot continue due to an error in Transaction four .
so, Transaction four roll back, Transactions behind Transaction 3 have not started
The indexes of the p1 and p2 partitions are available. Other indexes are invalid.
In reindex, we can ignore the rebuild of p1 and p2.
This seems better, although it seems to be inconsistent.
Do you think that scheme is more suitable for CIC?
Thank you very much,
Regards, Adger
------------------------------------------------------------------发件人:Michael Paquier <michael@paquier.xyz>发送时间:2020年6月18日(星期四) 10:41收件人:李杰(慎追) <adger.lj@alibaba-inc.com>抄 送:Justin Pryzby <pryzby@telsasoft.com>; pgsql-hackers <pgsql-hackers@lists.postgresql.org>; 曾文旌(义从) <wenjing.zwj@alibaba-inc.com>; Alvaro Herrera <alvherre@2ndquadrant.com>主 题:Re: 回复:回复:回复:how to create index concurrently on partitioned tableOn Wed, Jun 17, 2020 at 10:22:28PM +0800, 李杰(慎追) wrote:
> However, I found a problem. If there are many partitions,
> we may need to handle too many missing index entries when
> validate_index(). Especially for the first partition, the time may
> have been long and many entries are missing. In this case, why
> don't we put the second and third phase together into a transaction
> for each partition?
Not sure I am following. In the case of REINDEX, it seems to me that
the calls to validate_index() and index_concurrently_build() can
happen in a separate transaction for each index, as long as all the
calls to index_concurrently_swap() are grouped together in the same
transaction to make sure that index partition trees are switched
consistently when all entries are swapped from an invalid state to a
valid state, because the swapping phase is also when we attach a fresh
index to a partition tree. See also index_concurrently_create_copy()
where we don't set parentIndexRelid for the lower call to
index_create(). It would be good of course to check that when
swapping we have the code to handle that for a lot of indexes at
once.
--
Michael
pgsql-hackers by date: