Re: tablecmds: fix bug where index rebuild loses replica identity on partitions - Mailing list pgsql-hackers
| From | Xuneng Zhou |
|---|---|
| Subject | Re: tablecmds: fix bug where index rebuild loses replica identity on partitions |
| Date | |
| Msg-id | CABPTF7X-J4VCZbZfsCujhh4OE3tR73xws43ur2R1PFSMd21tpg@mail.gmail.com Whole thread Raw |
| In response to | Re: tablecmds: fix bug where index rebuild loses replica identity on partitions (Chao Li <li.evan.chao@gmail.com>) |
| List | pgsql-hackers |
Hi,
On Mon, Mar 23, 2026 at 3:57 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
>
>
> > On Mar 21, 2026, at 18:29, Xuneng Zhou <xunengzhou@gmail.com> wrote:
> >
> > On Thu, Mar 19, 2026 at 1:07 PM Chao Li <li.evan.chao@gmail.com> wrote:
> >>
> >>
> >>
> >>> On Feb 26, 2026, at 14:59, Chao Li <li.evan.chao@gmail.com> wrote:
> >>>
> >>>
> >>>
> >>>> On Jan 28, 2026, at 10:49, Chao Li <li.evan.chao@gmail.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>>> On Jan 27, 2026, at 16:30, Chao Li <li.evan.chao@gmail.com> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Jan 27, 2026, at 15:59, Chao Li <li.evan.chao@gmail.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> On Jan 27, 2026, at 15:39, Michael Paquier <michael@paquier.xyz> wrote:
> >>>>>>>
> >>>>>>> On Tue, Jan 27, 2026 at 01:13:32PM +0800, Chao Li wrote:
> >>>>>>>> I found this bug while working on a related patch [1].
> >>>>>>>>
> >>>>>>>> When ALTER TABLE ... ALTER COLUMN TYPE causes an index rebuild, and
> >>>>>>>> that index is used as REPLICA IDENTITY on a partitioned table, the
> >>>>>>>> replica identity marking on partitions can be silently lost after the
> >>>>>>>> rebuild.
> >>>>>>>
> >>>>>>> I am slightly confused by the tests included in the proposed patch.
> >>>>>>> On HEAD, if I undo the proposed changes of tablecmds.c, the tests
> >>>>>>> pass. If I run the tests of the patch with the changes of
> >>>>>>> tablecmds.c, the tests also pass.
> >>>>>>
> >>>>>> Oops, that isn’t supposed to be so. I’ll check the test.
> >>>>>>
> >>>>>
> >>>>> Okay, I see the problem is here:
> >>>>> ```
> >>>>> +CREATE UNIQUE INDEX test_replica_identity_partitioned_pkey ON test_replica_identity_partitioned (id);
> >>>>> ```
> >>>>>
> >>>>> I missed to add column “val” into the index, so that alter type of val didn’t cause index rebuild.
> >>>>>
> >>>>> Ideally, it’s better to also verify that index OIDs should have changed before and after alter column type, but I haven’t figured out how to do so. Do you have an idea?
> >>>>
> >>>> I just updated the test to store index OIDs before and after rebuild into 2 temp tables, so that we can compare the OIDs to verify rebuild happens and replica identity preserved.
> >>>>
> >>>> I tried to port the test to master branch, and the test failed. From the test diff file, we can see replica identity lost on 3 leaf partitions:
> >>>> ```
> >>>> @@ -360,9 +360,9 @@
> >>>> ORDER BY b.index_name;
> >>>> index_name | rebuilt | ri_lost
> >>>> ---------------------------------------------------+---------+---------
> >>>> - test_replica_identity_partitioned_p1_id_val_idx | t | f
> >>>> - test_replica_identity_partitioned_p2_1_id_val_idx | t | f
> >>>> - test_replica_identity_partitioned_p2_2_id_val_idx | t | f
> >>>> + test_replica_identity_partitioned_p1_id_val_idx | t | t
> >>>> + test_replica_identity_partitioned_p2_1_id_val_idx | t | t
> >>>> + test_replica_identity_partitioned_p2_2_id_val_idx | t | t
> >>>> test_replica_identity_partitioned_p2_id_val_idx | t | f
> >>>> test_replica_identity_partitioned_pkey | t | f
> >>>> (5 rows)
> >>>> ```
> >>>>
> >>>> With this patch, the test passes and all replica identity are preserved.
> >>>>
> >>>> PFA v3:
> >>>> * Enhanced the test.
> >>>> * A small change in find_partition_replica_identity_indexes(): if we will not update a partition, then unlock it.
> >>>>
> >>>> Best regards,
> >>>> --
> >>>> Chao Li (Evan)
> >>>> HighGo Software Co., Ltd.
> >>>> https://www.highgo.com/
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> <v3-0001-tablecmds-fix-bug-where-index-rebuild-loses-repli.patch>
> >>>
> >>> The CF asked for a rebase, thus rebased as v4.
> >>>
> >
> >
> > Hi, I reproduced this with the test case, and the patch appears
> > to resolve it.
> >
> > Some comments on v5:
>
> Thanks a lot for your review.
>
> >
> > -- Whether it makes sense to use a single list of pair structs instead
> > of two parallel OID lists (replicaIdentityIndexOids +
> > replicaIdentityTableOids) to avoid accidental desync.
>
> I don’t think that helps much. The current code of rebuilding index uses two lists changedIndexOids and changedIndexDefs. So, this patch matches the pattern of the existing code.
>
> >
> > -- It would be better to make lock handling in
> > find_partition_replica_identity_indexes() consistent
> > (relation_open(..., NoLock) if child is already locked, and avoid
> > mixed relation_close(..., lockmode)/NoLock behavior).
>
> That’s because if we are going to update a partition, then we need to hold the lock on the partition.
There is one locking cleanup in find_partition_replica_identity_indexes().
find_inheritance_children(relId, lockmode) already acquires lockmode on
every partition it returns, so I think the later relation_open() should use
NoLock, not lockmode. For the same reason, all relation_close() calls in
this function should use NoLock as well.
Today the code does:
partRel = relation_open(partRelOid, lockmode);
...
relation_close(partRel, lockmode);
That does not cause a correctness issue, because the lock manager
reference-counts same-transaction acquisitions, so the lock remains held
either way. But it is misleading: it suggests that relation_open() is where
the partition lock is taken, and that the early relation_close(..., lockmode)
is intentionally releasing it. Neither is actually true here, because the lock
was already acquired by find_inheritance_children().
So I think this should be adjusted to:
partRel = relation_open(partRelOid, NoLock);
and all close sites in this function should be:
relation_close(partRel, NoLock);
The comment on the early-close path should also be updated, since it is not
really unlocking the partition. Something like "No matching partition index;
just close the relcache entry" would match the actual behavior better.
> >
> > -- Some typos in comments/tests (partion/parition).
> >
>
> Fixed.
>
> PFA v6: fixed a typo in comment.
>
> Best regards,
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
>
>
>
>
--
Best,
Xuneng
On Mon, Mar 23, 2026 at 3:57 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
>
>
> > On Mar 21, 2026, at 18:29, Xuneng Zhou <xunengzhou@gmail.com> wrote:
> >
> > On Thu, Mar 19, 2026 at 1:07 PM Chao Li <li.evan.chao@gmail.com> wrote:
> >>
> >>
> >>
> >>> On Feb 26, 2026, at 14:59, Chao Li <li.evan.chao@gmail.com> wrote:
> >>>
> >>>
> >>>
> >>>> On Jan 28, 2026, at 10:49, Chao Li <li.evan.chao@gmail.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>>> On Jan 27, 2026, at 16:30, Chao Li <li.evan.chao@gmail.com> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Jan 27, 2026, at 15:59, Chao Li <li.evan.chao@gmail.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> On Jan 27, 2026, at 15:39, Michael Paquier <michael@paquier.xyz> wrote:
> >>>>>>>
> >>>>>>> On Tue, Jan 27, 2026 at 01:13:32PM +0800, Chao Li wrote:
> >>>>>>>> I found this bug while working on a related patch [1].
> >>>>>>>>
> >>>>>>>> When ALTER TABLE ... ALTER COLUMN TYPE causes an index rebuild, and
> >>>>>>>> that index is used as REPLICA IDENTITY on a partitioned table, the
> >>>>>>>> replica identity marking on partitions can be silently lost after the
> >>>>>>>> rebuild.
> >>>>>>>
> >>>>>>> I am slightly confused by the tests included in the proposed patch.
> >>>>>>> On HEAD, if I undo the proposed changes of tablecmds.c, the tests
> >>>>>>> pass. If I run the tests of the patch with the changes of
> >>>>>>> tablecmds.c, the tests also pass.
> >>>>>>
> >>>>>> Oops, that isn’t supposed to be so. I’ll check the test.
> >>>>>>
> >>>>>
> >>>>> Okay, I see the problem is here:
> >>>>> ```
> >>>>> +CREATE UNIQUE INDEX test_replica_identity_partitioned_pkey ON test_replica_identity_partitioned (id);
> >>>>> ```
> >>>>>
> >>>>> I missed to add column “val” into the index, so that alter type of val didn’t cause index rebuild.
> >>>>>
> >>>>> Ideally, it’s better to also verify that index OIDs should have changed before and after alter column type, but I haven’t figured out how to do so. Do you have an idea?
> >>>>
> >>>> I just updated the test to store index OIDs before and after rebuild into 2 temp tables, so that we can compare the OIDs to verify rebuild happens and replica identity preserved.
> >>>>
> >>>> I tried to port the test to master branch, and the test failed. From the test diff file, we can see replica identity lost on 3 leaf partitions:
> >>>> ```
> >>>> @@ -360,9 +360,9 @@
> >>>> ORDER BY b.index_name;
> >>>> index_name | rebuilt | ri_lost
> >>>> ---------------------------------------------------+---------+---------
> >>>> - test_replica_identity_partitioned_p1_id_val_idx | t | f
> >>>> - test_replica_identity_partitioned_p2_1_id_val_idx | t | f
> >>>> - test_replica_identity_partitioned_p2_2_id_val_idx | t | f
> >>>> + test_replica_identity_partitioned_p1_id_val_idx | t | t
> >>>> + test_replica_identity_partitioned_p2_1_id_val_idx | t | t
> >>>> + test_replica_identity_partitioned_p2_2_id_val_idx | t | t
> >>>> test_replica_identity_partitioned_p2_id_val_idx | t | f
> >>>> test_replica_identity_partitioned_pkey | t | f
> >>>> (5 rows)
> >>>> ```
> >>>>
> >>>> With this patch, the test passes and all replica identity are preserved.
> >>>>
> >>>> PFA v3:
> >>>> * Enhanced the test.
> >>>> * A small change in find_partition_replica_identity_indexes(): if we will not update a partition, then unlock it.
> >>>>
> >>>> Best regards,
> >>>> --
> >>>> Chao Li (Evan)
> >>>> HighGo Software Co., Ltd.
> >>>> https://www.highgo.com/
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> <v3-0001-tablecmds-fix-bug-where-index-rebuild-loses-repli.patch>
> >>>
> >>> The CF asked for a rebase, thus rebased as v4.
> >>>
> >
> >
> > Hi, I reproduced this with the test case, and the patch appears
> > to resolve it.
> >
> > Some comments on v5:
>
> Thanks a lot for your review.
>
> >
> > -- Whether it makes sense to use a single list of pair structs instead
> > of two parallel OID lists (replicaIdentityIndexOids +
> > replicaIdentityTableOids) to avoid accidental desync.
>
> I don’t think that helps much. The current code of rebuilding index uses two lists changedIndexOids and changedIndexDefs. So, this patch matches the pattern of the existing code.
>
> >
> > -- It would be better to make lock handling in
> > find_partition_replica_identity_indexes() consistent
> > (relation_open(..., NoLock) if child is already locked, and avoid
> > mixed relation_close(..., lockmode)/NoLock behavior).
>
> That’s because if we are going to update a partition, then we need to hold the lock on the partition.
There is one locking cleanup in find_partition_replica_identity_indexes().
find_inheritance_children(relId, lockmode) already acquires lockmode on
every partition it returns, so I think the later relation_open() should use
NoLock, not lockmode. For the same reason, all relation_close() calls in
this function should use NoLock as well.
Today the code does:
partRel = relation_open(partRelOid, lockmode);
...
relation_close(partRel, lockmode);
That does not cause a correctness issue, because the lock manager
reference-counts same-transaction acquisitions, so the lock remains held
either way. But it is misleading: it suggests that relation_open() is where
the partition lock is taken, and that the early relation_close(..., lockmode)
is intentionally releasing it. Neither is actually true here, because the lock
was already acquired by find_inheritance_children().
So I think this should be adjusted to:
partRel = relation_open(partRelOid, NoLock);
and all close sites in this function should be:
relation_close(partRel, NoLock);
The comment on the early-close path should also be updated, since it is not
really unlocking the partition. Something like "No matching partition index;
just close the relcache entry" would match the actual behavior better.
> >
> > -- Some typos in comments/tests (partion/parition).
> >
>
> Fixed.
>
> PFA v6: fixed a typo in comment.
>
> Best regards,
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
>
>
>
>
--
Best,
Xuneng
pgsql-hackers by date: