Thread: Confused comment about drop replica identity index
Hi hackers, When I doing development based by PG, I found the following comment have a little problem in file src/include/catalog/pg_class.h. /* * an explicitly chosen candidate key's columns are used as replica identity. * Note this will still be set if the index has been dropped; in that case it * has the same meaning as 'd'. */ #define REPLICA_IDENTITY_INDEX 'i' The last sentence makes me a little confused : [......in that case it as the same meaning as 'd'.] Now, pg-doc didn't have a clear style to describe this. But if I drop relation's replica identity index like the comment, the action is not as same as default. For example: Execute the following SQL: create table tbl (col1 int primary key, col2 int not null); create unique INDEX ON tbl(col2); alter table tbl replica identity using INDEX tbl_col2_idx; drop index tbl_col2_idx; create publication pub for table tbl; delete from tbl; Actual result: ERROR: cannot delete from table "tbl" because it does not have a replica identity and publishes deletes HINT: To enable deleting from the table, set REPLICA IDENTITY using ALTER TABLE. Expected result in comment: DELETE 0 I found that in the function CheckCmdReplicaIdentity, the operation described in the comment is not considered, When relation's replica identity index is found to be InvalidOid, an error is reported. Are the comment here not accurate enough? Or we need to adjust the code according to the comments? Regards, Wang wei
On Tue, Dec 14, 2021 at 6:08 PM wangw.fnst@fujitsu.com <wangw.fnst@fujitsu.com> wrote: > > Hi hackers, > > When I doing development based by PG, I found the following comment have a > little problem in file src/include/catalog/pg_class.h. > > /* > * an explicitly chosen candidate key's columns are used as replica identity. > * Note this will still be set if the index has been dropped; in that case it > * has the same meaning as 'd'. > */ > #define REPLICA_IDENTITY_INDEX 'i' > > The last sentence makes me a little confused : > [......in that case it as the same meaning as 'd'.] > > Now, pg-doc didn't have a clear style to describe this. > > > But if I drop relation's replica identity index like the comment, the action > is not as same as default. > > For example: > Execute the following SQL: > create table tbl (col1 int primary key, col2 int not null); > create unique INDEX ON tbl(col2); > alter table tbl replica identity using INDEX tbl_col2_idx; > drop index tbl_col2_idx; > create publication pub for table tbl; > delete from tbl; > > Actual result: > ERROR: cannot delete from table "tbl" because it does not have a replica identity and publishes deletes > HINT: To enable deleting from the table, set REPLICA IDENTITY using ALTER TABLE. I think I see where's the confusion. The table has a primary key and so when the replica identity index is dropped, per the comment in code, you expect that primary key will be used as replica identity since that's what 'd' or default means. > > Expected result in comment: > DELETE 0 > > > I found that in the function CheckCmdReplicaIdentity, the operation described > in the comment is not considered, > When relation's replica identity index is found to be InvalidOid, an error is > reported. This code in RelationGetIndexList() is not according to that comment. if (replident == REPLICA_IDENTITY_DEFAULT && OidIsValid(pkeyIndex)) relation->rd_replidindex = pkeyIndex; else if (replident == REPLICA_IDENTITY_INDEX && OidIsValid(candidateIndex)) relation->rd_replidindex = candidateIndex; else relation->rd_replidindex = InvalidOid; > > Are the comment here not accurate enough? > Or we need to adjust the code according to the comments? > Comment in code is one thing, but I think PG documentation is not covering the use case you tried. What happens when a replica identity index is dropped has not been covered either in ALTER TABLE https://www.postgresql.org/docs/13/sql-altertable.html or DROP INDEX https://www.postgresql.org/docs/14/sql-dropindex.html documentation. -- Best Wishes, Ashutosh Bapat
On Tue, Dec 14, 2021 at 07:10:49PM +0530, Ashutosh Bapat wrote: > This code in RelationGetIndexList() is not according to that comment. > > if (replident == REPLICA_IDENTITY_DEFAULT && OidIsValid(pkeyIndex)) > relation->rd_replidindex = pkeyIndex; > else if (replident == REPLICA_IDENTITY_INDEX && OidIsValid(candidateIndex)) > relation->rd_replidindex = candidateIndex; > else > relation->rd_replidindex = InvalidOid; Yeah, the comment is wrong. If the index of a REPLICA_IDENTITY_INDEX is dropped, I recall that the behavior is the same as REPLICA_IDENTITY_NOTHING. > Comment in code is one thing, but I think PG documentation is not > covering the use case you tried. What happens when a replica identity > index is dropped has not been covered either in ALTER TABLE > https://www.postgresql.org/docs/13/sql-altertable.html or DROP INDEX > https://www.postgresql.org/docs/14/sql-dropindex.html documentation. Not sure about the DROP INDEX page, but I'd be fine with mentioning that in the ALTER TABLE page in the paragraph related to REPLICA IDENTITY. While on it, I would be tempted to switch this stuff to use a list of <variablelist> for all the option values. That would be much easier to read. [ ... thinks a bit ... ] FWIW, this brings back some memories, as of this thread: https://www.postgresql.org/message-id/20200522035028.GO2355@paquier.xyz See also commit fe7fd4e from August 2020, where some tests have been added. I recall seeing this incorrect comment from last year's thread and it may have been mentioned in one of the surrounding threads.. Maybe I just let it go back then. I don't know. -- Michael
Attachment
On Tue, Dec 15, 2021 at 11:25AM, Michael Paquier wrote: > Yeah, the comment is wrong. If the index of a REPLICA_IDENTITY_INDEX is > dropped, I recall that the behavior is the same as REPLICA_IDENTITY_NOTHING. Thank you for your response. I agreed that the comment is wrong. > Not sure about the DROP INDEX page, but I'd be fine with mentioning that in the > ALTER TABLE page in the paragraph related to REPLICA IDENTITY. While on it, I > would be tempted to switch this stuff to use a list of <variablelist> for all the option > values. That would be much easier to read. Yeah, if we can add some details to pg-doc and code comments, I think it will be more friendly to PG users and developers. Regards, Wang wei
On Wed, Dec 15, 2021 at 09:18:26AM +0000, wangw.fnst@fujitsu.com wrote: > Yeah, if we can add some details to pg-doc and code comments, I think it will > be more friendly to PG users and developers. Would you like to write a patch to address all that? Thanks, -- Michael
Attachment
On Tue, Dec 16, 2021 at 06:40AM, Michael Paquier wrote: > Would you like to write a patch to address all that? OK, I will push it soon. Regards, Wang wei
On 2021-Dec-15, Michael Paquier wrote: > On Tue, Dec 14, 2021 at 07:10:49PM +0530, Ashutosh Bapat wrote: > > This code in RelationGetIndexList() is not according to that comment. > > > > if (replident == REPLICA_IDENTITY_DEFAULT && OidIsValid(pkeyIndex)) > > relation->rd_replidindex = pkeyIndex; > > else if (replident == REPLICA_IDENTITY_INDEX && OidIsValid(candidateIndex)) > > relation->rd_replidindex = candidateIndex; > > else > > relation->rd_replidindex = InvalidOid; > > Yeah, the comment is wrong. If the index of a REPLICA_IDENTITY_INDEX > is dropped, I recall that the behavior is the same as > REPLICA_IDENTITY_NOTHING. Hmm, so if a table has REPLICA IDENTITY INDEX and there is a publication with an explicit column list, then we need to forbid the DROP INDEX for that index. I wonder why don't we just forbid DROP INDEX of an index that's been defined as replica identity. It seems quite silly an operation to allow. -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/ "Ed is the standard text editor." http://groups.google.com/group/alt.religion.emacs/msg/8d94ddab6a9b0ad3
On Thu, Dec 16, 2021 at 03:08:46PM -0300, Alvaro Herrera wrote: > Hmm, so if a table has REPLICA IDENTITY INDEX and there is a publication > with an explicit column list, then we need to forbid the DROP INDEX for > that index. Hmm. I have not followed this thread very closely. > I wonder why don't we just forbid DROP INDEX of an index that's been > defined as replica identity. It seems quite silly an operation to > allow. The commit logs talk about b23b0f55 here for this code, to ease the handling of relcache entries for rd_replidindex. 07cacba is the origin of the logic (see RelationGetIndexList). Andres? I don't think that this is really an argument against putting more restrictions as anything that deals with an index drop, including the internal ones related to constraints, would need to go through index_drop(), and new features may want more restrictions in place as you say. Now, I don't see a strong argument in changing this behavior either (aka I have not looked at what this implies for the new publication types), and we still need to do something for the comment/docs in existing branches, anyway. So I would still fix this gap as a first step, then deal with the rest on HEAD as necessary. -- Michael
Attachment
On Thu, Dec 16, 2021, at 8:55 PM, Michael Paquier wrote:
On Thu, Dec 16, 2021 at 03:08:46PM -0300, Alvaro Herrera wrote:> Hmm, so if a table has REPLICA IDENTITY INDEX and there is a publication> with an explicit column list, then we need to forbid the DROP INDEX for> that index.Hmm. I have not followed this thread very closely.> I wonder why don't we just forbid DROP INDEX of an index that's been> defined as replica identity. It seems quite silly an operation to> allow.
It would avoid pilot errors.
The commit logs talk about b23b0f55 here for this code, to ease thehandling of relcache entries for rd_replidindex. 07cacba is theorigin of the logic (see RelationGetIndexList). Andres?I don't think that this is really an argument against putting morerestrictions as anything that deals with an index drop, including theinternal ones related to constraints, would need to go throughindex_drop(), and new features may want more restrictions in place asyou say.Now, I don't see a strong argument in changing this behavior either(aka I have not looked at what this implies for the new publicationtypes), and we still need to do something for the comment/docs inexisting branches, anyway. So I would still fix this gap as a firststep, then deal with the rest on HEAD as necessary.
I've never understand the weak dependency between the REPLICA IDENTITY and the
index used by it. I'm afraid we will receive complaints about this unexpected
behavior (my logical replication setup is broken because I dropped an index) as
far as new logical replication features are added. Row filtering imposes some
restrictions in UPDATEs and DELETEs (an error message is returned and the
replication stops) if a column used in the expression isn't part of the REPLICA
IDENTITY anymore.
It seems we already have some code in RangeVarCallbackForDropRelation() that
deals with a system index error condition. We could save a syscall and provide
a test for indisreplident there.
If this restriction is undesirable, we should at least document this choice and
probably emit a WARNING for DROP INDEX.
On Tue, Dec 16, 2021 at 10:27AM, Michael Paquier wrote: > On Tue, Dec 16, 2021 at 06:40AM, Michael Paquier wrote: > > Would you like to write a patch to address all that? > > OK, I will push it soon. Here is a patch to correct wrong comment about REPLICA_IDENTITY_INDEX, And improve the pg-doc. Regards, Wang wei
Attachment
On Mon, Dec 20, 2021 at 03:46:13AM +0000, wangw.fnst@fujitsu.com wrote: > Here is a patch to correct wrong comment about > REPLICA_IDENTITY_INDEX, And improve the pg-doc. That's mostly fine. I have made some adjustments as per the attached. + The default for non-system tables. Records the old values of the columns + of the primary key, if any. The default for non-system tables. The same sentence is repeated twice. + Records no information about the old row.(This is the default for system tables.) For consistency with the rest, this could drop the parenthesis for the second sentence. + <term><literal>USING INDEX index_name</literal></term> This should use <replaceable> as markup for index_name. Pondering more about this thread, I don't think we should change the existing behavior in the back-branches, but I don't have any arguments about doing such changes on HEAD to help the features being worked on, either. So I'd like to apply and back-patch the attached, as a first step, to fix the inconsistency. -- Michael
Attachment
On Mon, Dec 20, 2021, at 8:11 AM, Michael Paquier wrote:
On Mon, Dec 20, 2021 at 03:46:13AM +0000, wangw.fnst@fujitsu.com wrote:> Here is a patch to correct wrong comment about> REPLICA_IDENTITY_INDEX, And improve the pg-doc.That's mostly fine. I have made some adjustments as per theattached.
Your patch looks good to me.
Pondering more about this thread, I don't think we should change theexisting behavior in the back-branches, but I don't have any argumentsabout doing such changes on HEAD to help the features being workedon, either. So I'd like to apply and back-patch the attached, as afirst step, to fix the inconsistency.
What do you think about the attached patch? It forbids the DROP INDEX. We might
add a detail message but I didn't in this patch.
Attachment
On Tue, Dec 20, 2021 at 19:11PM, Michael Paquier wrote: > That's mostly fine. I have made some adjustments as per the attached. Thanks for reviewing. > + The default for non-system tables. Records the old values of the columns > + of the primary key, if any. The default for non-system tables. > The same sentence is repeated twice. > > + Records no information about the old row.(This is the > default for system tables.) > For consistency with the rest, this could drop the parenthesis for the second > sentence. > > + <term><literal>USING INDEX index_name</literal></term> > This should use <replaceable> as markup for index_name. The change looks good to me. Regards, Wang wei
On Tues, Dec 21, 2021 8:47 AM Michael Paquier <michael@paquier.xyz> wrote: > On Mon, Dec 20, 2021 at 11:57:32AM -0300, Euler Taveira wrote: > > What do you think about the attached patch? It forbids the DROP INDEX. > > We might add a detail message but I didn't in this patch. > > Yeah. I'd agree about doing something like that on HEAD, and that would help > with some of the logirep-related patch currently being worked on, as far as I > understood. Hi, I think forbids DROP INDEX might not completely solve this problem. Because user could still use other command to delete the index, for example: ALTER TABLE DROP COLUMN. After dropping the column, the index on it will also be dropped. Besides, user can also ALTER REPLICA IDENTITY USING INDEX "primary key", and in this case, when they ALTER TABLE DROP CONSTR "PRIMARY KEY", the replica identity index will also be dropped. Best regards, Hou zj
On Thu, Dec 30, 2021 at 06:45:30AM +0000, houzj.fnst@fujitsu.com wrote: > I think forbids DROP INDEX might not completely solve this problem. Because > user could still use other command to delete the index, for example: ALTER > TABLE DROP COLUMN. After dropping the column, the index on it will also be > dropped. > > Besides, user can also ALTER REPLICA IDENTITY USING INDEX "primary key", and in > this case, when they ALTER TABLE DROP CONSTR "PRIMARY KEY", the replica > identity index will also be dropped. Indexes related to any other object type, like constraints, are dropped as part of index_drop() as per the handling of dependencies. So, by putting a restriction there, any commands would take this code path, and fail when trying to drop an index used as a replica identity. Why would that be logically a problem? We may want errors with more context for such cases, though, as complaining about an object not directly known by the user when triggering a different command, like a constraint index, could be confusing. -- Michael