Thread: Question about behavior of deletes with REPLICA IDENTITY NOTHING
Hello, We recently noticed some behavior that seems reasonable but also surprised our engineers based on the docs. If we have this setup: create table items(i int); insert into items(i) values (1); create publication test_pub for all tables; Then when we: delete from items where i = 1; we get: ERROR: cannot delete from table "items" because it does not have a replica identity and publishes deletes HINT: To enable deleting from the table, set REPLICA IDENTITY using ALTER TABLE. Fair enough. But if we do this: alter table items replica identity nothing; because the docs [1] say that NOTHING means "Records no information about the old row." We still get the same error when we try the DELETE again. The publication docs [2] say "A published table must have a replica identity configured in order to be able to replicate UPDATE and DELETE operations, so that appropriate rows to update or delete can be identified on the subscriber side." We interpreted the intersection of these two docs to imply that if you explicitly configured NOTHING that the publication would simply not log anything about the original row. Part of the confusion I think was fed by reading "must have a replica identity set" as "have selected one of the options via ALTER TABLE REPLICA IDENTITY" -- i.e., as meaning that a setting has been configured rather than being about a subset of those possible configuration values/a specific key existing on the table. I'm wondering if this might be a surprise to anyone else, and if so, is there a minor docs tweak that might avoid the confusion? Thanks, James Coleman 1: https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITY 2: https://www.postgresql.org/docs/current/logical-replication-publication.html
On Wed, 2024-02-07 at 15:12 -0500, James Coleman wrote: > We recently noticed some behavior that seems reasonable but also > surprised our engineers based on the docs. > > If we have this setup: > create table items(i int); > insert into items(i) values (1); > create publication test_pub for all tables; > > Then when we: > delete from items where i = 1; > > we get: > ERROR: cannot delete from table "items" because it does not have a > replica identity and publishes deletes > HINT: To enable deleting from the table, set REPLICA IDENTITY using > ALTER TABLE. > > Fair enough. But if we do this: > alter table items replica identity nothing; > > because the docs [1] say that NOTHING means "Records no information > about the old row." We still get the same error when we try the DELETE > again. Well, "REPLICA IDENTITY NOTHING" is the same as "has no replica identity". So is "REPLICA IDENTITY DEFAULT" if there is no primary key, or "REPLICA IDENTITY USING INDEX ..." if the index is dropped. See "pg_class": the column "relreplident" is not nullable. Yours, Laurenz Albe
On Wed, Feb 7, 2024 at 3:22 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Wed, 2024-02-07 at 15:12 -0500, James Coleman wrote: > > We recently noticed some behavior that seems reasonable but also > > surprised our engineers based on the docs. > > > > If we have this setup: > > create table items(i int); > > insert into items(i) values (1); > > create publication test_pub for all tables; > > > > Then when we: > > delete from items where i = 1; > > > > we get: > > ERROR: cannot delete from table "items" because it does not have a > > replica identity and publishes deletes > > HINT: To enable deleting from the table, set REPLICA IDENTITY using > > ALTER TABLE. > > > > Fair enough. But if we do this: > > alter table items replica identity nothing; > > > > because the docs [1] say that NOTHING means "Records no information > > about the old row." We still get the same error when we try the DELETE > > again. > > Well, "REPLICA IDENTITY NOTHING" is the same as "has no replica identity". > So is "REPLICA IDENTITY DEFAULT" if there is no primary key, or > "REPLICA IDENTITY USING INDEX ..." if the index is dropped. > > See "pg_class": the column "relreplident" is not nullable. Right, I think the confusing point for us is that the docs for NOTHING ("Records no information about the old row") imply you can decide you don't have to record anything if you don't want to do so, but the publication feature is effectively overriding that and asserting that you can't make that choice. Regards, James Coleman
On Thu, Feb 8, 2024 at 9:04 AM James Coleman <jtc331@gmail.com> wrote: > > On Wed, Feb 7, 2024 at 3:22 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > > > On Wed, 2024-02-07 at 15:12 -0500, James Coleman wrote: > > > We recently noticed some behavior that seems reasonable but also > > > surprised our engineers based on the docs. > > > > > > If we have this setup: > > > create table items(i int); > > > insert into items(i) values (1); > > > create publication test_pub for all tables; > > > > > > Then when we: > > > delete from items where i = 1; > > > > > > we get: > > > ERROR: cannot delete from table "items" because it does not have a > > > replica identity and publishes deletes > > > HINT: To enable deleting from the table, set REPLICA IDENTITY using > > > ALTER TABLE. > > > > > > Fair enough. But if we do this: > > > alter table items replica identity nothing; > > > > > > because the docs [1] say that NOTHING means "Records no information > > > about the old row." We still get the same error when we try the DELETE > > > again. > > > > Well, "REPLICA IDENTITY NOTHING" is the same as "has no replica identity". > > So is "REPLICA IDENTITY DEFAULT" if there is no primary key, or > > "REPLICA IDENTITY USING INDEX ..." if the index is dropped. > > > > See "pg_class": the column "relreplident" is not nullable. > > Right, I think the confusing point for us is that the docs for NOTHING > ("Records no information about the old row") imply you can decide you > don't have to record anything if you don't want to do so, but the > publication feature is effectively overriding that and asserting that > you can't make that choice. > Hi, I can see how the current docs could be interpreted in a way that was not intended. ~~~ To emphasise the DEFAULT behaviour that Laurenze described, I felt there could be another sentence about DEFAULT, the same as there is already for the USING INDEX case. BEFORE [1] Records the old values of the columns of the primary key, if any. This is the default for non-system tables. SUGGESTION Records the old values of the columns of the primary key, if any. This is the default for non-system tables. If there is no primary key, the behavior is the same as NOTHING. ~~~ If that is done, then would a publication docs tweak like the one below clarify things sufficiently? BEFORE [2] If a table without a replica identity is added to a publication that replicates UPDATE or DELETE operations then subsequent UPDATE or DELETE operations will cause an error on the publisher. SUGGESTION If a table without a replica identity (or with replica identity behavior equivalent to NOTHING) is added to a publication that replicates UPDATE or DELETE operations then subsequent UPDATE or DELETE operations will cause an error on the publisher. ====== [1] https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITY [2] https://www.postgresql.org/docs/current/logical-replication-publication.html Kind Regards, Peter Smith. Fujitsu Australia
On Wed, Feb 7, 2024 at 6:04 PM Peter Smith <smithpb2250@gmail.com> wrote: > > On Thu, Feb 8, 2024 at 9:04 AM James Coleman <jtc331@gmail.com> wrote: > > > > On Wed, Feb 7, 2024 at 3:22 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > > > > > On Wed, 2024-02-07 at 15:12 -0500, James Coleman wrote: > > > > We recently noticed some behavior that seems reasonable but also > > > > surprised our engineers based on the docs. > > > > > > > > If we have this setup: > > > > create table items(i int); > > > > insert into items(i) values (1); > > > > create publication test_pub for all tables; > > > > > > > > Then when we: > > > > delete from items where i = 1; > > > > > > > > we get: > > > > ERROR: cannot delete from table "items" because it does not have a > > > > replica identity and publishes deletes > > > > HINT: To enable deleting from the table, set REPLICA IDENTITY using > > > > ALTER TABLE. > > > > > > > > Fair enough. But if we do this: > > > > alter table items replica identity nothing; > > > > > > > > because the docs [1] say that NOTHING means "Records no information > > > > about the old row." We still get the same error when we try the DELETE > > > > again. > > > > > > Well, "REPLICA IDENTITY NOTHING" is the same as "has no replica identity". > > > So is "REPLICA IDENTITY DEFAULT" if there is no primary key, or > > > "REPLICA IDENTITY USING INDEX ..." if the index is dropped. > > > > > > See "pg_class": the column "relreplident" is not nullable. > > > > Right, I think the confusing point for us is that the docs for NOTHING > > ("Records no information about the old row") imply you can decide you > > don't have to record anything if you don't want to do so, but the > > publication feature is effectively overriding that and asserting that > > you can't make that choice. > > > > Hi, I can see how the current docs could be interpreted in a way that > was not intended. > > ~~~ > > To emphasise the DEFAULT behaviour that Laurenze described, I felt > there could be another sentence about DEFAULT, the same as there is > already for the USING INDEX case. > > BEFORE [1] > Records the old values of the columns of the primary key, if any. This > is the default for non-system tables. > > SUGGESTION > Records the old values of the columns of the primary key, if any. This > is the default for non-system tables. If there is no primary key, the > behavior is the same as NOTHING. > > ~~~ > > If that is done, then would a publication docs tweak like the one > below clarify things sufficiently? > > BEFORE [2] > If a table without a replica identity is added to a publication that > replicates UPDATE or DELETE operations then subsequent UPDATE or > DELETE operations will cause an error on the publisher. > > SUGGESTION > If a table without a replica identity (or with replica identity > behavior equivalent to NOTHING) is added to a publication that > replicates UPDATE or DELETE operations then subsequent UPDATE or > DELETE operations will cause an error on the publisher. > > ====== > [1] https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITY > [2] https://www.postgresql.org/docs/current/logical-replication-publication.html > > Kind Regards, > Peter Smith. > Fujitsu Australia Thanks for looking at this! Yes, both of those changes together would make this unambiguous (and, I think, easier to mentally parse). Thanks, James Coleman
On Thu, Feb 8, 2024 at 11:12 AM James Coleman <jtc331@gmail.com> wrote: > > On Wed, Feb 7, 2024 at 6:04 PM Peter Smith <smithpb2250@gmail.com> wrote: > > > > On Thu, Feb 8, 2024 at 9:04 AM James Coleman <jtc331@gmail.com> wrote: > > > > > > On Wed, Feb 7, 2024 at 3:22 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > > > > > > > On Wed, 2024-02-07 at 15:12 -0500, James Coleman wrote: > > > > > We recently noticed some behavior that seems reasonable but also > > > > > surprised our engineers based on the docs. > > > > > > > > > > If we have this setup: > > > > > create table items(i int); > > > > > insert into items(i) values (1); > > > > > create publication test_pub for all tables; > > > > > > > > > > Then when we: > > > > > delete from items where i = 1; > > > > > > > > > > we get: > > > > > ERROR: cannot delete from table "items" because it does not have a > > > > > replica identity and publishes deletes > > > > > HINT: To enable deleting from the table, set REPLICA IDENTITY using > > > > > ALTER TABLE. > > > > > > > > > > Fair enough. But if we do this: > > > > > alter table items replica identity nothing; > > > > > > > > > > because the docs [1] say that NOTHING means "Records no information > > > > > about the old row." We still get the same error when we try the DELETE > > > > > again. > > > > > > > > Well, "REPLICA IDENTITY NOTHING" is the same as "has no replica identity". > > > > So is "REPLICA IDENTITY DEFAULT" if there is no primary key, or > > > > "REPLICA IDENTITY USING INDEX ..." if the index is dropped. > > > > > > > > See "pg_class": the column "relreplident" is not nullable. > > > > > > Right, I think the confusing point for us is that the docs for NOTHING > > > ("Records no information about the old row") imply you can decide you > > > don't have to record anything if you don't want to do so, but the > > > publication feature is effectively overriding that and asserting that > > > you can't make that choice. > > > > > > > Hi, I can see how the current docs could be interpreted in a way that > > was not intended. > > > > ~~~ > > > > To emphasise the DEFAULT behaviour that Laurenze described, I felt > > there could be another sentence about DEFAULT, the same as there is > > already for the USING INDEX case. > > > > BEFORE [1] > > Records the old values of the columns of the primary key, if any. This > > is the default for non-system tables. > > > > SUGGESTION > > Records the old values of the columns of the primary key, if any. This > > is the default for non-system tables. If there is no primary key, the > > behavior is the same as NOTHING. > > > > ~~~ > > > > If that is done, then would a publication docs tweak like the one > > below clarify things sufficiently? > > > > BEFORE [2] > > If a table without a replica identity is added to a publication that > > replicates UPDATE or DELETE operations then subsequent UPDATE or > > DELETE operations will cause an error on the publisher. > > > > SUGGESTION > > If a table without a replica identity (or with replica identity > > behavior equivalent to NOTHING) is added to a publication that > > replicates UPDATE or DELETE operations then subsequent UPDATE or > > DELETE operations will cause an error on the publisher. > > > > ====== > > [1] https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITY > > [2] https://www.postgresql.org/docs/current/logical-replication-publication.html > > > > Kind Regards, > > Peter Smith. > > Fujitsu Australia > > Thanks for looking at this! > > Yes, both of those changes together would make this unambiguous (and, > I think, easier to mentally parse). > OK, here then is a patch to do like that. ====== Kind Regards, Peter Smith. Fujitsu Australia
Attachment
On Thu, 2024-02-08 at 13:40 +1100, Peter Smith wrote: > - how to set the replica identity. If a table without a replica identity is > + how to set the replica identity. If a table without a replica identity > + (or with replica identity behavior the same as <literal>NOTHING</literal>) is > added to a publication that replicates <command>UPDATE</command> > or <command>DELETE</command> operations then > subsequent <command>UPDATE</command> or <command>DELETE</command> I had the impression that the root of the confusion was the perceived difference between "REPLICA IDENTITY NOTHING" and "no replica identity", and that change doesn't improve that. How about: If a table without a replica identity (explicitly set to <literal>NOTHING</literal>, or set to a primary key or index that doesn't exist) is added ... Yours, Laurenz Albe
On Thu, Feb 8, 2024 at 9:57 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Thu, 2024-02-08 at 13:40 +1100, Peter Smith wrote: > > - how to set the replica identity. If a table without a replica identity is > > + how to set the replica identity. If a table without a replica identity > > + (or with replica identity behavior the same as <literal>NOTHING</literal>) is > > added to a publication that replicates <command>UPDATE</command> > > or <command>DELETE</command> operations then > > subsequent <command>UPDATE</command> or <command>DELETE</command> > > I had the impression that the root of the confusion was the perceived difference > between "REPLICA IDENTITY NOTHING" and "no replica identity", and that change > doesn't improve that. > > How about: > > If a table without a replica identity (explicitly set to <literal>NOTHING</literal>, > or set to a primary key or index that doesn't exist) is added ... Another possibility is just to improve the documentation of various options as follows. DEFAULT If there is a primary key, record the old values of the columns of the primary key. Otherwise it acts as NOTHING. This is the default for non-system tables. USING INDEX index_name Records the old values of the columns covered by the named index, that must be unique, not partial, not deferrable, and include only columns marked NOT NULL. If this index is dropped, the behavior is the same as NOTHING. FULL Records the old values of all columns in the row. NOTHING Records no information about the old row. This is equivalent to having no replica identity. This is the default for system tables. -- Best Wishes, Ashutosh Bapat
On Wed, Feb 7, 2024 at 11:27 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Thu, 2024-02-08 at 13:40 +1100, Peter Smith wrote: > > - how to set the replica identity. If a table without a replica identity is > > + how to set the replica identity. If a table without a replica identity > > + (or with replica identity behavior the same as <literal>NOTHING</literal>) is > > added to a publication that replicates <command>UPDATE</command> > > or <command>DELETE</command> operations then > > subsequent <command>UPDATE</command> or <command>DELETE</command> > > I had the impression that the root of the confusion was the perceived difference > between "REPLICA IDENTITY NOTHING" and "no replica identity", and that change > doesn't improve that. > > How about: > > If a table without a replica identity (explicitly set to <literal>NOTHING</literal>, > or set to a primary key or index that doesn't exist) is added ... I think that would work also. I was reading the initial suggestion as "(or with replica identity behavior the same as..." as defining what "without a replica identity" meant, which would avoid the confusion. But your proposal is more explicit and more succinct, so I think it's the better option of the two. Regards, James Coleman
On Thu, Feb 8, 2024 at 4:47 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Thu, Feb 8, 2024 at 9:57 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > > > On Thu, 2024-02-08 at 13:40 +1100, Peter Smith wrote: > > > - how to set the replica identity. If a table without a replica identity is > > > + how to set the replica identity. If a table without a replica identity > > > + (or with replica identity behavior the same as <literal>NOTHING</literal>) is > > > added to a publication that replicates <command>UPDATE</command> > > > or <command>DELETE</command> operations then > > > subsequent <command>UPDATE</command> or <command>DELETE</command> > > > > I had the impression that the root of the confusion was the perceived difference > > between "REPLICA IDENTITY NOTHING" and "no replica identity", and that change > > doesn't improve that. > > > > How about: > > > > If a table without a replica identity (explicitly set to <literal>NOTHING</literal>, > > or set to a primary key or index that doesn't exist) is added ... > > Another possibility is just to improve the documentation of various > options as follows. > > DEFAULT > > If there is a primary key, record the old values of the columns of the > primary key. Otherwise it acts as NOTHING. This is the default for > non-system tables. > > USING INDEX index_name > > Records the old values of the columns covered by the named index, that > must be unique, not partial, not deferrable, and include only columns > marked NOT NULL. If this index is dropped, the behavior is the same as > NOTHING. > > FULL > > Records the old values of all columns in the row. > > NOTHING > > Records no information about the old row. This is equivalent to having > no replica identity. This is the default for system tables. This is the simplest change, and it does solve the confusion, so I'd be happy with it also. The other proposals have the benefit of having all the information necessary on the publications page rather than requiring the user to refer to the ALTER TABLE REPLICA IDENTITY page to understand what's meant. Regards, James Coleman
While revisiting some old threads, I found this one that seemed to reach a conclusion, but then it seemed nothing happened. After multiple suggestions AFAICT James preferred the docs [1] modification suggested [2] by Laurenz. Should we make a CF entry for this with the status RfC, or was the whole thing abandoned for some reason? ====== [1] https://www.postgresql.org/docs/devel/logical-replication-publication.html [2] https://www.postgresql.org/message-id/CAAaqYe91iO3dfUnVmBs4M-4aUX_zHmPN72ELE7c_8qAO_toPmA%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
On Mon, Dec 16, 2024 at 6:17 PM Peter Smith <smithpb2250@gmail.com> wrote: > > While revisiting some old threads, I found this one that seemed to > reach a conclusion, but then it seemed nothing happened. > > After multiple suggestions AFAICT James preferred the docs [1] > modification suggested [2] by Laurenz. > > Should we make a CF entry for this with the status RfC, or was the > whole thing abandoned for some reason? > > ====== > [1] https://www.postgresql.org/docs/devel/logical-replication-publication.html > [2] https://www.postgresql.org/message-id/CAAaqYe91iO3dfUnVmBs4M-4aUX_zHmPN72ELE7c_8qAO_toPmA%40mail.gmail.com > > Kind Regards, > Peter Smith. > Fujitsu Australia Yes, I would appreciate it moving forward. Did you want to create the CF entry or should I? Regards, James Coleman
On Wed, Dec 18, 2024 at 2:50 AM James Coleman <jtc331@gmail.com> wrote: > > On Mon, Dec 16, 2024 at 6:17 PM Peter Smith <smithpb2250@gmail.com> wrote: > > > > While revisiting some old threads, I found this one that seemed to > > reach a conclusion, but then it seemed nothing happened. > > > > After multiple suggestions AFAICT James preferred the docs [1] > > modification suggested [2] by Laurenz. > > > > Should we make a CF entry for this with the status RfC, or was the > > whole thing abandoned for some reason? > > > > ====== > > [1] https://www.postgresql.org/docs/devel/logical-replication-publication.html > > [2] https://www.postgresql.org/message-id/CAAaqYe91iO3dfUnVmBs4M-4aUX_zHmPN72ELE7c_8qAO_toPmA%40mail.gmail.com > > > > Kind Regards, > > Peter Smith. > > Fujitsu Australia > > Yes, I would appreciate it moving forward. > > Did you want to create the CF entry or should I? > I did it. See here: https://commitfest.postgresql.org/51/5445/ marked it as Ready for Committer. I was unsure whether to record the author as you (the thread author) or as Laurenz (the favoured patch author) so I just left some fields blank. ====== Kind Regards, Peter Smith. Fujitsu Australia
On Thu, Feb 8, 2024 at 7:24 PM James Coleman <jtc331@gmail.com> wrote: > > On Thu, Feb 8, 2024 at 4:47 AM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > On Thu, Feb 8, 2024 at 9:57 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > > > > > On Thu, 2024-02-08 at 13:40 +1100, Peter Smith wrote: > > > > - how to set the replica identity. If a table without a replica identity is > > > > + how to set the replica identity. If a table without a replica identity > > > > + (or with replica identity behavior the same as <literal>NOTHING</literal>) is > > > > added to a publication that replicates <command>UPDATE</command> > > > > or <command>DELETE</command> operations then > > > > subsequent <command>UPDATE</command> or <command>DELETE</command> > > > > > > I had the impression that the root of the confusion was the perceived difference > > > between "REPLICA IDENTITY NOTHING" and "no replica identity", and that change > > > doesn't improve that. > > > > > > How about: > > > > > > If a table without a replica identity (explicitly set to <literal>NOTHING</literal>, > > > or set to a primary key or index that doesn't exist) is added ... > > > > Another possibility is just to improve the documentation of various > > options as follows. > > > > DEFAULT > > > > If there is a primary key, record the old values of the columns of the > > primary key. Otherwise it acts as NOTHING. This is the default for > > non-system tables. > > > > USING INDEX index_name > > > > Records the old values of the columns covered by the named index, that > > must be unique, not partial, not deferrable, and include only columns > > marked NOT NULL. If this index is dropped, the behavior is the same as > > NOTHING. > > > > FULL > > > > Records the old values of all columns in the row. > > > > NOTHING > > > > Records no information about the old row. This is equivalent to having > > no replica identity. This is the default for system tables. > > This is the simplest change, and it does solve the confusion, so I'd > be happy with it also. The other proposals have the benefit of having > all the information necessary on the publications page rather than > requiring the user to refer to the ALTER TABLE REPLICA IDENTITY page > to understand what's meant. > There is no harm in having it at both places (publications page and ALTER TABLE REPLICA IDENTITY page). Would someone be interested in preparing a patch with the changes agreed upon? -- With Regards, Amit Kapila.
On Thu, Feb 8, 2024 at 9:57 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Thu, 2024-02-08 at 13:40 +1100, Peter Smith wrote: > > - how to set the replica identity. If a table without a replica identity is > > + how to set the replica identity. If a table without a replica identity > > + (or with replica identity behavior the same as <literal>NOTHING</literal>) is > > added to a publication that replicates <command>UPDATE</command> > > or <command>DELETE</command> operations then > > subsequent <command>UPDATE</command> or <command>DELETE</command> > > I had the impression that the root of the confusion was the perceived difference > between "REPLICA IDENTITY NOTHING" and "no replica identity", and that change > doesn't improve that. > > How about: > > If a table without a replica identity (explicitly set to <literal>NOTHING</literal>, > or set to a primary key or index that doesn't exist) is added ... > Is it correct to say "set to a primary key or index that doesn't exist"? Because when it is set to the primary key then it should work. I think Peter's proposal along with Ashutosh's proposal is the simpler approach to clarify things in this area but I am fine if others find some other way of updating docs better. -- With Regards, Amit Kapila.