Re: Improve logical replication usability when tables lack primary keys - Mailing list pgsql-hackers

From GRANT ZHOU
Subject Re: Improve logical replication usability when tables lack primary keys
Date
Msg-id CA+FXcm-YZXJpc6E7XEDTv9Yaic=U7Dwnjj4znxJ4gCxUZMcXww@mail.gmail.com
Whole thread Raw
In response to Re: Improve logical replication usability when tables lack primary keys  ("Euler Taveira" <euler@eulerto.com>)
List pgsql-hackers
On Wed, Dec 17, 2025 at 12:50 PM Euler Taveira <euler@eulerto.com> wrote:
The ship has sailed a long time ago (version 9.4 to be precise -- commit
07cacba983ef). The row identifier property was defined as an SQL command (ALTER
TABLE ... REPLICA IDENTITY) and *not* a publication property. IMO that's the
correct design because row identifier is a table property. Extend this concept
to a publication property is the wrong direction. It is confusing and complex.

Thanks for the detailed history. I completely understand and respect that Replica Identity is designed as a table property.

Each table needs to say what's its row identifier. The user created a table
without primary key. Well, create a primary key. There are dozens of thousands
of objects. Use a script.

However, I’d like to share a user perspective regarding the "use a script" approach. The main value of FOR TABLES IN SCHEMA is in-database automation. If users still need to maintain external scripts to monitor and ALTER new tables to prevent replication errors, it significantly diminishes the value of that automation.

Additionally, tables without Primary Keys are valid SQL and extremely common in enterprise environments (e.g., audit logs, data warehousing). In large-scale deployments, enforcing PKs on every single table isn't always practical. 

I would suggest a way to disallow or add a warning
message while creating the publication or adding new tables, however, the FOR
ALL TABLES and FOR TABLES IN SCHEMA were mentioned. There isn't a reliable way
to guarantee that a publication with UPDATE and/or DELETE option contains only
tables with pk, RI FULL or RI USING INDEX. The fact that there is no rows in the
pg_publication_rel for these clauses, makes validating the CREATE/ALTER
PUBLICATION commands more difficult. (I prefer deterministic commands and when I
saw an object definition saying "including objects created in the future", my
first question is: what's the drawbacks and caveats?)

I don't think the current behavior is lacking documentation; the REPLICA
IDENTITY concept is explicitly in the logical replication chapter [1].

I think the goal of this proposal is not to change the underlying table property design, but rather to seek a mechanism (like a Publication option) to ensure this automation functions safely without external intervention. It is simply about allowing the database to handle these valid, common scenarios gracefully when automation is enabled.

--
Grant Zhou
Highgo Software

pgsql-hackers by date:

Previous
From: Sami Imseih
Date:
Subject: Re: Support named (destination) portals in extended proto for psql meta commands.
Next
From: Michael Paquier
Date:
Subject: Re: [Proposal] Adding callback support for custom statistics kinds