Re: pg_publication_tables: return NULL attnames when no column list is specified - Mailing list pgsql-hackers
| From | Roberto Mello |
|---|---|
| Subject | Re: pg_publication_tables: return NULL attnames when no column list is specified |
| Date | |
| Msg-id | CAKz==b+30jUre6sZE85=afnn66FaQ926=4M1YobfsKoCJ=Mb+g@mail.gmail.com Whole thread |
| In response to | Re: pg_publication_tables: return NULL attnames when no column list is specified (Amit Kapila <amit.kapila16@gmail.com>) |
| List | pgsql-hackers |
On Tue, Mar 31, 2026 at 2:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Mar 31, 2026 at 12:02 AM Roberto Mello <roberto.mello@gmail.com> wrote:
>
> The subscriber receives WAL from both publications for the same table.
> Which column set should it apply? It cannot apply both as they disagree on
> whether email is included. This is exactly the situation the "cannot use
> different column lists" check was designed to prevent.
>
I think we need to consider the cases where current permissive
behavior helps. For example, consider the cases where the schema is
I'm sorry but you completely dismissed the points that I brought up, addressing
none of the concerns, and went on a tangent about "permissive behavior".
Permissive behavior is fine and dandy until it bites you in the @$$, and there's
a line where permissive is wrong, bad, or dangerous. I pointed out why I think
the current behavior is wrong, potentially dangerous. MySQL is king of permissive
behavior but I don't think that's much to brag about. PostgreSQL historically
favored correctness over permissiveness.
The deeper issue is that the "permissive" behavior isn't actually permissive: it's
silently inconsistent. The two publications have different replication contracts stored
silently inconsistent. The two publications have different replication contracts stored
in the catalog (prattrs = NULL vs prattrs = {1,2}), and pgoutput.c honors that difference
at WAL decode time. The old code just hid the difference from the check that was
supposed to detect it.
static. Now let us consider another case where a user would actually
need to define such publication combinations for a subscription. One
of the more common ways this conflict happens is accidental: User has
pub_1 for TABLE t (col1, col2). User later decides to replicate the
entire database to a new subscription and creates pub_2 FOR ALL
TABLES. User adds pub_2 to the subscription. Currently, the user can
add pub_2 and then later realize they need to drop pub_1 to clean
How would they "realize" that if the view is showing them the exact
same data for the two publications?
things up. If ALTER SUBSCRIPTION blocks this, the user is stuck in a
Catch-22 where they can't add the "All Tables" publication because a
single specific table has a column list. They would have to drop the
specific publication first, potentially losing replication coverage
for that table during the gap.
would continue to work:
-- Starting from sub has pub_1 with explicit column list
ALTER PUBLICATION pub_1 DROP TABLE t; -- on publisher
ALTER SUBSCRIPTION sub REFRESH PUBLICATION; -- on subscriber
-- Now add pub_2 FOR ALL TABLES
ALTER SUBSCRIPTION sub ADD PUBLICATION pub_2;
Or even simpler, just drop the column list from pub_1 before adding pub_2:
ALTER PUBLICATION pub_1 SET TABLE t; -- removes the column list
ALTER SUBSCRIPTION sub ADD PUBLICATION pub_2;
There's no gap in replication coverage in either case because the subscription is still
active throughout. The table remains subscribed via pub_1 until pub_2 takes over.
For the static schema case: if the schema never changes, the two publications do produce
identical column sets and there's no practical problem. But the publication system doesn't
know the schema is static. It has to handle the general case. And the general case is that
NULL means "all current and future columns" while an explicit list means "exactly these,
nothing more." Those are different things (contracts) and being "permissive" with interpretation
of what those contracts mean in my head versus someone else's could be dangerous.
That said, if the consensus is that the migration path is too disruptive, a middle ground
Now, considering the other cases where replication later ERRORs out
(like the one you mentioned) when we allow such combinations, we can
give a WARNING at the time subscriber DDLs when they lead to such
combinations.
A WARNING at ALTER SUBSCRIPTION time would be better than silence, but it still
allows the subscriber into a state where a future ALTER TABLE ADD COLUMN on the
publisher will cause replication to break. At that point the user gets the ERROR anyway,
but now it's during replication rather than at setup time, which is harder to diagnose and
recover from. Erroring early, when the user is actively making the change and can fix it,
is the safer default.
That said, if the consensus is that the migration path is too disruptive, a middle ground
could be to ERROR by default but provide a subscription-level option to downgrade it to
WARNING. For the reasons I pointed out, I think the right call is to keep the ERROR as
the default though, since it catches a real inconsistency.
Roberto Mello
Snowflake
pgsql-hackers by date: