Home > mailing lists

Re: pg_publication_tables: return NULL attnames when no column list is specified - Mailing list pgsql-hackers

From	Roberto Mello
Subject	Re: pg_publication_tables: return NULL attnames when no column list is specified
Date	April 1 00:54:51
Msg-id	CAKz==b+30jUre6sZE85=afnn66FaQ926=4M1YobfsKoCJ=Mb+g@mail.gmail.com Whole thread
In response to	Re: pg_publication_tables: return NULL attnames when no column list is specified (Amit Kapila <amit.kapila16@gmail.com>)
List	pgsql-hackers

Tree view

On Tue, Mar 31, 2026 at 2:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Mar 31, 2026 at 12:02 AM Roberto Mello <roberto.mello@gmail.com> wrote:
>
> The subscriber receives WAL from both publications for the same table.
> Which column set should it apply? It cannot apply both as they disagree on
> whether email is included. This is exactly the situation the "cannot use
> different column lists" check was designed to prevent.
>

I think we need to consider the cases where current permissive
behavior helps. For example, consider the cases where the schema is

I'm sorry but you completely dismissed the points that I brought up, addressing

none of the concerns, and went on a tangent about "permissive behavior".

Permissive behavior is fine and dandy until it bites you in the @$$, and there's

a line where permissive is wrong, bad, or dangerous. I pointed out why I think

the current behavior is wrong, potentially dangerous. MySQL is king of permissive

behavior but I don't think that's much to brag about. PostgreSQL historically

favored correctness over permissiveness.

The deeper issue is that the "permissive" behavior isn't actually permissive: it's
silently inconsistent. The two publications have different replication contracts stored

in the catalog (prattrs = NULL vs prattrs = {1,2}), and pgoutput.c honors that difference

at WAL decode time. The old code just hid the difference from the check that was

supposed to detect it.

static. Now let us consider another case where a user would actually
need to define such publication combinations for a subscription. One
of the more common ways this conflict happens is accidental: User has
pub_1 for TABLE t (col1, col2). User later decides to replicate the
entire database to a new subscription and creates pub_2 FOR ALL
TABLES. User adds pub_2 to the subscription. Currently, the user can
add pub_2 and then later realize they need to drop pub_1 to clean

How would they "realize" that if the view is showing them the exact

same data for the two publications?

things up. If ALTER SUBSCRIPTION blocks this, the user is stuck in a
Catch-22 where they can't add the "All Tables" publication because a
single specific table has a column list. They would have to drop the
specific publication first, potentially losing replication coverage
for that table during the gap.

I don't think it actually traps the user. The sequence that works today and

would continue to work:

-- Starting from sub has pub_1 with explicit column list
ALTER PUBLICATION pub_1 DROP TABLE t; -- on publisher
ALTER SUBSCRIPTION sub REFRESH PUBLICATION; -- on subscriber
-- Now add pub_2 FOR ALL TABLES
ALTER SUBSCRIPTION sub ADD PUBLICATION pub_2;

Or even simpler, just drop the column list from pub_1 before adding pub_2:

ALTER PUBLICATION pub_1 SET TABLE t; -- removes the column list
ALTER SUBSCRIPTION sub ADD PUBLICATION pub_2;

There's no gap in replication coverage in either case because the subscription is still

active throughout. The table remains subscribed via pub_1 until pub_2 takes over.

For the static schema case: if the schema never changes, the two publications do produce

identical column sets and there's no practical problem. But the publication system doesn't

know the schema is static. It has to handle the general case. And the general case is that

NULL means "all current and future columns" while an explicit list means "exactly these,

nothing more." Those are different things (contracts) and being "permissive" with interpretation

of what those contracts mean in my head versus someone else's could be dangerous.

Now, considering the other cases where replication later ERRORs out
(like the one you mentioned) when we allow such combinations, we can
give a WARNING at the time subscriber DDLs when they lead to such
combinations.

A WARNING at ALTER SUBSCRIPTION time would be better than silence, but it still

allows the subscriber into a state where a future ALTER TABLE ADD COLUMN on the

publisher will cause replication to break. At that point the user gets the ERROR anyway,

but now it's during replication rather than at setup time, which is harder to diagnose and

recover from. Erroring early, when the user is actively making the change and can fix it,

is the safer default.

That said, if the consensus is that the migration path is too disruptive, a middle ground

could be to ERROR by default but provide a subscription-level option to downgrade it to

WARNING. For the reasons I pointed out, I think the right call is to keep the ERROR as

the default though, since it catches a real inconsistency.

Roberto Mello

Snowflake

pgsql-hackers by date:

From: Jeff Davis
Date: 01 April, 00:34:28
Subject: Re: Do we still need MULE_INTERNAL?

From: Andres Freund
Date: 01 April, 01:05:46
Subject: Re: Buffer locking is special (hints, checksums, AIO writes)

Re: pg_publication_tables: return NULL attnames when no column list is specified - Mailing list pgsql-hackers

Previous

Next