Re: Proposal: Deferred Replica Filtering for PostgreSQL Logical Replication - Mailing list pgsql-hackers
From | Dean |
---|---|
Subject | Re: Proposal: Deferred Replica Filtering for PostgreSQL Logical Replication |
Date | |
Msg-id | CALWmXtuyvdL5zyYKgnszEVjX-Ru7jmGpKhn8zobXRbpoWRFFSg@mail.gmail.com Whole thread Raw |
In response to | Re: Proposal: Deferred Replica Filtering for PostgreSQL Logical Replication (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
RE: Proposal: Deferred Replica Filtering for PostgreSQL Logical Replication
Re: Proposal: Deferred Replica Filtering for PostgreSQL Logical Replication |
List | pgsql-hackers |
Unfortunately, neither column lists nor row filters can provide the level of control I'm proposing. These revised examples might help illustrate the use case for DRF:
Alice, Bob, and Eve subscribe to changes on a `friend_requests` table. Row-level security ensures CRUD access based on user IDs.
1. Per-subscriber column control: Bob makes a change to the table. Alice should receive the entire record, while Eve should only receive the timestamp - no other columns. Why DRF is needed: Column lists are static and apply equally to all subscribers, meaning we can't distinguish Alice's subscription from Eve's.
2. Bob DELETEs a row from the table. Alice should see the DELETE event, while Eve should not even be aware of an event. Why DRF is needed: The deterministic nature of row filters makes them unsuitable for per-subscriber filtering based on session data.
The goal of DRF is to allow per-subscriber variations in change broadcasts, enabling granular control over what data is sent to each subscriber based on their session context.
Best,
Dean S
On Mon, Mar 17, 2025 at 4:32 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sun, Mar 16, 2025 at 12:59 AM Dean <ds.blue797@gmail.com> wrote:
>
> I'd like to propose an enhancement to PostgreSQL's logical replication system: Deferred Replica Filtering (DRF). The goal of this feature is to provide more granular control over which rows are replicated by applying publication filters after the WAL decoding process, before sending data to subscribers.
>
> Currently, PostgreSQL's logical replication filters apply deterministically. Deferred filtering, however, operates after the WAL has been decoded, giving it access to the complete row data and making filtering decisions based on mutable values. Additionally, record columns may be omitted by the filter.
>
> This opens up several possibilities for granular control. Consider the following examples:
> Alice and Bob subscribe to changes on a table with RLS enabled, allowing CRUD operations based on user's IDs.
> 1. Alice needs to know the timestamp at which Bob updated the table. With DRF, we can omit all columns except for the timestamp.
> 2. Bob wants to track DELETEs on the table. Without DRF, Bob can see all columns on any deleted row, potentially exposing complete records he shouldn't be authorized to view. DRF can filter these rows out.
>
> Deferred replica filtering allows for session-specific, per-row, and per-column filtering - features currently not supported by existing replication filters, enhancing security and data privacy.
>
We provide column lists [1] and row filters [2]. Doesn't that suffice
the need, if not, kindly let us know what exactly you need with some
examples.
[1] - https://www.postgresql.org/docs/devel/logical-replication-col-lists.html
[2] - https://www.postgresql.org/docs/devel/logical-replication-row-filter.html
--
With Regards,
Amit Kapila.
pgsql-hackers by date: