Home > mailing lists

Re: row filtering for logical replication - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: row filtering for logical replication
Date	July 13, 2021 19:15:48
Msg-id	4d9e0142-0e80-daee-5236-154cd785828c@enterprisedb.com Whole thread Raw
In response to	Re: row filtering for logical replication (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: row filtering for logical replication
List	pgsql-hackers

Tree view

On 7/13/21 12:57 PM, Amit Kapila wrote:
> On Tue, Jul 13, 2021 at 10:24 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Mon, Jul 12, 2021 at 3:01 PM Tomas Vondra
>> <tomas.vondra@enterprisedb.com> wrote:
>>
>>> In terms of implementation, I think there are two basic options - either
>>> we can define a new "expression" type in gram.y, which would be a subset
>>> of a_expr etc. Or we can do it as some sort of expression walker, kinda
>>> like what the transform* functions do now.
>>>
>>
>> I think it is better to use some form of walker here rather than
>> extending the grammar for this. However, the question is do we need
>> some special kind of expression walker here or can we handle all
>> required cases via transformWhereClause() call as the patch is trying
>> to do. AFAIU, the main things we want to prohibit in the filter are:
>> (a) it doesn't refer to any relation other than catalog in where
>> clause, (b) it doesn't use UDFs in any way (in expressions, in
>> user-defined operators, user-defined types, etc.), (c) the columns
>> referred to in the filter should be part of PK or Replica Identity.
>> Now, if all such things can be detected by the approach patch has
>> taken then why do we need a special kind of expression walker? OTOH,
>> if we can't detect some of this then probably we can use a special
>> walker.
>>
>> I think in the long run one idea to allow UDFs is probably by
>> explicitly allowing users to specify whether the function is
>> publication predicate safe and if so, then we can allow such functions
>> in the filter clause.
>>
> 
> Another idea here could be to read the publication-related catalog
> with the latest snapshot instead of a historic snapshot. If we do that
> then if the user faces problems as described by Petr [1] due to
> missing dependencies via UDFs then she can Alter the Publication to
> remove/change the filter clause and after that, we would be able to
> recognize the updated filter clause and the system will be able to
> move forward.
> 
> I might be missing something but reading publication catalogs with
> non-historic snapshots shouldn't create problems as we use the
> historic snapshots are required to decode WAL.
> 

IMHO the best option for v1 is to just restrict the filters to 
known-safe expressions. That is, just built-in operators, no UDFs etc. 
Yes, it's not great, but both alternative proposals (allowing UDFs or 
using current snapshot) are problematic for various reasons.

Even with those restrictions the row filtering seems quite useful, and 
we can relax those restrictions later if we find acceptable compromise 
and/or decide it's worth the risk. Seems better than having to introduce 
new restrictions later.

> I think the problem described by Petr[1] is also possible today if the
> user drops the publication and there is a corresponding subscription,
> basically, the system will stuck with error: "ERROR:  publication
> "mypub" does not exist. I think allowing to use non-historic snapshots
> just for publications will resolve that problem as well.
> 
> [1] - https://www.postgresql.org/message-id/92e5587d-28b8-5849-2374-5ca3863256f1%402ndquadrant.com
> 

That seems like a completely different problem, TBH. For example the 
slot is dropped too, which means the WAL is likely gone etc.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Tomas Vondra
Date: 13 July 2021, 19:07:38
Subject: Re: row filtering for logical replication

From: Alvaro Herrera
Date: 13 July 2021, 19:21:17
Subject: Re: row filtering for logical replication

Re: row filtering for logical replication - Mailing list pgsql-hackers

Previous

Next