Re: Column Filtering in Logical Replication - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Column Filtering in Logical Replication |
Date | |
Msg-id | 5af7066b-15ac-1989-9afe-4e5dc1d60154@enterprisedb.com Whole thread Raw |
In response to | Re: Column Filtering in Logical Replication (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Column Filtering in Logical Replication
|
List | pgsql-hackers |
On 3/10/22 04:09, Amit Kapila wrote: > On Wed, Mar 9, 2022 at 3:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> On Mon, Mar 7, 2022 at 8:48 PM Tomas Vondra >> <tomas.vondra@enterprisedb.com> wrote: >> >>> OK, I reworked this to do the same thing as the row filtering patch. >>> >> >> Thanks, I'll check this. >> > > Some assorted comments: > ===================== > 1. We don't need to send a column list for the old tuple in case of an > update (similar to delete). It is not required to apply a column > filter for those cases because we ensure that RI must be part of the > column list for updates and deletes. I'm not sure which part of the code does this refer to? > 2. > + /* > + * Check if all columns referenced in the column filter are part of > + * the REPLICA IDENTITY index or not. > > I think this comment is reverse. The rule we follow here is that > attributes that are part of RI must be there in a specified column > list. This is used at two places in the patch. Yeah, you're right. Will fix. > 3. get_rel_sync_entry() > { > /* XXX is there a danger of memory leak here? beware */ > + oldctx = MemoryContextSwitchTo(CacheMemoryContext); > + for (int i = 0; i < nelems; i++) > ... > } > > Similar to the row filter, I think we need to use > entry->cache_expr_cxt to allocate this. There are other usages of > CacheMemoryContext in this part of the code but I think those need to > be also changed and we can do that as a separate patch. If we do the > suggested change then we don't need to separately free columns. I agree a shorter-lived context would be better than CacheMemoryContext, but "expr" seems to indicate it's for the expression only, so maybe we should rename that. But do we really want a memory context for every single entry? > 4. I think we don't need the DDL changes in AtExecDropColumn. Instead, > we can change the dependency of columns to NORMAL during publication > commands. I'll think about that. > 5. There is a reference to check_publication_columns but that function > is removed from the patch. Right, will fix. > 6. > /* > * If we know everything is replicated and the row filter is invalid > * for update and delete, there is no point to check for other > * publications. > */ > if (pubdesc->pubactions.pubinsert && pubdesc->pubactions.pubupdate && > pubdesc->pubactions.pubdelete && pubdesc->pubactions.pubtruncate && > !pubdesc->rf_valid_for_update && !pubdesc->rf_valid_for_delete) > break; > > /* > * If we know everything is replicated and the column filter is invalid > * for update and delete, there is no point to check for other > * publications. > */ > if (pubdesc->pubactions.pubinsert && pubdesc->pubactions.pubupdate && > pubdesc->pubactions.pubdelete && pubdesc->pubactions.pubtruncate && > !pubdesc->cf_valid_for_update && !pubdesc->cf_valid_for_delete) > break; > > Can we combine these two checks? > I was worried it'd get too complex / hard to understand, but I'll think about maybe simplifying the conditions a bit. > I feel this patch needs a more thorough review. > I won't object to more review, of course. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: