Re: Column Filtering in Logical Replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Column Filtering in Logical Replication
Date
Msg-id CAA4eK1+po5JU7u9GFdR4dCwti6z3-feK6-BNi8Zhr1-8RJ_Fyw@mail.gmail.com
Whole thread Raw
In response to Re: Column Filtering in Logical Replication  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Column Filtering in Logical Replication
List pgsql-hackers
On Fri, Mar 11, 2022 at 12:44 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 3/10/22 04:09, Amit Kapila wrote:
> > On Wed, Mar 9, 2022 at 3:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>
> >> On Mon, Mar 7, 2022 at 8:48 PM Tomas Vondra
> >> <tomas.vondra@enterprisedb.com> wrote:
> >>
> >>> OK, I reworked this to do the same thing as the row filtering patch.
> >>>
> >>
> >> Thanks, I'll check this.
> >>
> >
> > Some assorted comments:
> > =====================
> > 1. We don't need to send a column list for the old tuple in case of an
> > update (similar to delete). It is not required to apply a column
> > filter for those cases because we ensure that RI must be part of the
> > column list for updates and deletes.
>
> I'm not sure which part of the code does this refer to?
>

The below part:
@@ -464,11 +473,11 @@ logicalrep_write_update(StringInfo out,
TransactionId xid, Relation rel,
  pq_sendbyte(out, 'O'); /* old tuple follows */
  else
  pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldslot, binary);
+ logicalrep_write_tuple(out, rel, oldslot, binary, columns);
  }

I think here instead of columns, the patch needs to send NULL as it is
already doing in logicalrep_write_delete.

> > 2.
> > + /*
> > + * Check if all columns referenced in the column filter are part of
> > + * the REPLICA IDENTITY index or not.
> >
> > I think this comment is reverse. The rule we follow here is that
> > attributes that are part of RI must be there in a specified column
> > list. This is used at two places in the patch.
>
> Yeah, you're right. Will fix.
>
> > 3. get_rel_sync_entry()
> > {
> > /* XXX is there a danger of memory leak here? beware */
> > + oldctx = MemoryContextSwitchTo(CacheMemoryContext);
> > + for (int i = 0; i < nelems; i++)
> > ...
> > }
> >
> > Similar to the row filter, I think we need to use
> > entry->cache_expr_cxt to allocate this. There are other usages of
> > CacheMemoryContext in this part of the code but I think those need to
> > be also changed and we can do that as a separate patch. If we do the
> > suggested change then we don't need to separately free columns.
>
> I agree a shorter-lived context would be better than CacheMemoryContext,
> but "expr" seems to indicate it's for the expression only, so maybe we
> should rename that.
>

Yeah, we can do that. How about rel_entry_cxt or something like that?
The idea is that eventually, we should move a few other things of
RelSyncEntry like attrmap where we are using CacheMemoryContext under
this context.

> But do we really want a memory context for every
> single entry?
>

Any other better idea?

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats