Re: Report bytes and transactions actually sent downtream - Mailing list pgsql-hackers

From shveta malik
Subject Re: Report bytes and transactions actually sent downtream
Date
Msg-id CAJpy0uDYdzwUeSBb2=+V7WQuFdc1wU_6hrqWgFOXByonWVE-yQ@mail.gmail.com
Whole thread Raw
In response to Re: Report bytes and transactions actually sent downtream  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
List pgsql-hackers
On Wed, Oct 29, 2025 at 8:25 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Wed, Oct 29, 2025 at 9:14 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Oct 28, 2025 at 12:46 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Mon, Oct 27, 2025 at 4:47 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > Few comments:
> > > >
> > > > 1)
> > > > pgoutput_truncate:
> > > >
> > > > if (nrelids > 0)
> > > > {
> > > > OutputPluginPrepareWrite(ctx, true);
> > > > logicalrep_write_truncate(ctx->out,
> > > >   xid,
> > > >   nrelids,
> > > >   relids,
> > > >   change->data.truncate.cascade,
> > > >   change->data.truncate.restart_seqs);
> > > > OutputPluginWrite(ctx, true);
> > > > }
> > > > + else
> > > > + ctx->stats->filteredBytes += ReorderBufferChangeSize(change);
> > > > +
> > > >
> > > > It seems that filteredBytes are only counted for TRUNCATE when nrelids
> > > > is 0. Can nrelids only be 0 or same as nrelations?
> > > >
> > > > The below code makes me think that nrelids can be any number between 0
> > > > and nrelations, depending on which relations are publishable and which
> > > > supports publishing TRUNCATE. If that’s true, shouldn’t we count
> > > > filteredBytes in each such skipped case?
> > >
> > > IIIUC, you are suggesting that we should add
> > > ReorderBufferChangeSize(change) for every relation which is not part
> > > of the publication or whose truncate is not published.
> >
> > No, that will be wrong.
> >
> > > I think that
> > > won't be correct since it can lead to a situation where filtered bytes
> > > > total bytes which should never happen. Even if there is a single
> > > publishable relation whose truncate is published, the change should
> > > not be considered as filtered since something would be output
> > > downstream.
> >
> > Yes, the entire change should not be treated as filtered. The idea is
> > that, for example, if there are 20 relations belonging to different
> > publications and only one of them supports publishing TRUNCATE, then
> > when a TRUNCATE is triggered on all, the data for that one relation
> > should be counted as sent (which is currently happening based on
> > nrelids), while the data for the remaining 19 should be considered
> > filtered — which is not happening right now.
> >
> > > Otherwise filtered bytes as well as sent bytes both will
> > > be incremented causing an inconsistency (which would be hard to notice
> > > since total bytes - filtered bytes has something to do with the sent
> > > bytes but the exact correlation is hard to grasp in a formula).
> > >
> > > We may increment filteredBytes by sizeof(OID) for every relation we
> > > skip here OR by ReoderBufferChangeSize(change) if all the relations
> > > are filtered, but that's too much dependent on how the WAL record is
> > > encoded; and adding that dependency in an output plugin code seems
> > > hard to manage.
> > >
> >
> > Yes, that was the idea, to increment filteredBytes in this way. But I
> > see your point. I can’t think of a better solution at the moment. If
> > you also don’t have any better ideas, then at least adding a comment
> > in this function would be helpful. Right now, it looks like we
> > overlooked the fact that some relationships should contribute to
> > filteredBytes while others should go to sentBytes.
>
> I noticed that we do something similar while filtering columns. I
> think we need to add a comment in that code as well. How about
> something like below?
>
> diff --git a/src/backend/replication/pgoutput/pgoutput.c
> b/src/backend/replication/pgoutput/pgoutput.c
> index 4b35f2de6aa..f2d6e20a702 100644
> --- a/src/backend/replication/pgoutput/pgoutput.c
> +++ b/src/backend/replication/pgoutput/pgoutput.c
> @@ -1621,7 +1621,12 @@ pgoutput_change(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
>
>         OutputPluginPrepareWrite(ctx, true);
>
> -       /* Send the data */
> +       /*
> +        * Send the data. Even if we end up filtering some columns
> while sending the
> +        * message, we won't consider the change, as a whole, to be
> filtered out.
> +        * Instead the filtered columns will be reflected as a smaller sentBytes
> +        * count.
> +        */
>         switch (action)
>         {
>                 case REORDER_BUFFER_CHANGE_INSERT:
> @@ -1728,6 +1733,13 @@ pgoutput_truncate(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
>
> change->data.truncate.cascade,
>
> change->data.truncate.restart_seqs);
>                 OutputPluginWrite(ctx, true);
> +
> +               /*
> +                * Even if we filtered out some relations, we still
> send a TRUNCATE
> +                * message for the remaining relations. Since the
> change, as a whole, is
> +                * not filtered out, we don't count modify
> filteredBytes. The filtered
> +                * out relations will be reflected as a smaller sentBytes count.
> +                */
>         }
>         else
>                 ctx->stats->filteredBytes += ReorderBufferChangeSize(change);
>

> +                * not filtered out, we don't count modify filteredBytes. The filtered

Something is wrong in this sentence.

Also, regarding "The filtered out relations will be reflected as a
smaller sentBytes count."
Can you please point me to the code where it happens? From what I have
understood, pgoutput_truncate() completely skips the relations which
do not support publishing truncate. Then it sends 'BEGIN',  then
schema info of non-filtered relations and then TRUNCATE for
non-filtered relations (based on nrelids).

thanks
Shveta



pgsql-hackers by date:

Previous
From: Chao Li
Date:
Subject: Re: Optimize LISTEN/NOTIFY
Next
From: John Naylor
Date:
Subject: Re: tuple radix sort