Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding - Mailing list pgsql-hackers

From Ajin Cherian
Subject Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding
Date
Msg-id CAFPTHDYb2ydVdAg8u0JqZctKYsp1L9BT-7y5g6WxiAm5OVHKbA@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Tue, Apr 22, 2025 at 5:00 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Ajin,
>
> Some review comments for patch v17-0001
>
> ======
> Commit message
>
> 1.
> Additionally, transactions containing WAL records (INTERNAL_SNAPSHOT,
> COMMAND_ID, or INVALIDATION) that modify the historical snapshot
> constructed during logical decoding are deemed unfilterable. This
> is necessary because constructing a correct historical snapshot
> for searching publication information requires processing these WAL record.
>
> ~
>
> /these WAL record./these WAL records./
>

Fixed.

> ======
> src/backend/replication/logical/reorderbuffer.c
>
> ReorderBufferFilterByRelFileLocator:
>
> 2.
> + if (found)
> + {
> + rb->try_to_filter_change = entry->filterable;
> + return entry->filterable;
> + }
> +
>
> Bad indentation.
>

Fixed.

> ======
> src/include/replication/reorderbuffer.h
>
> 3.
> +extern bool ReorderBufferCanFilterChanges(ReorderBuffer *rb);
> +
>
> Why is this extern here? This function is not implemented in patch 0001.
>
Fixed.

On Wed, Apr 23, 2025 at 1:11 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Ajin,
>
> Some review comments for patch v17-0002.
>
> ======
> src/backend/replication/logical/decode.c
>
> 1.
>  /*
> + * Check if filtering changes before decoding is supported and we're
> not suppressing filter
> + * changes currently.
> + */
> +static inline bool
> +FilterChangeIsEnabled(LogicalDecodingContext *ctx)
> +{
> + return (ctx->callbacks.filter_change_cb != NULL &&
> + ctx->reorder->try_to_filter_change);
> +}
> +
>
> I still have doubts about the need for/benefits of this to be a
> separate function.
>
> It is only called from *one* place within the other static function,
> FilterChange.
>
> Just putting that code inline seems just as readable as maintaining
> the separate function for it.
>
> SUGGESTION:
> static inline bool
> FilterChange(LogicalDecodingContext *ctx, XLogRecPtr origptr, TransactionId xid,
>              RelFileLocator *target_locator, ReorderBufferChangeType
> change_type)
> {
>   return
>     ctx->callbacks.filter_change_cb &&
>     ctx->reorder->try_to_filter_change &&
>     ReorderBufferFilterByRelFileLocator(ctx->reorder, xid, origptr,
> target_locator,
>                                         change_type));
> }
>
>

Fixed.

> ======
> src/backend/replication/logical/reorderbuffer.c
>
> RelFileLocatorCacheInvalidateCallback:
>
> 2.
>   if (hash_search(RelFileLocatorFilterCache,
> - &entry->key,
> - HASH_REMOVE,
> - NULL) == NULL)
> + &entry->key,
> + HASH_REMOVE,
> + NULL) == NULL)
>   elog(ERROR, "hash table corrupted");
>
> I think this whitespace change belongs back in patch 0001 where this
> function was introduced, not here in patch 0002.
>

Fixed.

> ~~~
>
> ReorderBufferFilterByRelFileLocator:
>
> 3.
> + /*
> + * Quick return if we already know that the relation is not to be
> + * decoded. These are for special relations that are unlogged and for
> + * sequences and catalogs.
> + */
> + if (entry->filterable)
> + return true;
>
> /These are for/This is for/

Fixed.

>
> ~~~
>
> 4.
>   if (RelationIsValid(relation))
>   {
> - entry->relid = RelationGetRelid(relation);
> + if (IsToastRelation(relation))
> + {
> + char   *toast_name = RelationGetRelationName(relation);
> + int     n PG_USED_FOR_ASSERTS_ONLY;
> +
> + n = sscanf(toast_name, "pg_toast_%u", &entry->relid);
> +
> + Assert(n == 1);
> + }
> + else
> + entry->relid = RelationGetRelid(relation);
> +
>   entry->filterable = false;
> + rb->try_to_filter_change = rb->filter_change(rb, entry->relid, change_type,
> +   true, &cache_valid);
>   RelationClose(relation);
>   }
>   else
>   {
>   entry->relid = InvalidOid;
> - entry->filterable = true;
> + rb->try_to_filter_change = entry->filterable = true;
>   }
>
>   rb->try_to_filter_change = entry->filterable;
> ~
>
> Something seems fishy here. AFAICT, the rb->try_to_filter_change will
> already be assigned in both the *if* and the *else* blocks. So, why is
> it being overwritten again outside that if/else?
>

Fixed.



On Wed, Apr 23, 2025 at 1:49 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Ajin.
>
> Here are some v17-0003 review comments (aka some v16-0003 comments
> that were not yet addressed or rejected)
>
> On Fri, Apr 11, 2025 at 4:05 PM Peter Smith <smithpb2250@gmail.com> wrote:
> ...
> > ======
> > Commit message
> >
> > 2. missing commit message
>
> Not yet addressed.

Fixed.

>
> > ~~~
> >
> > 8.
> > # Insert, delete, and update tests for restricted publication tables
> > $log_location = -s $node_publisher->logfile;
> > $node_publisher->safe_psql('postgres', "INSERT INTO insert_only_table
> > VALUES (1, 'to be inserted')");
> > $node_publisher->safe_psql('postgres', "UPDATE insert_only_table SET
> > data = 'updated' WHERE id = 1");
> > $logfile = slurp_file($node_publisher->logfile, $log_location);
> > ok($logfile =~ qr/Filtering UPDATE/,
> > 'unpublished UPDATE is filtered');
> >
> > $log_location = -s $node_publisher->logfile;
> > $node_publisher->safe_psql('postgres', "DELETE FROM insert_only_table
> > WHERE id = 1");
> > $logfile = slurp_file($node_publisher->logfile, $log_location);
> > ok($logfile =~ qr/Filtering DELETE/,
> > 'unpublished DELETE is filtered');
> >
> > $log_location = -s $node_publisher->logfile;
> > $node_publisher->safe_psql('postgres', "INSERT INTO delete_only_table
> > VALUES (1, 'to be deleted')");
> > $logfile = slurp_file($node_publisher->logfile, $log_location);
> > ok($logfile =~ qr/Filtering INSERT/,
> > 'unpublished INSERT is filtered');
> >
> > ~
> ...
> > 8b.
> > Change the comment or rearrange the tests so they are in the same
> > order as described
> >
>
> The comment was changed, and now says "Insert, update, and delete
> tests ..." but still, the "Filtering INSERT" test is last. IMO, that
> test should come first to match the comment.
>
> > ~

Fixed.

> >
> > 8c.
> > Looking at the expected logs I wondered if it might be nicer for the
> > pgoutput_filter_change doing this logging to also emit the relation
> > name.
> >
>

pgoutupt_filter does not  have relation name, I have added a debug
message in reorderbuffer.c and modified the test accordingly.

Updated patchset v18 addressing review comments.

regards,
Ajin Cherian
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Some problems regarding the self-join elimination code
Next
From: Peter Eisentraut
Date:
Subject: Re: alphabetize long options in pg_dump[all] docs