Re: logical decoding and replication of sequences, take 2 - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: logical decoding and replication of sequences, take 2
Date
Msg-id CAExHW5tR+rmrarU1m0DCmhq0e7d00XDbdhcHJV+iGkcPTfo8hg@mail.gmail.com
Whole thread Raw
In response to Re: logical decoding and replication of sequences, take 2  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: logical decoding and replication of sequences, take 2
Re: logical decoding and replication of sequences, take 2
List pgsql-hackers
On Thu, Dec 14, 2023 at 2:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 14, 2023 at 2:45 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Thu, Dec 14, 2023 at 12:37 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > I think you forgot to attach the patch.
> >
> > Sorry. Here it is.
> >
> > On Thu, Dec 14, 2023 at 2:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > >
> > > It looks like the solution works. But this is the only place where we
> > > process a change before SNAPSHOT reaches FULL. But this is also the
> > > only record which affects a decision to queue/not a following change.
> > > So it should be ok. The sequence_hash'es as separate for each
> > > transaction and they are cleaned when processing COMMIT record.
> > > >
> > >
> > > But it is possible that even commit or abort also happens before the
> > > snapshot reaches full state in which case the hash table will have
> > > stale or invalid (for aborts) entries. That will probably be cleaned
> > > at a later point by running_xact records.
> >
> > Why would cleaning wait till running_xact records? Won't txn entry
> > itself be removed when processing commit/abort record? At the same the
> > sequence hash will be cleaned as well.
> >
> > > Now, I think in theory, it
> > > is possible that the same RelFileLocator can again be allocated before
> > > we clean up the existing entry which can probably confuse the system.
> >
> > How? The transaction allocating the first time would be cleaned before
> > it happens the second time. So shouldn't matter.
> >
>
> It can only be cleaned if we process it but xact_decode won't allow us
> to process it and I don't think it would be a good idea to add another
> hack for sequences here. See below code:
>
> xact_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
> {
> SnapBuild  *builder = ctx->snapshot_builder;
> ReorderBuffer *reorder = ctx->reorder;
> XLogReaderState *r = buf->record;
> uint8 info = XLogRecGetInfo(r) & XLOG_XACT_OPMASK;
>
> /*
> * If the snapshot isn't yet fully built, we cannot decode anything, so
> * bail out.
> */
> if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT)
> return;

That may be true for a transaction which is decoded, but I think all
the transactions which are added to ReorderBuffer should be cleaned up
once they have been processed irrespective of whether they are
decoded/sent downstream or not. In this case I see the sequence hash
being cleaned up for the sequence related transaction in Hayato's
reproducer. See attached patch with a diagnostic change and the output
below (notice sequence cleanup called on transaction 767).
2023-12-14 21:06:36.756 IST [386957] LOG:  logical decoding found
initial starting point at 0/15B2F68
2023-12-14 21:06:36.756 IST [386957] DETAIL:  Waiting for transactions
(approximately 1) older than 767 to end.
2023-12-14 21:06:36.756 IST [386957] STATEMENT:  SELECT
pg_create_logical_replication_slot('slot', 'test_decoding');
2023-12-14 21:07:05.679 IST [386957] LOG:  XXX: smgr_decode. snapshot
is SNAPBUILD_BUILDING_SNAPSHOT
2023-12-14 21:07:05.679 IST [386957] STATEMENT:  SELECT
pg_create_logical_replication_slot('slot', 'test_decoding');
2023-12-14 21:07:05.679 IST [386957] LOG:  XXX: seq_decode. snapshot
is SNAPBUILD_BUILDING_SNAPSHOT
2023-12-14 21:07:05.679 IST [386957] STATEMENT:  SELECT
pg_create_logical_replication_slot('slot', 'test_decoding');
2023-12-14 21:07:05.679 IST [386957] LOG:  XXX: skipped
2023-12-14 21:07:05.679 IST [386957] STATEMENT:  SELECT
pg_create_logical_replication_slot('slot', 'test_decoding');
2023-12-14 21:07:05.710 IST [386957] LOG:  logical decoding found
initial consistent point at 0/15B3388
2023-12-14 21:07:05.710 IST [386957] DETAIL:  Waiting for transactions
(approximately 1) older than 768 to end.
2023-12-14 21:07:05.710 IST [386957] STATEMENT:  SELECT
pg_create_logical_replication_slot('slot', 'test_decoding');
2023-12-14 21:07:39.292 IST [386298] LOG:  checkpoint starting: time
2023-12-14 21:07:40.919 IST [386957] LOG:  XXX: seq_decode. snapshot
is SNAPBUILD_FULL_SNAPSHOT
2023-12-14 21:07:40.919 IST [386957] STATEMENT:  SELECT
pg_create_logical_replication_slot('slot', 'test_decoding');
2023-12-14 21:07:40.919 IST [386957] LOG:  XXX: the sequence is transactional
2023-12-14 21:07:40.919 IST [386957] STATEMENT:  SELECT
pg_create_logical_replication_slot('slot', 'test_decoding');
2023-12-14 21:07:40.919 IST [386957] LOG:  sequence cleanup called on
transaction 767
2023-12-14 21:07:40.919 IST [386957] STATEMENT:  SELECT
pg_create_logical_replication_slot('slot', 'test_decoding');
2023-12-14 21:07:40.919 IST [386957] LOG:  logical decoding found
consistent point at 0/15B3518
2023-12-14 21:07:40.919 IST [386957] DETAIL:  There are no running transactions.
2023-12-14 21:07:40.919 IST [386957] STATEMENT:  SELECT
pg_create_logical_replication_slot('slot', 'test_decoding');

We see similar output when pg_logical_slot_get_changes() is called.

I haven't found the code path from where the sequence cleanup gets
called. But it's being called. Am I missing something?

--
Best Wishes,
Ashutosh Bapat

Attachment

pgsql-hackers by date:

Previous
From: "Tristan Partin"
Date:
Subject: Re: Clean up find_typedefs and add support for Mac
Next
From: Jelte Fennema-Nio
Date:
Subject: Re: Add new for_each macros for iterating over a List that do not require ListCell pointer