RE: [PoC] pg_upgrade: allow to upgrade publisher node - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: [PoC] pg_upgrade: allow to upgrade publisher node
Date
Msg-id TYCPR01MB587049C4F11BF7EB4083D895F53FA@TYCPR01MB5870.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: [PoC] pg_upgrade: allow to upgrade publisher node  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses Re: [PoC] pg_upgrade: allow to upgrade publisher node
List pgsql-hackers
Dear hackers,

> Based on the above, we are considering that we delay the timing of shutdown for
> logical walsenders. The preliminary workflow is:
> 
> 1. When logical walsenders receives siginal from checkpointer, it consumes all
>    of WAL records, change its state into WALSNDSTATE_STOPPING, and stop
> doing
>    anything.
> 2. Then the checkpointer does the shutdown checkpoint
> 3. After that postmaster sends signal to walsenders, same as current
> implementation.
> 4. Finally logical walsenders process the shutdown checkpoint record and update
> the
>   confirmed_lsn after the acknowledgement from subscriber.
>   Note that logical walsenders don't have to send a shutdown checkpoint record
>   to subscriber but following keep_alive will help us to increment the
> confirmed_lsn.
> 5. All tasks are done, they exit.
> 
> This mechanism ensures that the confirmed_lsn of active slots is same as the
> current
> WAL location of old publisher, so that 0003 patch would become more simpler.
> We would not have to calculate the acceptable difference anymore.
> 
> One thing we must consider is that any WALs must not be generated while
> decoding
> the shutdown checkpoint record. It causes the PANIC. IIUC the record leads
> SnapBuildSerializationPoint(), which just serializes snapbuild or restores from
> it, so the change may be acceptable. Thought?

I've implemented the ideas from my previous proposal, PSA another patch set.
Patch 0001 introduces the state WALSNDSTATE_STOPPING to logical walsenders. The
workflow remains largely the same as described in my previous post, with the
following additions:

* A flag has been added to track whether all the WALs have been flushed. The
  logical walsender can only exit after the flag is set. This ensures that all
  WALs are flushed before the termination of the walsender.
* Cumulative statistics are now forcibly written before changing the state.
  While the previous involved reporting stats upon process exit, the current approach
  must report earlier due to the checkpointer's termination timing. See comments
  in CheckpointerMain() and atop pgstat_before_server_shutdown().
* At the end of processes, slots are now saved to disk.


Patch 0002 adds --include-logical-replication-slots option to pg_upgrade,
not changed from previous set.

Patch 0003 adds a check function, which becomes simpler. 
The previous version calculated the "acceptable" difference between confirmed_lsn
and the current WAL position. This was necessary because shutdown records could
not be sent to subscribers, creating a disparity in these values. However, this
approach had drawbacks, such as needing adjustments if record sizes changed.

Now, the record can be sent to subscribers, so the hacking is not needed anymore,
at least in the context of logical replication. The consistency is now maintained
by the logical walsenders, so slots created by the backend could not be.
We must consider what should be...

How do you think?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Next
From: Melih Mutlu
Date:
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication