Dear Amit, Andres,
Thank you for giving the decision! Basically I will follow your idea and make
a patch accordingly.
> Today, I discussed this problem with Andres at PGConf NYC and he
> suggested as following. To verify, if there is any pending unexpected
> WAL after shutdown, we can have an API like
> pg_logical_replication_slot_advance() which will simply process
> records without actually sending anything downstream. In this new API,
> we will start with each slot's restart_lsn location and try to process
> till the end of WAL, if we encounter any WAL that needs to be
> processed (like we need to send the decoded WAL downstream) we can
> return a false indicating that there is an unexpected WAL. The reason
> to start with restart_lsn is that it is the location that we use to
> start scanning the WAL anyway.
I felt the approach seems similar to Hou-san's suggestion[1], but we can avoid to
use test_decoding. I'm planning to do that the upgrading function decodes WALs
and check whether there are reorderbuffer changes.
> Then, we should also try to create slots before invoking pg_resetwal.
> The idea is that we can write a new binary mode function that will do
> exactly what pg_resetwal does to compute the next segment and use that
> location as a new location (restart_lsn) to create the slots in a new
> node. Then, pass it pg_resetwal by using the existing option '-l
> walfile'. As we don't have any API that takes restart_lsn as input, we
> can write a new API probably for binary mode to create slots that do
> take restart_lsn as input. This will ensure that there is no new WAL
> inserted by background processes between resetwal and the creation of
> slots.
It seems better because we can create every objects before pg_resetwal.
I will handle above two points and let's see how it work.
[1]:
https://www.postgresql.org/message-id/OS0PR01MB5716506A1A1B20EFBFA7B52994C1A%40OS0PR01MB5716.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED