RE: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date
Msg-id OSCPR01MB14966F1A97E4D4288187BD613F538A@OSCPR01MB14966.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: POC: enable logical decoding when wal_level = 'replica' without a server restart  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
List pgsql-hackers
Dear Sawada-san,

Thanks for updating the patch. Here are my comments.

xlog_desc()
```
    else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
    {
        bool        enabled;

        memcpy(&enabled, rec, sizeof(bool));
        appendStringInfo(buf, enabled ? "true" : "false");
    }
```

Per 2075ba9, appendStringInfoString() can be used if we do not have other messages.

logicalctl.h
```
extern void UpdateNumberOfLogicalSlots(bool incr);
```

This function is not implemented.

UpdateLogicalDecodingStatus()
```
    elog(DEBUG1, "update logical decoding status to %d", new_status);
```

I prefer to use true/false instead of 1/0, thought?

xlog_redo()
```
        /* Update the status on shared memory */
        memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool));
        UpdateLogicalDecodingStatus(logical_decoding, true);

        if (InRecovery && InHotStandby)
        {
            if (!logical_decoding)
            {
                /*
                 * Invalidate logical slots if we are in hot standby and the
                 * primary disabled the logical decoding.
                 */
                InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
                                                   0, InvalidOid,
                                                   InvalidTransactionId);

```

Assuming that logical_decoding written in the WAL is false here, and a logical
replication slot is created just after that. In my experiments below happened:

1. startup process updated logical_decoding_enabled to false, at line 8652.
2. slotsync worker started to sync. Surprisingly, it created a (second) logical
   slot and started logical decoding with fast_foward mode.
3. startup invalidated logical slots due to the wal_level. the slot created at
   step2 was automatically dropped, because it was not sync-readly yet.
4. startup process shut down the slotsync worker.
5. start process read the STATUS_CHANGE record again, which has the value "true".
   it requested to restart the sync worker.
6. restarted sync worker synchronize the slot again...

For me it works well but it is bit a strange because 1) logical decoding is
started even when effective_wal_level is false, and 2) the synced slot is
dropped once with below message:

```
LOG:  terminating process 1474448 to release replication slot "test2"
DETAIL:  Logical decoding on standby requires "wal_level" >= "logical" or at least one logical slot on the primary
server.
CONTEXT:  WAL redo at 0/030000B8 for XLOG/LOGICAL_DECODING_STATUS_CHANGE: false
ERROR:  canceling statement due to conflict with recovery
DETAIL:  User was using a logical replication slot that must be invalidated.
```

Can we stop the sync worker before updating the status? IIUC this is one of the
solution.

Best regards,
Hayato Kuroda
FUJITSU LIMITED


pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: [PATCH] Generate random dates/times in a specified range
Next
From: "Matheus Alcantara"
Date:
Subject: Re: Potential problem in commit f777d773878 and 4f7f7b03758