RE: Logical Replication of sequences - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: Logical Replication of sequences
Date
Msg-id OSCPR01MB14966DA8CB749A0D4E9F3F7A7F50E2@OSCPR01MB14966.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
Dear Vignesh,

Thanks for updating the patch! Here are my comments.

01. SyncingRelationsState

```
 * SYNC_RELATIONS_STATE_NEEDS_REBUILD The subscription relations state is no
 * longer valid and the subscription relations should be rebuilt.
```

Can we follow the style like other lines? Like:

SYNC_RELATIONS_STATE_NEEDS_REBUILD indicates that the subscription relations
state is no longer valid and the subscription relations should be rebuilt.

02. FetchRelationStates()

```
        /* Fetch tables and sequences that are in non-ready state. */
        rstates = GetSubscriptionRelations(MySubscription->oid, true, true,
                                           false);
```

I think rstates should be list_free_deep()'d after the foreach().

03. LogicalRepSyncSequences

```
    char        slotname[NAMEDATALEN];
...
    snprintf(slotname, NAMEDATALEN, "pg_%u_sync_sequences_" UINT64_FORMAT,
             subid, GetSystemIdentifier());

    /*
     * Here we use the slot name instead of the subscription name as the
     * application_name, so that it is different from the leader apply worker,
     * so that synchronous replication can distinguish them.
     */
    LogRepWorkerWalRcvConn =
        walrcv_connect(MySubscription->conninfo, true, true,
                       must_use_password,
                       slotname, &err);
```

Hmm, IIUC the sequence sync worker does not require any replication slots.
I feel the variable name should be "application_name" and the comment can be updated.

04. LogicalRepSyncSequences

```
    /* Get the sequences that should be synchronized. */
    sequences = GetSubscriptionRelations(subid, false, true, false);
```

I think rstates should be list_free_deep()'d after the foreach_ptr().

05. LogicalRepSyncSequences

```
        /*
         * COPY FROM does not honor RLS policies.  That is not a problem for
         * subscriptions owned by roles with BYPASSRLS privilege (or
         * superuser, who has it implicitly), but other roles should not be
         * able to circumvent RLS.  Disallow logical replication into RLS
         * enabled relations for such roles.
         */
        if (check_enable_rls(RelationGetRelid(sequence_rel), InvalidOid, false) == RLS_ENABLED)
            ereport(ERROR,
                    errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                    errmsg("user \"%s\" cannot replicate into sequence with row-level security enabled: \"%s\"",
                           GetUserNameFromId(GetUserId(), true),
                           RelationGetRelationName(sequence_rel)));
```

Can we really set a row level security policy to sequences? I've tested but
I couldn't.

```
postgres=# CREATE SEQUENCE s;
CREATE SEQUENCE
postgres=# ALTER TABLE s ENABLE ROW LEVEL SECURITY ;
ERROR:  ALTER action ENABLE ROW SECURITY cannot be performed on relation "s"
DETAIL:  This operation is not supported for sequences.
```

06. copy_sequence

```
    appendStringInfo(&cmd, "SELECT c.oid, c.relkind\n"
                     "FROM pg_catalog.pg_class c\n"
                     "INNER JOIN pg_catalog.pg_namespace n\n"
                     "  ON (c.relnamespace = n.oid)\n"
                     "WHERE n.nspname = %s AND c.relname = %s",
                     quote_literal_cstr(nspname),
                     quote_literal_cstr(relname));
```

I feel the function is not so efficient because it can obtain only a tuple, i.e.,
information for one sequence at once. I think you basically copied from fetch_remote_table_info(),
but it was OK for it because the tablesync worker handles only a table.

Can you obtain all sequences at once and check whether each sequences match or not?

07. LogicalRepSyncSequences

```
            /* LOG all the sequences synchronized during current batch. */
            for (int i = 0; i < curr_batch_seq; i++)
            {
                SubscriptionRelState *done_seq;

                done_seq = (SubscriptionRelState *) lfirst(list_nth_cell(sequences_not_synced,
                                                                         (curr_seq - curr_batch_seq) + i));

                ereport(DEBUG1,
                        errmsg_internal("logical replication synchronization for subscription \"%s\", sequence \"%s\"
hasfinished",
 
                                        get_subscription_name(subid, false), get_rel_name(done_seq->relid)));
            }
```

The loop is needed only when the debug messages should be output the system.
Can we use message_level_is_interesting() to skip the loop for some cases?

08. pg_dump.c

```
/*
 * getSubscriptionTables
 *      Get information about subscription membership for dumpable tables. This
 *    will be used only in binary-upgrade mode for PG17 or later versions.
 */
void
getSubscriptionTables(Archive *fout)
```

I was bit confused of the pg_dump codes. I doubt that the pg_upgrade might not
be able to transfer pg_subscripion_rel entries of sequences, but it seemed to work
well because sequences are handled mostly same as normal tables on the pg_dump layer.
But based on other codes, the function should be "getSubscriptionRelations" and
comments should be also updated.

09. logical-replication.sgml

I feel we can add descriptions in "Publication" section. E.g.: 

Publications can also include sequences, but the behavior differs from a table
or a group of tables: users can synchronize current states of sequences at an
arbitrary timing. For more details, please see "Replicating Sequences".

10. pg_proc.dat

```
+{ oid => '6313',
+  descr => 'current on-disk sequence state',
+  proname => 'pg_sequence_state', provolatile => 'v',
+  prorettype => 'record', proargtypes => 'regclass',
+  proallargtypes => '{regclass,pg_lsn,int8,int8,bool}',
+  proargmodes => '{i,o,o,o,o}',
+  proargnames => '{seq_oid,page_lsn,last_value,log_cnt,is_called}',
+  prosrc => 'pg_sequence_state' },
```

I think this is not wrong, but according to the src/include/catalog/unused_oids,
the oid should be at 8000-9999 while developing:

```
$ ./unused_oids
...
Patches should use a more-or-less consecutive range of OIDs.
Best practice is to start with a random choice in the range 8000-9999.
Suggested random unused OID: 9293 (415 consecutive OID(s) available starting here)
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED


pgsql-hackers by date:

Previous
From: RECHTÉ Marc
Date:
Subject: Re: Logical replication timeout
Next
From: Andrey Borodin
Date:
Subject: Re: Fix bank selection logic in SLRU