Re: PGDOCS - Logical replication GUCs - added some xrefs - Mailing list pgsql-hackers

From samay sharma
Subject Re: PGDOCS - Logical replication GUCs - added some xrefs
Date
Msg-id CAJxrbyxnZxk7+_v8MsphNBK=v-yb3Mo3BeH_TXQKo5H=s4oHwQ@mail.gmail.com
Whole thread Raw
In response to PGDOCS - Logical replication GUCs - added some xrefs  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: PGDOCS - Logical replication GUCs - added some xrefs  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
Hi,

On Mon, Oct 24, 2022 at 12:45 AM Peter Smith <smithpb2250@gmail.com> wrote:
Hi hackers.

There is a docs Logical Replication section "31.10 Configuration
Settings" [1] which describes some logical replication GUCs, and
details on how they interact with each other and how to take that into
account when setting their values.

There is another docs Server Configuration section "20.6 Replication"
[2] which lists the replication-related GUC parameters, and what they
are for.

Currently AFAIK those two pages are unconnected, but I felt it might
be helpful if some of the parameters in the list [2] had xref links to
the additional logical replication configuration information [1]. PSA
a patch to do that.

+1 on the patch. Some feedback on v5 below.

> +    <para>
> +     For <firstterm>logical replication</firstterm> configuration settings refer
> +     also to <xref linkend="logical-replication-config"/>.
> +    </para>
> +

I feel the top paragraph needs to explain terminology for logical replication like it does for physical replication in addition to linking to the logical replication config page. I'm recommending this as we use terms like subscriber etc. in description of parameters without introducing them first.

As an example, something like below might work.

These settings control the behavior of the built-in streaming replication feature (see Section 27.2.5) and logical replication (link).

For physical replication, servers will be either a primary or a standby server. Primaries can send data, while standbys are always receivers of replicated data. When cascading replication (see Section 27.2.7) is used, standby servers can also be senders, as well as receivers. Parameters are mainly for sending and standby servers, though some parameters have meaning only on the primary server. Settings may vary across the cluster without problems if that is required.

For logical replication, servers will either be publishers (also called senders in the sections below) or subscribers. Publishers are ...., Subscribers are...

> +       <para>
> +         See <xref linkend="logical-replication-config"/> for more details
> +         about setting <varname>max_replication_slots</varname> for logical
> +         replication.
> +        </para>


The link doesn't add any new information regarding max_replication_slots other than "to reserve some for table sync" and has a good amount of unrelated info. I think it might be useful to just put a line here asking to reserve some for table sync instead of linking to the entire logical replication config section.

 
> -   Logical replication requires several configuration options to be set.
> +   Logical replication requires several configuration parameters to be set.

May not be needed? The docs have references to both options and parameters but I don't feel strongly about it. Feel free to use what you prefer.

I think we should add an additional line to the intro here saying that parameters are mostly relevant only one of the subscriber or publisher. Maybe a better written version of "While max_replication_slots means different things on the publisher and subscriber, all other parameters are relevant only on either the publisher or the subscriber."

> +  <sect2 id="logical-replication-config-notes">
> +   <title>Notes</title>

I don't think we need this sub-section. If I understand correctly, these parameters are effective only on the subscriber side. So, any reason to not include them in that section?

> +
> +   <para>
> +    Logical replication workers are also affected by
> +    <link linkend="guc-wal-receiver-timeout"><varname>wal_receiver_timeout</varname></link>,
> +    <link linkend="guc-wal-receiver-status-interval"><varname>wal_receiver_status_interval</varname></link> and
> +    <link linkend="guc-wal-retrieve-retry-interval"><varname>wal_receiver_retry_interval</varname></link>.
> +   </para>
> +

I like moving this; it makes more sense here. Should we remove it from config.sgml? It seems a bit out of place there as we generally talk only about individual parameters there and this line is general logical replication subscriber advise which is more suited to logical-replication.sgml

> +   <para>
> +    Configuration parameter
> +    <link linkend="guc-max-worker-processes"><varname>max_worker_processes</varname></link>
> +    may need to be adjusted to accommodate for replication workers, at least (
> +    <link linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
> +    + <literal>1</literal>). Some extensions and parallel queries also take
> +    worker slots from <varname>max_worker_processes</varname>.
> +   </para>
> +
> +  </sect2>

I think we should move this to the subscriber section as said above. It's useful to know this and people might skip over the notes.


~~

Meanwhile, I also suspect that the main blurb top of [1] is not
entirely correct... it says "These settings control the behaviour of
the built-in streaming replication feature", although some of the GUCs
mentioned later in this section are clearly for "logical replication".

Thoughts?

I shared an idea above.

Regards,
Samay
 

------
[1] 31.10 Configuration Settings -
https://www.postgresql.org/docs/current/logical-replication-config.html
[2] 20.6 Replication -
https://www.postgresql.org/docs/current/runtime-config-replication.html

Kind Regards,
Peter Smith.
Fujitsu Australia

pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert
Next
From: Corey Huinker
Date:
Subject: Re: Error-safe user functions