Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Perform streaming logical transactions by background workers and parallel apply |
Date | |
Msg-id | CAHut+Pu8LCtzdd_PMbUoueZ9W8D7-Hd6nTH30kFEGzhfTgtFEg@mail.gmail.com Whole thread Raw |
In response to | RE: Perform streaming logical transactions by background workers and parallel apply ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>) |
Responses |
RE: Perform streaming logical transactions by background workers and parallel apply
|
List | pgsql-hackers |
Here are my review comments for patch v87-0002. ====== doc/src/sgml/config.sgml 1. <para> - Allows streaming or serializing changes immediately in logical decoding. The allowed values of <varname>logical_replication_mode</varname> are - <literal>buffered</literal> and <literal>immediate</literal>. When set - to <literal>immediate</literal>, stream each change if + <literal>buffered</literal> and <literal>immediate</literal>. The default + is <literal>buffered</literal>. + </para> I didn't think it was necessary to say “of logical_replication_mode”. IMO that much is already obvious because this is the first sentence of the description for logical_replication_mode. (see also review comment #4) ~~~ 2. + <para> + On the publisher side, it allows streaming or serializing changes + immediately in logical decoding. When set to + <literal>immediate</literal>, stream each change if <literal>streaming</literal> option (see optional parameters set by <link linkend="sql-createsubscription"><command>CREATE SUBSCRIPTION</command></link>) is enabled, otherwise, serialize each change. When set to - <literal>buffered</literal>, which is the default, decoding will stream - or serialize changes when <varname>logical_decoding_work_mem</varname> - is reached. + <literal>buffered</literal>, decoding will stream or serialize changes + when <varname>logical_decoding_work_mem</varname> is reached. </para> 2a. "it allows" --> "logical_replication_mode allows" 2b. "decoding" --> "the decoding" ~~~ 3. + <para> + On the subscriber side, if <literal>streaming</literal> option is set + to <literal>parallel</literal>, this parameter also allows the leader + apply worker to send changes to the shared memory queue or to serialize + changes. When set to <literal>buffered</literal>, the leader sends + changes to parallel apply workers via shared memory queue. When set to + <literal>immediate</literal>, the leader serializes all changes to + files and notifies the parallel apply workers to read and apply them at + the end of the transaction. + </para> "this parameter also allows" --> "logical_replication_mode also allows" ~~~ 4. <para> This parameter is intended to be used to test logical decoding and replication of large transactions for which otherwise we need to generate the changes till <varname>logical_decoding_work_mem</varname> - is reached. + is reached. Moreover, this can also be used to test the transmission of + changes between the leader and parallel apply workers. </para> "Moreover, this can also" --> "It can also" I am wondering would this sentence be better put at the top of the GUC description. So then the first paragraph becomes like this: SUGGESTION (I've also added another sentence "The effect of...") The allowed values are buffered and immediate. The default is buffered. This parameter is intended to be used to test logical decoding and replication of large transactions for which otherwise we need to generate the changes till logical_decoding_work_mem is reached. It can also be used to test the transmission of changes between the leader and parallel apply workers. The effect of logical_replication_mode is different for the publisher and subscriber: On the publisher side... On the subscriber side... ====== .../replication/logical/applyparallelworker.c 5. + /* + * In immeidate mode, directly return false so that we can switch to + * PARTIAL_SERIALIZE mode and serialize remaining changes to files. + */ + if (logical_replication_mode == LOGICAL_REP_MODE_IMMEDIATE) + return false; Typo "immediate" Also, I felt "directly" is not needed. "return false" and "directly return false" is the same. SUGGESTION Using ‘immediate’ mode returns false to cause a switch to PARTIAL_SERIALIZE mode so that the remaining changes will be serialized. ====== src/backend/utils/misc/guc_tables.c 6. { {"logical_replication_mode", PGC_USERSET, DEVELOPER_OPTIONS, - gettext_noop("Allows streaming or serializing each change in logical decoding."), - NULL, + gettext_noop("Controls the behavior of logical replication publisher and subscriber"), + gettext_noop("If set to immediate, on the publisher side, it " + "allows streaming or serializing each change in " + "logical decoding. On the subscriber side, in " + "parallel streaming mode, it allows the leader apply " + "worker to serialize changes to files and notifies " + "the parallel apply workers to read and apply them at " + "the end of the transaction."), GUC_NOT_IN_SAMPLE }, 6a. short description User PoV behaviour should be the same. Instead, maybe say "controls the internal behavior" or something like that? ~ 6b. long description IMO the long description shouldn’t mention ‘immediate’ mode first as it does. BEFORE If set to immediate, on the publisher side, ... AFTER On the publisher side, ... ------ Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: