Thread: postgres_fdw "parallel_commit" docs
Hi, While researching PG15 features, I was trying to read through the docs[1] for the "parallel_commit" (04e706d4) feature in postgres_fdw to better understand what it does. I found myself becoming lost with the references to (sub)transaction and a few other items that, while accurate, may be overly specific in this context. Attached is a patch to try to simplify the language for the description of the "parallel_commit" option. A few notes: * I stated that this feature applies to both transactions and subtransactions. * I tried to condense some of the language around remote/local transactions. If this makes the statement inaccurate, let's revise. * I removed the "Be careful with this option" and instead clarified an explanation of the case that could cause performance impacts. This feature seems like it will be impactful for distributed workloads using "postgres_fdw" so I want to ensure that we both accurately and clearly capture what it can do. Thanks! Jonathan [1] https://www.postgresql.org/docs/devel/postgres-fdw.html#id-1.11.7.47.11.7
Attachment
On Mon, May 09, 2022 at 11:37:35AM -0400, Jonathan S. Katz wrote: > @@ -473,27 +473,25 @@ OPTIONS (ADD password_required 'false'); > <term><literal>parallel_commit</literal> (<type>boolean</type>)</term> > <listitem> > <para> > - This option controls whether <filename>postgres_fdw</filename> commits > - remote (sub)transactions opened on a foreign server in a local > - (sub)transaction in parallel when the local (sub)transaction commits. > - This option can only be specified for foreign servers, not per-table. > - The default is <literal>false</literal>. > + This option controls whether <filename>postgres_fdw</filename> commits in > + parallel remote transactions opened on a foreign server in a local > + transaction when the local transaction is committed. This setting > + applies to remote and local substransactions. This option can only be typo: substransactions > - If multiple foreign servers with this option enabled are involved in > - a local (sub)transaction, multiple remote (sub)transactions opened on > - those foreign servers in the local (sub)transaction are committed in > - parallel across those foreign servers when the local (sub)transaction > - commits. > + If multiple foreign servers with this option enabled have a local > + transaction, multiple remote transactions on those foreign servers are > + committed in parallel across those foreign servers when the local > + transaction is committed. > </para> I think "have a transaction" doesn't sound good, and the old language "involved in" was better.
Hi Jonathan, On Tue, May 10, 2022 at 12:37 AM Jonathan S. Katz <jkatz@postgresql.org> wrote: > While researching PG15 features, I was trying to read through the > docs[1] for the "parallel_commit" (04e706d4) feature in postgres_fdw to > better understand what it does. I found myself becoming lost with the > references to (sub)transaction and a few other items that, while > accurate, may be overly specific in this context. I have to admit that that is making the docs confusing. > Attached is a patch to try to simplify the language for the description > of the "parallel_commit" option. A few notes: Thanks for the patch! > * I stated that this feature applies to both transactions and > subtransactions. > * I tried to condense some of the language around remote/local > transactions. If this makes the statement inaccurate, let's revise. One thing I noticed is this bit: - When multiple remote (sub)transactions are involved in a local - (sub)transaction, by default <filename>postgres_fdw</filename> commits - those remote (sub)transactions one by one when the local (sub)transaction - commits. - Performance can be improved with the following option: + When multiple remote transactions or subtransactions are involved in a + local transaction (or subtransaction) on a foreign server, + <filename>postgres_fdw</filename> by default commits those remote + transactions serially when the local transaction commits. Performance can be + improved with the following option: I think this might still be a bit confusing. How about rewriting it to something like this? As described in F.38.4. Transaction Management, in postgres_fdw transactions are managed by creating corresponding remote transactions, and subtransactions are managed by creating corresponding remote subtransactions. When multiple remote transactions are involved in the current local transaction, postgres_fdw by default commits those remote transactions serially when the local transaction is committed. When multiple remote subtransactions are involved in the current local subtransaction, it by default commits those remote subtransactions serially when the local subtransaction is committed. Performance can be improved with the following option: It might be a bit redundant to explain the transaction/subtransaction cases differently, but I think it makes it clear and maybe easy-to-understand that how they are handled by postgres_fdw by default. > * I removed the "Be careful with this option" and instead clarified an > explanation of the case that could cause performance impacts. I like this change. > This feature seems like it will be impactful for distributed workloads > using "postgres_fdw" so I want to ensure that we both accurately and > clearly capture what it can do. Thanks! Best regards, Etsuro Fujita
Hi Justin, On Tue, May 10, 2022 at 12:58 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > On Mon, May 09, 2022 at 11:37:35AM -0400, Jonathan S. Katz wrote: > > - If multiple foreign servers with this option enabled are involved in > > - a local (sub)transaction, multiple remote (sub)transactions opened on > > - those foreign servers in the local (sub)transaction are committed in > > - parallel across those foreign servers when the local (sub)transaction > > - commits. > > + If multiple foreign servers with this option enabled have a local > > + transaction, multiple remote transactions on those foreign servers are > > + committed in parallel across those foreign servers when the local > > + transaction is committed. > > </para> > > I think "have a transaction" doesn't sound good, and the old language "involved > in" was better. I think so too. Thanks! Best regards, Etsuro Fujita
On Wed, May 11, 2022 at 7:25 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote: > One thing I noticed is this bit: > > - When multiple remote (sub)transactions are involved in a local > - (sub)transaction, by default <filename>postgres_fdw</filename> commits > - those remote (sub)transactions one by one when the local (sub)transaction > - commits. > - Performance can be improved with the following option: > + When multiple remote transactions or subtransactions are involved in a > + local transaction (or subtransaction) on a foreign server, > + <filename>postgres_fdw</filename> by default commits those remote > + transactions serially when the local transaction commits. > Performance can be > + improved with the following option: > > I think this might still be a bit confusing. How about rewriting it > to something like this? > > As described in F.38.4. Transaction Management, in postgres_fdw > transactions are managed by creating corresponding remote > transactions, and subtransactions are managed by creating > corresponding remote subtransactions. When multiple remote > transactions are involved in the current local transaction, > postgres_fdw by default commits those remote transactions serially > when the local transaction is committed. When multiple remote > subtransactions are involved in the current local subtransaction, it > by default commits those remote subtransactions serially when the > local subtransaction is committed. Performance can be improved with > the following option: > > It might be a bit redundant to explain the transaction/subtransaction > cases differently, but I think it makes it clear and maybe > easy-to-understand that how they are handled by postgres_fdw by > default. I modified the patch that way. On Wed, May 11, 2022 at 7:29 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote: > On Tue, May 10, 2022 at 12:58 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Mon, May 09, 2022 at 11:37:35AM -0400, Jonathan S. Katz wrote: > > > - If multiple foreign servers with this option enabled are involved in > > > - a local (sub)transaction, multiple remote (sub)transactions opened on > > > - those foreign servers in the local (sub)transaction are committed in > > > - parallel across those foreign servers when the local (sub)transaction > > > - commits. > > > + If multiple foreign servers with this option enabled have a local > > > + transaction, multiple remote transactions on those foreign servers are > > > + committed in parallel across those foreign servers when the local > > > + transaction is committed. > > > </para> > > > > I think "have a transaction" doesn't sound good, and the old language "involved > > in" was better. > > I think so too. I modified the patch to use the old language. Also, I fixed a typo reported by Justin. Attached is an updated patch. I'll commit the patch if no objections. Best regards, Etsuro Fujita
Attachment
Hi Etsuro, On 5/12/22 7:26 AM, Etsuro Fujita wrote: > I modified the patch to use the old language. Also, I fixed a typo > reported by Justin. > > Attached is an updated patch. I'll commit the patch if no objections. Thanks for reviewing and revising! I think this is much easier to read. I made a few minor copy edits. Please see attached. Thanks, Jonathan
Attachment
Hi Jonathan, On Thu, May 12, 2022 at 10:32 PM Jonathan S. Katz <jkatz@postgresql.org> wrote: > On 5/12/22 7:26 AM, Etsuro Fujita wrote: > > Attached is an updated patch. I'll commit the patch if no objections. > > I think this is much easier to read. Cool! > I made a few minor copy edits. Please see attached. LGTM, so I pushed the patch. Thanks for the patch and taking the time to improve this! Best regards, Etsuro Fujita