RE: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From wangw.fnst@fujitsu.com
Subject RE: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id OS3PR01MB6275DBDC55E9A09572C97FAC9E499@OS3PR01MB6275.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: Perform streaming logical transactions by background workers and parallel apply  ("kuroda.hayato@fujitsu.com" <kuroda.hayato@fujitsu.com>)
List pgsql-hackers
On Tues, Sep 13, 2022 at 20:02 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
> Dear Hou-san,
> 
> > I will dig your patch more, but I send partially to keep the activity of the thread.
> 
> More minor comments about v28.

Thanks for your comments.

> ===
> About 0002
> 
> For 015_stream.pl
> 
> 14. check_parallel_log
> 
> ```
> +# Check the log that the streamed transaction was completed successfully
> +# reported by parallel apply worker.
> +sub check_parallel_log
> +{
> +       my ($node_subscriber, $offset, $is_parallel)= @_;
> +       my $parallel_message = 'finished processing the transaction finish
> command';
> +
> +       if ($is_parallel)
> +       {
> +               $node_subscriber->wait_for_log(qr/$parallel_message/, $offset);
> +       }
> +}
> ```
> 
> I think check_parallel_log() should be called only when streaming = 'parallel' and
> if-statement is not needed

I wanted to make the function test_streaming look simpler, so I put the
checking of the streaming option inside the function check_parallel_log.

> For 016_stream_subxact.pl
> 
> 15. test_streaming
> 
> ```
> +       INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(    3,
> 500) s(i);
> ```
> 
> "    3" should be "3".

Improved.

> About 0003
> 
> For applyparallelworker.c
> 
> 16. parallel_apply_relation_check()
> 
> ```
> +       if (rel->parallel_apply_safe == PARALLEL_APPLY_SAFETY_UNKNOWN)
> +               logicalrep_rel_mark_parallel_apply(rel);
> ```
> 
> I was not clear when logicalrep_rel_mark_parallel_apply() is called here.
> IIUC parallel_apply_relation_check() is called when parallel apply worker
> handles changes,
> but before that relation is opened via logicalrep_rel_open() and
> parallel_apply_safe is set here.
> If it guards some protocol violation, we may use Assert().

Compared with the flag "localrelvalid", we also need to additionally reset the
flag "safety" when function and type are changed (see function
logicalrep_relmap_init). So I think for these two cases, we just need to reset
the flag "safety" to avoid rebuilding too much cache (see function
logicalrep_relmap_reset_parallel_cb).

> For create_subscription.sgml
> 
> 17.
> The restriction about foreign key does not seem to be documented.

I removed the check for the foreign key.

Since foreign key does not take effect in the subscriber's apply worker by
default, it seems that foreign key does not hit this ERROR frequently. 
If we set foreign key related trigger to "REPLICA", then, I think this flag
will be set to "unsafety" when checking non-immutable function uesd by trigger.

BTW, I only document this reason in the commit message and keep the foreign key
related tests.

> ===
> About 0004
> 
> For 015_stream.pl
> 
> 18. check_parallel_log
> 
> I heard that the removal has been reverted, but in the patch
> check_parallel_log() is removed again... :-(

Yes, I removed it.
I think this will make the test unstable. Because after applying patch
0004, we could not sure whether the transaction is completed in a parallel
apply worker. If any unexpected error occurs, the test will fail because the
log cannot be found, even if the transaction completed successfully. 

> ===
> About throughout
> 
> I checked the test coverage via `make coverage`. About appluparallelworker.c
> and worker.c, both function coverage is 100%, and
> line coverages are 86.2 % and 94.5 %. Generally it's good.
> But I read the report and following parts seems not tested.
> 
> In parallel_apply_start_worker():
> 
> ```
>         if (tmp_winfo->error_mq_handle == NULL)
>         {
>             /*
>              * Release the worker information and try next one if
> the parallel
>              * apply worker exited cleanly.
>              */
>             ParallelApplyWorkersList =
> foreach_delete_current(ParallelApplyWorkersList, lc);
>             shm_mq_detach(tmp_winfo->mq_handle);
>             dsm_detach(tmp_winfo->dsm_seg);
>             pfree(tmp_winfo);
>         }
> ```
> 
> In HandleParallelApplyMessage():
> 
> ```
>         case 'X':                /* Terminate, indicating
> clean exit */
>             {
>                 shm_mq_detach(winfo->error_mq_handle);
>                 winfo->error_mq_handle = NULL;
>                 break;
>             }
> ```
> 
> Does it mean that we do not test the termination of parallel apply worker? If so I
> think it should be tested.

Since this is an unexpected situation that cannot be reproduced 100%, we did
not add tests related to this part of the code to improve coverage.

The new patches were attached in [1].

[1] -
https://www.postgresql.org/message-id/OS3PR01MB6275F145878B4A44586C46CE9E499%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Regards,
Wang wei


pgsql-hackers by date:

Previous
From: "wangw.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Next
From: "shiy.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply