RE: Parallel Apply - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: Parallel Apply
Date
Msg-id OS9PR01MB12149FDCBBEDB05F9A617F0ABF5252@OS9PR01MB12149.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Parallel Apply  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Dear Amit,

> + if (c == PqReplMsg_WALData)
> + {
> + /*
> + * Ignore statistics fields that have been updated by the
> + * leader apply worker.
> + *
> + * XXX We can avoid sending the statistics fields from the
> + * leader apply worker but for that, it needs to rebuild the
> + * entire message by removing these fields which could be more
> + * work than simply ignoring these fields in the parallel
> + * apply worker.
> + */
> + s.cursor += SIZE_STATS_MESSAGE;
> 
> - apply_dispatch(&s);
> + apply_dispatch(&s);
> + }
> + else if (c == PARALLEL_APPLY_INTERNAL_MESSAGE)
> + {
> + apply_dispatch(&s);
> + }
> 
> Isn't it better to invent a new apply_internal_message() or something
> like it to handle internal messages? I think if we do that then the
> previous point of not using LOGICAL_REP for
> LOGICAL_REP_MSG_INTERNAL_DEPENDENCY and other messages would
> make more
> sense.

Firstly, I had a concern that apply_error_callback() might become unnecessary
complex. Now apply_error_callback_arg.command is not updated in the new function:
it means no additional messages would be put by the reporting function.
I think it's enough for now. The function is intended to report the origin,
remote XID and LSN, but internal messages do not have.

> 3.
> @@ -921,6 +1036,9 @@ LogicalParallelApplyLoop(shm_mq_handle *mqh)
> 
>   if (rc & WL_LATCH_SET)
>   ResetLatch(MyLatch);
> +
> + if (!IsTransactionState())
> + pgstat_report_stat(true);
> 
> Why is this change required?

This is needed to pass tap tests. We're still analyzing the point, please wait
for some time.

> 5.
>  void
>  pa_start_subtrans(TransactionId current_xid, TransactionId top_xid)
>  {
> + if (!TransactionIdIsValid(top_xid))
> + return;
> 
> Why is this change required for this patch? It will be better to add a
> comment so that it is easier to understand.

It's needed to avoid unnecessary checking in case of non-streaming transactions;
the function checks whether there are sub-transactions or not, but non-streaming
case they won't be replicated. Comments were added.

Other comments were addressed accordingly, please see attached patch set.
Thank you Hou Zhijie to revise some parts.

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: [PATCH] GROUP BY ALL
Next
From: David Geier
Date:
Subject: Re: Reduce build times of pg_trgm GIN indexes