Thanks Ashutosh for your comments.
On Wed, Sep 16, 2020 at 6:36 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Hi Vignesh,
>
> I've spent some time today looking at your new set of patches and I've
> some thoughts and queries which I would like to put here:
>
> Why are these not part of the shared cstate structure?
>
> SerializeString(pcxt, PARALLEL_COPY_KEY_NULL_PRINT, cstate->null_print);
> SerializeString(pcxt, PARALLEL_COPY_KEY_DELIM, cstate->delim);
> SerializeString(pcxt, PARALLEL_COPY_KEY_QUOTE, cstate->quote);
> SerializeString(pcxt, PARALLEL_COPY_KEY_ESCAPE, cstate->escape);
>
I have used shared_cstate mainly to share the integer & bool data
types from the leader to worker process. The above data types are of
char* data type, I will not be able to use it like how I could do it
for integer type. So I preferred to send these as separate keys to the
worker. Thoughts?
> I think in the refactoring patch we could replace all the cstate
> variables that would be shared between the leader and workers with a
> common structure which would be used even for a serial copy. Thoughts?
>
Currently we are using shared_cstate only to share integer & bool data
types from leader to worker. Once worker retrieves the shared data for
integer & bool data types, worker will copy it to cstate. I preferred
this way because only for integer & bool we retrieve to shared_cstate
& copy it to cstate and for rest of the members any way we are
directly copying back to cstate. Thoughts?
> Have you tested your patch when encoding conversion is needed? If so,
> could you please point out the email that has the test results.
>
We have not yet done encoding testing, we will do and post the results
separately in the coming days.
> Apart from above, I've noticed some cosmetic errors which I am sharing here:
>
> +#define IsParallelCopy() (cstate->is_parallel)
> +#define IsLeader() (cstate->pcdata->is_leader)
>
> This doesn't look to be properly aligned.
>
Fixed.
> + shared_info_ptr = (ParallelCopyShmInfo *)
> shm_toc_allocate(pcxt->toc, sizeof(ParallelCopyShmInfo));
> + PopulateParallelCopyShmInfo(shared_info_ptr, full_transaction_id);
>
> ..
>
> + /* Store shared build state, for which we reserved space. */
> + shared_cstate = (SerializedParallelCopyState
> *)shm_toc_allocate(pcxt->toc, est_cstateshared);
>
> In the first case, while typecasting you've added a space between the
> typename and the function but that is missing in the second case. I
> think it would be good if you could make it consistent.
>
Fixed
> Same comment applies here as well:
>
> + pg_atomic_uint32 line_state; /* line state */
> + uint64 cur_lineno; /* line number for error messages */
> +}ParallelCopyLineBoundary;
>
> ...
>
> + CommandId mycid; /* command id */
> + ParallelCopyLineBoundaries line_boundaries; /* line array */
> +} ParallelCopyShmInfo;
>
> There is no space between the closing brace and the structure name in
> the first case but it is in the second one. So, again this doesn't
> look consistent.
>
Fixed
> I could also find this type of inconsistency in comments. See below:
>
> +/* It can hold upto 10000 record information for worker to process. RINGSIZE
> + * should be a multiple of WORKER_CHUNK_COUNT, as wrap around cases
> is currently
> + * not handled while selecting the WORKER_CHUNK_COUNT by the worker. */
> +#define RINGSIZE (10 * 1000)
>
> ...
>
> +/*
> + * Each worker will be allocated WORKER_CHUNK_COUNT of records from DSM data
> + * block to process to avoid lock contention. Read RINGSIZE comments before
> + * changing this value.
> + */
> +#define WORKER_CHUNK_COUNT 50
>
> You may see these kinds of errors at other places as well if you scan
> through your patch.
Fixed.
Please find the attached v5 patch which has the fixes for the same.
Thoughts?
Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com