Re: Parallel copy - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Parallel copy |
Date | |
Msg-id | CAA4eK1L-Xgw1zZEbGePmhBBWmEmLFL6rCaiOMDPnq2GNMVz-sg@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel copy (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Parallel copy
Re: Parallel copy |
List | pgsql-hackers |
On Mon, Sep 28, 2020 at 12:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Few additional comments: > ====================== Some more comments: v5-0002-Framework-for-leader-worker-in-parallel-copy =========================================== 1. These values + * help in handover of multiple records with significant size of data to be + * processed by each of the workers to make sure there is no context switch & the + * work is fairly distributed among the workers. How about writing it as: "These values help in the handover of multiple records with the significant size of data to be processed by each of the workers. This also ensures there is no context switch and the work is fairly distributed among the workers." 2. Can we keep WORKER_CHUNK_COUNT, MAX_BLOCKS_COUNT, and RINGSIZE as power-of-two? Say WORKER_CHUNK_COUNT as 64, MAX_BLOCK_COUNT as 1024, and accordingly choose RINGSIZE. At many places, we do that way. I think it can sometimes help in faster processing due to cache size requirements and in this case, I don't see a reason why we can't choose these values to be power-of-two. If you agree with this change then also do some performance testing after this change? 3. + bool curr_blk_completed; + char data[DATA_BLOCK_SIZE]; /* data read from file */ + uint8 skip_bytes; +} ParallelCopyDataBlock; Is there a reason to keep skip_bytes after data? Normally the variable size data is at the end of the structure. Also, there is no comment explaining the purpose of skip_bytes. 4. + * Copy data block information. + * ParallelCopyDataBlock's will be created in DSM. Data read from file will be + * copied in these DSM data blocks. The leader process identifies the records + * and the record information will be shared to the workers. The workers will + * insert the records into the table. There can be one or more number of records + * in each of the data block based on the record size. + */ +typedef struct ParallelCopyDataBlock Keep one empty line after the description line like below. I also suggested to do a minor tweak in the above sentence which is as follows: * Copy data block information. * * These data blocks are created in DSM. Data read ... Try to follow a similar format in other comments as well. 5. I think it is better to move parallelism related code to a new file (we can name it as copyParallel.c or something like that). 6. copy.c(1648,25): warning C4133: 'function': incompatible types - from 'ParallelCopyLineState *' to 'uint32 *' Getting above compilation warning on Windows. v5-0003-Allow-copy-from-command-to-process-data-from-file ================================================== 1. @@ -4294,7 +5047,7 @@ BeginCopyFrom(ParseState *pstate, * only in text mode. */ initStringInfo(&cstate->attribute_buf); - cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1); + cstate->raw_buf = (IsParallelCopy()) ? NULL : (char *) palloc(RAW_BUF_SIZE + 1); Is there anyway IsParallelCopy can be true by this time? AFAICS, we do anything about parallelism after this. If you want to save this allocation then we need to move this after we determine that parallelism can be used or not and accordingly the below code in the patch needs to be changed. * ParallelCopyFrom - parallel copy leader's functionality. * * Leader executes the before statement for before statement trigger, if before @@ -1110,8 +1547,302 @@ ParallelCopyFrom(CopyState cstate) ParallelCopyShmInfo *pcshared_info = cstate->pcdata->pcshared_info; ereport(DEBUG1, (errmsg("Running parallel copy leader"))); + /* raw_buf is not used in parallel copy, instead data blocks are used.*/ + pfree(cstate->raw_buf); + cstate->raw_buf = NULL; Is there anything else also the allocation of which depends on parallelism? 2. +static pg_attribute_always_inline bool +IsParallelCopyAllowed(CopyState cstate) +{ + /* Parallel copy not allowed for frontend (2.0 protocol) & binary option. */ + if ((cstate->copy_dest == COPY_OLD_FE) || cstate->binary) + return false; + + /* Check if copy is into foreign table or temporary table. */ + if (cstate->rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE || + RelationUsesLocalBuffers(cstate->rel)) + return false; + + /* Check if trigger function is parallel safe. */ + if (cstate->rel->trigdesc != NULL && + !IsTriggerFunctionParallelSafe(cstate->rel->trigdesc)) + return false; + + /* + * Check if there is after statement or instead of trigger or transition + * table triggers. + */ + if (cstate->rel->trigdesc != NULL && + (cstate->rel->trigdesc->trig_insert_after_statement || + cstate->rel->trigdesc->trig_insert_instead_row || + cstate->rel->trigdesc->trig_insert_new_table)) + return false; + + /* Check if the volatile expressions are parallel safe, if present any. */ + if (!CheckExprParallelSafety(cstate)) + return false; + + /* Check if the insertion mode is single. */ + if (FindInsertMethod(cstate) == CIM_SINGLE) + return false; + + return true; +} In the comments, we should write why parallelism is not allowed for a particular case. The cases where parallel-unsafe clause is involved are okay but it is not clear from comments why it is not allowed in other cases. 3. + ParallelCopyShmInfo *pcshared_info = cstate->pcdata->pcshared_info; + ParallelCopyLineBoundary *lineInfo; + uint32 line_first_block = pcshared_info->cur_block_pos; + line_pos = UpdateBlockInLineInfo(cstate, + line_first_block, + cstate->raw_buf_index, -1, + LINE_LEADER_POPULATING); + lineInfo = &pcshared_info->line_boundaries.ring[line_pos]; + elog(DEBUG1, "[Leader] Adding - block:%d, offset:%d, line position:%d", + line_first_block, lineInfo->start_offset, line_pos); Can we take all the code here inside function UpdateBlockInLineInfo? I see that it is called from one other place but I guess most of the surrounding code there can also be moved inside the function. Can we change the name of the function to UpdateSharedLineInfo or something like that and remove inline marking from this? I am not sure we want to inline such big functions. If it make difference in performance then we can probably consider it. 4. EndLineParallelCopy() { .. + /* Update line size. */ + pg_atomic_write_u32(&lineInfo->line_size, line_size); + pg_atomic_write_u32(&lineInfo->line_state, LINE_LEADER_POPULATED); + elog(DEBUG1, "[Leader] After adding - line position:%d, line_size:%d", + line_pos, line_size); .. } Can we instead call UpdateSharedLineInfo (new function name for UpdateBlockInLineInfo) to do this and maybe see it only updates the required info? The idea is to centralize the code for updating SharedLineInfo. 5. +static uint32 +GetLinePosition(CopyState cstate) +{ + ParallelCopyData *pcdata = cstate->pcdata; + ParallelCopyShmInfo *pcshared_info = pcdata->pcshared_info; + uint32 previous_pos = pcdata->worker_processed_pos; + uint32 write_pos = (previous_pos == -1) ? 0 : (previous_pos + 1) % RINGSIZE; It seems to me that each worker has to hop through all the processed chunks before getting the chunk which it can process. This will work but I think it is better if we have some shared counter which can tell us the next chunk to be processed and avoid all the unnecessary work of hopping to find the exact position. v5-0004-Documentation-for-parallel-copy ----------------------------------------- 1. Can you add one or two examples towards the end of the page where we have examples for other Copy options? Please run pgindent on all patches as that will make the code look better. From the testing perspective, 1. Test by having something force_parallel_mode = regress which means that all existing Copy tests in the regression will be executed via new worker code. You can have this as a test-only patch for now and make sure all existing tests passed with this. 2. Do we have tests for toast tables? I think if you implement the previous point some existing tests might cover it but I feel we should have at least one or two tests for the same. 3. Have we checked the code coverage of the newly added code with existing tests? -- With Regards, Amit Kapila.
pgsql-hackers by date: