RE: Determine parallel-safety of partition relations for Inserts - Mailing list pgsql-hackers

From tsunakawa.takay@fujitsu.com
Subject RE: Determine parallel-safety of partition relations for Inserts
Date
Msg-id OSBPR01MB29826CFA98FD1AD5A361B2E3FEA40@OSBPR01MB2982.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Determine parallel-safety of partition relations for Inserts  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Determine parallel-safety of partition relations for Inserts  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
From: Amit Kapila <amit.kapila16@gmail.com>
> We already allow users to specify the degree of parallelism for all
> the parallel operations via guc's max_parallel_maintenance_workers,
> max_parallel_workers_per_gather, then we have a reloption
> parallel_workers and vacuum command has the parallel option where
> users can specify the number of workers that can be used for
> parallelism. The parallelism considers these as hints but decides
> parallelism based on some other parameters like if there are that many
> workers available, etc. Why the users would expect differently for
> parallel DML?

I agree that the user would want to specify the degree of parallelism of DML, too.  My simple (probably silly) question
was,in INSERT SELECT,
 

* If the target table has 10 partitions and the source table has 100 partitions, how would the user want to specify
parameters?

* If the source and target tables have the same number of partitions, and the user specified different values to
parallel_workersand parallel_dml_workers, how many parallel workers would run?
 

* What would the query plan be like?  Something like below?  Can we easily support this sort of nested thing?

Gather
  Workers Planned: <parallel_dml_workers>
  Insert
    Gather
      Workers Planned: <parallel_workers>
      Parallel Seq Scan


> Which memory specific to partitions are you referring to here and does
> that apply to the patch being discussed?

Relation cache and catalog cache, which are not specific to partitions.  This patch's current parallel safety check
opensand closes all descendant partitions of the target table.  That leaves those cache entries in CacheMemoryContext
afterthe SQL statement ends.  But as I said, we can consider it's not a serious problem in this case because the
parallelDML would be executed in limited number of concurrent sessions.  I just touched on the memory consumption issue
forcompleteness in comparison with (3).
 


Regards
Takayuki Tsunakawa


pgsql-hackers by date:

Previous
From: Pavan Deolasee
Date:
Subject: Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits
Next
From: Pavan Deolasee
Date:
Subject: Re: some pointless HeapTupleHeaderIndicatesMovedPartitions calls