Re: Parallel Query - Mailing list pgsql-general

From Rafia Sabih
Subject Re: Parallel Query
Date
Msg-id CAOGQiiNVXF-n800pAMKMi=UEn99pXBxHv8TmrMSUVvr70Pzjnw@mail.gmail.com
Whole thread Raw
In response to Parallel Query  (Krithika Venkatesh <krithikavenkatesh31@gmail.com>)
List pgsql-general


On Thu, Jan 11, 2018 at 12:24 PM, Krithika Venkatesh <krithikavenkatesh31@gmail.com> wrote:
Hi All,

I was reading about parallel queries in postgresql 10.

I came across the property max_parallel_workers_per_gather. It sets the maximum number of workers that can be started by a single Gather or Gather Merge node. 

I am not able to understand gather and gather merge node. 

With parallel query in picture, there are two new types of nodes introduced in postgresql 10  viz, Gather node and other is Gather-Merge node.
Now, if seq-scan has to be performed in parallel, then gather node comes into picture, which you may think of as a master node for the parallel workers which are performing sequential scan. e.g.
                           QUERY PLAN                            
------------------------------------------------------------------
 Gather  (cost=0.00..7.60 rows=246 width=8)
   Workers Planned: 2
   ->  Parallel Seq Scan on t  (cost=0.00..7.60 rows=102 width=8)
         Filter: (i < 50)
(4 rows)

Here, there are two workers which are in parallel scanning the table using seq-scan and gather is the process which collects the result of this data and passes it to the node above it, if any.

Now, just like in seq-scan the order of tuples is not maintained so is the case in gather node. However, for the cases wherein you need to maintain the order, gather-merge node comes into picture. For example when index scan is used in parallel then gather-merge will be above it to collect the results of parallel scans respecting the order of indexes.

What is the difference between max_parallel_workers_per_gather and max_parallel_processes. What is meant by gather node.

As you mentioned above, max_parallel_workers_per_gather is the total number of parallel workers allowed for a node. The parameter max_parallel_processes sets the upper limit of total parallel workers running. As a base rule, the value of max_parallel_workers_per_gather should be less than or equal to max_parallel_processes, otherwise the value of max_parallel_processes will be the maximum allowable workers for a parallel node. 

You may have a look at the documentation of these parameters for more details --  https://www.postgresql.org/docs/10/static/when-can-parallel-query-be-used.html

--
Regards,
Rafia Sabih

pgsql-general by date:

Previous
From: Durumdara
Date:
Subject: Re: String comparison problem in select - too many results
Next
From: pinker
Date:
Subject: RE: How Many Partitions are Good Performing