Home > mailing lists

Re: Parallel Query - Mailing list pgsql-general

From	Rafia Sabih
Subject	Re: Parallel Query
Date	January 11, 2018 12:05:46
Msg-id	CAOGQiiNVXF-n800pAMKMi=UEn99pXBxHv8TmrMSUVvr70Pzjnw@mail.gmail.com Whole thread
In response to	Parallel Query (Krithika Venkatesh <krithikavenkatesh31@gmail.com>)
List	pgsql-general

Tree view

On Thu, Jan 11, 2018 at 12:24 PM, Krithika Venkatesh <krithikavenkatesh31@gmail.com> wrote:

Hi All,

I was reading about parallel queries in postgresql 10.

I came across the property max_parallel_workers_per_gather. It sets the maximum number of workers that can be started by a single Gather or Gather Merge node.

I am not able to understand gather and gather merge node.

With parallel query in picture, there are two new types of nodes introduced in postgresql 10 viz, Gather node and other is Gather-Merge node.

Now, if seq-scan has to be performed in parallel, then gather node comes into picture, which you may think of as a master node for the parallel workers which are performing sequential scan. e.g.

QUERY PLAN

------------------------------------------------------------------

Gather (cost=0.00..7.60 rows=246 width=8)

Workers Planned: 2

-> Parallel Seq Scan on t (cost=0.00..7.60 rows=102 width=8)

Filter: (i < 50)

(4 rows)

Here, there are two workers which are in parallel scanning the table using seq-scan and gather is the process which collects the result of this data and passes it to the node above it, if any.

Now, just like in seq-scan the order of tuples is not maintained so is the case in gather node. However, for the cases wherein you need to maintain the order, gather-merge node comes into picture. For example when index scan is used in parallel then gather-merge will be above it to collect the results of parallel scans respecting the order of indexes.

What is the difference between max_parallel_workers_per_gather and max_parallel_processes. What is meant by gather node.

As you mentioned above, max_parallel_workers_per_gather is the total number of parallel workers allowed for a node. The parameter max_parallel_processes sets the upper limit of total parallel workers running. As a base rule, the value of max_parallel_workers_per_gather should be less than or equal to max_parallel_processes, otherwise the value of max_parallel_processes will be the maximum allowable workers for a parallel node.

You may have a look at the documentation of these parameters for more details -- https://www.postgresql.org/docs/10/static/when-can-parallel-query-be-used.html

Regards,

Rafia Sabih

EnterpriseDB: http://www.enterprisedb.com/

pgsql-general by date:

From: Durumdara
Date: 11 January 2018, 11:57:07
Subject: Re: String comparison problem in select - too many results

From: pinker
Date: 11 January 2018, 12:15:07
Subject: RE: How Many Partitions are Good Performing

Re: Parallel Query - Mailing list pgsql-general

Previous

Next