Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation) - Mailing list pgsql-hackers

From Haribabu Kommi
Subject Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)
Date
Msg-id CAJrrPGcUCOtb21j7uou2Ng0ikrrRbKz44OVCkz_MgWddDbncXg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers


On Mon, May 8, 2017 at 11:39 AM, David Rowley <david.rowley@2ndquadrant.com> wrote:

We really need a machine with good IO concurrency, and not too much
RAM to test these things out. It could well be that for a suitability
large enough table we'd want to scan a whole 1GB extent per worker.

I did post a patch to have heap_parallelscan_nextpage() use atomics
instead of locking over in [1], but I think doing atomics there does
not rule out also adding batching later. In fact, I think it
structures things so batching would be easier than it is today.

As part of our internal PostgreSQL project, we developed parallel seq
scan with batch mode only. The problem that we faced with batch mode
is making sure that all the parallel workers should finish almost the same
time with a proper distribution of data pages. Otherwise, it may lead to
a problem where one worker only doing the last batch job and all others
gets finished their job. In these cases, we cannot achieve good performance.

Whereas in the current approach, the maximum time the last worker
will do the job is scanning the last one page of the table.

If we go with batching of 1GB per worker, there may be chances that, the
data that satisfies the query condition may fall into only one extent then
in these cases also the batching may not yield the good results.

Regards,
Hari Babu
Fujitsu Australia

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [HACKERS] SUBSCRIPTIONS and pg_upgrade
Next
From: Noah Misch
Date:
Subject: Re: [HACKERS] Time based lag tracking for logical replication