Re: using custom scan nodes to prototype parallel sequential scan - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: using custom scan nodes to prototype parallel sequential scan
Date
Msg-id CA+U5nMLnfPZ6QfTgpjzNDfMJjP5-f_KyTg-tRLZSh7WFzYO28Q@mail.gmail.com
Whole thread Raw
In response to using custom scan nodes to prototype parallel sequential scan  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: using custom scan nodes to prototype parallel sequential scan  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Re: using custom scan nodes to prototype parallel sequential scan  (Robert Haas <robertmhaas@gmail.com>)
Re: using custom scan nodes to prototype parallel sequential scan  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
On 10 November 2014 15:57, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Oct 15, 2014 at 2:55 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Something usable, with severe restrictions, is actually better than we
>> have now. I understand the journey this work represents, so don't be
>> embarrassed by submitting things with heuristics and good-enoughs in
>> it. Our mentor, Mr.Lane, achieved much by spreading work over many
>> releases, leaving others to join in the task.
>
> It occurs to me that, now that the custom-scan stuff is committed, it
> wouldn't be that hard to use that, plus the other infrastructure we
> already have, to write a prototype of parallel sequential scan.

+1

> Given
> where we are with the infrastructure, there would be a number of
> unhandled problems, such as deadlock detection (needs group locking or
> similar), assessment of quals as to parallel-safety (needs
> proisparallel or similar), general waterproofing to make sure that
> pushing down a qual we shouldn't does do anything really dastardly
> like crash the server (another written but yet-to-be-published patch
> adds a bunch of relevant guards), and snapshot sharing (likewise).
> But if you don't do anything weird, it should basically work.

If we build something fairly restricted, but production ready we can
still do something useful to users.

* only functions marked as "CONTAINS NO SQL"
* parallel_workers = 2 or fixed
* only one Plan type, designed to maximise benefit and minimize optimizer change

Why?

* The above is all that is needed to test any infrastructure changes
we accept. We need both infrastructure and tests. We could do this as
a custom scan plugin, but why not make it work in core - that doesn't
prevent a plugin version also if you desire it.

* only functions marked as "CONTAINS NO SQL"
We don't really know what proisparallel is, but we do know what
CONTAINS NO SQL means and can easily check for it.
Plus I already have a patch for this, slightly bitrotted.

* parallel_workers = 2 (or at least not make it user settable)
By fixing the number of workers at 2 we avoid any problems caused by
having N variable, such as how to vary N fairly amongst users and
other such considerations. We get the main benefit of parallelism,
without causing other issues across the server.

* Fixed Plan: aggregate-scan
To make everything simpler, allow only plans of a single type.SELECT something, list of aggregatesFROM fooWHERE
filtersGROUPBY something
 
because we know that passing large amounts of data from worker to
master process will be slow, so focusing only on seq scan is not
sensible; we should focus on plans that significantly reduce the
number of rows passed upwards. We could just do this for very
selective WHERE clauses, but that is not an important class of query.
As soon as include aggregates, we reduce data passing significantly
AND we hit a very important subset of queries:

This plan type is widely used in reporting queries, so will hit the
mainline of BI applications and many Mat View creations.
This will allow SELECT count(*) FROM foo to go faster also.

The execution plan for that query type looks like this...
Hash Aggregate  Gather From Workers     {Worker Nodes workers = 2       HashAggregate       PartialScan}

Which is simple enough that we include a mechanism for the Gather
operation, plus it is simple enough to not need extensive optimizer
changes - just changes to set_plain_rel_pathlist() to consider
parallel plans.

Plan costs are easily to calculate for the above...cpu_worker_startup_cost = 100 -- Startup cost is easy to calculate
by
observation, but a reasonably large default will be OKcpu_ipc_tuple_cost = 0.1 -- assume x10 normal cost of
cpu_tuple_cost
Partial scan costs are just same as SeqScan, just with fewer blocks.
All other costs are the same

We can submit the main patch by Dec 15, fix all the problems by Feb 15.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Next
From: Heikki Linnakangas
Date:
Subject: Re: WAL format and API changes (9.5)