Re: [HACKERS] FDW and parallel execution - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: [HACKERS] FDW and parallel execution
Date
Msg-id 20170413.164949.16305379.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: [HACKERS] FDW and parallel execution  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
List pgsql-hackers
Sorry for the too-brief reply.

At Tue, 11 Apr 2017 20:08:46 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in
<94c8692a-f299-b72b-6227-270b8a9ed7ad@postgrespro.ru>
>
> On 04.04.2017 13:29, Kyotaro HORIGUCHI wrote:
> > Hi,
> >
> > At Sun, 02 Apr 2017 16:30:24 +0300, Konstantin Knizhnik
> > <k.knizhnik@postgrespro.ru> wrote in <58E0FCF0.2070603@postgrespro.ru>
> >> My FDW provides implementation for IsForeignScanParallelSafe which
> >> returns true.
> >> I wonder what can prevent optimizer from using parallel plan in this
> >> case?
> > Parallel execution requires partial paths. It's the work for
> > GetForeignPaths of your FDW.
>
> Thank you very much for explanation.
> But unfortunately I still do not completely understand what kind of
> queries allow parallel execution with FDW.

At Tue, 11 Apr 2017 19:20:04 +0200, PostgreSQL - Hans-Jürgen Schönig <postgres@cybertec.at> wrote in
<0c9c101d-0fbb-1e19-f04c-7a6ec577d960@cybertec.at>
> did you check out antonin houska's patches?
> we basically got code, which can do that.

Parallel aggregation is already available. Antonin's patch is
partition-wise aggregation, which boosts the case where partition
key is aggregation key, I suppose. parallel aggregation seems to
be considered when any appropriate partial path is available. (I
haven't tried anything, though.)

set_plain_rel_pathlist() does the work for plain relations so
what we should do in GetForeignPaths would be follows.

- check rel->consider_parallel (won't be requried since the fDW knows that) and rel->lateral_relids.

- If parallel is OK, create a path with create_foreignscan_path in ordinary way then change some parallel related
membersas necessary. 

- Like create_plain_partial_paths(), check certain conditions and finally add_partial_path() the created partial
foreignscan path. 

I haven't really done this, so I might be wrong.

> Section "FDW Routines for Parallel Execution" of FDW specification
> says:
> > A ForeignScan node can, optionally, support parallel execution. A
> > parallel ForeignScan will be executed in multiple processes and should
> > return each row only once across all cooperating processes. To do
> > this, processes can coordinate through fixed size chunks of dynamic
> > shared memory. This shared memory is not guaranteed to be mapped at
> > the same address in every process, so pointers may not be used. The
> > following callbacks are all optional in general, but required if
> > parallel execution is to be supported.
>
> I provided IsForeignScanParallelSafe, EstimateDSMForeignScan,
> InitializeDSMForeignSca and InitializeWorkerForeignScan in my FDW.
> IsForeignScanParallelSafe returns true.
> Also in GetForeignPaths function I created path with
> baserel->consider_parallel == true.
> Is it enough or I should do something else?

Creating partial paths, I think. create_grouping_paths() requires
partial_pathlist in input_rel.

The section is explaning FDW routines specially provided for
parallel execution. But it doesn't seem mentioning "how to run a
parallel execution" as a whole.

> But unfortunately I failed to find any query: sequential scan, grand
> aggregation, aggregation with group by, joins... when parallel
> execution plan is used for this FDW.
> Also there are no examples of using this functions in Postgres
> distributive and I failed to find any such examples in Internet.

Maybe you're the pioneer in this area.

> Can somebody please clarify my situation with parallel execution and
> FDW and may be point at some examples?
> Thank in advance.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center




pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: [HACKERS] [PATCH v1] Add and report the new "in_hot_standby" GUC pseudo-variable.
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: [HACKERS] Quorum commit for multiple synchronous replication.