Re: [HACKERS] FDW and parallel execution - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: [HACKERS] FDW and parallel execution |
Date | |
Msg-id | 20170413.164949.16305379.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [HACKERS] FDW and parallel execution (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
List | pgsql-hackers |
Sorry for the too-brief reply. At Tue, 11 Apr 2017 20:08:46 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in <94c8692a-f299-b72b-6227-270b8a9ed7ad@postgrespro.ru> > > On 04.04.2017 13:29, Kyotaro HORIGUCHI wrote: > > Hi, > > > > At Sun, 02 Apr 2017 16:30:24 +0300, Konstantin Knizhnik > > <k.knizhnik@postgrespro.ru> wrote in <58E0FCF0.2070603@postgrespro.ru> > >> My FDW provides implementation for IsForeignScanParallelSafe which > >> returns true. > >> I wonder what can prevent optimizer from using parallel plan in this > >> case? > > Parallel execution requires partial paths. It's the work for > > GetForeignPaths of your FDW. > > Thank you very much for explanation. > But unfortunately I still do not completely understand what kind of > queries allow parallel execution with FDW. At Tue, 11 Apr 2017 19:20:04 +0200, PostgreSQL - Hans-Jürgen Schönig <postgres@cybertec.at> wrote in <0c9c101d-0fbb-1e19-f04c-7a6ec577d960@cybertec.at> > did you check out antonin houska's patches? > we basically got code, which can do that. Parallel aggregation is already available. Antonin's patch is partition-wise aggregation, which boosts the case where partition key is aggregation key, I suppose. parallel aggregation seems to be considered when any appropriate partial path is available. (I haven't tried anything, though.) set_plain_rel_pathlist() does the work for plain relations so what we should do in GetForeignPaths would be follows. - check rel->consider_parallel (won't be requried since the fDW knows that) and rel->lateral_relids. - If parallel is OK, create a path with create_foreignscan_path in ordinary way then change some parallel related membersas necessary. - Like create_plain_partial_paths(), check certain conditions and finally add_partial_path() the created partial foreignscan path. I haven't really done this, so I might be wrong. > Section "FDW Routines for Parallel Execution" of FDW specification > says: > > A ForeignScan node can, optionally, support parallel execution. A > > parallel ForeignScan will be executed in multiple processes and should > > return each row only once across all cooperating processes. To do > > this, processes can coordinate through fixed size chunks of dynamic > > shared memory. This shared memory is not guaranteed to be mapped at > > the same address in every process, so pointers may not be used. The > > following callbacks are all optional in general, but required if > > parallel execution is to be supported. > > I provided IsForeignScanParallelSafe, EstimateDSMForeignScan, > InitializeDSMForeignSca and InitializeWorkerForeignScan in my FDW. > IsForeignScanParallelSafe returns true. > Also in GetForeignPaths function I created path with > baserel->consider_parallel == true. > Is it enough or I should do something else? Creating partial paths, I think. create_grouping_paths() requires partial_pathlist in input_rel. The section is explaning FDW routines specially provided for parallel execution. But it doesn't seem mentioning "how to run a parallel execution" as a whole. > But unfortunately I failed to find any query: sequential scan, grand > aggregation, aggregation with group by, joins... when parallel > execution plan is used for this FDW. > Also there are no examples of using this functions in Postgres > distributive and I failed to find any such examples in Internet. Maybe you're the pioneer in this area. > Can somebody please clarify my situation with parallel execution and > FDW and may be point at some examples? > Thank in advance. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: