Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan [take-2] - Mailing list pgsql-hackers
From | Kouhei Kaigai |
---|---|
Subject | Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan [take-2] |
Date | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F8012B8A9D@BPXM15GP.gisp.nec.co.jp Whole thread Raw |
In response to | Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan [take-2] (Kohei KaiGai <kaigai@kaigai.gr.jp>) |
Responses |
Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan[take-2]
|
List | pgsql-hackers |
The attached patch is rebased version of pass-down LIMIT clause patch, which was forgotten to register on the last CF. It allows to inform required number of rows to the sub-plans not only ones we have individually handled at pass_down_bound(). Its primary target is control of number of remote tuple transfer over the network connection by postgres_fdw. According to the past discussion, we add a new field @ps_numTuples on the PlanState to represent the required number of tuples. Limit node assign a particular number on the field of sub-plan, then this sub-plan can know its upper node does not require entire tuples, and adjust its execution storategy. Like MergeAppend, the sub-plan can also pass down the bounds to its sub-plans again, if it makes sense and works correctly. This feature is potentially a basis of GPU-based sorting on top of CustomScan, because it has advantage for a workload to pick up the top-N tuples if its data-size is enough small to load onto GPU-RAM. Thanks, ---- PG-Strom Project / NEC OSS Promotion Center KaiGai Kohei <kaigai@ak.jp.nec.com> > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kohei KaiGai > Sent: Tuesday, January 03, 2017 12:07 PM > To: Kaigai Kouhei(海外 浩平) <kaigai@ak.jp.nec.com> > Cc: Jeevan Chalke <jeevan.chalke@enterprisedb.com>; Robert Haas > <robertmhaas@gmail.com>; pgsql-hackers@postgresql.org; Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp>; Andres Freund <andres@anarazel.de> > Subject: Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan > [take-2] > > Oops, I oversight this patch was marked as "returned with feedback", not > "moved to the next CF". > > Its status has not been changed since the last update. (Code was revised > according to the last comment by Jeevan, but CF-Nov was time up at that > time.) > > How do I handle the patch? > > 2016-12-05 16:49 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>: > > Hello, > > > > Sorry for my late response. > > The attached patch reflects your comments. > > > >> Here are few comments on latest patch: > >> > >> > >> 1. > >> make/make check is fine, however I am getting regression failure in > >> postgres_fdw contrib module (attached regression.diff). > >> Please investigate and fix. > >> > > It was an incorrect interaction when postgres_fdw tries to push down > > sorting to the remote side. We cannot attach LIMIT clause on the plain > > scan path across SORT, however, the previous version estimated the > > cost for the plain scan with LIMIT clause even if local sorting is needed. > > If remote scan may return just 10 rows, estimated cost of the local > > sort is very lightweight, thus, this unreasonable path was chosen. > > (On the other hands, its query execution results were correct because > > ps_numTuples is not delivered across Sort node, so ForeignScan > > eventually scanned all the remote tuples. It made correct results but > > not optimal from the viewpoint of performance.) > > > > The v3 patch estimates the cost with remote LIMIT clause only if > > supplied pathkey strictly matches with the final output order of the > > query, thus, no local sorting is expected. > > > > Some of the regression test cases still have different plans but due > > to the new optimization by remote LIMIT clause. > > Without remote LIMIT clause, some of regression test cases preferred > > remote-JOIN + local-SORT then local-LIMIT. > > Once we have remote-LIMIT option, it allows to discount the cost for > > remote-SORT by choice of top-k heap sorting. > > It changed the optimizer's decision on some test cases. > > > > Potential one big change is the test case below. > > > > -- CROSS JOIN, not pushed down > > EXPLAIN (VERBOSE, COSTS OFF) > > SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, > > t2.c1 OFFSET 100 LIMIT 10; > > > > It assumed CROSS JOIN was not pushed down due to the cost for network > > traffic, however, remote LIMIT reduced the estimated number of tuples > > to be moved. So, all of the CROSS JOIN + ORDER BY + LIMIT became to > > run on the remote side. > > > >> 2. > >> + * > >> + * MEMO: root->limit_tuples is not attached when query > >> contains > >> + * grouping-clause or aggregate functions. So, we don's > adjust > >> + * rows even if LIMIT <const> is supplied. > >> > >> Can you please explain why you are not doing this for grouping-clause > >> or aggregate functions. > >> > > See grouping_planner() at optimizer/plan/planner.c It puts an invalid > > value on the root->limit_tuples if query has GROUP BY clause, so we > > cannot know number of tuples to be returned even if we have upper > > limit actually. > > > > /* > > * Figure out whether there's a hard limit on the number of rows > that > > * query_planner's result subplan needs to return. Even if we > know a > > * hard limit overall, it doesn't apply if the query has any > > * grouping/aggregation operations, or SRFs in the tlist. > > */ > > if (parse->groupClause || > > parse->groupingSets || > > parse->distinctClause || > > parse->hasAggs || > > parse->hasWindowFuncs || > > parse->hasTargetSRFs || > > root->hasHavingQual) > > root->limit_tuples = -1.0; > > else > > root->limit_tuples = limit_tuples; > > > >> 3. > >> Typo: > >> > >> don's => don't > >> > > Fixed, > > > > best regards, > > ---- > > PG-Strom Project / NEC OSS Promotion Center KaiGai Kohei > > <kaigai@ak.jp.nec.com> > > > > > > > > -- > > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To > > make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-hackers > > > > > > -- > KaiGai Kohei <kaigai@kaigai.gr.jp> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: