Home > mailing lists

Re: Getting sorted data from foreign server - Mailing list pgsql-hackers

From	Ashutosh Bapat
Subject	Re: Getting sorted data from foreign server
Date	October 21, 2015 09:23:54
Msg-id	CAFjFpRcdGCuby=g+GCgr8GZPnhhnkesuqqmkkeHEC9LCj_o5Ww@mail.gmail.com Whole thread
In response to	Re: Getting sorted data from foreign server (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Getting sorted data from foreign server Re: Getting sorted data from foreign server
List	pgsql-hackers

Tree view

On Fri, Oct 16, 2015 at 11:33 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Oct 15, 2015 at 6:28 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Attached is the patch which takes care of above comments.

I spent some time on this patch today. But it's still not right.

I've attached a new version which fixes a serious problem with your
last version - having postgresGetForeignPaths do the costing of the
sorted path itself instead of delegating that to
estimate_path_cost_size is wrong. In your version, 10% increment gets
applied to the network transmission costs as well as the cost of
generating the tupes - but only when use_remote_estimate == false. I
fixed this and did some cosmetic cleanup.

That's better.

But you'll notice if you try this some of postgres_fdw's regression
tests fail. This is rather mysterious:

***************
*** 697,715 ****
Sort
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
Sort Key: t1.c1
! -> Nested Loop Semi Join
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
! Join Filter: (t1.c3 = t2.c3)
-> Foreign Scan on public.ft1 t1
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
FROM "S 1"."T 1" WHERE (("C 1" < 20))
! -> Materialize
Output: t2.c3
! -> Foreign Scan on public.ft2 t2
Output: t2.c3
! Filter: (date(t2.c4) = '01-17-1970'::date)
! Remote SQL: SELECT c3, c4 FROM "S 1"."T 1"
WHERE (("C 1" > 10))
! (15 rows)

EXECUTE st2(10, 20);
c1 | c2 | c3 | c4 | c5
| c6 | c7 | c8
--- 697,718 ----
Sort
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
Sort Key: t1.c1
! -> Hash Join
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
! Hash Cond: (t1.c3 = t2.c3)
-> Foreign Scan on public.ft1 t1
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
FROM "S 1"."T 1" WHERE (("C 1" < 20))
! -> Hash
Output: t2.c3
! -> HashAggregate
Output: t2.c3
! Group Key: t2.c3
! -> Foreign Scan on public.ft2 t2
! Output: t2.c3
! Filter: (date(t2.c4) = '01-17-1970'::date)
! Remote SQL: SELECT c3, c4 FROM "S 1"."T
1" WHERE (("C 1" > 10))
! (18 rows)

What I think is happening here is that the planner notices that
instead of doing a parameterized nestloop, it could pull down the data
already sorted from the remote side, cheaply unique-ify it by using
the ordering provided by the remote side, and then do a standard hash
join. That might well be a sensible approach, but the ORDER BY that
would make it correct doesn't show up in the Remote SQL. I don't know
why that's happening, but it's not good.

The unique-ifying is happening through HashAggregate, so we do not need ORDER BY clause in RemoteSQL. So the plan is correct.

Careful examination of paths created revealed that the change in plan is result of our fuzzy path selection logic. Let me explain it using the cost values I got on my machine. Please note that the costs are described as a tuple (startup cost, total cost, number of rows, present of pathkeys).

With your patch, base relation paths have following costs:

ft1 path without pathkeys - (100, 123.9, 20, no)

ft1 path with pathkeys (100, 126.25, 20, yes)

ft2 path without pathkeys (100, 143.645, 4, no)

ft2 path without pathkeys with params (100.01, 125.365, 1, no)

Notice the sorted path is only marginally costly (2.5%) compared to the non-sorted path for ft1. During the path creation process several nested loop paths are created, but except for two, they are too costly. The two nested loop paths of interest have costs (200, 268.755, 10, no) and (200, 271.105, 10, yes). The path with pathkeys kicks out the one without pathkeys because the later's costs are "fuzzily" equal and pathkeys give extra advantage. The "fuzzy" equality is effect of the marginally extra sorting cost. The nested loop path with pathkeys remains in the path list. Several hash join paths and merge join paths are created but are costlier than one particular hash path which has costs (243.677, 267.6624, 10, no). This hash path is not beaten by the nested loop path since because of lower total cost (which is beyond the fuzzy margins) and gets ultimately chosen by planner to perform ft1 join ft2.

Interestingly, if the nested loop path with pathkeys had not kicked out that without pathkeys, the nested loop path would have emerged as winner owing to lower startup cost as the total costs of the plans are within fuzzy margin.

With my patch base relation paths have following costs:

ft1 path without pathkeys - (100, 123.9, 20, no)

ft1 path with pathkeys - (110, 136.29, 20, yes)

ft2 path without pathkeys - (100, 143, 4, no)

ft2 path with pathkeys - (100, 125.365, 1, no)

The interesting nested loop paths with and without pathkeys have costs (200, 268.755, 10, no) and (210, 281.145, 10, yes). The path with pathkeys does not kick out the path without pathkeys. The nested loop path without pathkeys in turn beats every other path, merge join or hash join, on the basis of lower startup cost and "fuzzily" equal total cost. The same emerges as the winner owing to lower startup and total costs compared to the path without pathkeys.

In both the cases (with or without patch) since the number of resultant rows is very small, the planner thinks that sorting them after joining is better than getting them sorted while joining.

Increasing the sorting cost factor (when use_remote_estimates = false) from 1.1 to 1.2 makes the difference disappear.

Since the startup costs for postgres_fdw are large portion of total cost, extra 10% of rest of the cost is comparable to 1% fuzzy limit. IMO, we shouldn't bother too much about it as the path costs are not much different.

In one of the comments, there is a typo s/sidea/side/. Rest of the patch looks fine.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pgsql-hackers by date:

From: Victor Wagner
Date: 21 October 2015, 09:07:41
Subject: Patch (2): Implement failover on libpq connect level.

From: Etsuro Fujita
Date: 21 October 2015, 10:56:47
Subject: Re: Foreign join pushdown vs EvalPlanQual

Re: Getting sorted data from foreign server - Mailing list pgsql-hackers

Previous

Next