Re: Getting sorted data from foreign server - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Getting sorted data from foreign server
Date
Msg-id CAFjFpRcdGCuby=g+GCgr8GZPnhhnkesuqqmkkeHEC9LCj_o5Ww@mail.gmail.com
Whole thread Raw
In response to Re: Getting sorted data from foreign server  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Getting sorted data from foreign server  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Re: Getting sorted data from foreign server  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers


On Fri, Oct 16, 2015 at 11:33 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Oct 15, 2015 at 6:28 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Attached is the patch which takes care of above comments.

I spent some time on this patch today.  But it's still not right.

I've attached a new version which fixes a serious problem with your
last version - having postgresGetForeignPaths do the costing of the
sorted path itself instead of delegating that to
estimate_path_cost_size is wrong.  In your version, 10% increment gets
applied to the network transmission costs as well as the cost of
generating the tupes - but only when use_remote_estimate == false.  I
fixed this and did some cosmetic cleanup.

That's better.
 

But you'll notice if you try this some of postgres_fdw's regression
tests fail.  This is rather mysterious:

***************
*** 697,715 ****
   Sort
     Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
     Sort Key: t1.c1
!    ->  Nested Loop Semi Join
           Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
!          Join Filter: (t1.c3 = t2.c3)
           ->  Foreign Scan on public.ft1 t1
                 Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
FROM "S 1"."T 1" WHERE (("C 1" < 20))
!          ->  Materialize
                 Output: t2.c3
!                ->  Foreign Scan on public.ft2 t2
                       Output: t2.c3
!                      Filter: (date(t2.c4) = '01-17-1970'::date)
!                      Remote SQL: SELECT c3, c4 FROM "S 1"."T 1"
WHERE (("C 1" > 10))
! (15 rows)

  EXECUTE st2(10, 20);
   c1 | c2 |  c3   |              c4              |            c5
      | c6 |     c7     | c8
--- 697,718 ----
   Sort
     Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
     Sort Key: t1.c1
!    ->  Hash Join
           Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
!          Hash Cond: (t1.c3 = t2.c3)
           ->  Foreign Scan on public.ft1 t1
                 Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
                 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
FROM "S 1"."T 1" WHERE (("C 1" < 20))
!          ->  Hash
                 Output: t2.c3
!                ->  HashAggregate
                       Output: t2.c3
!                      Group Key: t2.c3
!                      ->  Foreign Scan on public.ft2 t2
!                            Output: t2.c3
!                            Filter: (date(t2.c4) = '01-17-1970'::date)
!                            Remote SQL: SELECT c3, c4 FROM "S 1"."T
1" WHERE (("C 1" > 10))
! (18 rows)

What I think is happening here is that the planner notices that
instead of doing a parameterized nestloop, it could pull down the data
already sorted from the remote side, cheaply unique-ify it by using
the ordering provided by the remote side, and then do a standard hash
join. That might well be a sensible approach, but the ORDER BY that
would make it correct doesn't show up in the Remote SQL.  I don't know
why that's happening, but it's not good.


The unique-ifying is happening through HashAggregate, so we do not need ORDER BY clause in RemoteSQL. So the plan is correct.

Careful examination of paths created revealed that the change in plan is result of our fuzzy path selection logic. Let me explain it using the cost values I got on my machine. Please note that the costs are described as a tuple (startup cost, total cost, number of rows, present of pathkeys).

With your patch, base relation paths have following costs:

ft1 path without pathkeys - (100, 123.9, 20, no)
ft1 path with pathkeys (100, 126.25, 20, yes)
ft2 path without pathkeys (100, 143.645, 4, no)
ft2 path without pathkeys with params (100.01, 125.365, 1, no)

Notice the sorted path is only marginally costly (2.5%) compared to the non-sorted path for ft1. During the path creation process several nested loop paths are created, but except for two, they are too costly. The two nested loop paths of interest have costs (200, 268.755, 10, no) and (200, 271.105, 10, yes). The path with pathkeys kicks out the one without pathkeys because the later's costs are "fuzzily" equal and pathkeys give extra advantage. The "fuzzy" equality is effect of the marginally extra sorting cost. The nested loop path with pathkeys remains in the path list. Several hash join paths and merge join paths are created but are costlier than one particular hash path which has costs (243.677, 267.6624, 10, no). This hash path is not beaten by the nested loop path since because of lower total cost (which is beyond the fuzzy margins) and gets ultimately chosen by planner to perform ft1 join ft2.

Interestingly, if the nested loop path with pathkeys had not kicked out that without pathkeys, the nested loop path would have emerged as winner owing to lower startup cost as the total costs of the plans are within fuzzy margin.

With my patch base relation paths have following costs:
ft1 path without pathkeys - (100, 123.9, 20, no)
ft1 path with pathkeys - (110, 136.29, 20, yes)
ft2 path without pathkeys - (100, 143, 4, no)
ft2 path with pathkeys - (100, 125.365, 1, no)

The interesting nested loop paths with and without pathkeys have costs (200, 268.755, 10, no) and (210, 281.145, 10, yes). The path with pathkeys does not kick out the path without pathkeys. The nested loop path without pathkeys in turn beats every other path, merge join or hash join, on the basis of lower startup cost and "fuzzily" equal total cost. The same emerges as the winner owing to lower startup and total costs compared to the path without pathkeys.

In both the cases (with or without patch) since the number of resultant rows is very small, the planner thinks that sorting them after joining is better than getting them sorted while joining.

Increasing the sorting cost factor (when use_remote_estimates = false) from 1.1 to 1.2 makes the difference disappear.

Since the startup costs for postgres_fdw are large portion of total cost, extra 10% of rest of the cost is comparable to 1% fuzzy limit. IMO, we shouldn't bother too much about it as the path costs are not much different.

In one of the comments, there is a typo s/sidea/side/. Rest of the patch looks fine.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pgsql-hackers by date:

Previous
From: Victor Wagner
Date:
Subject: Patch (2): Implement failover on libpq connect level.
Next
From: Etsuro Fujita
Date:
Subject: Re: Foreign join pushdown vs EvalPlanQual