Re: Getting sorted data from foreign server - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Getting sorted data from foreign server
Date
Msg-id CAFjFpRc9GLDi9==2dPfkec11Ed9_DV3yvZDfY8ZACv5CbYk4aA@mail.gmail.com
Whole thread Raw
In response to Re: Getting sorted data from foreign server  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Getting sorted data from foreign server  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Oct 15, 2015 at 2:27 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Oct 14, 2015 at 7:30 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> The patch uses a factor of 1.1 (10% increase) to multiple the startup and
> total costs in fpinfo for unsorted data.
>
> This change has caused the plans for few queries in the test postgres_fdw to
> change. There is ORDER BY and LIMIT clause in the queries in postgres_fdw
> testcase to keep test outputs sane and consistent. These ORDER BY clauses
> are not being pushed down the foreign servers. I tried using values upto 2
> for this but still the foreign paths with pathkeys won over those without
> pathkeys.

That's great.  Seeing unnecessary sorts disappear from the regression
tests as a result of this patch is, IMHO, pretty cool.  But these
compiler warnings might also be related:

Thanks for catching these warnings. My compiler (gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3) didn't catch those.  Sorry for those mistakes. Worst, values in fpinfo were being modified.
 

postgres_fdw.c:605:7: error: variable 'rows' is used uninitialized whenever 'if'
      condition is false [-Werror,-Wsometimes-uninitialized]
                if (fpinfo->use_remote_estimate)
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
postgres_fdw.c:628:13: note: uninitialized use occurs here
  ...rows,

Done.
 
     ^~~~
postgres_fdw.c:605:3: note: remove the 'if' if its condition is always true
                if (fpinfo->use_remote_estimate)
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

That isn't true. Left the code as is.
postgres_fdw.c:596:15: note: initialize the variable 'rows' to silence this
      warning
                double          rows;
                                    ^
                                     = 0.0

That suggestion isn't applicable either. Left the code as is.
 
postgres_fdw.c:605:7: error: variable 'startup_cost' is used uninitialized
      whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
                if (fpinfo->use_remote_estimate)
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
postgres_fdw.c:629:13: note: uninitialized use occurs here
  ...startup_cost,
     ^~~~~~~~~~~~

Done.
 
postgres_fdw.c:605:3: note: remove the 'if' if its condition is always true
                if (fpinfo->use_remote_estimate)
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

That's not true. Left the code as is.
 
postgres_fdw.c:598:21: note: initialize the variable 'startup_cost' to silence
      this warning
                Cost            startup_cost;
                                            ^
                                             = 0.0

That suggestion isn't applicable either. Left the code as is.
 
postgres_fdw.c:605:7: error: variable 'total_cost' is used uninitialized
      whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
                if (fpinfo->use_remote_estimate)
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
postgres_fdw.c:630:13: note: uninitialized use occurs here
  ...total_cost,
     ^~~~~~~~~~

Done.
 
postgres_fdw.c:605:3: note: remove the 'if' if its condition is always true
                if (fpinfo->use_remote_estimate)
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
postgres_fdw.c:599:19: note: initialize the variable 'total_cost' to silence
      this warning
                Cost            total_cost;
                                          ^
                                           = 0.0


Those suggestions aren't applicable.
 
At least some of those look like actual mistakes.


All of those were mistakes. Can you please check if you see still see those warnings?
 
I would suggest that instead of the regression test you added, you
instead change one of the existing tests to order by a non-pushable
expression that produces the same final output ordering.  The revised
EXPLAIN output shows that the existing tests are already testing that
the sort is getting pushed down; we don't need another test for the
same thing.

Right, removed the testcase.
 
Instead, let's adjust one of the tests so that we can
also show that we don't push down ORDER BY clauses when it would be
wrong to do so.  For example, suppose the user specifies a collation
in the ORDER BY clause.

Done. Modified single table with alias test to use tableoid in ORDER BY. tableoid being a system column is unsafe to be pushed down and it being constant doesn't change the order of result. Using collation might get into problems of supported/default collation, which affect the ability to push expressions down.
 

appendOrderByClause could use a header comment, and its definition
line is more than 80 characters, so it should be wrapped.

Done.
 

DEFAULT_FDW_SORT_MULTIPLIER should have a one-line comment, like /* If
no remote estimates, assume a sort costs 10% extra */.  The comment
inside postgresGetForeignPaths shouldn't refer specifically to the 10%
value, in case we change it.  So maybe: "Assume sorting costs
something, so that we won't ask for sorted results when they aren't
useful, but not too much, so that remote sorts will look attractive
when the ordering is useful to us."  Note that "attractive" is missing
a "t" in the patch.

Done.
 

Do you need to ignore ec_has_volatile equivalence classes in
find_em_expr_for_rel?  I bet yes.


is_foreign_expr takes care of volatile expressions, but using ec_has_volatile saves some cycles in is_foreign_expr. I did not check it inside find_em_expr_for_rel() since that would not match the label of this function i.e. find equivalence member for given relation. The function is being called from appendOrderByClause, where checking volatility again is not required. The function as it is might be useful somewhere else in future. I have added a separate testcase for making sure that we do not push volatile expressions.
 
Attached is the patch which takes care of above comments.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Attachment

pgsql-hackers by date:

Previous
From: "Shulgin, Oleksandr"
Date:
Subject: Re: How to import PostgreSQL 9.2.4 dump to PostgreSQL 9.4.5?
Next
From: Amit Kapila
Date:
Subject: Re: Parallel Seq Scan