Re: Problems with plan estimates in postgres_fdw - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Problems with plan estimates in postgres_fdw
Date
Msg-id 5C7FC458.4020400@lab.ntt.co.jp
Whole thread Raw
In response to Re: Problems with plan estimates in postgres_fdw  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: Problems with plan estimates in postgres_fdw
List pgsql-hackers
(2019/02/25 18:40), Etsuro Fujita wrote:
> (2019/02/23 0:21), Antonin Houska wrote:
>> Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote:

>> What about an ORDER BY expression that contains multiple Var nodes? For
>> example
>>
>> SELECT * FROM foo ORDER BY x + y;

> Actually, add_paths_with_pathkeys_for_rel() generates such pre-sorted
> paths for the base relation, as shown in the below example using HEAD
> without the patchset proposed in this thread:
>
> postgres=# explain verbose select a+b from ft1 order by a+b;
> QUERY PLAN
> --------------------------------------------------------------------------
> Foreign Scan on public.ft1 (cost=100.00..200.32 rows=2560 width=4)
> Output: (a + b)
> Remote SQL: SELECT a, b FROM public.t1 ORDER BY (a + b) ASC NULLS LAST
> (3 rows)
>
> I think it is OK for that function to generate such paths, as tlists for
> such paths would be adjusted in apply_scanjoin_target_to_paths(), by
> doing create_projection_path() to them.
>
> Conversely, it appears that add_foreign_ordered_paths() added by the
> patchset would generate such pre-sorted paths *redundantly* when the
> input_rel is the final scan/join relation. Will look into that.

As mentioned above, I noticed that we generated a properly-sorted path 
redundantly in add_foreign_ordered_paths(), for the case where 1) the 
input_rel is the final scan/join relation and 2) the input_rel already 
has a properly-sorted path in its pathlist that was created by 
add_paths_with_pathkeys_for_rel().  So, I modified 
add_foreign_ordered_paths() to skip creating a path in that case.

(2019/02/25 19:31), Etsuro Fujita wrote:
 > + /*
 > + * If this is an UPPERREL_ORDERED step performed on the final
 > + * scan/join relation, the costs obtained from the cache wouldn't yet
 > + * contain the eval costs for the final scan/join target, which would
 > + * have been updated by apply_scanjoin_target_to_paths(); add the eval
 > + * costs now.
 > + */
 > + if (fpextra && !IS_UPPER_REL(foreignrel))
 > + {
 > + /* The costs should have been obtained from the cache. */
 > + Assert(fpinfo->rel_startup_cost >= 0 &&
 > + fpinfo->rel_total_cost >= 0);
 > +
 > + startup_cost += foreignrel->reltarget->cost.startup;
 > + run_cost += foreignrel->reltarget->cost.per_tuple * rows;
 > + }

 > but as I said in the nearby thread, this part might be completely
 > redundant. Will re-consider about this.

I thought that if it was true that in add_foreign_ordered_paths() we 
didn't need to consider pushing down the final sort to the remote in the 
case where the input_rel to that function is the final scan/join 
relation, the above code could be entirely removed from 
estimate_path_cost_size(), but I noticed that that is wrong; as we do 
not always have a properly-sorted path in the input_rel's pathlist 
already.  So, I think we need to keep the above code so that we we can 
consider the final sort pushdown for the final scan/join relation in 
add_foreign_ordered_paths().  Sorry for the confusion.  I moved the 
above code to the place we get cached costs, which I hope makes 
estimate_path_cost_size() a bit more readable.

Other changes:

* I modified add_foreign_final_paths() to skip creating a path if 
possible, in a similar way to add_foreign_ordered_paths().
* I fixed the issue pointed out by Jeff [1].
* I added more comments.

Best regards,
Etsuro Fujita

[1] 
https://www.postgresql.org/message-id/CAMkU%3D1z1Ez7fb_P_0Bc1040npE5fCOnu0M1DFyOzCp%3De%3DrBJCw%40mail.gmail.com

Attachment

pgsql-hackers by date:

Previous
From: Erik Rijkers
Date:
Subject: Re: patch tester symbols
Next
From: Etsuro Fujita
Date:
Subject: Re: Problems with plan estimates in postgres_fdw