Re: Problems with plan estimates in postgres_fdw - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Problems with plan estimates in postgres_fdw
Date
Msg-id 5C92283A.40208@lab.ntt.co.jp
Whole thread Raw
In response to Re: Problems with plan estimates in postgres_fdw  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: Problems with plan estimates in postgres_fdw
List pgsql-hackers
(2019/03/07 19:27), Etsuro Fujita wrote:
> (2019/03/06 22:00), Etsuro Fujita wrote:
>> (2019/02/25 18:40), Etsuro Fujita wrote:
>>> (2019/02/23 0:21), Antonin Houska wrote:
>>>> Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote:
>>
>>>> What about an ORDER BY expression that contains multiple Var nodes? For
>>>> example
>>>>
>>>> SELECT * FROM foo ORDER BY x + y;
>>
>>> Actually, add_paths_with_pathkeys_for_rel() generates such pre-sorted
>>> paths for the base relation, as shown in the below example using HEAD
>>> without the patchset proposed in this thread:
>>>
>>> postgres=# explain verbose select a+b from ft1 order by a+b;
>>> QUERY PLAN
>>> --------------------------------------------------------------------------
>>>
>>>
>>> Foreign Scan on public.ft1 (cost=100.00..200.32 rows=2560 width=4)
>>> Output: (a + b)
>>> Remote SQL: SELECT a, b FROM public.t1 ORDER BY (a + b) ASC NULLS LAST
>>> (3 rows)
>>>
>>> I think it is OK for that function to generate such paths, as tlists for
>>> such paths would be adjusted in apply_scanjoin_target_to_paths(), by
>>> doing create_projection_path() to them.
>>>
>>> Conversely, it appears that add_foreign_ordered_paths() added by the
>>> patchset would generate such pre-sorted paths *redundantly* when the
>>> input_rel is the final scan/join relation. Will look into that.
>>
>> As mentioned above, I noticed that we generated a properly-sorted path
>> redundantly in add_foreign_ordered_paths(), for the case where 1) the
>> input_rel is the final scan/join relation and 2) the input_rel already
>> has a properly-sorted path in its pathlist that was created by
>> add_paths_with_pathkeys_for_rel(). So, I modified
>> add_foreign_ordered_paths() to skip creating a path in that case.

I had a rethink about this and noticed that the previous version was not 
enough; since query_pathkeys is set to root->sort_pathkeys in that case 
(ie, the case where the input_rel is a base or join relation), we would 
already have considered pushing down the final sort when creating 
pre-sorted foreign paths for the input_rel in postgresGetForeignPaths() 
or postgresGetForeignJoinPaths(), so *no work* in that case.  I modified 
the patch as such.

>> (2019/02/25 19:31), Etsuro Fujita wrote:
>> > + /*
>> > + * If this is an UPPERREL_ORDERED step performed on the final
>> > + * scan/join relation, the costs obtained from the cache wouldn't yet
>> > + * contain the eval costs for the final scan/join target, which would
>> > + * have been updated by apply_scanjoin_target_to_paths(); add the eval
>> > + * costs now.
>> > + */
>> > + if (fpextra && !IS_UPPER_REL(foreignrel))
>> > + {
>> > + /* The costs should have been obtained from the cache. */
>> > + Assert(fpinfo->rel_startup_cost >= 0 &&
>> > + fpinfo->rel_total_cost >= 0);
>> > +
>> > + startup_cost += foreignrel->reltarget->cost.startup;
>> > + run_cost += foreignrel->reltarget->cost.per_tuple * rows;
>> > + }
>>
>> > but as I said in the nearby thread, this part might be completely
>> > redundant. Will re-consider about this.
>>
>> I thought that if it was true that in add_foreign_ordered_paths() we
>> didn't need to consider pushing down the final sort to the remote in the
>> case where the input_rel to that function is the final scan/join
>> relation, the above code could be entirely removed from
>> estimate_path_cost_size(), but I noticed that that is wrong;

I was wrong here; we don't need to consider pushing down the final sort 
in that case, as mentioned above, so we could remove that code.  Sorry 
for the confusion.

>> I moved the above
>> code to the place we get cached costs, which I hope makes
>> estimate_path_cost_size() a bit more readable.

I moved that code from the UPPERREL_ORDERED patch to the UPPERREL_FINAL 
patch, because we still need that code for estimating the costs of a 
foreign path created in add_foreign_final_paths() defined in the latter 
patch.

I polished the patches; revised code, added some more regression tests, 
and tweaked comments further.

Attached is an updated version of the patch set.

Best regards,
Etsuro Fujita

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: BUG #15572: Misleading message reported by "Drop functionoperation" on DB with functions having same name
Next
From: David Rowley
Date:
Subject: Re: Should we add GUCs to allow partition pruning to be disabled?