Re: Problems with plan estimates in postgres_fdw - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: Problems with plan estimates in postgres_fdw
Date
Msg-id 7273.1551438041@localhost
Whole thread Raw
In response to Re: Problems with plan estimates in postgres_fdw  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: Problems with plan estimates in postgres_fdw
List pgsql-hackers
Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

> (2019/02/22 22:54), Antonin Houska wrote:
> > Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp>  wrote:
> >> On the other hand, the code bit added by
> >> 0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patch handles the
> >> case where a post-scan/join processing step other than grouping/aggregation
> >> (ie, the final sort or LIMIT/OFFSET) is performed directly on a base or join
> >> relation, as in the LIMIT example.  So, the two changes are handling different
> >> cases, hence both changes would be required.
> >
> > Do you mean this part?
>
> No, I don't.  What I meant was this part:
>
> +       /*
> +        * If this is an UPPERREL_ORDERED step performed on the final
> +        * scan/join relation, the costs obtained from the cache wouldn't yet
> +        * contain the eval costs for the final scan/join target, which would
> +        * have been updated by apply_scanjoin_target_to_paths(); add the eval
> +        * costs now.
> +        */
> +       if (fpextra && !IS_UPPER_REL(foreignrel))
> +       {
> +           /* The costs should have been obtained from the cache. */
> +           Assert(fpinfo->rel_startup_cost >= 0 &&
> +                  fpinfo->rel_total_cost >= 0);
> +
> +           startup_cost += foreignrel->reltarget->cost.startup;
> +           run_cost += foreignrel->reltarget->cost.per_tuple * rows;
> +       }

> Yeah, but I think the code bit cited above is needed.  Let me explain using
> yet another example with grouping and ordering:
>
>     SELECT a+b, random() FROM foreign_table GROUP BY a+b ORDER BY a+b;
>
> For this query, the sort_input_target would be {a+b}, (to which the
> foreignrel->reltarget would have been set when called from
> estimate_path_cost_size() called from postgresGetForeignUpperPaths() with the
> UPPERREL_ORDERED stage), but the final target would be {a+b, random()}, so the
> sort_input_target isn't the same as the final target in that case; the costs
> needs to be adjusted so that the costs include the ones of generating the
> final target.  That's the reason why I added the code bit you cited.

I used gdb to help me understand, however the condition

    if (fpextra && !IS_UPPER_REL(foreignrel))

never evaluated to true with the query above. fpextra was only !=NULL when
estimate_path_cost_size() was called from add_foreign_ordered_paths(), but at
the same time foreignrel->reloptkind was RELOPT_UPPER_REL.

It's still unclear to me why add_foreign_ordered_paths() passes the input
relation (input_rel) to estimate_path_cost_size(). If it passed the output rel
(i.e. ordered_rel in this case) like add_foreign_grouping_paths() does, then
foreignrel->reltarget should contain the random() function. (What I see now is
that create_ordered_paths() leaves ordered_rel->reltarget empty, but that's
another problem to be fixed.)

Do you still see a reason to call estimate_path_cost_size() this way?

> > Note about the !activeWindows test: It occurs me now that no paths should be
> > added to UPPERREL_ORDERED relation if the query needs WindowAgg node, because
> > postgres_fdw currently cannot handle UPPERREL_WINDOW relation. Or rather in
> > general, the FDW should not skip any stage just because it doesn't know how to
> > build the paths.
>
> That's right.  In postgres_fdw, by the following code in
> postgresGetForeignUpperPaths(), we will give up on doing the UPPERREL_ORDERED
> step remotely, if the query has the UPPERREL_WINDOW (and/or UPPERREL_DISTINCT)
> steps, because in that case we would have input_rel->fdw_private=NULL for a
> call to postgresGetForeignUpperPaths() with the UPPERREL_ORDERED stage:
>
>     /*
>      * If input rel is not safe to pushdown, then simply return as we cannot
>      * perform any post-join operations on the foreign server.
>      */
>     if (!input_rel->fdw_private ||
>         !((PgFdwRelationInfo *) input_rel->fdw_private)->pushdown_safe)
>         return;

I understand now: if FDW did not handle the previous stage, then
input_rel->fdw_private is NULL.

--
Antonin Houska
https://www.cybertec-postgresql.com


pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Question about commit 11cf92f6e2e13c0a6e3f98be3e629e6bd90b74d5
Next
From: Peter Eisentraut
Date:
Subject: Re: Add exclusive backup deprecation notes to documentation