Thread: Cost estimation in foreign data wrappers

Cost estimation in foreign data wrappers

From

Hadi Moshayedi

Date:

21 February 2014, 14:12:19

Hello,

There is a callback function in fdw's which should also set estimates for startup and total costs for each path. Assume a fdw adds only one path (e.g. in file_fdw). I am trying to understand what can go wrong if we do a bad job in estimating these costs.

Since we have only one scan path here, it doesn't make a difference in choosing the best scan path.

By looking at the code and doing some experiments, I think this can be significant in (1) underestimating a nested loop's cost, (2) not materializing inner table in nested loop.

* Are there any other cases that this can be significant?

* Assume we are not sure about the exact cost, but we know that it is in [lower_bound, upper_bound] range, where upper_bound can be 10x lower_bound Then, what value is better to choose? lower bound? upper bound? or average?

Thanks,

-- Hadi

Re: Cost estimation in foreign data wrappers

From

Tom Lane

Date:

21 February 2014, 14:55:21

Hadi Moshayedi <hadi@moshayedi.net> writes:
> There is a callback function in fdw's which should also set estimates for
> startup and total costs for each path. Assume a fdw adds only one path
> (e.g. in file_fdw). I am trying to understand what can go wrong if we do a
> bad job in estimating these costs.

> Since we have only one scan path here, it doesn't make a difference in
> choosing the best scan path.

Right.  But if there's more than one table in the query, it might make a
difference in terms of what join plan gets chosen.  I'd say that getting
an accurate rowcount estimate is usually far more important, though.
        regards, tom lane