Thread: Foreign table scan estimates

Foreign table scan estimates

From
"Albe Laurenz"
Date:
While playing around with ANALYZE on foreign tables, I noticed
that the row count estimate for foreign scans is still
initialized to 1000 even if there are statistics for the
foreign table.  I think that this should be improved.

The attached patch illustrates my suggestion.

BTW, ist there any other place where foreign table statistics
should or do enter the planning process?

Yours,
Laurenz Albe

Attachment

Re: Foreign table scan estimates

From
Tom Lane
Date:
"Albe Laurenz" <laurenz.albe@wien.gv.at> writes:
> While playing around with ANALYZE on foreign tables, I noticed
> that the row count estimate for foreign scans is still
> initialized to 1000 even if there are statistics for the
> foreign table.  I think that this should be improved.

> The attached patch illustrates my suggestion.

I don't think this is appropriate; it will just waste cycles because
the FDW will have to repeat the calculations after obtaining a real
estimate of the foreign table size.  If we trusted pg_class.reltuples
to be up to date, there might be some value in this.  But we don't
trust that for regular tables (cf. plancat.c), and I don't see why
we would do so for foreign tables.

I think on the whole it's better to abdicate responsibility here and
require the FDW to do something in its GetForeignRelSize function.
It's not like we'd be saving the FDW a lot of code in the (unlikely)
case that this is exactly what it would do anyway.

A different line of thought would be to refactor the definition of
GetForeignRelSize so that it's supposed to set rel->tuples and then
after that we do the selectivity calculation to set rel->rows.
But that doesn't seem attractive to me either; it saves a few lines
for trivial FDWs but makes life impossible for complex ones.  The
FDW might well have a better idea than the core code does about how
to calculate selectivity for remote tables.
        regards, tom lane


Re: Foreign table scan estimates

From
"Albe Laurenz"
Date:
Tom Lane wrote:
>> While playing around with ANALYZE on foreign tables, I noticed
>> that the row count estimate for foreign scans is still
>> initialized to 1000 even if there are statistics for the
>> foreign table.  I think that this should be improved.

>> The attached patch illustrates my suggestion.

> I don't think this is appropriate; it will just waste cycles because
> the FDW will have to repeat the calculations after obtaining a real
> estimate of the foreign table size.  If we trusted pg_class.reltuples
> to be up to date, there might be some value in this.  But we don't
> trust that for regular tables (cf. plancat.c), and I don't see why
> we would do so for foreign tables.
>
> I think on the whole it's better to abdicate responsibility here and
> require the FDW to do something in its GetForeignRelSize function.
> It's not like we'd be saving the FDW a lot of code in the (unlikely)
> case that this is exactly what it would do anyway.

I agree.

Yours,
Laurenz Albe