Thread: Foreign table scan estimates
While playing around with ANALYZE on foreign tables, I noticed that the row count estimate for foreign scans is still initialized to 1000 even if there are statistics for the foreign table. I think that this should be improved. The attached patch illustrates my suggestion. BTW, ist there any other place where foreign table statistics should or do enter the planning process? Yours, Laurenz Albe
Attachment
"Albe Laurenz" <laurenz.albe@wien.gv.at> writes: > While playing around with ANALYZE on foreign tables, I noticed > that the row count estimate for foreign scans is still > initialized to 1000 even if there are statistics for the > foreign table. I think that this should be improved. > The attached patch illustrates my suggestion. I don't think this is appropriate; it will just waste cycles because the FDW will have to repeat the calculations after obtaining a real estimate of the foreign table size. If we trusted pg_class.reltuples to be up to date, there might be some value in this. But we don't trust that for regular tables (cf. plancat.c), and I don't see why we would do so for foreign tables. I think on the whole it's better to abdicate responsibility here and require the FDW to do something in its GetForeignRelSize function. It's not like we'd be saving the FDW a lot of code in the (unlikely) case that this is exactly what it would do anyway. A different line of thought would be to refactor the definition of GetForeignRelSize so that it's supposed to set rel->tuples and then after that we do the selectivity calculation to set rel->rows. But that doesn't seem attractive to me either; it saves a few lines for trivial FDWs but makes life impossible for complex ones. The FDW might well have a better idea than the core code does about how to calculate selectivity for remote tables. regards, tom lane
Tom Lane wrote: >> While playing around with ANALYZE on foreign tables, I noticed >> that the row count estimate for foreign scans is still >> initialized to 1000 even if there are statistics for the >> foreign table. I think that this should be improved. >> The attached patch illustrates my suggestion. > I don't think this is appropriate; it will just waste cycles because > the FDW will have to repeat the calculations after obtaining a real > estimate of the foreign table size. If we trusted pg_class.reltuples > to be up to date, there might be some value in this. But we don't > trust that for regular tables (cf. plancat.c), and I don't see why > we would do so for foreign tables. > > I think on the whole it's better to abdicate responsibility here and > require the FDW to do something in its GetForeignRelSize function. > It's not like we'd be saving the FDW a lot of code in the (unlikely) > case that this is exactly what it would do anyway. I agree. Yours, Laurenz Albe