On Sun, 29 Oct 2023 at 12:45, Bruce Momjian <bruce@momjian.us> wrote:
> Has anything been done about this issue?
Nothing has been done. I was hoping to get the attention of a few
people who have dealt more with postgres_fdw in the past.
I've attached a patch with adjusts DEFAULT_FDW_TUPLE_COST and sets it
to 0.2. I set it to this because my experiment in [1] showed that it
was about 21x lower than the actual costs (on my machine with a
loopback fdw connecting to the same instance and database using my
example query). Given that we have parallel_tuple_cost set to 0.1 by
default, the network cost of a tuple from an FDW of 0.2 seems
reasonable to me. Slightly higher is probably also reasonable, but
given the seeming lack of complaints, I think I'd rather err on the
low side.
Changing it to 0.2, I see 4 plans change in postgres_fdw's regression
tests. All of these changes are due to STD_FUZZ_FACTOR causing some
other plan to win in add_path().
For example the query EXPLAIN (VERBOSE, ANALYZE) SELECT a, sum(b),
min(b), count(*) FROM pagg_tab GROUP BY a HAVING avg(b) < 22 ORDER BY
1; the plan switches from a HashAggregate to a GroupAggregate. This is
because after increasing the DEFAULT_FDW_TUPLE_COST to 0.2 the sorted
append child (fuzzily) costs the same as the unsorted seq scan path
and the sorted path wins in add_path due to having better pathkeys.
The seq scan path is then thrown away and we end up doing the Group
Aggregate using the sorted append children.
If I change STD_FUZZ_FACTOR to something like 1.0000001 then the plans
no longer change when I do:
alter server loopback options (add fdw_tuple_cost '0.01');
<run the query>
alter server loopback options (drop fdw_tuple_cost);
<run the query>
Ordinarily, I'd not care too much about that, but I did test the
performance of one of the plans and the new plan came out slower than
the old one.
I'm not exactly sure how best to proceed on this in the absence of any feedback.
David
[1] https://postgr.es/m/CAApHDvopVjjfh5c1Ed2HRvDdfom2dEpMwwiu5-f1AnmYprJngA@mail.gmail.com