Why is DEFAULT_FDW_TUPLE_COST so insanely low? - Mailing list pgsql-hackers

From David Rowley
Subject Why is DEFAULT_FDW_TUPLE_COST so insanely low?
Date
Msg-id CAApHDvopVjjfh5c1Ed2HRvDdfom2dEpMwwiu5-f1AnmYprJngA@mail.gmail.com
Whole thread Raw
Responses Re: Why is DEFAULT_FDW_TUPLE_COST so insanely low?
List pgsql-hackers
Over on [1] I was complaining that I thought DEFAULT_FDW_TUPLE_COST,
which is defined as 0.01 was unrealistically low.

For comparison, cpu_tuple_cost, something we probably expect to be in
a CPU cache is also 0.01.  We've defined DEFAULT_PARALLEL_TUPLE_COST
to be 0.1, which is 10x cpu_tuple_cost.  That's coming from a shared
memory segment.  So why do we think DEFAULT_FDW_TUPLE_COST should be
the same as cpu_tuple_cost when that's probably pulling a tuple from
some remote server over some (possibly slow) network?

I did a little experiment in the attached .sql file and did some maths
to try to figure out what it's really likely to be costing us. I tried
this with and without the attached hack to have the planner not
consider remote grouping just to see how much slower pulling a million
tuples through the FDW would cost.

I setup a loopback server on localhost (which has about the lowest
possible network latency) and found the patched query to the foreign
server took:

Execution Time: 530.000 ms

This is pulling all million tuples over and doing the aggregate locally.

Unpatched, the query took:

Execution Time: 35.334 ms

so about 15x faster.

If I take the seqscan cost for querying the local table, which is
14425.00 multiply that by 15 (the extra time it took to pull the 1
million tuples) then divide by 1 million to get the extra cost per
tuple, then that comes to about 0.216.  So that says
DEFAULT_FDW_TUPLE_COST is about 21x lower than it should be.

I tried cranking DEFAULT_FDW_TUPLE_COST up to 0.5 to see what plans
would change in the postgres_fdw regression tests and quite a number
changed. Many seem to be pushing the sorts down to the remote server
where they were being done locally before. A few others just seem
weird. For example, the first one seems to be blindly adding a remote
sort when it does no good. I think it would take quite a bit of study
with a debugger to figure out what's going on with many of these.

Does anyone have any ideas why DEFAULT_FDW_TUPLE_COST was set so low?

Does anyone object to it being set to something more realistic?

David

[1] https://www.postgresql.org/message-id/CAApHDvpXiXLxg4TsA8P_4etnuGQqAAbHWEOM4hGe=DCaXmi_jA@mail.gmail.com

Attachment

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Race between KeepFileRestoredFromArchive() and restartpoint
Next
From: David Rowley
Date:
Subject: Re: Add proper planner support for ORDER BY / DISTINCT aggregates