Re: Ideas about a better API for postgres_fdw remote estimates - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Ideas about a better API for postgres_fdw remote estimates
Date
Msg-id 1374896.1594047946@sss.pgh.pa.us
Whole thread Raw
In response to Re: Ideas about a better API for postgres_fdw remote estimates  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Ideas about a better API for postgres_fdw remote estimates
List pgsql-hackers
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Per postgres_fdw's get_remote_estimate(), the only data we use right now
>> is the startup_cost, total_cost, rows and width estimates from the
>> top-level Plan node.  That's available immediately from the Plan tree,
>> meaning that basically *nothing* of the substantial display effort
>> expended by explain.c and ruleutils.c is of any value.  So the level-zero

> The 'display effort' you're referring to, when using JSON format with
> explain, is basically to format the results into JSON and return them-
> which is what you're suggesting this mode would do anyway, no..?

Not hardly.  Spend some time studying ruleutils.c sometime ---
reverse-compiling a plan is *expensive*.  For instance, we have to
look up the names of all the operators used in the query quals,
decide what needs quoting, decide what needs parenthesization, etc.
There's also a fun little bit that assigns unique aliases to each
table appearing in the query, which from memory is at least O(N^2)
and maybe worse.  (Admittedly, shipped queries are usually not so
complicated that N would be large.)  And by the way, we're also
starting up the executor, even if you didn't say ANALYZE.

A little bit of fooling with "perf" suggests that when explaining
a pretty simple bitmapscan query --- I used
    EXPLAIN SELECT * FROM tenk1 WHERE unique1 > 9995
which ought to be somewhat representative of what postgres_fdw needs
--- only about half of the runtime is spent within pg_plan_query, and
the other half is spent on explain.c + ruleutils.c formatting work.
So while getting rid of that overhead wouldn't be an earthshattering
improvement, I think it'd be worthwhile.

>> Further down the road, we might want to rethink the whole idea of
>> completely constructing a concrete Plan.  We could get the data we need
>> at the list-of-Paths stage.  Even more interesting, we could (with very
>> little more work) return data about multiple Paths, so that the client
>> could find out, for example, the costs of sorted and unsorted output
>> without paying two network round trips to discover that.

> I have to admit that I'm not really sure how we could make it work, but
> having a way to get multiple paths returned by EXPLAIN would certainly
> be interesting to a lot of users.  Certainly it's easier to see how we
> could get at that info in a postgres_fdw-specific function, and be able
> to understand how to deal with it there and what could be done, but once
> it's there I wonder if other tools might see that and possibly even
> build on it because it'd be the only way to get that kind of info, which
> certainly wouldn't be ideal.

Yeah, thinking about it as a function that inspects partial planner
results, it might be useful for other purposes besides postgres_fdw.
As I said before, I don't think this necessarily has to be bundled as
part of postgres_fdw.  That still doesn't make it part of EXPLAIN.

> That postgres_fdw is an extension is almost as much of a wart as
> anything being discussed here and suggesting that things added to
> postgres_fdw aren't 'core features' seems akin to ignoring the forest
> for the trees-

I think we just had this discussion in another thread.  The fact that
postgres_fdw is an extension is a feature, not a bug, because (a) it
means that somebody could implement their own version if they wanted
it to act differently; and (b) it keeps us honest about whether the
APIs needed by an FDW are accessible from outside core.  I think moving
postgres_fdw into core would be a large step backwards.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Proposal: Automatic partition creation
Next
From: Pavel Stehule
Date:
Subject: bad status of FETCH PERCENT in commitfest application