Re: patch: SQL/MED(FDW) DDL - Mailing list pgsql-hackers

From Robert Haas
Subject Re: patch: SQL/MED(FDW) DDL
Date
Msg-id AANLkTikRB7HvcOBGVpLYaB4S3bTjMri4iySPxWALnTwa@mail.gmail.com
Whole thread Raw
In response to Re: patch: SQL/MED(FDW) DDL  (Shigeru HANADA <hanada@metrosystems.co.jp>)
List pgsql-hackers
On Tue, Oct 5, 2010 at 8:10 AM, Shigeru HANADA
<hanada@metrosystems.co.jp> wrote:
> On Mon, 4 Oct 2010 19:31:52 -0400
> Robert Haas <robertmhaas@gmail.com> wrote:
>> On Thu, Sep 30, 2010 at 3:48 AM, Shigeru HANADA
>> <hanada@metrosystems.co.jp> wrote:
>> > How about having cost hints in generic option of the foreign table or
>> > its columns? ?Generic options are storage for wrappers, not for
>> > PostgreSQL core modules. ?Wrappers can use their own format to
>> > represent various information, and use the hints to estimate costs of
>> > a path.
>>
>> I do think we're going to need some kind of local caching of relevant
>> information from the foreign side.  Really, I doubt that fdwoptions
>> are the place for that, though: that's data for the user to input, not
>> a place for the wrapper to scribble on internally.  The trick is that
>> there's a lot of stuff you might want to cache, and we don't really
>> know anything about what the format of it is - for example, you might
>> have foreign-side statistics that need to get cached locally, but they
>> needn't be in the same format we use for pg_statistic.  Perhaps part
>> of setting up an FDW should be creating tables with prespecified
>> definitions and passing the table names to CREATE FOREIGN DATA WRAPPER
>> as options.  Maybe we could even arrange to set up the dependencies
>> appropriately...
> Agreed.  I withdraw the idea to store foreign-side statistics into
> generic options.
>
> Can we treat statistics of a foreign table separately?
>
> 1. Same as local tables (maybe required)
>   (pg_statistic.*, pg_class.reltuples/relpages)
> They will be used by planner/optimizer to estimate basic costs based
> on tuple selectivity, result row count and so on.  Such statistics
> could be generated by ANALYZE module if the FDW can supply all tuples
> from foreign side.  The basic costs should be able to correct by FDW
> via another API, because foreign queries might have some overheads,
> such as connection and transfer.
> ISTM that it is very difficult for non-PG FDW to generate PG-style
> statistics correctly.
>
> 2. depend on FDW (optional)
>   (in various, arbitrary format)
> They will be used by FDW to optimize query to be executed on
> foreign-side in their own way.  As you say, new table(s) to store such
> statistics can be created during creation of new FOREIGN DATA WRAPPER
> or installation of new fdw_handler module.  Maybe ANALYZE should call
> another API which collect these kind of statistics.
>
> I think that(1) is necessary in the first step, but (2) is not.

I disagree.  I wouldn't bother doing either one of them unless you can do both.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: KaiGai Kohei
Date:
Subject: Re: leaky views, yet again
Next
From: Tom Lane
Date:
Subject: Re: A quick warning on the win32 build scripts