Re: patch: SQL/MED(FDW) DDL - Mailing list pgsql-hackers

From Shigeru HANADA
Subject Re: patch: SQL/MED(FDW) DDL
Date
Msg-id 20101005211031.A022.6989961C@metrosystems.co.jp
Whole thread Raw
In response to Re: patch: SQL/MED(FDW) DDL  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: patch: SQL/MED(FDW) DDL
Re: patch: SQL/MED(FDW) DDL
List pgsql-hackers
On Mon, 4 Oct 2010 19:31:52 -0400
Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Sep 30, 2010 at 3:48 AM, Shigeru HANADA
> <hanada@metrosystems.co.jp> wrote:
> > How about having cost hints in generic option of the foreign table or
> > its columns? ?Generic options are storage for wrappers, not for
> > PostgreSQL core modules. ?Wrappers can use their own format to
> > represent various information, and use the hints to estimate costs of
> > a path.
> 
> I do think we're going to need some kind of local caching of relevant
> information from the foreign side.  Really, I doubt that fdwoptions
> are the place for that, though: that's data for the user to input, not
> a place for the wrapper to scribble on internally.  The trick is that
> there's a lot of stuff you might want to cache, and we don't really
> know anything about what the format of it is - for example, you might
> have foreign-side statistics that need to get cached locally, but they
> needn't be in the same format we use for pg_statistic.  Perhaps part
> of setting up an FDW should be creating tables with prespecified
> definitions and passing the table names to CREATE FOREIGN DATA WRAPPER
> as options.  Maybe we could even arrange to set up the dependencies
> appropriately...
Agreed.  I withdraw the idea to store foreign-side statistics into
generic options.

Can we treat statistics of a foreign table separately?

1. Same as local tables (maybe required)  (pg_statistic.*, pg_class.reltuples/relpages)
They will be used by planner/optimizer to estimate basic costs based
on tuple selectivity, result row count and so on.  Such statistics
could be generated by ANALYZE module if the FDW can supply all tuples
from foreign side.  The basic costs should be able to correct by FDW
via another API, because foreign queries might have some overheads,
such as connection and transfer.
ISTM that it is very difficult for non-PG FDW to generate PG-style
statistics correctly.

2. depend on FDW (optional)  (in various, arbitrary format)
They will be used by FDW to optimize query to be executed on
foreign-side in their own way.  As you say, new table(s) to store such
statistics can be created during creation of new FOREIGN DATA WRAPPER
or installation of new fdw_handler module.  Maybe ANALYZE should call
another API which collect these kind of statistics.

I think that(1) is necessary in the first step, but (2) is not.

Regards,
--
Shigeru Hanada



pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: standby registration
Next
From: Peter Eisentraut
Date:
Subject: Re: O_DSYNC broken on MacOS X?