Re: SQL/MED - file_fdw - Mailing list pgsql-hackers

From Itagaki Takahiro
Subject Re: SQL/MED - file_fdw
Date
Msg-id AANLkTi=7gOn0jpqhF=bNL6NoVSa7V4f-W6dZHtXhGTu3@mail.gmail.com
Whole thread Raw
In response to Re: SQL/MED - file_fdw  (Shigeru HANADA <hanada@metrosystems.co.jp>)
List pgsql-hackers
On Fri, Jan 14, 2011 at 14:20, Shigeru HANADA <hanada@metrosystems.co.jp> wrote:
> After copying statisticsof pgbench_xxx tables into csv_xxx tables,
> planner generates same plans as for local tables, but costs of
> ForeignScan nodes are little lower than them of SeqScan nodes.
> Forced Nested Loop uses Materialize node as expected.

Interesting. It means we need per-column statistics for foreign
tables in addition to cost values.

> ISTM that new interface which is called from ANALYZE would help to
> update statistics of foreign talbes.  If we could leave sampling
> argorythm to FDWs, acquire_sample_rows() might fit for that purpose.

We will discuss how to collect statistics from foreign tables
in the next development cycle. I think we have two choice here:
#1. Retrieve sample rows from remote foreign tables and    store stats in the local pg_statistic.#2. Use remote
statisticsfor each foreign table directly. 

acquire_sample_rows() would be a method for #1, Another approach
for #2 is to use remote statistics directly. We provide hooks to
generate virtual statistics with get_relation_stats_hook() and
families. We could treat statistics for foreign tables in a similar
way as the hook.

file_fdw likes #1 because there are no external storage to store
statistics for CSV files, but pgsql_fdw might prefer #2 because
the remote server already has stats for the underlying table.

--
Itagaki Takahiro


pgsql-hackers by date:

Previous
From: Shigeru HANADA
Date:
Subject: Re: SQL/MED - file_fdw
Next
From: Fujii Masao
Date:
Subject: Re: Streaming base backups