On Mon, Feb 7, 2011 at 16:01, Shigeru HANADA <hanada@metrosystems.co.jp> wrote:
> This patch is based on latest FDW API patches which are posted in
> another thread "SQL/MED FDW API", and copy_export-20110104.patch which
> was posted by Itagaki-san.
I have questions about estimate_costs().
* What value does baserel->tuples have? Foreign tables are never analyzed for now. Is the number correct?
* Your previous measurement showed it has much more startup_cost. When you removed ReScan, it took long time but
plannerdidn't choose materialized plans. It might come from lower startup costs.
* Why do you use lstat() in it? Even if the file is a symlink, we will read the linked file in the succeeding copy. So,
Ithink it should be stat() rather than lstat().
+estimate_costs(const char *filename, RelOptInfo *baserel,
+ double *startup_cost, double *total_cost)
+{
...
+ /* get size of the file */
+ if (lstat(filename, &stat) == -1)
+ {
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat file \"%s\": %m", filename)));
+ }
+
+ /*
+ * The way to estimate costs is almost same as cost_seqscan(), but there
+ * are some differences:
+ * - DISK costs are estimated from file size.
+ * - CPU costs are 10x of seq scan, for overhead of parsing records.
+ */
+ pages = stat.st_size / BLCKSZ + (stat.st_size % BLCKSZ > 0 ? 1 : 0);
+ run_cost += seq_page_cost * pages;
+
+ *startup_cost += baserel->baserestrictcost.startup;
+ cpu_per_tuple = cpu_tuple_cost + baserel->baserestrictcost.per_tuple;
+ run_cost += cpu_per_tuple * 10 * baserel->tuples;
+ *total_cost = *startup_cost + run_cost;
+
+ return stat.st_size;
+}
--
Itagaki Takahiro