Re: WIP: Collecting statistics on CSV file data - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: WIP: Collecting statistics on CSV file data |
Date | |
Msg-id | 4EE74C5F.3000309@lab.ntt.co.jp Whole thread Raw |
In response to | Re: WIP: Collecting statistics on CSV file data (Shigeru Hanada <shigeru.hanada@gmail.com>) |
Responses |
Re: WIP: Collecting statistics on CSV file data
Re: WIP: Collecting statistics on CSV file data |
List | pgsql-hackers |
(2011/12/12 19:33), Shigeru Hanada wrote: > (2011/12/09 21:16), Etsuro Fujita wrote: >> I updated the patch. Please find attached a patch. > > I've examined v5 patch, and got reasonable EXPLAIN results which reflect > collected statistics! As increasing STATISTICS option, estimated rows > become better. Please see attached stats_*.txt for what I > tested. > > stats_none.txt : before ANALYZE > stats_100.txt : SET STATISTICS = 100 for all columns, and ANALYZE > stats_10000.txt : SET STATISTICS = 10000 for all columns, and ANALYZE > > I think that this patch become ready for committer after some > minor corrections: > > Couldn't set n_distinct > ======================= > I couldn't set n_distinct to columns of foreign tables. With some > research, I noticed that ATSimplePermissions should accept > ATT_FOREIGN_TABLE too for that case. In addition, regression tests for > ALTER FOREIGN TABLE should be added to detect this kind of problem. > > Showing stats target > ==================== > We can see stats target of ordinary tables with \d+, but it is not > available for foreign tables. I think "Stats target" column should be > added even though output of \d+ for foreign tables become wider. One > possible idea is to remove useless "Storage" column instead, but views > have that column even though their source could come from foreign tables. > > Please see attached patch for these two items. > > Comments of FdwRoutine > ====================== > Mention about optional handler is obsolete. We should clearly say > AnalyzeForeignTable is optional (can be set to NULL) and rest are > required. IMO separating them with comment would help FDW authors to > understand requirements, e.g.: > > typedef struct FdwRoutine > { > NodeTag type; > > /* > * These Handlers are required to execute simple scan on a foreign > * table. If any of them was set to NULL, scan on a foreign table > * managed by such FDW would fail. > */ > PlanForeignScan_function PlanForeignScan; > ExplainForeignScan_function ExplainForeignScan; > BeginForeignScan_function BeginForeignScan; > IterateForeignScan_function IterateForeignScan; > ReScanForeignScan_function ReScanForeignScan; > EndForeignScan_function EndForeignScan; > > /* > * Handlers below are optional. You can set any of them to > * NULL to tell PostgreSQL that the FDW doesn't have the > * capability. > */ > AnalyzeForeignTable_function AnalyzeForeignTable; > } FdwRoutine; > > Code formatting > =============== > Some code lines go past 80 columns. > > Message style > ============= > The terms 'cannot support option "n_distinct"...' used in > ATPrepSetOptions seems little unusual in PostgreSQL. Should we say > 'cannot set "n_distinct_inherited" for foreign tables' for that case? > > Typo > ==== > Typo "spcify" is in document of analyze. Thank you for your effectiveness experiments and proposals for improvements. I updated the patch according to your proposals. Attached is the updated version of the patch. Best regards, Etsuro Fujita > Regards, > > > >
Attachment
pgsql-hackers by date: