Thread: FDW handling count(*) through AnalyzeForeignTable or other constant time push-down
FDW handling count(*) through AnalyzeForeignTable or other constant time push-down
<div class="WordSection1"><p class="MsoNormal">Hey all,<p class="MsoNormal"> <p class="MsoNormal">I’m building a FDW arounda column-store backend (similar to CStore but for genomic data!).<p class="MsoNormal"> <p class="MsoNormal">I havetables in the billions of rows, and have a common query pattern of asking for the table size (i.e. SELECT COUNT(*) FROMbig_fdw_table; ).<p class="MsoNormal"> <p class="MsoNormal">This is a read-optimized system in which I know in constanttime the exact dimensions of the table.<p class="MsoNormal"> <p class="MsoNormal">Is there any way to convince PostgresFDW to leverage the analyze row counts or even the “double* totalRowCount” returned from the AcquireSampleRows callbackfrom my AnalyzeForeignTable function so that it does not do a full-table scan for a COUNT(*) etc?<p class="MsoNormal"> <pclass="MsoNormal">My current fallback is to export a specialized function that returns the table rowcount for a given FDW table, but that then leaks into the user-application driving these queries.<p class="MsoNormal"> <pclass="MsoNormal">Thanks in advance!<p class="MsoNormal">Gabe<p class="MsoNormal"> <p class="MsoNormal"style="background:white"><b><span style="font-size:10.0pt;color:#262626">Gabe Rudy</span></b><span style="font-size:10.0pt;color:#595959">| VP Product & Engineering | Golden Helix, Inc.</span><span style="color:black"></span><pclass="MsoNormal"> </div>
Re: FDW handling count(*) through AnalyzeForeignTable or other constant time push-down
"Gabe F. Rudy" <rudy@goldenhelix.com> writes: > Is there any way to convince Postgres FDW to leverage the analyze row counts or even the "double* totalRowCount" returnedfrom the AcquireSampleRows callback from my AnalyzeForeignTable function so that it does not do a full-table scanfor a COUNT(*) etc? No. In PG's view, ANALYZE-based row counts are imprecise by definition. regards, tom lane
Re: FDW handling count(*) through AnalyzeForeignTable or other constant time push-down
Hey all,
I’m building a FDW around a column-store backend (similar to CStore but for genomic data!).
I have tables in the billions of rows, and have a common query pattern of asking for the table size (i.e. SELECT COUNT(*) FROM big_fdw_table; ).
This is a read-optimized system in which I know in constant time the exact dimensions of the table.
Is there any way to convince Postgres FDW to leverage the analyze row counts or even the “double* totalRowCount” returned from the AcquireSampleRows callback from my AnalyzeForeignTable function so that it does not do a full-table scan for a COUNT(*) etc?
My current fallback is to export a specialized function that returns the table row count for a given FDW table, but that then leaks into the user-application driving these queries.
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: FDW handling count(*) through AnalyzeForeignTable or other constant time push-down
Ok, I get that. Really what I am *rooting* for is Aggregate (and Sort By) Push-Down to FDW plugins. I can already internalize conditional filters for most cases, and doing a count on the filtered results would be considerablyfaster in my FDW back-end before all the records and Datums have to be constructed for postgres to do the counting. Similarly, I'm very excited about the potential for FDW to advertise a-priori sort states, so things like external merge-sortscan pass-through the request for sorted data for fields in which sorting is a no-op in my backend. Importantly my IDs are sorted by definition since they are essentially array indexes into the column-store, so joining onthem with merge-sort should be blazing fast, but currently time is wasted sorting these pre-sorted fields. Just my 2c, and I'll be tracking the 9.6 progress that includes some of these proposals. Gabe -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Thursday, February 25, 2016 11:21 PM To: Gabe F. Rudy <rudy@goldenhelix.com> Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] FDW handling count(*) through AnalyzeForeignTable or other constant time push-down "Gabe F. Rudy" <rudy@goldenhelix.com> writes: > Is there any way to convince Postgres FDW to leverage the analyze row counts or even the "double* totalRowCount" returnedfrom the AcquireSampleRows callback from my AnalyzeForeignTable function so that it does not do a full-table scanfor a COUNT(*) etc? No. In PG's view, ANALYZE-based row counts are imprecise by definition. regards, tom lane
Re: FDW handling count(*) through AnalyzeForeignTable or other constant time push-down
Hi Gabe, Did you get Aggregate Pushdown FDW plugin? Would be really helpful if you can share some insight on your investigation. Regards Anant -- View this message in context: http://postgresql.nabble.com/FDW-handling-count-through-AnalyzeForeignTable-or-other-constant-time-push-down-tp5889291p5912699.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.