Re: [POC] FETCH limited by bytes. - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: [POC] FETCH limited by bytes.
Date
Msg-id CADkLM=fqQYDHsHeR3pid=P1VpPBVmPH9=vZ+pQb_7KZmWLhztA@mail.gmail.com
Whole thread Raw
In response to Re: [POC] FETCH limited by bytes.  (Vladimir Sitnikov <sitnikov.vladimir@gmail.com>)
Responses Re: [POC] FETCH limited by bytes.  (Vladimir Sitnikov <sitnikov.vladimir@gmail.com>)
List pgsql-hackers
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Sat, Dec 26, 2015 at 6:16 AM, Vladimir Sitnikov
<spandir="ltr"><<a href="mailto:sitnikov.vladimir@gmail.com"
target="_blank">sitnikov.vladimir@gmail.com</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0 0
0.8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">>Have you got numbers showing any actual
performancewin for postgres_fdw?<br /><br /></span>For JDBC purposes, it would be nice to have an ability of asking<br
/>backend "to stop fetching if produced more than X MiB of response<br /> data".<br /> For small table (4 int4 fields),
havingdecent fetchSize (~1000) makes<br /> result processing 7 times faster than with fetchSize of 50 rows (14 ms<br />
->2 ms for 2000 rows).<br /> Here are the measurements: [1] and [2].<br /><br /> Note: it is not required to
preciselyfollow given "max fetch bytes"<br /> limit. It would be enough just to stop after certain amount of data<br />
wassent.<br /> The whole thing of using limited fetch size is to avoid running out of<br /> memory at client side.<br
/>I do not think developers care how many rows is fetched at once. It<br /> they do, they should rather use "limit X"
SQLsyntax.<br /><br /> Do you have a suggestion for such a use case?<br /><br /> For fixed-size data types, JDBC driver
canestimate "max sane fetch<br /> size", however:<br /> 1) In order to know data types, a roundtrip is required. This
means<br/> the first fetch must be conservative, thus small queries would be<br /> penalized.<br /> 2) For variable
lengthtypes there is no way to estimate "sane number<br /> of rows", except of using "average row size of already
receiveddata".<br /> This is not reliable, especially if the first rows have nulls, and<br /> subsequent ones contain
non-emptystrings.<br /><br /> [1]: <a href="https://github.com/pgjdbc/pgjdbc/issues/292#issue-82595473"
rel="noreferrer"target="_blank">https://github.com/pgjdbc/pgjdbc/issues/292#issue-82595473</a><br /> [2]: <a
href="https://github.com/pgjdbc/pgjdbc/issues/292#issuecomment-107019387"rel="noreferrer"
target="_blank">https://github.com/pgjdbc/pgjdbc/issues/292#issuecomment-107019387</a><br/><span class="HOEnZb"><font
color="#888888"><br/> Vladimir<br /></font></span><div class="HOEnZb"><div class="h5"><br /><br /> --<br /> Sent via
pgsql-hackersmailing list (<a href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br /> To
makechanges to your subscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-hackers" rel="noreferrer"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></div></div></blockquote></div><br
/></div><divclass="gmail_extra">I believe that Kyotaro proposed something like that, wherein the FDW would be more
adaptivebased on the amount of memory available, and fetch a number of rows that, by its estimation, would fit in the
memoryavailable. I don't know the progress of that patch.<br /></div><div class="gmail_extra"><br /></div><div
class="gmail_extra">Thispatch is a far less complicated solution and puts the burden on the DBA to figure out
approximatelyhow many rows would fit in memory based on the average row size, and set the per-table option accordingly.
Ifit is later determined that the rows are now too heavy to fit into the space allotted, the fetch size can be altered
forthat table as needed.</div></div> 

pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: oldest/newestCommitTs output by pg_controldata
Next
From: Jeff Janes
Date:
Subject: Re: WIP: Covering + unique indexes.