Re: [POC] FETCH limited by bytes. - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: [POC] FETCH limited by bytes.
Date
Msg-id CADkLM=fqQYDHsHeR3pid=P1VpPBVmPH9=vZ+pQb_7KZmWLhztA@mail.gmail.com
Whole thread
In response to Re: [POC] FETCH limited by bytes.  (Vladimir Sitnikov <sitnikov.vladimir@gmail.com>)
Responses Re: [POC] FETCH limited by bytes.
List pgsql-hackers
On Sat, Dec 26, 2015 at 6:16 AM, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:
>Have you got numbers showing any actual performance win for postgres_fdw?

For JDBC purposes, it would be nice to have an ability of asking
backend "to stop fetching if produced more than X MiB of response
data".
For small table (4 int4 fields), having decent fetchSize (~1000) makes
result processing 7 times faster than with fetchSize of 50 rows (14 ms
-> 2 ms for 2000 rows).
Here are the measurements: [1] and [2].

Note: it is not required to precisely follow given "max fetch bytes"
limit. It would be enough just to stop after certain amount of data
was sent.
The whole thing of using limited fetch size is to avoid running out of
memory at client side.
I do not think developers care how many rows is fetched at once. It
they do, they should rather use "limit X" SQL syntax.

Do you have a suggestion for such a use case?

For fixed-size data types, JDBC driver can estimate "max sane fetch
size", however:
1) In order to know data types, a roundtrip is required. This means
the first fetch must be conservative, thus small queries would be
penalized.
2) For variable length types there is no way to estimate "sane number
of rows", except of using "average row size of already received data".
This is not reliable, especially if the first rows have nulls, and
subsequent ones contain non-empty strings.

[1]: https://github.com/pgjdbc/pgjdbc/issues/292#issue-82595473
[2]: https://github.com/pgjdbc/pgjdbc/issues/292#issuecomment-107019387

Vladimir


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

I believe that Kyotaro proposed something like that, wherein the FDW would be more adaptive based on the amount of memory available, and fetch a number of rows that, by its estimation, would fit in the memory available. I don't know the progress of that patch.

This patch is a far less complicated solution and puts the burden on the DBA to figure out approximately how many rows would fit in memory based on the average row size, and set the per-table option accordingly. If it is later determined that the rows are now too heavy to fit into the space allotted, the fetch size can be altered for that table as needed.

pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: oldest/newestCommitTs output by pg_controldata
Next
From: Jeff Janes
Date:
Subject: Re: WIP: Covering + unique indexes.