On Wed, 2005-11-23 at 16:21 +0900, Atsushi Ogawa wrote:
> In space management for PGresult of libpq, the block size of PGresult
> is always PGRESULT_DATA_BLOCKSIZE(2048bytes). Therefore, when a large
> result of query is received, malloc is executed many times.
>
> My proposal is to enlarge the size of the block whenever the block is
> allocated. The size of first block is PGRESULT_DATA_BLOCKSIZE. And the
> size of the following blocks will be doubled until it reaches
> PGRESULT_MAX_DATA_BLOCKSIZE.
>
> PGRESULT_MAX_DATA_BLOCKSIZE is new constants. I think that 2Mbytes is
> good enough for this constants.
>
> The test result is the following:
>
> Test SQL:
> select * from accounts; (It is pgbench's table. scale factor is 10.)
>
> The number of malloc calls at pqResultAlloc:
> 8.1.0 : 80542
> patched: 86
>
> Execution time:
> 8.1.0 : 6.80 sec
> patched: 6.73 sec
>
What this highlights for me is that we have (IMHO) a strange viewpoint
on allocating result memory, not an optimization issue.
We really ought to be streaming the result back to the user, not
downloading it all into a massive client side chunk of memory. It ought
to be possible to do this with very low memory, and would probably have
the side-effect of reducing time-to-first-row. Then we wouldn't have a
memory allocation issue at all.
Consider what will happen if you do "select * from too_big_table". We'll
just run for ages, then blow memory and fail. (That's what it used to
do, does it still? he asks lazily).
Best Regards, Simon Riggs