Home > mailing lists

Re: Speed dblink using alternate libpq tuple storage - Mailing list pgsql-hackers

From	Marko Kreen
Subject	Re: Speed dblink using alternate libpq tuple storage
Date	January 20, 2012 10:50:07
Msg-id	20120120144945.GA4863@gmail.com Whole thread Raw
In response to	Re: Speed dblink using alternate libpq tuple storage (Kyotaro HORIGUCHI <horiguchi.kyotaro@oss.ntt.co.jp>)
Responses	Re: Speed dblink using alternate libpq tuple storage Re: Speed dblink using alternate libpq tuple storage
List	pgsql-hackers

Tree view

On Tue, Jan 17, 2012 at 05:53:33PM +0900, Kyotaro HORIGUCHI wrote:
> Hello,  This is revised and rebased version of the patch.
> 
> a. Old term `Add Tuple Function' is changed to 'Store
>    Handler'. The reason why not `storage' is simply length of the
>    symbols.
> 
> b. I couldn't find the place to settle PGgetAsCString() in. It is
>    removed and storeHandler()@dblink.c touches PGresAttValue
>    directly in this new patch. Definition of PGresAttValue stays
>    in lipq-fe.h and provided with comment.
> 
> c. Refine error handling of dblink.c. I think it preserves the
>    previous behavior for column number mismatch and type
>    conversion exception.
> 
> d. Document is revised.

First, my priority is one-the-fly result processing,
not the allocation optimizing.  And this patch seems to make
it possible, I can process results row-by-row, without the
need to buffer all of them in PQresult.  Which is great!

But the current API seems clumsy, I guess its because the
patch grew from trying to replace the low-level allocator.

I would like to propose better one-shot API with:
   void *(*RowStoreHandler)(PGresult *res, PGresAttValue *columns);

where the PGresAttValue * is allocated once, inside PQresult.
And the pointers inside point directly to network buffer.
Ofcourse this requires replacing the current per-column malloc+copy
pattern with per-row parse+handle pattern, but I think resulting
API will be better:

1) Pass-through processing do not need to care about unnecessary  per-row allocations.

2) Handlers that want to copy of the row (like regular libpq),  can optimize allocations by having "global" view of the
row. (Eg. One allocation for row header + data).

This also optimizes call patterns - first libpq parses packet,
then row handler processes row, no unnecessary back-and-forth.

Summary - current API has various assumptions how the row is
processed, let's remove those.

-- 
marko

pgsql-hackers by date:

From: Robert Haas
Date: 20 January 2012, 10:38:13
Subject: Re: Inline Extension

From: Simon Riggs
Date: 20 January 2012, 10:50:14
Subject: Re: CLOG contention, part 2

Re: Speed dblink using alternate libpq tuple storage - Mailing list pgsql-hackers

Previous

Next