Re: [HACKERS] Faster methods for getting SPI results - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: [HACKERS] Faster methods for getting SPI results
Date
Msg-id dc3daba8-1066-0a7c-641a-0fc4afb1c1ad@BlueTreble.com
Whole thread Raw
In response to [HACKERS] Faster methods for getting SPI results  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses Re: [HACKERS] Faster methods for getting SPI results  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
List pgsql-hackers
On 12/20/16 10:14 PM, Jim Nasby wrote:
> It would be a lot more efficient if we could just grab datums from the
> executor and make a single copy into plpython (or R), letting the PL
> deal with all the memory management overhead.
>
> I briefly looked at using SPI cursors to do just that, but that looks
> even worse: every fetch is executed in a subtransaction, and every fetch
> creates an entire tuplestore even if it's just going to return a single
> value. (But hey, we never claimed cursors were fast...)
>
> Is there any way to avoid all of this? I'm guessing one issue might be
> that we don't want to call an external interpreter while potentially
> holding page pins, but even then couldn't we just copy a single tuple at
> a time and save a huge amount of palloc overhead?

AFAICT that's exactly how DestRemote works: it grabs a raw slot from the 
executor, makes sure it's fully expanded, and sends it on it's way via 
pq_send*(). So presumably the same could be done for SPI, by creating a 
new CommandDest (ISTM none of the existing ones would do what we want).

I'm not sure what the API for this should look like. One possibility is 
to have SPI_execute and friends accept a flag that indicates not to 
build a tupletable. I don't think a query needs to be read-only to allow 
for no tuplestore, so overloading read_only seems like a bad idea.

Another option is to treat this as a "lightweight cursor" that only 
allows forward fetches. One nice thing about that option is it leaves 
open the possibility of using a small tuplestore for each "fetch", 
without all the overhead that a full blown cursor has. This assumes 
there are some use cases where you want to operate on relatively small 
sets of tuples at a time, but you don't need to materialize the whole 
thing in one shot.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [HACKERS] pgstattuple documentation clarification
Next
From: Jim Nasby
Date:
Subject: Re: [HACKERS] Why does plpython delay composite type resolution?