Re: json api WIP patch - Mailing list pgsql-hackers

From james
Subject Re: json api WIP patch
Date
Msg-id 50EC7C70.5010104@mansionfamily.plus.com
Whole thread Raw
In response to Re: json api WIP patch  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: json api WIP patch
List pgsql-hackers
>>
>> I had been wondering how to do such an insertion efficiently in the context of SPI, but it seems that there is no
SPI_copyequiv that would allow a query parse and plan to be avoided.
 
>
> Your query above would need to be planned too, although the plan will be trivial.

Ah yes, I meant that I had not found a way to avoid it (for multi-row 
inserts etc) from a stored proc context where I have SPI functions 
available.

> You should not try to use it as a general bulk load facility. And it will not be as fast as COPY for several reasons,
includingthat the Json parsing routines are necessarily much heavier than the COPY parse routines, which have in any
casebeen optimized over quite a long period. Also, a single json datum is limited to no more than 1Gb. If you have such
adatum, parsing it involves having it in memory and then taking a copy (I wonder if we could avoid that step - will
takea look). Then each object is decomposed into a hash table of key value pairs, which it then used to construct the
recorddatum. Each field name  in the result record is used to look up the value in the hash table - this happens once
inthe case of populate_record() and once per object in the array in the case of populate_recordset(). In the latter
casethe resulting records are put into a tuplestore structure (which spills to disk if necessary) which is then
returnedto the caller when all the objects in the js
 
on array are processed. COPY doesn't have these sorts of issues. It knows without having to look things up where each
datumis in each record, and it stashes the result straight into the target table. It can read and insert huge numbers
ofrows without significant memory implications.
 

Yes - but I don't think I can use COPY from a stored proc context can I?  If I could use binary COPY from a stored proc
thathas received a 
 
binary param and unpacked to the data, it would be handy.

If SPI provided a way to perform a copy to a temp table and then some 
callback on an iterator that yields rows to it, that would do the trick 
I guess.

> Perhaps if you give us a higher level view of what you're trying to achieve we can help you better.

I had been trying to identify a way to work with record sets where the 
records might be used for insert, or for updates or deletion statements, 
preferably without forming a large custom SQL statement that must then 
be parsed and planned (and which would be a PITA if I wanted to use the 
SQL-C preprocessor or some language bindings that like to prepare a 
statement and execute with params).

The data I work with has a master-detail structure and insertion 
performance matters, so I'm trying to limit manipulations to one 
statement per table per logical operation even where there are multiple 
detail rows.

Sometimes the network latency can be a pain too and that also suggests 
an RPC with unpack and insert locally.

Cheers
James




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it
Next
From: Andrew Dunstan
Date:
Subject: Re: json api WIP patch