The largest arrays I expect at the moment are more or less sparse
vectors of around 4.8k elements and I have noticed that the
input/output (C/C++ extension) does not scale well with the number of
elements in the array.
Using a function that sums all elements in the array, this is the time
it takes for ~150k arrays of various sizes (including ordering desc
and limit 10):
128: 61ms
256: 80ms
512: 681ms
1024 1065ms
2048 7682ms
4096 21332ms
That's why I thought that the construction of the PostgreSQL array was
not optimal.
On Wed, Apr 27, 2011 at 14:49, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Apr 27, 2011 at 6:02 AM, Adrian Schreyer <ams214@cam.ac.uk> wrote:
>> At the moment I am using the following code to construct a PostgreSQL
>> array from a C array in my C extension but I am not so sure if this is
>> really the best solution:
>>
>> const int *data = array.data(); // C array
>> Datum *d = (Datum *) palloc(sizeof(Datum) * size);
>>
>> for (int i = 0; i < size; i++) d[i] = Int32GetDatum(data[i]);
>>
>> ArrayType *a = construct_array(d, size, INT4OID, sizeof(int4), true, 'i');
>>
>> Is this okay or is there a better solution (existing function in the
>> PostgreSQL source for example)?
>
> That's pretty much the best way AFAIK. Int32GetDatum doesn't do
> anything fancy -- it's just a 32 bit mask/assignment. constructing
> the array at once is going to be a lot better than incrementally
> creating it. Do you expect the arrays to be large, say bigger than
> 10k elements?
>
> merlin
>