Home > mailing lists

Re: Postgres array parser - Mailing list pgsql-performance

From	Aleksej Trofimov
Subject	Re: Postgres array parser
Date	December 14, 2011 05:59:53
Msg-id	4EE87386.1010401@ruptela.lt Whole thread Raw
In response to	Re: Postgres array parser ("Marc Mamin" <M.Mamin@intershop.de>)
Responses	Re: Postgres array parser
List	pgsql-performance

Tree view

Yes, it would be great, but I haven't found such a function, which
splits 2 dimensional array into rows =) Maybe we'll modify existing
function, but unfortunately we have tried hstore type and function in
postgres and we see a significant performance improvements. So we only
need to convert existing data into hstore and I think this is a good
solution.

On 12/14/2011 11:21 AM, Marc Mamin wrote:
> Hello,
>
> For such cases (see below), it would be nice to have an unnest function that only affect the first array dimension.
>
> Something like
>
> unnest(ARRAY[[1,2],[2,3]], SLICE=1)
> =>
> unnest
> ------
> [1,2]
> [2,3]
>
>
> With this function, I imagine that following sql function
> might beat the plpgsql FOREACH version.
>
>
> CREATE OR REPLACE FUNCTION input_value_un (in_inputs numeric[], in_input_nr numeric)
>     RETURNS numeric AS
> $BODY$
>
>    SELECT u[1][2]
>    FROM unnest($1, SLICE =1) u
>    WHERE u[1][1]=in_input_nr
>    LIMIT 1;
>
> $BODY$
>     LANGUAGE sql IMMUTABLE;
>
>
>
> best regards,
>
> Marc Mamin
>
>
>> -----Original Message-----
>> From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-
>> owner@postgresql.org] On Behalf Of Pavel Stehule
>> Sent: Dienstag, 13. Dezember 2011 15:43
>> To: Aleksej Trofimov
>> Cc: pgsql-performance@postgresql.org
>> Subject: Re: [PERFORM] Postgres array parser
>>
>> Hello
>>
>> 2011/12/13 Aleksej Trofimov<aleksej.trofimov@ruptela.lt>:
>>> We have tried foreach syntax, but we have noticed performance
>> degradation:
>>> Function with for: 203ms
>>> Function with foreach: ~250ms:
>>>
>>> there is functions code:
>>> CREATE OR REPLACE FUNCTION input_value_fe(in_inputs numeric[],
>> in_input_nr
>>> numeric)
>>>   RETURNS numeric AS
>>> $BODY$
>>> declare i numeric[];
>>> BEGIN
>>>         FOREACH i SLICE 1 IN ARRAY in_inputs
>>>             LOOP
>>>                  if i[1] = in_input_nr then
>>>                     return i[2];
>>>                  end if;
>>>             END LOOP;
>>>
>>>     return null;
>>> END;
>>> $BODY$
>>>   LANGUAGE plpgsql VOLATILE
>>>   COST 100;
>>>
>>> CREATE OR REPLACE FUNCTION input_value(in_inputs numeric[],
>> in_input_nr
>>> numeric)
>>>   RETURNS numeric AS
>>> $BODY$
>>> declare
>>>   size int;
>>> BEGIN
>>>   size = array_upper(in_inputs, 1);
>>>     IF size IS NOT NULL THEN
>>>
>>>         FOR i IN 1 .. size LOOP
>>>             if in_inputs[i][1] = in_input_nr then
>>>                 return in_inputs[i][2];
>>>             end if;
>>>         END LOOP;
>>>     END IF;
>>>
>>>     return null;
>>> END;
>>> $BODY$
>>>   LANGUAGE plpgsql VOLATILE
>>>   COST 100;
>>>
>>>
>>> On 12/13/2011 04:02 PM, Pavel Stehule wrote:
>>>> Hello
>>>>
>>>> do you know FOREACH IN ARRAY statement in 9.1
>>>>
>>>> this significantly accelerate iteration over array
>>>>
>>>>
>>>> http://www.depesz.com/index.php/2011/03/07/waiting-for-9-1-foreach-
>> in-array/
>>>>
>>>>
>>>> 2011/12/13 Aleksej Trofimov<aleksej.trofimov@ruptela.lt>:
>>>>> Hello, I wanted to ask according such a problem which we had faced
>> with.
>>>>> We are widely using postgres arrays like key->value array by doing
>> like
>>>>> this:
>>>>>
>>>>> {{1,5},{2,6},{3,7}}
>>>>>
>>>>> where 1,2,3 are keys, and 5,6,7 are values. In our pgSql functions
>> we are
>>>>> using self written array_input(array::numeric[], key::numeric)
>> function
>>>>> which makes a loop on whole array and searches for key like
>>>>> FOR i IN 1 .. size LOOP
>>>>>             if array[i][1] = key then
>>>>>                 return array[i][2];
>>>>>             end if;
>>>>> END LOOP;
>>>>>
>>>>> But this was a good solution until our arrays and database had
>> grown. So
>>>>> now
>>>>> FOR loop takes a lot of time to find value of an array.
>>>>>
>>>>> And my question is, how this problem of performance could be
>> solved? We
>>>>> had
>>>>> tried pgperl for string parsing, but it takes much more time than
>> our
>>>>> current solution. Also we are thinking about self-written C++
>> function,
>>>>> may
>>>>> be someone had implemented this algorithm before?
>>>>>
>>>> you can use indexes or you can use hstore
>>>>
>>>> Regards
>>>>
>>>> Pavel Stehule
>>>>
>>>>> --
>>>>> Best regards
>>>>>
>>>>> Aleksej Trofimov
>>>>>
>>>>>
>>>>> --
>>>>> Sent via pgsql-performance mailing list
>>>>> (pgsql-performance@postgresql.org)
>>>>> To make changes to your subscription:
>>>>> http://www.postgresql.org/mailpref/pgsql-performance
>>>
>> It is strange - on my comp FOREACH is about 2x faster
>>
>> postgres=# select input_value(array(select
>> generate_series(1,1000000)::numeric), 100000);
>>   input_value
>> -------------
>>
>> (1 row)
>>
>> Time: 495.426 ms
>>
>> postgres=# select input_value_fe(array(select
>> generate_series(1,1000000)::numeric), 100000);
>>   input_value_fe
>> ----------------
>>
>> (1 row)
>>
>> Time: 248.980 ms
>>
>> Regards
>>
>> Pavel
>>
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-
>> performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance


--
Best regards

Aleksej Trofimov

pgsql-performance by date:

From: "Marc Mamin"
Date: 14 December 2011, 05:22:23
Subject: Re: Postgres array parser

From: "Marc Mamin"
Date: 14 December 2011, 06:27:41
Subject: Re: Postgres array parser

Re: Postgres array parser - Mailing list pgsql-performance

Previous

Next