Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL
Date
Msg-id CAFj8pRB8=cGxi1XAtjiTD0Sn3ZqYkTomEd96OdzGPUVs7tX9Cg@mail.gmail.com
Whole thread Raw
In response to Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL  (Andres Freund <andres@2ndquadrant.com>)
Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
Hello

I fixed patch - there was a missing cast to domain when it was used (and all regress tests are ok now).

a some performance tests

set array_fast_update to off;
postgres=# select fill_2d_array(300,300);
 fill_2d_array
───────────────
         90000
(1 row)

Time: 33570.087 ms
postgres=# set array_fast_update to on;
SET
Time: 0.279 ms
postgres=# select fill_2d_array(300,300);
 fill_2d_array
───────────────
         90000
(1 row)

Time: 505.202 ms

CREATE OR REPLACE FUNCTION public.quicksort(l integer, r integer, a integer[])
 RETURNS integer[]
 LANGUAGE plpgsql
 IMMUTABLE STRICT
AS $function$
DECLARE akt int[] = a;
  i integer := l; j integer := r; x integer = akt[(l+r) / 2];
  w integer;
BEGIN
  LOOP
    WHILE akt[i] < x LOOP i := i + 1; END LOOP;
    WHILE x < akt[j] loop j := j - 1; END LOOP;
    IF i <= j THEN
      w := akt[i];
      akt[i] := akt[j]; akt[j] := w;
      i := i + 1; j := j - 1;
    END IF;
    EXIT WHEN i > j;
  END LOOP;
  IF l < j THEN akt := quicksort(l,j,akt); END IF;
  IF i < r then akt := quicksort(i,r,akt); END IF;
  RETURN akt;
END;
$function$


postgres=# set array_fast_update to off;
SET
Time: 0.282 ms
postgres=# SELECT array_upper(quicksort(1,21000,array_agg(a)),1) FROM test;
 array_upper
─────────────
       21000
(1 row)

Time: 5086.781 ms
postgres=# set array_fast_update to on;
SET
Time: 0.702 ms
postgres=# SELECT array_upper(quicksort(1,21000,array_agg(a)),1) FROM test;
 array_upper
─────────────
       21000
(1 row)

Time: 1751.920 ms

So in first test - fast update is 66x faster, second test - 3x faster

I found so for small arrays (about 1000 fields) a difference is not significant.


This code can be cleaned (a domains can be better optimized generally - a IO cast can be expensive for large arrays and domain_check can be used there instead), but it is good enough for discussion if we would this optimization or not.

Regards

Pavel





2013/10/3 Pavel Stehule <pavel.stehule@gmail.com>
Hello

a very ugly test shows a possibility about  100% speedup on reported example (on small arrays, a patch is buggy and doesn't work for larger arrays).

I updated a code to be read only

CREATE OR REPLACE FUNCTION public.fill_2d_array(rows integer, cols integer)
 RETURNS integer
 LANGUAGE plpgsql
AS $function$
DECLARE
    img double precision[][];
    i integer; j integer;
    cont integer; r double precision;
BEGIN
    img  := ARRAY( SELECT 0 FROM generate_series(1, rows * cols) ) ;
    cont:= 0;
    For i IN 1..rows LOOP
            For j IN 1..cols LOOP r := img[i * cols + j];
                    r := (i * cols + j)::double precision;
                    cont := cont + 1; --raise notice '%', img;
            END LOOP;
    END LOOP;
    return cont;
END;
$function$

It exec all expressions

-- original
postgres=# select fill_2d_array(200,200);
 fill_2d_array
---------------
         40000
(1 row)

Time: 12726.117 ms

-- read only version
postgres=# select fill_2d_array(200,200); fill_2d_array
---------------
         40000
(1 row)

Time: 245.894 ms

so there is about 50x slowdown 


2013/10/3 Pavel Stehule <pavel.stehule@gmail.com>



2013/10/3 Tom Lane <tgl@sss.pgh.pa.us>
Pavel Stehule <pavel.stehule@gmail.com> writes:
> If you can do a update of some array in plpgsql now, then you have to work
> with local copy only. It is a necessary precondition, and I am think it is
> valid.

If the proposal only relates to assignments to elements of plpgsql local
variables, it's probably safe, but it's also probably not of much value.
plpgsql has enough overhead that I'm doubting you'd get much real-world
speedup.  I'm also not very excited about putting even more low-level
knowledge about array representation into plpgsql.

I looked to code, and I am thinking so this can be done inside array related routines. We just have to signalize request for inplace update (if we have a local copy).

I have not idea, how significant speedup can be (if any), but current behave is not friendly (and for multidimensional arrays there are no workaround), so it is interesting way - and long time I though about some similar optimization.

Regards

Pavel
 

                        regards, tom lane



Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: logical changeset generation v6.2
Next
From: Andres Freund
Date:
Subject: Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL