Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL
Date
Msg-id 5293C109.90206@vmware.com
Whole thread Raw
In response to Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
(sorry for possible duplicates, used wrong account on first try)

On 07.10.2013 17:00, Pavel Stehule wrote:
> Hello
>
> I fixed patch - there was a missing cast to domain when it was used (and
> all regress tests are ok now).

This doesn't work correctly for varlen datatypes. I modified your
quicksort example to work with text[], and got this:

postgres=#  SELECT
array_upper(quicksort(1,20000,array_agg((g::text))),1) FROM
generate_series(1,20000) g;
ERROR:  invalid input syntax for integer: ""
CONTEXT:  PL/pgSQL function quicksort(integer,integer,text[]) line 10 at
assignment
PL/pgSQL function quicksort(integer,integer,text[]) line 16 at assignment

The handling of array domains is also wrong. You replace the new value
to the array and only check the domain constraint after that. If the
constraint is violated, it's too late to roll that back. For example:

postgres=# create domain posarray as int[] check (value[1] > 0);
CREATE DOMAIN
postgres=# do $$
declare
    iarr posarray;
begin
    begin
      iarr[1] := 1;
      iarr[1] := -1;
    exception when others then raise notice 'got error: %', sqlerrm; end;
    raise notice 'value: %', iarr[1];
end;
$$;
NOTICE:  got error: value for domain posarray violates check constraint
"posarray_check"
NOTICE:  value: -1
DO

Without the patch, it prints 1, but with the patch, -1.


In general, I'm not convinced this patch is worth the trouble. The
speedup isn't all that great; manipulating large arrays in PL/pgSQL is
still so slow that if you care about performance you'll want to rewrite
your function in some other language or use temporary tables. And you
only get a gain with arrays of fixed-length element type with no NULLs.

So I think we should drop this patch in its current form. If we want to
make array manipulation in PL/pgSQL, I think we'll have to do something
similar to how we handle "row" variables, or something else entirely.
But that's a much bigger patch, and I'm not sure it's worth the trouble
either.

Looking at perf profile and the functions involved, though, I see some
low-hanging fruit: in array_set, the new array is allocated with
palloc0'd, but we only need to zero out a few alignment bytes where the
new value is copied. Replacing it with palloc() will save some cycles,
per the attached patch. Nowhere near as much as your patch, but this is
pretty uncontroversial.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: why semicolon after begin is not allowed in postgresql?
Next
From: Alexander Korotkov
Date:
Subject: Re: Cube extension split algorithm fix