Re: plpgsql function is so slow - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: plpgsql function is so slow
Date
Msg-id b42b73150909250648k1d4a1984ga365992b56426bf@mail.gmail.com
Whole thread Raw
In response to Re: plpgsql function is so slow  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
List pgsql-hackers
On Fri, Sep 25, 2009 at 1:05 AM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
>>>>>> "Euler" == Euler Taveira de Oliveira <euler@timbira.com> writes:
>
>  Euler> Ops... forgot to remove it from other test. It seems much
>  Euler> better but far from the ideal. :( I've never taken a look at
>  Euler> the pl/pgsql code but it could be nice if there would be two
>  Euler> path codes: access-data and non-access-data paths.  I have no
>  Euler> idea if it will be possible (is path type too complex to
>  Euler> detect?)  but it will certainly improve the non-access-data
>  Euler> functions.
>
> Like Tom said, this benchmark is silly. Some comparisons (note that in
> all these cases I've replaced the power(10,8) with a constant, because
> you weren't comparing like with like there):
>
> plpgsql     13.3 sec
> tcl85       29.9 sec
> perl5.8      7.7 sec
> python2.6   11.5 sec
> C            0.242 sec
>
> What this suggests to me is that plpgsql isn't so far off the norm for
> interpreted scripting languages; sure it's slower than perl, but then
> most things are; comparing it with C code is just silly.
>
> There is, though, one genuine case that's come up a few times in IRC
> regarding slowness of procedural code in pg, and that's any time
> someone tries to implement some array-based algorithm in plpgsql. The
> fact that a[i] is O(i) not O(1) (unless the array type is fixed length)
> comes as a nasty shock since iterating over an array becomes O(n^2).
>
> This is obviously a consequence of the array storage format; is there
> any potential for changing that to some format which has, say, an array
> of element offsets at the start, rather than relying on stepping over
> length fields?

Couple points:
*) Surely, it's better to encourage use of 'unnest' style approaches
for array iteration
*) If an array has fixed length elements and doesn't have null
elements (a fairly common case), maybe it's worthwhile not
generating/storing the lengths vector?
*) Wouldn't it be possible to store offsets always, not lengths, since
you can calculate the length from the next offset?

merlin


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Hot Standby 0.2.1
Next
From: Pierre Frédéric Caillaud
Date:
Subject: Re: Bulk Inserts and WAL Inserts