Thread: efficient math vector operations on arrays

efficient math vector operations on arrays

From
Marcus Engene
Date:
Hi,

Are there highly efficient C extensions out there for math operations on
arrays? Dot product and whatnot.

Example usecase: sort an item by euclid distance.

Kind regards,
Marcus



Re: efficient math vector operations on arrays

From
Pavel Stehule
Date:
Hi

2015-12-24 8:05 GMT+01:00 Marcus Engene <mengpg2@engene.se>:
Hi,

Are there highly efficient C extensions out there for math operations on arrays? Dot product and whatnot.

what you mean "highly efficient" ?

PostgreSQL executor is interpret - so in almost all cases the special optimizations has not big sense. If you take few us, you will lost in executor.

Example usecase: sort an item by euclid distance.

Regards

Pavel

 

Kind regards,
Marcus



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: efficient math vector operations on arrays

From
Marcus Engene
Date:
On 24/12/15 07:13, Pavel Stehule wrote:
Hi

2015-12-24 8:05 GMT+01:00 Marcus Engene <mengpg2@engene.se>:
Hi,

Are there highly efficient C extensions out there for math operations on arrays? Dot product and whatnot.

what you mean "highly efficient" ?

Implemented as a C module so I wont have to unnest or plpgsql.

Kind regards,
Marcus

Re: efficient math vector operations on arrays

From
Pavel Stehule
Date:


2015-12-24 8:34 GMT+01:00 Marcus Engene <mengpg2@engene.se>:
On 24/12/15 07:13, Pavel Stehule wrote:
Hi

2015-12-24 8:05 GMT+01:00 Marcus Engene <mengpg2@engene.se>:
Hi,

Are there highly efficient C extensions out there for math operations on arrays? Dot product and whatnot.

what you mean "highly efficient" ?

Implemented as a C module so I wont have to unnest or plpgsql.

ok,

I don't know any extension that calculate euclid distance, but it should be trivial in C - if you don't need to use generic types and generic operations.

Pavel


Kind regards,
Marcus


Re: efficient math vector operations on arrays

From
Jim Nasby
Date:
On 12/24/15 1:56 AM, Pavel Stehule wrote:
> I don't know any extension that calculate euclid distance, but it should
> be trivial in C - if you don't need to use generic types and generic
> operations.

Before messing around with that, I'd recommend trying either pl/r or
pl/pythonu.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: efficient math vector operations on arrays

From
Jony Cohen
Date:
Hi, Don't know if it's exactly what you're looking for but the MADLib package has utility function for matrix and vector operations.

Regards,
 - Jony

On Fri, Dec 25, 2015 at 9:58 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 12/24/15 1:56 AM, Pavel Stehule wrote:
I don't know any extension that calculate euclid distance, but it should
be trivial in C - if you don't need to use generic types and generic
operations.

Before messing around with that, I'd recommend trying either pl/r or pl/pythonu.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: efficient math vector operations on arrays

From
Jim Nasby
Date:
On 12/27/15 2:00 AM, Jony Cohen wrote:
> Hi, Don't know if it's exactly what you're looking for but the MADLib
> package has utility function for matrix and vector operations.
> see: http://doc.madlib.net/latest/group__grp__array.html

Apply an operator to al elements on an array or pair of arrays:
http://theplateisbad.blogspot.com/2015/12/the-arraymath-extension-vs-plpgsql.html,
https://github.com/pramsey/pgsql-arraymath.

See also
http://theplateisbad.blogspot.com/2015/12/more-fortran-90-like-vector-operations.html.

BTW, if you want to simply apply a function to all elements in an array
there is an internal C function array_map that can do it. There's no SQL
interface to it, but it shouldn't be hard to add one.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: efficient math vector operations on arrays

From
Tom Lane
Date:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> BTW, if you want to simply apply a function to all elements in an array
> there is an internal C function array_map that can do it. There's no SQL
> interface to it, but it shouldn't be hard to add one.

That wouldn't be useful for the example given originally, since it
iterates over just one array not two arrays in parallel.  But you could
imagine writing something similar that would iterate over two arrays and
call a two-argument function.

Whether it's worth a SQL interface is debatable though.  Whatever
efficiency you might gain from using this would probably be eaten by the
overhead of calling a SQL or PL function for each pair of array elements.
You'd probably end up in the same ballpark performance-wise as the UNNEST
solution given earlier.

            regards, tom lane


Re: efficient math vector operations on arrays

From
Jim Nasby
Date:
On 12/29/15 6:50 PM, Tom Lane wrote:
> Jim Nasby<Jim.Nasby@BlueTreble.com>  writes:
>> >BTW, if you want to simply apply a function to all elements in an array
>> >there is an internal C function array_map that can do it. There's no SQL
>> >interface to it, but it shouldn't be hard to add one.
> That wouldn't be useful for the example given originally, since it
> iterates over just one array not two arrays in parallel.  But you could
> imagine writing something similar that would iterate over two arrays and
> call a two-argument function.

Actually, I suspect you could pretty easily do array_map(regprocedure,
VARIADIC anyarray).

> Whether it's worth a SQL interface is debatable though.  Whatever
> efficiency you might gain from using this would probably be eaten by the
> overhead of calling a SQL or PL function for each pair of array elements.
> You'd probably end up in the same ballpark performance-wise as the UNNEST
> solution given earlier.

Take a look at [1]; using a rough equivalent to array_map is 6% faster
than unnest().

The array op array version is 30% faster that plpgsql, which based on
the code at [2] I assume is doing

  explain analyze select array(select a*b from unnest(array(select
random() from generate_series(1,1000000)), array(select random() from
generate_series(1,1000000)))) u(a,b);

The syntactic sugar of r := array_map('function(a, b)', in1, in2) (let
alone r := in1 * in2;) is appealing too.

[1]
http://theplateisbad.blogspot.com/2015/12/the-arraymath-extension-vs-plpgsql.html
[2]
http://theplateisbad.blogspot.com/2015/12/more-fortran-90-like-vector-operations.html
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com