Re: arrays of floating point numbers / linear algebra operations into the DB - Mailing list pgsql-general

From Enrico Sirola
Subject Re: arrays of floating point numbers / linear algebra operations into the DB
Date
Msg-id 47A5CF8E.7060304@gmail.com
Whole thread Raw
In response to Re: arrays of floating point numbers / linear algebra operations into the DB  ("Webb Sprague" <webb.sprague@gmail.com>)
Responses Re: arrays of floating point numbers / linear algebra operations into the DB  (Enrico Sirola <enrico.sirola@gmail.com>)
List pgsql-general
Hi Webb,

Webb Sprague ha scritto:
>> I'm quite proud, this is my first C extension function ;-)
>> I'd gladly post the code if it's ok for the list users. It's more or
>> less 100 lines of code. This approach seems promising...
>
> I would definitely like to see it.

here it goes:

-------------------------->linalg.h<----------------------------------

#ifndef linalg_h
#define linalg_h

#include "postgres.h"
#include "utils/array.h"

Datum scale(PG_FUNCTION_ARGS);

#endif /* linalg_h */

-------------------------->linalg.c<----------------------------------


#include "linalg.h"
#include "fmgr.h"
#include <cblas.h>

PG_MODULE_MAGIC;

PG_FUNCTION_INFO_V1( scale);


Datum
scale(PG_FUNCTION_ARGS)
{
   float8    x = PG_GETARG_FLOAT8(0);
   ArrayType *v1 = PG_GETARG_ARRAYTYPE_P(1);
   int           *dims1,
     *lbs1,
     ndims1,
     nitems1,
     ndatabytes1;
   int *arrlbound1, *arrlbound2;
   char *arrdatap1, *arrdatap2;

   ArrayType *rv;

   /* get argument array details */
   lbs1 = ARR_LBOUND(v1);
   dims1 = ARR_DIMS(v1);
   ndims1 = v1->ndim;
   nitems1 = ArrayGetNItems(ndims1, dims1);
   ndatabytes1 = ARR_SIZE(v1) - ARR_DATA_OFFSET(v1);

   if ( ndims1 != 1 )
     ereport(ERROR,
             (errcode(ERRCODE_DATATYPE_MISMATCH),
              errmsg("Multi dimensional array given"),
              errdetail("Array have %d dimensions", ndims1)));

   if (ARR_HASNULL(v1))
     ereport(ERROR,
             (errcode(ERRCODE_DATATYPE_MISMATCH),
              errmsg("Null in array"),
              errdetail("array should not contain null elements")));

   /* allcate new vector */

   rv = (ArrayType *) palloc(ndatabytes1);

   SET_VARSIZE(rv, ndatabytes1);
   rv->ndim = v1->ndim;
   rv->dataoffset = v1->dataoffset; // no nulls (0)
   rv->elemtype = v1->elemtype;
   memcpy(ARR_DIMS(rv), ARR_DIMS(v1), sizeof(int));

   arrlbound2 = ARR_LBOUND(rv);
   arrlbound1 = ARR_LBOUND(v1);

   memcpy(arrlbound2, arrlbound1, sizeof(int));

   arrdatap1 = ARR_DATA_PTR(v1);
   arrdatap2 = ARR_DATA_PTR(rv);

   memcpy(arrdatap2, arrdatap1, nitems1 * sizeof(float8));

   /* scale vector a la blas */
   cblas_dscal(nitems1, x, (float8*) arrdatap2, 1);

   PG_RETURN_ARRAYTYPE_P(rv);
}

--------------------------->linalg.sql<---------------------------------

/* -*-sql-*- */
create or replace function scale(float8, float8[])
returns float8[]
as '$libdir/linalg', 'scale'
language 'C' immutable strict;

create aggregate array_accum (
     sfunc = array_append,
     basetype = anyelement,
     stype = anyarray,
     initcond = '{}'
);

create operator * (leftarg=float8, rightarg=float8[], procedure=scale);

-------------------------------->end<------------------------------------

> GSL licensing is GNU ish, so may be that is a deal breaker, too.

well, I don't know. This is just a proof of concept. Anyway, yes, there
could be problems with GPL.

On the above code: coupling cblas functions with PG float8[] seems easy,
you just have to know which black-magic-macros to use in order to access
the data structure. It took me a while to figure out how it works (I'm
not actually sure I understood everything, but at least the above code
seems to work). The problem for the code above is that it doesn't work
for vectors longer than 1000 elements or so (try it with 2000 and it
doesn't work). I guess I should manage the "toasting" machinery in some
ways - any suggestion is appreciated
Bye,
e.

pgsql-general by date:

Previous
From: vladimir konrad
Date:
Subject: Re: [OT] "advanced" database design (long)
Next
From: "Guido Barosio"
Date:
Subject: Re: [pgsql-advocacy] PostgreSQL Certification