Home > mailing lists

Re: arrays of floating point numbers / linear algebra operations into the DB - Mailing list pgsql-general

From	Enrico Sirola
Subject	Re: arrays of floating point numbers / linear algebra operations into the DB
Date	February 3, 2008 13:28:44
Msg-id	47A5CF8E.7060304@gmail.com Whole thread Raw
In response to	Re: arrays of floating point numbers / linear algebra operations into the DB ("Webb Sprague" <webb.sprague@gmail.com>)
Responses	Re: arrays of floating point numbers / linear algebra operations into the DB (Enrico Sirola <enrico.sirola@gmail.com>)
List	pgsql-general

Tree view

Hi Webb,

Webb Sprague ha scritto:
>> I'm quite proud, this is my first C extension function ;-)
>> I'd gladly post the code if it's ok for the list users. It's more or
>> less 100 lines of code. This approach seems promising...
>
> I would definitely like to see it.

here it goes:

-------------------------->linalg.h<----------------------------------

#ifndef linalg_h
#define linalg_h

#include "postgres.h"
#include "utils/array.h"

Datum scale(PG_FUNCTION_ARGS);

#endif /* linalg_h */

-------------------------->linalg.c<----------------------------------


#include "linalg.h"
#include "fmgr.h"
#include <cblas.h>

PG_MODULE_MAGIC;

PG_FUNCTION_INFO_V1( scale);


Datum
scale(PG_FUNCTION_ARGS)
{
   float8    x = PG_GETARG_FLOAT8(0);
   ArrayType *v1 = PG_GETARG_ARRAYTYPE_P(1);
   int           *dims1,
     *lbs1,
     ndims1,
     nitems1,
     ndatabytes1;
   int *arrlbound1, *arrlbound2;
   char *arrdatap1, *arrdatap2;

   ArrayType *rv;

   /* get argument array details */
   lbs1 = ARR_LBOUND(v1);
   dims1 = ARR_DIMS(v1);
   ndims1 = v1->ndim;
   nitems1 = ArrayGetNItems(ndims1, dims1);
   ndatabytes1 = ARR_SIZE(v1) - ARR_DATA_OFFSET(v1);

   if ( ndims1 != 1 )
     ereport(ERROR,
             (errcode(ERRCODE_DATATYPE_MISMATCH),
              errmsg("Multi dimensional array given"),
              errdetail("Array have %d dimensions", ndims1)));

   if (ARR_HASNULL(v1))
     ereport(ERROR,
             (errcode(ERRCODE_DATATYPE_MISMATCH),
              errmsg("Null in array"),
              errdetail("array should not contain null elements")));

   /* allcate new vector */

   rv = (ArrayType *) palloc(ndatabytes1);

   SET_VARSIZE(rv, ndatabytes1);
   rv->ndim = v1->ndim;
   rv->dataoffset = v1->dataoffset; // no nulls (0)
   rv->elemtype = v1->elemtype;
   memcpy(ARR_DIMS(rv), ARR_DIMS(v1), sizeof(int));

   arrlbound2 = ARR_LBOUND(rv);
   arrlbound1 = ARR_LBOUND(v1);

   memcpy(arrlbound2, arrlbound1, sizeof(int));

   arrdatap1 = ARR_DATA_PTR(v1);
   arrdatap2 = ARR_DATA_PTR(rv);

   memcpy(arrdatap2, arrdatap1, nitems1 * sizeof(float8));

   /* scale vector a la blas */
   cblas_dscal(nitems1, x, (float8*) arrdatap2, 1);

   PG_RETURN_ARRAYTYPE_P(rv);
}

--------------------------->linalg.sql<---------------------------------

/* -*-sql-*- */
create or replace function scale(float8, float8[])
returns float8[]
as '$libdir/linalg', 'scale'
language 'C' immutable strict;

create aggregate array_accum (
     sfunc = array_append,
     basetype = anyelement,
     stype = anyarray,
     initcond = '{}'
);

create operator * (leftarg=float8, rightarg=float8[], procedure=scale);

-------------------------------->end<------------------------------------

> GSL licensing is GNU ish, so may be that is a deal breaker, too.

well, I don't know. This is just a proof of concept. Anyway, yes, there
could be problems with GPL.

On the above code: coupling cblas functions with PG float8[] seems easy,
you just have to know which black-magic-macros to use in order to access
the data structure. It took me a while to figure out how it works (I'm
not actually sure I understood everything, but at least the above code
seems to work). The problem for the code above is that it doesn't work
for vectors longer than 1000 elements or so (try it with 2000 and it
doesn't work). I guess I should manage the "toasting" machinery in some
ways - any suggestion is appreciated
Bye,
e.

pgsql-general by date:

From: vladimir konrad
Date: 03 February 2008, 10:36:00
Subject: Re: [OT] "advanced" database design (long)

From: "Guido Barosio"
Date: 03 February 2008, 14:02:52
Subject: Re: [pgsql-advocacy] PostgreSQL Certification

Re: arrays of floating point numbers / linear algebra operations into the DB - Mailing list pgsql-general

Previous

Next