Re: arrays of floating point numbers / linear algebra operations into the DB - Mailing list pgsql-general
From | Enrico Sirola |
---|---|
Subject | Re: arrays of floating point numbers / linear algebra operations into the DB |
Date | |
Msg-id | 47A5CF8E.7060304@gmail.com Whole thread Raw |
In response to | Re: arrays of floating point numbers / linear algebra operations into the DB ("Webb Sprague" <webb.sprague@gmail.com>) |
Responses |
Re: arrays of floating point numbers / linear algebra operations
into the DB
(Enrico Sirola <enrico.sirola@gmail.com>)
|
List | pgsql-general |
Hi Webb, Webb Sprague ha scritto: >> I'm quite proud, this is my first C extension function ;-) >> I'd gladly post the code if it's ok for the list users. It's more or >> less 100 lines of code. This approach seems promising... > > I would definitely like to see it. here it goes: -------------------------->linalg.h<---------------------------------- #ifndef linalg_h #define linalg_h #include "postgres.h" #include "utils/array.h" Datum scale(PG_FUNCTION_ARGS); #endif /* linalg_h */ -------------------------->linalg.c<---------------------------------- #include "linalg.h" #include "fmgr.h" #include <cblas.h> PG_MODULE_MAGIC; PG_FUNCTION_INFO_V1( scale); Datum scale(PG_FUNCTION_ARGS) { float8 x = PG_GETARG_FLOAT8(0); ArrayType *v1 = PG_GETARG_ARRAYTYPE_P(1); int *dims1, *lbs1, ndims1, nitems1, ndatabytes1; int *arrlbound1, *arrlbound2; char *arrdatap1, *arrdatap2; ArrayType *rv; /* get argument array details */ lbs1 = ARR_LBOUND(v1); dims1 = ARR_DIMS(v1); ndims1 = v1->ndim; nitems1 = ArrayGetNItems(ndims1, dims1); ndatabytes1 = ARR_SIZE(v1) - ARR_DATA_OFFSET(v1); if ( ndims1 != 1 ) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("Multi dimensional array given"), errdetail("Array have %d dimensions", ndims1))); if (ARR_HASNULL(v1)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("Null in array"), errdetail("array should not contain null elements"))); /* allcate new vector */ rv = (ArrayType *) palloc(ndatabytes1); SET_VARSIZE(rv, ndatabytes1); rv->ndim = v1->ndim; rv->dataoffset = v1->dataoffset; // no nulls (0) rv->elemtype = v1->elemtype; memcpy(ARR_DIMS(rv), ARR_DIMS(v1), sizeof(int)); arrlbound2 = ARR_LBOUND(rv); arrlbound1 = ARR_LBOUND(v1); memcpy(arrlbound2, arrlbound1, sizeof(int)); arrdatap1 = ARR_DATA_PTR(v1); arrdatap2 = ARR_DATA_PTR(rv); memcpy(arrdatap2, arrdatap1, nitems1 * sizeof(float8)); /* scale vector a la blas */ cblas_dscal(nitems1, x, (float8*) arrdatap2, 1); PG_RETURN_ARRAYTYPE_P(rv); } --------------------------->linalg.sql<--------------------------------- /* -*-sql-*- */ create or replace function scale(float8, float8[]) returns float8[] as '$libdir/linalg', 'scale' language 'C' immutable strict; create aggregate array_accum ( sfunc = array_append, basetype = anyelement, stype = anyarray, initcond = '{}' ); create operator * (leftarg=float8, rightarg=float8[], procedure=scale); -------------------------------->end<------------------------------------ > GSL licensing is GNU ish, so may be that is a deal breaker, too. well, I don't know. This is just a proof of concept. Anyway, yes, there could be problems with GPL. On the above code: coupling cblas functions with PG float8[] seems easy, you just have to know which black-magic-macros to use in order to access the data structure. It took me a while to figure out how it works (I'm not actually sure I understood everything, but at least the above code seems to work). The problem for the code above is that it doesn't work for vectors longer than 1000 elements or so (try it with 2000 and it doesn't work). I guess I should manage the "toasting" machinery in some ways - any suggestion is appreciated Bye, e.
pgsql-general by date: