Re: Aggregate C function accumulating a text array - Mailing list pgsql-general

From Joe Conway
Subject Re: Aggregate C function accumulating a text array
Date
Msg-id 40C10C90.1060301@joeconway.com
Whole thread Raw
In response to Aggregate C function accumulating a text array  (Joel Dudley <joel@nanovoid.com>)
List pgsql-general
Joel Dudley wrote:
>   I am about to write a set of C functions to be used in an aggregate
> function in which the final function performs a calculation on an array
> of accumulated text data types stored in a text[] array. I need to use
> the text type because this function will be used on DNA sequences which
> can be very large. My questions are the following. What is the most
> efficient way to accumulate a text array while being efficient with
> memory? I see construct_array() used in accumulation functions but I am
> worried that I might end up making a copy of a potentially very large
> text array each time my accumulation function is called.

True, but the intermediate results should be released after each row, I
think. You might try it with some real data before assuming a
performance problem.

If it is a problem, take a look at how contrib/intagg works. It
basically just passes a pointer from call to call. You could do
something similar for the text data type.

> The general flow is
>
> User defined aggregate function
>     SELECT pb_distance_k2p(sequence) WHERE family_id = 10;
>
> uses accumulation function
>
> distance_accum(PG_FUNCTION_ARGS);
>
> and uses a final function
>
> calculate_distance_k2p(PG_FUNCTION_ARGS)
>
> which needs to deconstruct_array() to get the text array and loop
> through the array to do some pairwise comparisons of the text and return
> a multidimensional array

Makes sense to me. BTW, take a look at PL/R
http://www.joeconway.com/plr/

It would allow you to write your final function in R, which has many
extensions related to bioinformatics -- see:
http://www.bioconductor.org/

HTH,

Joe

pgsql-general by date:

Previous
From: Bill Moran
Date:
Subject: Re: Queries slow from within plpgsql
Next
From: "Marc G. Fournier"
Date:
Subject: Re: News outage?