Workaround for custom aggregate which would need "internal" as statetype - Mailing list pgsql-general

From Florian G. Pflug
Subject Workaround for custom aggregate which would need "internal" as statetype
Date
Msg-id 443AC506.5060401@phlo.org
Whole thread Raw
Responses Re: Workaround for custom aggregate which would need "internal" as statetype  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi

I'm trying to write an aggrecate collect_distinct(int8) which puts all
distinct values into an array. My first try was defining an aggregate
"collect" using array_append, and doing "select collect(distinct <field>) ..",
but this is quite slow - probably because distinct sorts the values, instead
of using a hash to filter out duplicates.

Using perl, and a perl-hash was even slower, so I wrote my to c-functions
(actualy c++), which use a STL hash_set to filter out duplicates.

I initially defined my state-transaction function as
"collect_distinct(internal, int8) returns internal". The parameter marked internal was
a pointer to a STL hash_set. But using this to define an aggregate failed, because
internal seems to be forbidden as a state-type.

I now resorted to a crude hack to get this running - I changed "internal" to
"int4", and just cast my pointer to a int4 before returning it, and after receiving
it as an argument. This at least enabled me to do some benchmarking, and performance-wise
things look good...

Before using this on a production system, I need to get rid of that hack, but I don't
see how this could be done ATM... Maybe someone here could give me a hint how this could
work...

greetings, Florian Pflug

pgsql-general by date:

Previous
From: "Dave Page"
Date:
Subject: Re: Debian package for freeradius_postgresql module
Next
From: Bruce Momjian
Date:
Subject: Re: pg 8.1.2 performance issue