cbbrowne@cbbrowne.com wrote:
> What might be "nifty" would be to have some mappings that did Clever
> Transformations of Queries Into Views, particularly if that allowed
> harnessing the DBMS to do some of the statistical analysis behind your
> back...
I'm not quite sure what you mean here, but it does support pulling data into
the R interpreter as a "data.frame" via SPI, and returning R
matricies/vectors/data.frames as either Postgres arrays or as rows and columns
of a table function. Here's two contrived, but illustrative, examples:
create or replace function test_dtup() returns record as
'data.frame(letters[1:10],1:10)' language 'plr';
select * from test_dtup() as t(f1 text, f2 int); f1 | f2
----+---- a | 1 b | 2 c | 3 d | 4 e | 5 f | 6 g | 7 h | 8 i | 9 j | 10
(10 rows)
create or replace function test_spi_tup(text) returns record as
'pg.spi.exec(arg1)' language 'plr';
select * from test_spi_tup('select oid, typname from pg_type where typname =
''oid'' or typname = ''text''') as t(typeid oid, typename name); typeid | typename
--------+---------- 25 | text 26 | oid
(2 rows)
You could easily perform a parameterized query via SPI, retrieve the results
into an R data.frame, do some statistical manipulations, and then return the
results as a table function. The table function itself could be wrapped in a
view to hide the whole thing from the end-user.
You can also create custom aggregates. There has been at least one thread not
too long ago regarding an aggregate to calculate median, for instance. Here it
is in plr:
create table foo(f1 text, f2 float8);
insert into foo values('cat1',1.21);
insert into foo values('cat1',1.24);
insert into foo values('cat1',1.18);
insert into foo values('cat1',1.26);
insert into foo values('cat1',1.15);
insert into foo values('cat2',1.15);
insert into foo values('cat2',1.26);
insert into foo values('cat2',1.32);
insert into foo values('cat2',1.30);
create or replace function r_median(_float8) returns float as 'median(arg1)'
language 'plr';
CREATE AGGREGATE median (sfunc = array_accum, basetype = float8, stype =
_float8, finalfunc = r_median);
select f1, median(f2) from foo group by f1 order by f1; f1 | median
------+-------- cat1 | 1.21 cat2 | 1.28
(2 rows)
It's not as fast as the native PostgreSQL functions if you just need average
or standard deviation, but it's alot easier and faster than writing your own
for something more out-of-the-ordinary.
Joe