Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
Date
Msg-id 1259068857.30357.104.camel@hvost1700
Whole thread Raw
In response to Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION  (Daniel Farina <drfarina@gmail.com>)
List pgsql-hackers
On Tue, 2009-11-24 at 03:48 -0800, Daniel Farina wrote:
> On Tue, Nov 24, 2009 at 3:25 AM, Hannu Krosing <hannu@2ndquadrant.com> wrote:
> > On Tue, 2009-11-24 at 02:56 -0800, Daniel Farina wrote:
> >> On Tue, Nov 24, 2009 at 2:51 AM, Daniel Farina <drfarina@gmail.com> wrote:
> >> > On Tue, Nov 24, 2009 at 2:50 AM, Hannu Krosing <hannu@2ndquadrant.com> wrote:
> >> >> Can't you use existing aggregate function design ?
> >> >>
> >> >> CREATE AGGREGATE name ( input_data_type [ , ... ] ) (
> >> >>    SFUNC = sfunc,
> >> >>    STYPE = state_data_type
> >> >>    [ , FINALFUNC = ffunc ]
> >> >>    [ , INITCOND = initial_condition ]
> >> >>    [ , SORTOP = sort_operator ]
> >> >> )
> >> >
> >> > Actually, yes.  I just thought that this was an idea so crazy that no
> >> > one would like it.
> >
> > seems kind of natural choice for me - in essence this is an aggregate
> > function, aggregating over rows/tuples supplied to it.
> 
> Okay, well, maybe that wasn't such a crazy idea after all...
> 
> >> Oh, and the other elephant in the room: error handling.  How to handle
> >> error conditions...try/catch/finally type stuff.
> >
> > Same as current aggregates - either ignore the error, logi it and
> > continue, or bail out
> >[snip]
> > Neither do ordinary funtions, we have no "ON ERROR DO ..." clause  for
> > function definitions
> 
> It is assumed most functions do not have side effects outside the
> database, so this is gotten rather for free.  The driving use case for
> this *is* side effects on other systems.  I'm not sure if it's as easy
> to use this justification here...normally rollbacks just take care of
> all the error handling a function would want.  Here I'm not so sure
> that is as common a case.

A cleaner solution for undoing external effects would be ON ROLLBACK
trigger, or maybe even extension to BEGIN 

BEGIN WORK ON ROLLBACK RUN externalCleanupFunction();

ROLLBACK trigger could also be done as SET parameter inside a session,
so it wont bloat/pollute system tables if changed often;

> >
> >> I did consider using
> >> aggregates though, but somehow it felt to me like "I need at least a
> >> three-tuple, why not fish around for any random bundling of three
> >> functions..."
> >
> > Why do you need three ?
> 
> I'm counting the aggregate prototype itself to refer to the bundle,
> which I suppose would be more normally considered a two-tuple of
> functions.  This is a self-referential tuple, I suppose...
> 
> >> After all, I would not want to actually call the nodeAgg stuff to
> >> apply the function anyway...so it'd basically be abused as a
> >> three-tuple of functions.
> >
> > Actually it would be best if it could use straight generic funtions, so
> > you could do something like
> >
> > COPY stdin TO filterfunc(int) TO avg(int);
> 
> Generic functions?  Do you mean just scalar functions? 

Type. Actually I meant our existing aggregate functions.

>  That'd be
> neat, but as I said previously, composition could just be wrapped into
> a function of the user's choice.  Also, what about use of
> multi-function-apply?
> 
> COPY stdin TO replicant1(datum) AND replicant2(datum);

seems like a rare case, but you could use a wrapper func

CREATE FUNCTION replicants_1_and_2(datum) AS  replicant1(datum)  replicant2(datum)

> You could imagine all sorts of new 2PC evil. 

2PC is evil enyway, at least when performance is concerned ;)

>  But again, one could
> just write a little function to absorb the rows and dole them out
> without bloating COPY syntax...
> 
> I am in no way suggesting that syntax seriously or unseriously.
> 
> > pass the file name in as an argument to SFUNC, open it on first call,
> > ignore later (if it stays the same ;)
> 
> So either you are going to pass it with every row and ignore it,

That would be my preferred way, yes

> or create a new initial aggregate state for each COPY TO FUNCTION

third, more hackish way would to set it as INITCOND = '/file/name' :)

> ...how are you going to get it passed to SFUNC?

keep the file handle in the aggregate node - it is for keeping state,
and file handle sure is part of state.


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability   Services, Consulting and Training




pgsql-hackers by date:

Previous
From: Daniel Farina
Date:
Subject: Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
Next
From: Pavel Stehule
Date:
Subject: Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION