Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION - Mailing list pgsql-hackers
From | Pavel Stehule |
---|---|
Subject | Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION |
Date | |
Msg-id | 162867790911240539v3aa7e091g583aa6e77e6dcfe7@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION (Daniel Farina <drfarina@gmail.com>) |
Responses |
Re: [PATCH 4/4] Add tests to dblink covering use of COPY
TO FUNCTION
|
List | pgsql-hackers |
2009/11/24 Daniel Farina <drfarina@gmail.com>: > On Tue, Nov 24, 2009 at 4:37 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> 2009/11/24 Daniel Farina <drfarina@gmail.com>: >>> On Tue, Nov 24, 2009 at 2:10 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>> Hello >>>> >>>> I thing, so this patch is maybe good idea. I am missing better >>>> function specification. Specification by name isn't enough - we can >>>> have a overloaded functions. This syntax doesn't allow to use explicit >>>> cast - from my personal view, the syntax is ugly - with type >>>> specification we don't need to keyword FUNCTION >>> >>> As long as things continue to support the INTERNAL-type behavior for >>> extremely low overhead bulk transfers I am open to suggestions about >>> how to enrich things...but how would I do so under this proposal? >>> >> >> using an INTERNAL type is wrong. It breaks design these functions for >> usual PL. I don't see any reason, why it's necessary. >> >>> I am especially fishing for suggestions in the direction of managing >>> state for the function between rows though...I don't like how the >>> current design seems to scream "use a global variable." >>> >>>> We have a fast copy statement - ok., we have a fast function ok, but >>>> inside a function we have to call "slow" sql query. Personally What is >>>> advantage? >>> >>> The implementation here uses a type 'internal' for performance. It >>> doesn't even recompute the fcinfo because of the very particular >>> circumstances of how the function is called. It doesn't do a memory >>> copy of the argument buffer either, to the best of my knowledge. In >>> the dblink patches you basically stream directly from the disk, format >>> the COPY bytes, and shove it into a waiting COPY on another postgres >>> node...there's almost no additional work in-between. All utilized >>> time would be some combination of the normal COPY byte stream >>> generation and libpq. >>> >> >> I understand and I dislike it. This design isn't general - or it is >> far from using a function. It doesn't use complete FUNCAPI interface. >> I thing so you need different semantic. You are not use a function. >> You are use some like "stream object". This stream object can have a >> input, output function, and parameters should be internal (I don't >> thing, so internal could to carry any significant performance here) or >> standard. Syntax should be similar to CREATE AGGREGATE. > > I think you might be right about this. At the time I was too shy to > add a DDL command for this hack, though. But what I did want is a > form of currying, and that's not easily accomplished in SQL without > extension... > COPY is a PostgreSQL extension. If there are other related extensions - why not? PostgreSQL has lot of database objects over SQL standard - see fulltext implementation. I am not sure if STREAM is good keyword now. It could be in collision with STREAM from streaming databases. >> then syntax should be: >> >> COPY table TO streamname(parameters) >> >> COPY table TO filestream('/tmp/foo.dta') ... >> COPY table TO dblinkstream(connectionstring) ... > > I like this one quite a bit...it's a bit like an aggregate, except the > initial condition can be set in a rather function-callish way. > > But that does seem to require making a DDL command, which leaves a > nice green field. In particular, we could then make as many hooks, > flags, and options as we wanted, but sometimes there is a paradox of > choice...I just did not want to anticipate on Postgres being friendly > to a new DDL command when writing this the first time. > sure - nobody like too much changes in gram.y. But well designed general feature with related SQL enhancing is more acceptable, then fast simply hack. Don't be a hurry. This idea is good - but it needs: a) good designed C API like: initialise_functions(fcinfo) -- std fcinfo consument_process_tuple(fcinfo) -- gets standard row -- Datum dvalues[] + Row description producent_process_tuple(fcinfo) -- returns standard row -- Datum dvalues[] + Row description (look on SRF API) terminate_funnction(fcinfo) I am sure, so this could be similar to AGGREGATE api + some samples to contrib b) good designed PLPerlu and PLPythonu interface + some samples to documentation Regards Pavel Stehule > >
pgsql-hackers by date: