Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION - Mailing list pgsql-hackers

From Daniel Farina
Subject Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
Date
Msg-id 7b97c5a40912050032g5651c257q49b9cdca6da82e2d@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Nov 30, 2009 at 12:14 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Jeff Davis wrote:
>
> COPY target FROM FUNCTION foo() WITH RECORDS;
>
>
> In what format would the records be?

As a not-quite aside, I think I have a better syntax suggestion.  I
have recently been digging around in the grammar with the changes made
in the following commit:

commit a6833fbb85cb5212a9d8085849e7281807f732a6
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Mon Sep 21 20:10:21 2009 +0000
   Define a new, more extensible syntax for COPY options.
   This is intentionally similar to the recently revised syntax for EXPLAIN   options, ie, (name value, ...).  The old
syntaxis still supported for   backwards compatibility, but we intend that any options added in future   will be
providedonly in the new syntax.
 
   Robert Haas, Emmanuel Cecchet

As it turns out, the following syntax may work pretty well:
 COPY y TO FUNCTION (setup_function the_setup_function('some arg', 3, 7, 42))

Basically the func_expr reduction fits very neatly into the
copy_generic_opt_elem reduction:
   copy_generic_opt_elem:                           ColLabel copy_generic_opt_arg                                   {
                                       $$ = (Node *) makeDefElem($1, $2);                                   }
               | ColLabel func_expr                                   {                                           $$ =
(Node*) $2;                                   }                   ;
 

Now we can use more or less any reasonable number of symbol names and
function calls we desire.  This makes life considerably easier, I
think...

We can also try to refactor COPY's internals to take advantage of
these features (and potentially reduce the number of mechanisms.  For
example, the legacy "COPY ... TO '/place' WITH CSV" perhaps can be
more verbosely/generically expressed as:
 COPY ... TO FUNCTION (setup_function to_file('/place'),                       record_converter csv_formatter,
            stream_function fwrite                       end_function fclose);
 

We can also add auxiliary symbols for error handling behavior.  For
example, were the COPY to fail for some reason maybe it would make
sense "on_error" to call "unlink" to clean up the partially finished
file.

I also have what I think is a somewhat interesting hack.  Consider
some of the functions up there without arguments (presumably they'd be
called with a somewhat fixed contract the mechanics of COPY itself):
how does one disambiguate them?  Ideally, one could sometimes use
literal arguments (when the return value of that function is desired
to be threaded through the other specified functions) and other times
it'd be nice to disambiguate functions via type names.  That would
look something like the following:
 COPY ... TO FUNCTION (setup_function to_file('/place'),                       record_converter csv_formatter(record),
                    stream_function fwrite(bytea),                       end_function fclose(text));
 

I think this is possible to implement without much ambiguity, drawing
on the observation that the COPY statement does not have -- and
probably will never have -- references via Var(iable) node, unlike
normal SQL statements such as SELECT, INSERT, et al.  That means we
might be able disambiguate using the following rules when scanning the
funcexpr's arguments during the semantic analysis phase to figure out
what to do:
 * Only literal list items found: it's a function call with the types   of those literals.  Ex:
my_setup_function('foo'::text,3)
 
 * Only non-literal list items found: it's type specifiers.  Ex:   csv_formatter(record).
 * Both literal and non-literal values found: report an error.

This only works because no cases where a non-literal quantity could be
confused with a type name come to mind.  If one could name a type "3"
and being forced to double-quote "3" to get your type disambiguated
was just too ugly, then we are at an impasse.  But otherwise I think
this may work quite well.

Common constellations of functions could perhaps be bound together
into a DDL to reduce the amount of symbol soup going on here, but that
seems like a pretty clean transition strategy at some later time.
Most of the functionality could still be captured with this simple
approach for now...

Also note that factoring out high-performance implementations of
things like csv_formatter (and friends: pg_binary_formatter) will
probably take some time, but ultimately I think all the existing
functionality could be realized as a layer of syntactic sugar over
this mechanism.

fdr


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: First feature patch for plperl - draft [PATCH]
Next
From: Hitoshi Harada
Date:
Subject: Re: add more frame types in window functions (ROWS)