Re: COPY as a set returning function - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: COPY as a set returning function
Date
Msg-id CADkLM=ck_M2q=hYEDtau6DVc-kPGp+-8fddJ4oUwoU_ujTGrrg@mail.gmail.com
Whole thread Raw
In response to Re: COPY as a set returning function  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: COPY as a set returning function  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Re: [HACKERS] COPY as a set returning function  (David Fetter <david@fetter.org>)
List pgsql-hackers
Attached is a patch that implements copy_srf().

The function signature reflects cstate more than it reflects the COPY options (filename+is_program instead of  FILENAME or PROGRAM, etc)

CREATE OR REPLACE FUNCTION copy_srf(
   filename    text DEFAULT null,
   is_program  boolean DEFAULT false,
   format      text DEFAULT 'text',  /* accepts text, csv, binary */
   delimiter   text DEFAULT null,
   null_string text DEFAULT E'\\N',
   header      boolean DEFAULT false,
   quote       text DEFAULT null,
   escape      text DEFAULT null,
   encoding    text DEFAULT null)
RETURNS SETOF RECORD

The major utility for this (at least for me) will be in ETLs that currently make a lot of use of temp tables, and wish to either reduce I/O or avoid pg_attribute bloat.

I have not yet implemented STDIN mode, but it's a start.

$ psql test
psql (10devel)
Type "help" for help.

# select * from    copy_srf('echo 1,2; echo 4,5',true,'csv') as t(x text, y text);
 x | y
---+---
 1 | 2
 4 | 5
(2 rows)

Time: 1.456 ms
# select * from    copy_srf('/tmp/small_file.txt'::text,false,'text') as t(x text, y text);
  x  |  y
-----+-----
 1   | 2
 a   | b
 cde | fgh
(3 rows)


On Mon, Oct 17, 2016 at 2:33 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Fri, Sep 30, 2016 at 9:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Craig Ringer <craig.ringer@2ndquadrant.com> writes:
>> On 1 Oct. 2016 05:20, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>>> I think the last of those suggestions has come up before.  It has the
>>> large advantage that you don't have to remember a different syntax for
>>> copy-as-a-function.
>
>> That sounds fantastic. It'd help this copy variant retain festure parity
>> with normal copy. And it'd bring us closer to being able to FETCH in non
>> queries.
>
> On second thought, though, this couldn't exactly duplicate the existing
> COPY syntax, because COPY relies heavily on the rowtype of the named
> target table to tell it what it's copying.  You'd need some new syntax
> to provide the list of column names and types, which puts a bit of
> a hole in the "syntax we already know" argument.  A SRF-returning-record
> would have a leg up on that, because we do have existing syntax for
> defining the concrete rowtype that any particular call returns.

One big disadvantage of SRF-returning-record syntax is that functions
are basically unwrappable with generic wrappers sans major gymnastics
such as dynamically generating the query and executing it.  This is a
major disadvantage relative to the null::type hack we use in the
populate_record style functions and perhaps ought to make this
(SRF-returning-record syntax) style of use discouraged for useful
library functions.  If there were a way to handle wrapping I'd
withdraw this minor objection -- this has come up in dblink
discussions a few times).

merlin

Attachment

pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Increase pltcl test coverage
Next
From: Corey Huinker
Date:
Subject: Re: sequential scan result order vs performance