Re: COPY as a set returning function - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: COPY as a set returning function
Date
Msg-id CADkLM=d6PO+batzzrAnpK_wgcF6PLg1Mn1jYLENc5tDxfp9dXQ@mail.gmail.com
Whole thread
In response to Re: COPY as a set returning function  (Craig Ringer <craig.ringer@2ndquadrant.com>)
List pgsql-hackers
On Sun, Oct 16, 2016 at 9:01 AM, Craig Ringer <craig.ringer@2ndquadrant.com> wrote:

On 15 Oct. 2016 04:56, "Corey Huinker" <corey.huinker@gmail.com> wrote:

> I would like to make COPY itself a SRF. That's a bit beyond my capabilities, so if that is the route we want to go, I will need help.
>
> The syntax would probably look like this (new bits in bold):
>
>> WITH my_copy  AS (
>>     COPY FROM 'example.csv' TO RESULT SET(c1 text, c2 integer, dummy1 text, dummy2 text, c5 date) WITH (FORMAT CSV)
>>     RETURNING c1, c2, c3
>> )

Strong -1 from me on this approach. Our CTE implementation materializes everything so this is no better than COPYing to a temp table.

Not unless you plan to fix that (and figure out the backward compatibility issues since the bug is documented as a feature) or implement RETURNING in subqueries... I'd go for the function.


Well, it saves burning the oid and the pg_attribute rows. A few long running transactions can cause pg_attribute to bloat to 400GB on one of our systems - hence my wanting something like this function.

If it does stay a function, we only need to implement 8 of the 12 options as parameters (FREEZE and FORCE* options don't apply). My guess is that future options added to COPY will be more about handling output or optimizing table inserts, neither of which mean more options for this proposed function.

Would the best approach be to build in a core srf-returning function that might be deprecated once COPY is set-returning AND CTEs don't have to materialize, or to refactor what's in copy.c such that a contrib module can easily plug into it, and have copy_srf live there?



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: [COMMITTERS] pgsql: Replace PostmasterRandom() with a stronger way of generating ran
Next
From: Kenaniah Cerny
Date:
Subject: Idempotency for all DDL statements