Re: Ragged CSV import - Mailing list pgsql-hackers

From Dimitri Fontaine
Subject Re: Ragged CSV import
Date
Msg-id m2eiqd28f9.fsf@hi-media.com
Whole thread Raw
In response to Re: Ragged CSV import  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: Ragged CSV import
List pgsql-hackers
Hi,

Andrew Dunstan <andrew@dunslane.net> writes:
> I do like the idea of COPY returning a SETOF text[], but I am not at all
> clear on the mechanics of feeding STDIN to an SRF. ISTM that something like
> a RETURNING clause on COPY and the ability to use it in FROM clause or
> something similar might work better. 

I envisonned COPY "just" returning all what it reads (minus extra discarded
column as soon as your proposal gets implemented), in the FROM clause,
and the user sorting out what he wants in the SELECT clause.

> I understand the difficulties, but
> maybe we could place some restrictions on where it could be used so as to
> obviate at least some of those.

Maybe instead of opening FROM for COPY, having it accepted in WITH would
be better, the same way (from the user point of view) that DML returning
are worked on.

>    INSERT into foo (x,y,z) select t[3],[t2],[t57] from (COPY RETURNING
>    t FROM stdin CSV);

Would become:
WITH csv AS (  COPY t FROM stdin CSV)INSERT INTO foo(x, y, z) SELECT t[3], t[2], mytimestamptz([5], [6], [7])  FROM
csv;

Now the text[] has a strange feeling, without it it'd be:
WITH csv AS (  COPY t(a, b, c, d, e, f, g)   FROM stdin    CSV IGNORING EXTRA COLUMNS         -- random nice syntax
 MISSING COLUMNS DEFAULTS NULL  -- that needs some reality check)INSERT INTO foo(x, y, z) SELECT c, b, mytimestamptz(e,
f,g)  FROM csv;
 

The function mytimestamptz(date text, time text, timezone text) will
accept input that PostgreSQL input types would have errored out on... so
you can process in one go strange formats from other products.

> The only thing that's been seriously on the table that isn't accounted for
> by something like this is the suggestion of making the header line have some
> semantic significance, and I'm far from sure that's a good idea.

Yeah, and it seems only useful when you don't have any way to play with
what COPY returns before it goes to a table (magic column reordering or
ignoring).

Regards,
-- 
dim


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: logging hook for database audit
Next
From: Heikki Linnakangas
Date:
Subject: Re: [COMMITTERS] pgsql: On Windows, when a file is deleted and another process still has