Re: Ragged CSV import - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: Ragged CSV import
Date
Msg-id 4AA9A752.1070006@dunslane.net
Whole thread Raw
In response to Re: Ragged CSV import  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Ragged CSV import
Re: Ragged CSV import
Re: Ragged CSV import
List pgsql-hackers

Stephen Frost wrote:
> * Andrew Dunstan (andrew@dunslane.net) wrote:
>   
>> Consider the suggestion withdrawn.
>>     
>
> Let's not throw it out completely.  The proposal to have COPY return a
> text[] in some fashion is interesting enough that others (ah, such as
> myself..) might be willing to put effort into it.  Andrew, could you put
> your thoughts plus some example files onto a wiki page, at least?  Then
> Robert, Tom, myself, etc, could update that to nail down the
> specification and then it's just an implementation detail, as it were.
>
>     
>   

I don't mind discussing the idea a bit.

I don't have any samples readily to hand, but really anything that's not 
strictly rectangular meets my original case, like
   a,b,c   1,2,3   4,5,6,7   8,9   10,11,12

I do like the idea of COPY returning a SETOF text[], but I am not at all 
clear on the mechanics of feeding STDIN to an SRF. ISTM that something 
like a RETURNING clause on COPY and the ability to use it in FROM clause 
or something similar might work better. I understand the difficulties, 
but maybe we could place some restrictions on where it could be used so 
as to obviate at least some of those.

One of the things I like about a SETOF text[] is that it lets you 
reorder the columns, or cherry pick which ones you want. In fact, it 
might be argued with that the hacky FORCE NOT NULL, which has always 
pained me somewhat, even if it was my idea ;-) might no longer be needed.

I'd love to be able to do something like
   INSERT into foo (x,y,z) select t[3],[t2],[t57] from (COPY RETURNING   t FROM stdin CSV);

The only thing that's been seriously on the table that isn't accounted 
for by something like this is the suggestion of making the header line 
have some semantic significance, and I'm far from sure that's a good idea.

If time were not short on getting features presented I might attempt to 
do it, but I have one very large monkey (and a few small ones) on my 
back that I am determined to get rid of by the November CF, and there is 
not a hope in the world I could get two large features done, even if we 
had the details of this all sorted out and agreed on. That's why I said 
"Consider the suggestion withdrawn".

cheers

andrew




pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: community decision-making & 8.5
Next
From: Itagaki Takahiro
Date:
Subject: Re: logging hook for database audit