Re: COPY enhancements - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: COPY enhancements
Date
Msg-id 4AAE9B91.80805@dunslane.net
Whole thread Raw
In response to Re: COPY enhancements  (Emmanuel Cecchet <manu@asterdata.com>)
List pgsql-hackers

Emmanuel Cecchet wrote:
> Greg Smith wrote:
>> On Fri, 11 Sep 2009, Emmanuel Cecchet wrote:
>>
>>> I guess the problem with extra or missing columns is to make sure 
>>> that you know exactly which data belongs to which column so that you 
>>> don't put data in the wrong columns which is likely to happen if 
>>> this is fully automated.
>>
>> Allowing the extra column case is easy:  everwhere in copy.c you find 
>> the error message "extra data after last expected column", just 
>> ignore the overflow fields rather than rejecting the line just based 
>> on that.  And the default information I mentioned you might want to 
>> substitute for missing columns is already being collected by the code 
>> block with the comment "Get default info if needed".
> If I understand it well, you expect the garbage to be after the last 
> column. But what if the extra or missing column is somewhere upfront 
> or in the middle? Sometimes you might have a type conflict problem 
> that will help you detect the problem, sometimes you will just insert 
> garbage. This might call for another mechanism that would log the 
> lines that are automatically 'adjusted' to be able to rollback any 
> mistake that might happen during this automated process.
>
>


Garbage off to the right is exactly the case that we have. Judging from 
what I'm hearing a number of other people are too.

Nobody suggests that a facility to ignore extra columns will handle 
every case. It will handle what increasingly appears to be a common case.

cheers

andrew


pgsql-hackers by date:

Previous
From: gabrielle
Date:
Subject: Re: Commitfest Code Sprint with PUGs
Next
From: Simon Riggs
Date:
Subject: Re: Streaming Replication patch for CommitFest 2009-09