Thread: allow COPY routines to read arbitrary numbers of fields

allow COPY routines to read arbitrary numbers of fields

From
Andrew Dunstan
Date:
Attached is a patch that allows CopyReadAttibutesText() and
CopyReadAttributesCSV() to read arbitrary numbers of attributes.
Underflowing attributes are recorded as null, and space is made for
overflowing attributes on a line.

This patch doesn't result in any user-visible behavior. The current
calling code will fail if the number of attributes read is not what is
expected, as happens now. But it will allow the API to be used (when
exposed) by a foreign data wrapper that can accept arbitrary numbers of
attributes. My aim here is to get to something like:

    CREATE FOREIGN TABLE my_csv (
         t text[]
    )
    SERVER file_server
    OPTIONS (format 'csv', filename '/path/to/my/data.csv', textarray
    'true',
              header 'true', delimiter ';', quote '@', escape '"', null '');

    SELECT t[3] as f1, t[1] as f2, t[9999] as probably_null
    FROM my_csv;

It would probably be nice to apply this before we start exposing the
COPY API to FDW routines, as discussed earlier today.

cheers

andrew




Attachment

Re: allow COPY routines to read arbitrary numbers of fields

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Attached is a patch that allows CopyReadAttibutesText() and 
> CopyReadAttributesCSV() to read arbitrary numbers of attributes. 
> Underflowing attributes are recorded as null, and space is made for 
> overflowing attributes on a line.

Why are you still passing nfields as a separate parameter instead of
relying on the value you added to the struct?  That can't do anything
except cause confusion, especially once the two values diverge due to a
previous array-expansion.  Also, why did you change the setup code to
not compute nfields in binary mode?  That seems at best an unnecessary
change, and at worst a breakage of the binary path --- did you test it?

Also please be a little more careful with the formatting.  This for
instance is pretty sloppy:

!  * strings.  cstate->raw_fields[k] is set to point to the k'th attribute 
!  * string, * or NULL when the input matches the null marker string.  

and there seem to be some gratuitous whitespace changes as well.
        regards, tom lane


Re: allow COPY routines to read arbitrary numbers of fields

From
Andrew Dunstan
Date:

On 12/06/2010 12:11 PM, Tom Lane wrote:
> Andrew Dunstan<andrew@dunslane.net>  writes:
>> Attached is a patch that allows CopyReadAttibutesText() and
>> CopyReadAttributesCSV() to read arbitrary numbers of attributes.
>> Underflowing attributes are recorded as null, and space is made for
>> overflowing attributes on a line.
> Why are you still passing nfields as a separate parameter instead of
> relying on the value you added to the struct?  That can't do anything
> except cause confusion, especially once the two values diverge due to a
> previous array-expansion.

Good point. will fix.

>    Also, why did you change the setup code to
> not compute nfields in binary mode?  That seems at best an unnecessary
> change, and at worst a breakage of the binary path --- did you test it?

AFAICT it's not used in binary mode at all. But I will double check.

> Also please be a little more careful with the formatting.

Ok, Will fix also. Thanks for he comments.

cheers

andre


Re: allow COPY routines to read arbitrary numbers of fields

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 12/06/2010 12:11 PM, Tom Lane wrote:
>> Also, why did you change the setup code to
>> not compute nfields in binary mode?  That seems at best an unnecessary
>> change, and at worst a breakage of the binary path --- did you test it?

> AFAICT it's not used in binary mode at all. But I will double check.

Well, even if it is not used at the moment, it seems potentially of use
in that path.  So I'd vote for continuing to set it correctly, rather
than making it deliberately incorrect as this patch is going out of its
way to do.
        regards, tom lane


Re: allow COPY routines to read arbitrary numbers of fields

From
Andrew Dunstan
Date:

On 12/06/2010 01:23 PM, Tom Lane wrote:
> Andrew Dunstan<andrew@dunslane.net>  writes:
>> On 12/06/2010 12:11 PM, Tom Lane wrote:
>>> Also, why did you change the setup code to
>>> not compute nfields in binary mode?  That seems at best an unnecessary
>>> change, and at worst a breakage of the binary path --- did you test it?
>> AFAICT it's not used in binary mode at all. But I will double check.
> Well, even if it is not used at the moment, it seems potentially of use
> in that path.  So I'd vote for continuing to set it correctly, rather
> than making it deliberately incorrect as this patch is going out of its
> way to do.
>
>             

Ok.

cheers

andrew