Re: exposing COPY API - Mailing list pgsql-hackers

From Itagaki Takahiro
Subject Re: exposing COPY API
Date
Msg-id AANLkTikDBYqsge=Ce2V7+fe31Mm38TAwazrh8WXTsrt9@mail.gmail.com
Whole thread Raw
In response to Re: exposing COPY API  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: exposing COPY API  (Andrew Dunstan <andrew@dunslane.net>)
Re: exposing COPY API  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
Here is a demonstration to support jagged input files. It's a patch
on the latest patch. The new added API is:

  bool NextLineCopyFrom(
        [IN] CopyState cstate,
        [OUT] char ***fields, [OUT] int *nfields, [OUT] Oid *tupleOid)

It just returns separated fields in the next line. Fortunately, I need
no extra code for it because it is just extracted from NextCopyFrom().

I'm willing to include the change into copy APIs,
but we still have a few issues. See below.

On Fri, Feb 4, 2011 at 16:53, Andrew Dunstan <andrew@dunslane.net> wrote:
> The problem with COPY FROM is that nobody's come up with a good syntax for
> allowing it as a FROM target. Doing what I want via FDW neatly gets us
> around that problem. But I'm quite OK with doing the hard work inside the
> COPY code - that's what my working prototype does in fact.

I think it is not only syntax issue. I found an issue that we hard to
support FORCE_NOT_NULL option for extra fields. See FIXME in the patch.
It is a fundamental problem to support jagged fields.

> One thing I'd like is to to have file_fdw do something we can't do another
> way. currently it doesn't, so it's nice but uninteresting.

BTW, how do you determine which field is shifted in your broken CSV file?
For example, the case you find "AB,CD,EF" for 2 columns tables.
I could provide a raw CSV reader for jagged files, but you still have to
cook the returned fields into a proper tuple...

--
Itagaki Takahiro

Attachment

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Compilation failed
Next
From: Fujii Masao
Date:
Subject: Re: Compilation failed