Thread: psql copy errors

psql copy errors

From
David Bear
Date:
I'm finding the \copy is very brittle. It seems to stop for everyone
little reason. Is there a way to tell it to be more forgiving -- for
example, to ignore extra data fields that might exists on a line?

Or, to have it just skip that offending record but continue on to the
next.

I've got a tab delimited file, but if \copy sees any extra tabs in the
file it just stops at that record. I want to be able to control what
pg does when it hits an exception.

I'm curious what others do for bulk data migration. Since copy seems
so brittle, there must be a better way...

--
David Bear
phone:     480-965-8257
fax:     480-965-9189
College of Public Programs/ASU
Wilson Hall 232
Tempe, AZ 85287-0803
 "Beware the IP portfolio, everyone will be suspect of trespassing"

Re: psql copy errors

From
Michael Fuhr
Date:
On Thu, Jun 23, 2005 at 12:27:44PM -0700, David Bear wrote:
>
> I'm finding the \copy is very brittle. It seems to stop for everyone
> little reason. Is there a way to tell it to be more forgiving -- for
> example, to ignore extra data fields that might exists on a line?
>
> Or, to have it just skip that offending record but continue on to the
> next.

Calling this behavior "brittle" is debatable; some people would
call it "reliable": either the whole input validates -- matches
what's expected -- or the command rejects it.  This might sound
unreasonable if a partial load is acceptable, but it's desirable
behavior if a partial load is unacceptable.  Some would argue that
the correct solution is to fix the data, not to make the command
that loads the data more forgiving.

The developers' TODO list has a few outstanding items to enhance
COPY, but I don't know if anybody's working on them:

http://www.postgresql.org/docs/faqs.TODO.html

* Allow COPY to report error lines and continue
  This requires the use of a savepoint before each COPY line is processed,
  with ROLLBACK on COPY failure.

* Have COPY return the number of rows loaded/unloaded (?)

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: psql copy errors

From
Vladimir Yevdokimov
Date:
On June 23, 2005 03:27 pm, David Bear wrote:
> I'm finding the \copy is very brittle. It seems to stop for everyone
> little reason. Is there a way to tell it to be more forgiving -- for
> example, to ignore extra data fields that might exists on a line?
>
> Or, to have it just skip that offending record but continue on to the
> next.
>
> I've got a tab delimited file, but if \copy sees any extra tabs in the
> file it just stops at that record. I want to be able to control what
> pg does when it hits an exception.
>
> I'm curious what others do for bulk data migration. Since copy seems
> so brittle, there must be a better way...
>

You may use '-d' option of pg_dump in which case it dumps data into INSERT statements.
In this case when you load the damped data it will process tabs properly and will fail any invalid records but finish
theprocess itself. 
If you redirect output into a separate file you can analyze later how many records failed.
May be that's what you need in your case.
The only problem with this method I know is that it takes longer to load the data as it does full validation for each
record.
--
Vladimir Yevdokimov <vladimir@givex.com>

Re: psql copy errors

From
"李江华"
Date:
David Bear:

        Yes. I agree with you.
        \copy is really too brittle.
        I wonder why \copy is not like oracle's sqlldr?
        I think sqlldr is more powerful. When using sqlldr,we can specify the maximum error records we allow,and we can
alsospecify the number we should commit in every transaction.
 

        I think PostgreSQL should consider this.

       Another aspect is also important. Oracle has better partition table facilities,it's especially suitable for
largetables,as well as index partition concepts.But PostgreSQL has no such concepts.These are really important for
largedatabase.
 
     
        I think if PostgreSQL can implement those above in the future.It will be more powerful and more suitable for
largebusiness application.
 

        

 

======= 2005-06-24 03:27:44 you wrote:=======

>I'm finding the \copy is very brittle. It seems to stop for everyone
>little reason. Is there a way to tell it to be more forgiving -- for
>example, to ignore extra data fields that might exists on a line?
>
>Or, to have it just skip that offending record but continue on to the
>next.
>
>I've got a tab delimited file, but if \copy sees any extra tabs in the
>file it just stops at that record. I want to be able to control what
>pg does when it hits an exception.
>
>I'm curious what others do for bulk data migration. Since copy seems
>so brittle, there must be a better way...
>
>-- 
>David Bear
>phone:     480-965-8257
>fax:     480-965-9189
>College of Public Programs/ASU
>Wilson Hall 232
>Tempe, AZ 85287-0803
> "Beware the IP portfolio, everyone will be suspect of trespassing"
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>

= = = = = = = = = = = = = = = = = = = =
            

       Best regards!
  
        
                               李江华
                               Seamus Dean
                               Alibaba.com
                               TEL:0571-85022088-2287
                        ljh1469@alibaba-inc.com
                        2005-06-24

Re: psql copy errors

From
Scott Marlowe
Date:
On Fri, 2005-06-24 at 00:52, 李江华 wrote:
> David Bear:
>
>         Yes. I agree with you.
>         \copy is really too brittle.
>         I wonder why \copy is not like oracle's sqlldr?
>         I think sqlldr is more powerful. When using sqlldr,we can specify the maximum error records we allow,and we
canalso specify the number we should commit in every transaction. 

While I wouldn't mind have the versatility of sqlldr, I would HATE to
have to deal with such a clunky interface.  The only features of sqlldr
I actually like are the ones that allow rejected records to go into a
separate file.

But like most of oracle's tools, it's awkward to use, and requires a lot
of up front work even if you only want a simple load.  Sure, add some
features to \copy, but don't emulate the WAY those features are
implemented oracle.

>        Another aspect is also important. Oracle has better partition table facilities,it's especially suitable for
largetables,as well as index partition concepts.But PostgreSQL has no such concepts.These are really important for
largedatabase. 

No one would complain if someone stepped up and programmed a decent
table partitioning patch.  I have a feeling it's one of the things on
the TODO list that will happen in the next version or two.


Re: psql copy errors

From
Alvaro Herrera
Date:
On Fri, Jun 24, 2005 at 09:43:43AM -0500, Scott Marlowe wrote:
> On Fri, 2005-06-24 at 00:52, ????????? wrote:
> > David Bear:
> >
> >         Yes. I agree with you.
> >         \copy is really too brittle.
> >         I wonder why \copy is not like oracle's sqlldr?
> >         I think sqlldr is more powerful. When using sqlldr,we can specify the maximum error records we allow,and we
canalso specify the number we should commit in every transaction. 
>
> While I wouldn't mind have the versatility of sqlldr, I would HATE to
> have to deal with such a clunky interface.  The only features of sqlldr
> I actually like are the ones that allow rejected records to go into a
> separate file.

There's a PGLoader project in pgfoundry that may suit some people.

> >        Another aspect is also important. Oracle has better partition
> >        table facilities,it's especially suitable for large tables,as
> >        well as index partition concepts.But PostgreSQL has no such
> >        concepts.These are really important for large database.
>
> No one would complain if someone stepped up and programmed a decent
> table partitioning patch.  I have a feeling it's one of the things on
> the TODO list that will happen in the next version or two.

I think Simon Riggs and crew are working on this.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
Jude: I wish humans laid eggs
Ringlord: Why would you want humans to lay eggs?
Jude: So I can eat them