copy losing information - Mailing list pgsql-general

From Silvela, Jaime \(Exchange\)
Subject copy losing information
Date
Msg-id 6D6734D7CD866145AE87A2D5D88830A90228290B@whexchmb14.bsna.bsroot.bear.com
Whole thread Raw
Responses Re: copy losing information  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: copy losing information  (Reece Hart <reece@harts.net>)
Re: copy losing information  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general

This is the first time I post to the list. I’ve done a brief search and didn’t find my issue treated already, so here it goes. Apologies if this has been reported before.

 

I have a pretty big file, around 2 million rows, in tab-separated format, with 4 columns, that I read into a table in Postgres using the copy command.

I’ve started to notice missing info sometimes. I’ll truncate the table, read from the file, and notice that sometimes there are less rows in the table than in the file.

This is not well reproducible. If I truncate again, and reread, I may get all the lines, or I may get a different amount of missing lines.

 

I concluded that there was a bug in the copy command, and wrote a replacement in Ruby, using the pure-ruby Postgres-pr library.

I run into the same issue. Some lines seem to be dropped, but no exceptions nor SQL errors are reported by the program.

In order to improve throughput, in my ruby program I connect to the server just once, and send the INSERT statements to the server in batches of 2000.

 

I’ve checked that the file doesn’t contain any SQL escape sequences or anything else that would invalidate an INSERT.

 

The version running in the server is 8.1.3 on Linux 2.6.5 on an Intel platform.

 

The imports are being run from windows machines in the same network.

 

Has somebody seen this before?

 

Thanks

Jaime

pgsql-general by date:

Previous
From: "Redefined Horizons"
Date:
Subject: Database Design Theory - PostgreSQL Custom Types
Next
From: Tom Lane
Date:
Subject: Re: copy losing information