Re: COPY losing information - Mailing list pgsql-general

From Jaime Silvela
Subject Re: COPY losing information
Date
Msg-id 4609273B.4000403@bear.com
Whole thread Raw
In response to Re: COPY losing information  (Jaime Silvela <JSilvela@Bear.com>)
List pgsql-general
Correction: my sever is running 8.1.3

Jaime Silvela wrote:
> Just bringing back to life a message I sent last July.
>
> The problem I was having was that when importing very large data sets,
> COPY seemed to drop some data. I built a script to use INSERTs, and
> same problem. My server runs 8.1.3 on Linux. Several people
> investigated, Reece Hart was unable to reproduce the problem using my
> same data file, and Tom Lane suggested a buggy client installation
> dropping info, or hardware problems in the server.
>
> I've come back to this problem recently, and have found a couple of
> interesting things.
> I'm using a 418MB data file. wc -l gives me 6,802,367 lines
>
> On the server (Linux running Postgres 2.1.3). 4GB RAM, and the disk
> Postgres lives in is a network drive.
> After copying the data file to the server and running psql locally.
> - Import try 1: count(*) gives 6,801,664
> - Import try 2: count(*) gives 6,802,241
> - Every import gave a different count
> The log file doesn't say anything other than
> LOG: checkpoints are occurring too frequently (27 seconds apart), but
> that's should not be a problem, right?
>
> I also tried it in my local-disk Windows installation,  which runs
> Postgres 8.1.0
> - Import try 1: count(*) gives 6,824,366
> - All imports since then give, correctly, count(*) = 6,802,367
>
> This time, I FTP'd the file to my mac laptop, which is running 8.2.3.
> The import gave me the correct count every time.
> Remembering the suggestion of hardware problems, I made an empty
> install of 8.2.3 on my server. On it, the data file was fully imported
> every single time.
> I decided to copy all my data to 8.2.3 to test dependency of the bug
> on a) size of the database (30GB of data) or b) some problem with data
> or some stored procedure. I've tried the import, and again, it's
> correct every single time.
>
> During the import of the database data to 8.2.3, I got mostly a clean
> set, except for the 3 error messages listed below which suggest a
> buggy kernel.
>
> A bunch of questions:
> 1) Has anybody seen this type of behavior in 8.1.*, is there a known
> bug that might explain the problems in both the Linux server and the
> Windows box?
> 2) Those errors on the database import are troubling, but the affected
> only 3 tables. Wouldn't a buggy kernel give more trouble? Is it
> possible that this was a bug in 8.1.3's pg_dump? Also, since the
> database is so big, the output of pg_dump is split'd and bzip2'd, so
> there's room for error there too.
> 3) On Friday I'm going to upgrade the production database from 8.1.3
> to 8.2.3. Any caveats or words of advice?
>
> Thank you,
> Jaime
>
>
> ERROR:  unexpected data beyond EOF in block 23662 of relation "portfolio"
> HINT:  This has been seen to occur with buggy kernels; consider
> updating your system.
> CONTEXT:  COPY portfolio, line 3426949: "210395 1974263 1       723"
> STATEMENT:  COPY portfolio (deal_id, security_id, amount,
> portfolio_version) FROM stdin;
> ERROR:  unexpected data beyond EOF in block 4028 of relation
> "coverage_test_val"
> HINT:  This has been seen to occur with buggy kernels; consider
> updating your system.
> CONTEXT:  COPY coverage_test_val, line 338035: "340676  588
> 2006-08-25      STEPUP_TRIGGER2 6       0.0     0.0   \
>  7"
> STATEMENT:  COPY coverage_test_val (coverage_test_val_id, deal_id,
> observation_date, coverage_test_name, coverage_test\
> _priority, coverage_test_value, coverage_test_trigger,
> coverage_test_type_id) FROM stdin;
> ERROR:  unexpected data beyond EOF in block 4049 of relation
> "deal_current_val"
> HINT:  This has been seen to occur with buggy kernels; consider
> updating your system.
> CONTEXT:  COPY deal_current_val, line 511050: "612884   7008
> 2005-09-21      23      1.957871"
> STATEMENT:  COPY deal_current_val (deal_current_val_id, deal_id,
> observation_date, type_id, deal_current_val) FROM std\
> in;
>
>


***********************************************************************
Bear Stearns is not responsible for any recommendation, solicitation,
offer or agreement or any information about any transaction, customer
account or account activity contained in this communication.

Bear Stearns does not provide tax, legal or accounting advice.  You
should consult your own tax, legal and accounting advisors before
engaging in any transaction. In order for Bear Stearns to comply with
Internal Revenue Service Circular 230 (if applicable), you are notified
that any discussion of U.S. federal tax issues contained or referred to
herein is not intended or written to be used, and cannot be used, for
the purpose of:  (A) avoiding penalties that may be imposed under the
Internal Revenue Code; nor (B) promoting, marketing or recommending to
another party any transaction or matter addressed herein.
***********************************************************************

pgsql-general by date:

Previous
From: "Filip Rembiałkowski"
Date:
Subject: Re: Every user has own database - how?
Next
From: Marc Evans
Date:
Subject: Is there a shortage of postgresql skilled ops people