Thread: [PERFORMANCE] Error loading 37G CSV file "invalid string enlargement request size 65536"
[PERFORMANCE] Error loading 37G CSV file "invalid string enlargement request size 65536"
From
"Adonias Malosso"
Date:
Hi all,
I´m running a copy for a 37G CSV and receiving the following error:
"invalid string enlargement request size 65536"
The file has about 70 million lines with 101 columns, all them varchar.
When I run the command with the whole file i receive the error after loading
about 29million lines. So i´ve spllited the file in 10 million lines with split:
split --lines=10000000
And running the copy i receive the error on the 5th file:
psql:/srv/www/htdocs/import/script_q2.sql:122: ERROR: invalid string enlargement request size 65536
CONTEXT: COPY temp_q2, line 3509639: ""000000009367276";"4";"DANIEL DO CARMO BARROS";"31-Jan-1986";"M";"1";"10";"3162906";"GILSON TEIXEIRA..."
Any clues?
My postgresql version is 8.2.4 the server is running suse linux with 1.5GB
Sensitive changes in postgresql.conf are:
shared_buffers = 512MB
temp_buffers = 256MB
checkpoint_segments = 60
I´d also like to know if there´s any way to optimize huge data load in operations like these.
Regards
Adonias Malosso
I´m running a copy for a 37G CSV and receiving the following error:
"invalid string enlargement request size 65536"
The file has about 70 million lines with 101 columns, all them varchar.
When I run the command with the whole file i receive the error after loading
about 29million lines. So i´ve spllited the file in 10 million lines with split:
split --lines=10000000
And running the copy i receive the error on the 5th file:
psql:/srv/www/htdocs/import/script_q2.sql:122: ERROR: invalid string enlargement request size 65536
CONTEXT: COPY temp_q2, line 3509639: ""000000009367276";"4";"DANIEL DO CARMO BARROS";"31-Jan-1986";"M";"1";"10";"3162906";"GILSON TEIXEIRA..."
Any clues?
My postgresql version is 8.2.4 the server is running suse linux with 1.5GB
Sensitive changes in postgresql.conf are:
shared_buffers = 512MB
temp_buffers = 256MB
checkpoint_segments = 60
I´d also like to know if there´s any way to optimize huge data load in operations like these.
Regards
Adonias Malosso
Re: [PERFORMANCE] Error loading 37G CSV file "invalid string enlargement request size 65536"
From
Tom Lane
Date:
"Adonias Malosso" <malosso@gmail.com> writes: > I�m running a copy for a 37G CSV and receiving the following error: > "invalid string enlargement request size 65536" AFAICS this would only happen if you've got an individual line of COPY data exceeding 1GB. (PG versions later than 8.2 give a slightly more helpful "out of memory" error in such a case.) Most likely, that was not your intention, and the real problem is incorrect quoting/escaping in the CSV file, causing COPY to think that a large number of physical lines should be read as one logical line. regards, tom lane
Re: [PERFORMANCE] Error loading 37G CSV file "invalid string enlargement request size 65536"
From
Shane Ambler
Date:
Adonias Malosso wrote: > Hi all, > > split --lines=10000000 > > And running the copy i receive the error on the 5th file: > > psql:/srv/www/htdocs/import/script_q2.sql:122: ERROR: invalid string > enlargement request size 65536 > CONTEXT: COPY temp_q2, line 3509639: ""000000009367276";"4";"DANIEL DO > CARMO BARROS";"31-Jan-1986";"M";"1";"10";"3162906";"GILSON TEIXEIRA..." > > Any clues? quote problems from earlier than that? one missing? \ at end of field negating the closing quote I'd keep splitting to help isolate - what control do you have over the generation of the data? Is this one off import or ongoing? > My postgresql version is 8.2.4 the server is running suse linux with 1.5GB > Sensitive changes in postgresql.conf are: > > shared_buffers = 512MB > temp_buffers = 256MB > checkpoint_segments = 60 > > I´d also like to know if there´s any way to optimize huge data load in > operations like these. Sounds like you are already using copy. Where from? Is the data file on the server or a seperate client? (as in reading from the same disk that you are writing the data to?) See if http://pgfoundry.org/projects/pgbulkload/ can help It depends a lot on what you are doing and what table you are importing into. Indexes will most likely be the biggest slow down, it is faster to create them after the table is filled. Also fk restraints can slow down as well. Is this a live server that will still be working as you load data? If the db is not in use try dropping all indexes (on the relevant table anyway), loading then create indexes. You can copy into a temp table without indexes then select into the target table. What fk restraints does this table have? Can they be safely deferred during the import? -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz