extreme memory use when loading in a lot of data - Mailing list pgsql-general

From Vivek Khera
Subject extreme memory use when loading in a lot of data
Date
Msg-id x7vfipmwn7.fsf@yertle.int.kciLink.com
Whole thread Raw
Responses Re: extreme memory use when loading in a lot of data
Re: extreme memory use when loading in a lot of data
List pgsql-general
I have some historic data that I want to analyze.  To do this I set up
postgres on a spare box I picked up for cheap, which just lucked into
having tons of RAM (1.5G).   I set up postgres to use 10000 buffers,
and recompiled the kernel to allow 2Gb data size limit per process.

Since this is historical data, I'm actually merging a couple of dumps
that span the time range.  I've dealt with eliminating any conflicting
data (ie, clashing unique keys) but I'm not 100% sure that the foreign
key constraints are all met.  Thus, when loading the data from the
second dump, I am leaving the FK triggers on.

Now, this is where my trouble has begun... On importing row 29,796,801
for the first big table, I get this (after 27 hours!):

pg_restore: ERROR:  out of memory
DETAIL:  Failed on request of size 44.
CONTEXT:  COPY msg_recipients, line 29796801: "75312    3434358 sent"
pg_restore: [archiver (db)] error returned by PQendcopy

(I had originally tried this with a 1Gb data size limit on the
process, and it died at around row 15 million.)

The curious thing is that watching the process size grow on another
window, is that it shrunk considerably a few minutes before it
croaked:


% ps axuw | grep postg
pgsql   26032  6.0  7.9 2186888 123440  ??  D    Thu09AM 122:26.10 postmaster: khera vkmlm [local] COPY (postgres)

Note here that we are using nearly 2gb virtual memory (I'm sure a
bunch of that is stack and the shared memory segment).

a few minutes later I saw this:

% ps axuw | grep postg
pgsql   26032 10.4  5.5 91840 85624  ??  S    Thu09AM 123:17.24 postmaster: khera vkmlm [local] idle (postgres)

note the process size is down to 91M.  A few minutes later I got the
out of memory error.  This is very curious because I don't expect the
process to release the memory back to the OS like that.

There are about 157 million records in this table, three columns wide.

FreeBSD 4.10-PRERELEASE, PostgreSQL 7.4.2

So, is there any way to load this in or do I need to either break it
into chunks (no small feat) or disable triggers during load and hope
and pray there are no FK violations?

Is there some leak or does it just take that much RAM to load in data
from a table?

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD  +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

pgsql-general by date:

Previous
From: "Carl E. McMillin"
Date:
Subject: Re: Function with RETURN TYPE RECORD Called From JAVA
Next
From: Vivek Khera
Date:
Subject: Re: Is Linux 2.6.5 kernel good enough for production?