PLPythonU & Out of Memory - Importing Query - Mailing list pgsql-interfaces

From Jon Clements
Subject PLPythonU & Out of Memory - Importing Query
Date
Msg-id 4c05f1b439ce248766e7cd5375a83ddc@readgroup.co.uk
Whole thread Raw
Responses Re: PLPythonU & Out of Memory - Importing Query
Re: PLPythonU & Out of Memory - Importing Query
List pgsql-interfaces
Hi there,

I am currently experimenting using plpythonu with postgresql 8.0 for Win32. It's basically a quick script that imports
datafrom CSV files, but does some quite complicated data lookups and selections. The area in which I'm somewhat
confundedis memory usage. The process successfully runs, but keeps climbing in memory usage relentlessly, successfully
importingabout 200k records, before the memory usage of postgres soars to 2gb and of course, shortly after that, grinds
toa halt with a "Out of Memory" error.  

I'm not deliberately storing anything in the SD/GD dictionaries, and am not dealing with triggers...

create function blah(text) returns int8 as
$$
# Initialisation of plans
myplan = plpy.prepare('insert into tablename (var1,var2) values($1,$2)', ['text','text'] )
# Setup external CSV data source
# For each record, that meets certain critera, execute insert...
for rec in dsource: plpy.execute(myplan, [Value1, Value2] )

# Finishing stuff
return some_meaningful_value
$$
LANGUAGE PLPYTHONU;

Given I'm importing about 250 million records and only want to end up with about 4 million, is
1) This possible using the above?
2) Better suited to something else (I've looked at COPY but that would require the entire table be uploaded first, then
filteredand I'd like to avoid that if necessary, or thinking about it, I spose a trigger could be written that
respondedon the copy?). Also the other thing is COPY is only applicable to simple text files, while I want this import
scriptto be generic from whatever datasource it may be importing from (an ODBC/DBF/Berkeley DB format etc...) 

Anyhow, thanks in advance for any help.
Any Qs, please gimme a yell.

Regards,

Jon.








________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________


pgsql-interfaces by date:

Previous
From: "Greg Sabino Mullane"
Date:
Subject: Re: DBD::Pg returns 1/0 for boolean field ...
Next
From: Tom Lane
Date:
Subject: Re: PLPythonU & Out of Memory - Importing Query