Home > mailing lists

Re: processing large amount of rows with plpgsql - Mailing list pgsql-general

From	Merlin Moncure
Subject	Re: processing large amount of rows with plpgsql
Date	August 8, 2012 20:04:49
Msg-id	CAHyXU0yr7n0nFB4bieYX7C9+v6CPwWmZ=1u1xQ2Pej0h6WGHeg@mail.gmail.com Whole thread Raw
In response to	processing large amount of rows with plpgsql (Geert Mak <pobox@verysmall.org>)
Responses	Re: processing large amount of rows with plpgsql
List	pgsql-general

Tree view

On Wed, Aug 8, 2012 at 2:41 PM, Geert Mak <pobox@verysmall.org> wrote:
> hello everybody,
>
> we are trying to move the data from table1 into table2 using a plpgsql stored procedure which is performing simple a
dataconversion 
>
> there are about 50 million rows
>
> the tables are relatively simple, less than a dozen columns, most are integer, a couple are char(32) and one is
varcharholding URLs 
>
> what happens is that when we execute the stored procedure, the execution eats up the 17 GB free space on the server
andthe server crashes 
>
> if we apply a limit of 1 million rows, the execution completes successfully in about a minute
>
> we understand, by now, that plpgsql functions are executed in their own transaction, which commits when the function
comesto an end 
>
> the question is -
>
> are plpgsql functions suitable for such massive operations on data and can this work without so much disk space is
beingeaten for something which should be simply "read-change-write, read-change-write, read-change-write, …"? i. e. any
wayto force commit inside, or so? 
>
> or should we rather implement this operation in some external scripting language (speed is not that important, this
isa one time conversion) 

What is the general structure of the procedure?  In particular, how
are you browsing and updating the rows?  There is (almost) no way to
force commit inside a function -- there has been some discussion about
stored procedure and/or autonomous transaction feature in terms of
getting there.

I say 'almost' because you can emulate some aspects of autonomous
transactions with dblink, but that may not be a very good fit for your
particular case.

merlin

pgsql-general by date:

From: Geert Mak
Date: 08 August 2012, 19:41:51
Subject: processing large amount of rows with plpgsql

From: Fanbin Meng
Date: 08 August 2012, 20:35:55
Subject: postgresql service stoped

Re: processing large amount of rows with plpgsql - Mailing list pgsql-general

Previous

Next