Thread: Out of memory on update of a single column table containg just one row.

Hello Guys,

 

We are trying to migrate from Oracle to Postgres.  One of the major requirement of our database is the ability to generate XML feeds and some of our XML files are in the order of 500MB+. 

 

We are getting "Out of Memory" errors when doing an update on a table. 

 

Here is some detail on the error:

------------------------------------

update test_text3 set test=test||test

 

The table test_text3 contains only one record, the column test contains a string containing 382,637,520 characters (around 300+ MB)

 

Error Message:

ERROR:  out of memory

DETAIL:  Failed on request of size 765275088.

 

The server has 3GB of RAM:

             total       used       free     shared    buffers     cached

Mem:       3115804     823524    2292280          0     102488     664224

-/+ buffers/cache:      56812    3058992

Swap:      5177336      33812    5143524

 

I tweaked the memory parameters of the server a bit to the following values, but still no luck.

shared_buffers = 768MB

effective_cache_size = 2048MB

checkpoint_segments 8

checkpoint_completion_target 0.8

work_mem 10MB

max_connections 50

wal_buffers 128

 

This error is consistent and reproducible every time I run that update.   I can provide a detailed stack trace if needed.

 

Any help would be highly appreciated.

 

For those who are interested in the background, we are trying to migrate from Oracle to Postgresql.  One of the major requirement of our database is the ability to generate XML feeds and some of our XML files are in the order of 500MB+. 

 

Considering future scalability we are trying to see how much data can be stored in a "text" column and written to the file system as we found PostgreSQL's COPY command a very efficient way of writing date to a file.

 

Thanks in advance and best regards,

 

 

 

Zeeshan

 

This e-mail is confidential and should not be used by anyone who is not the original intended recipient. Global DataPoint Limited does not accept liability for any statements made which are clearly the sender's own and not expressly made on behalf of Global DataPoint Limited. No contracts may be concluded on behalf of Global DataPoint Limited by means of e-mail communication. Global DataPoint Limited Registered in England and Wales with registered number 3739752 Registered Office Middlesex House, 34-42 Cleveland Street, London W1T 4LB
On 5 July 2010 11:47,  <Zeeshan.Ghalib@globaldatapoint.com> wrote:
> Hello Guys,
>
>
>
> We are trying to migrate from Oracle to Postgres.  One of the major
> requirement of our database is the ability to generate XML feeds and some of
> our XML files are in the order of 500MB+.
>
>
>
> We are getting "Out of Memory" errors when doing an update on a table.
>
>
>
> Here is some detail on the error:
>
> ------------------------------------
>
> update test_text3 set test=test||test
>
>
>
> The table test_text3 contains only one record, the column test contains a
> string containing 382,637,520 characters (around 300+ MB)
>
>
>
> Error Message:
>
> ERROR:  out of memory
>
> DETAIL:  Failed on request of size 765275088.
>
>
>
> The server has 3GB of RAM:
>
>              total       used       free     shared    buffers     cached
>
> Mem:       3115804     823524    2292280          0     102488     664224
>
> -/+ buffers/cache:      56812    3058992
>
> Swap:      5177336      33812    5143524
>
>
>
> I tweaked the memory parameters of the server a bit to the following values,
> but still no luck.
>
> shared_buffers = 768MB
>
> effective_cache_size = 2048MB
>
> checkpoint_segments 8
>
> checkpoint_completion_target 0.8
>
> work_mem 10MB
>
> max_connections 50
>
> wal_buffers 128
>
>
>
> This error is consistent and reproducible every time I run that update.   I
> can provide a detailed stack trace if needed.
>
>
>
> Any help would be highly appreciated.
>
>
>
> For those who are interested in the background, we are trying to migrate
> from Oracle to Postgresql.  One of the major requirement of our database is
> the ability to generate XML feeds and some of our XML files are in the order
> of 500MB+.
>
>
>
> Considering future scalability we are trying to see how much data can be
> stored in a "text" column and written to the file system as we found
> PostgreSQL's COPY command a very efficient way of writing date to a file.
>
>
>
> Thanks in advance and best regards,
>
>
>
>
>
>
>
> Zeeshan
>

Hi Zeeshan,

Which version of PostgreSQL are you using?  And are there any indexes,
constraints or triggers on the table you're attempting to insert this
data into?

As for the maximum size of a text column, I believe it's 1GB.

You may find this useful too:
http://www.postgresql.org/docs/8.4/static/populate.html

Thom

> -----Original Message-----
> From: Thom Brown [mailto:thombrown@gmail.com]
> Sent: 05 July 2010 12:40
> To: Zeeshan Ghalib
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Out of memory on update of a single column table
> containg just one row.
> Hi Zeeshan,
>
> Which version of PostgreSQL are you using?  And are there any indexes,
> constraints or triggers on the table you're attempting to insert this
> data into?
>
> As for the maximum size of a text column, I believe it's 1GB.
>
> You may find this useful too:
> http://www.postgresql.org/docs/8.4/static/populate.html
>
> Thom
[Zeeshan]
Hello Thom,

Thanks for your email.  PostgreSQL version number is 8.4.4 running on Centos release 5.4 (Final)

There are no indexes, constraints or triggers on this table.

1 GB limit is fine, but it is giving the error on 700MB or so.  Plus, loading this kid of data will not be a one-time
initialimport.  We will do it, whenever we have to generate the XML and we generate 

What we are planning to do this is on a regular basis for our XML feed generation.  We will put the whole XML into a
TEXTcolumn and then use the COPY command to create the file. 

I am hoping that this is just a configuration problem and once the server is configured properly it will go away.  Am I
rightin my assumption or are these kind of out-of-memory errors common with PostgreSQL? 

Thanks,

Zeeshan

This e-mail is confidential and should not be used by anyone who is not the original intended recipient. Global
DataPointLimited does not accept liability for any statements made which are clearly the sender's own and not expressly
madeon behalf of Global DataPoint Limited. No contracts may be concluded on behalf of Global DataPoint Limited by means
ofe-mail communication. Global DataPoint Limited Registered in England and Wales with registered number 3739752
RegisteredOffice Middlesex House, 34-42 Cleveland Street, London W1T 4LB 

Re: Out of memory on update of a single column table containg just one row.

From
Thomas Markus
Date:
  Hi,

i tried a simple test:
create temp table _t as select repeat('x',382637520) as test;
update _t set test=test||test;

pg 8.3 32bit fails with
[Error Code: 0, SQL State: 53200]  ERROR: out of memory
   Detail: Failed on request of size 765275088.

pg 8.4.4 64bit works fine

so upgrade to 64bit

regards
Thomas

Hello Thom,

Thanks for your quick response.

So, is this there a restriction with 32-bit PostgreSQL, a bug or configuration issue?

I will most definitely upgrade to 64-bit, because that's what we want anyway.  However, I was curious what is the root
causeof this problem? 

I am getting a bit worried about this migration, although our database is not too big (less than 200GB).

Once again, thanks for your help.

Zeeshan


> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Thomas Markus
> Sent: 05 July 2010 14:39
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Out of memory on update of a single column table
> containg just one row.
>
>   Hi,
>
> i tried a simple test:
> create temp table _t as select repeat('x',382637520) as test;
> update _t set test=test||test;
>
> pg 8.3 32bit fails with
> [Error Code: 0, SQL State: 53200]  ERROR: out of memory
>    Detail: Failed on request of size 765275088.
>
> pg 8.4.4 64bit works fine
>
> so upgrade to 64bit
>
> regards
> Thomas
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
This e-mail is confidential and should not be used by anyone who is not the original intended recipient. Global
DataPointLimited does not accept liability for any statements made which are clearly the sender's own and not expressly
madeon behalf of Global DataPoint Limited. No contracts may be concluded on behalf of Global DataPoint Limited by means
ofe-mail communication. Global DataPoint Limited Registered in England and Wales with registered number 3739752
RegisteredOffice Middlesex House, 34-42 Cleveland Street, London W1T 4LB 

On Mon, Jul 05, 2010 at 01:52:20PM +0000, Zeeshan.Ghalib@globaldatapoint.com wrote:
> So, is this there a restriction with 32-bit PostgreSQL, a bug or
> configuration issue?

It's a restriction because of the 32bit address space.  You've basically
got between two and three GB of useful space left and everything has to
fit in there.  Hence if you've got a 300MB object in memory (i.e. your
XML) and you try and combine it with as similar sized object then you'll
need 300MB*4 = 1.2GB of free memory in the process's address space, with
600MB of that being consecutive.  It's obviously failing to find that
and hence the query is failing.  A 64bit address space is more than a
million times larger and hence this is why that worked.

Generally with databases you're expected to be working with lots of
small objects (i.e. most a few bytes in length) with a few multi KB
ones.  Databases are fine with lots of these (i.e. I've got databases
with hundreds of millions of rows) but don't work very well when each
row is very big.  The design assumption is that values are "atomic" and
large values normally aren't atomic so would be broken down into smaller
pieces when they enter the database.

Maybe the large object functionality in PG would suit your needs better,
they are designed for larger things like this and don't suffer the same
restrictions (i.e. internally they're worked with piecemeal rather than
trying to work with the whole thing in one go).  They can be a bit of a
hassle to work with, so which is "better" is very use case dependent.

--
  Sam  http://samason.me.uk/

p.s. the legalese at the bottom of your emails is probably dissuading
a number of people from replying, you're better off dumping it if you
can--it serves no useful purpose anyway.

Ok, that makes perfect sense.  We will upgrade to 64-bit and continue our tests on the new build.

By the way, is it safe to go ahead with Centos 5.5 or should we stick to the previous version 5.4?

Thank you so much Sam and Thom for your speedy help :)

Best regards,

Zeeshan





> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Sam Mason
> Sent: 05 July 2010 15:14
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Out of memory on update of a single column table
> containg just one row.
>
> On Mon, Jul 05, 2010 at 01:52:20PM +0000,
> Zeeshan.Ghalib@globaldatapoint.com wrote:
> > So, is this there a restriction with 32-bit PostgreSQL, a bug or
> > configuration issue?
>
> It's a restriction because of the 32bit address space.  You've
> basically
> got between two and three GB of useful space left and everything has to
> fit in there.  Hence if you've got a 300MB object in memory (i.e. your
> XML) and you try and combine it with as similar sized object then
> you'll
> need 300MB*4 = 1.2GB of free memory in the process's address space,
> with
> 600MB of that being consecutive.  It's obviously failing to find that
> and hence the query is failing.  A 64bit address space is more than a
> million times larger and hence this is why that worked.
>
> Generally with databases you're expected to be working with lots of
> small objects (i.e. most a few bytes in length) with a few multi KB
> ones.  Databases are fine with lots of these (i.e. I've got databases
> with hundreds of millions of rows) but don't work very well when each
> row is very big.  The design assumption is that values are "atomic" and
> large values normally aren't atomic so would be broken down into
> smaller
> pieces when they enter the database.
>
> Maybe the large object functionality in PG would suit your needs
> better,
> they are designed for larger things like this and don't suffer the same
> restrictions (i.e. internally they're worked with piecemeal rather than
> trying to work with the whole thing in one go).  They can be a bit of a
> hassle to work with, so which is "better" is very use case dependent.
>
> --
>   Sam  http://samason.me.uk/
>
> p.s. the legalese at the bottom of your emails is probably dissuading
> a number of people from replying, you're better off dumping it if you
> can--it serves no useful purpose anyway.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
This e-mail is confidential and should not be used by anyone who is not the original intended recipient. Global
DataPointLimited does not accept liability for any statements made which are clearly the sender's own and not expressly
madeon behalf of Global DataPoint Limited. No contracts may be concluded on behalf of Global DataPoint Limited by means
ofe-mail communication. Global DataPoint Limited Registered in England and Wales with registered number 3739752
RegisteredOffice Middlesex House, 34-42 Cleveland Street, London W1T 4LB