Thread: VACUUM ANALYZE out of memory

VACUUM ANALYZE out of memory

From

Michael Akinde

Date:

11 December 2007, 06:00:17

Hi,

I am encountering problems when trying to run VACUUM FULL ANALYZE on a
particular table in my database; namely that the process crashes out
with the following problem:

INFO:  vacuuming "pg_catalog.pg_largeobject"
ERROR:  out of memory
DETAIL:  Failed on request of size 536870912.

INFO:  vacuuming "pg_catalog.pg_largeobject"
ERROR:  out of memory
DETAIL:  Failed on request of size 32.

Granted, our largeobject table is a bit large:

INFO:  analyzing "pg_catalog.pg_largeobject"
INFO:  "pg_largeobject": scanned 3000 of 116049431 pages, containing
18883 live rows and 409 dead rows; 3000 rows in sample, 730453802
estimated total rows

...but I trust that VACUUM ANALYZE doesn't try to read the entire table
into memory at once. :-) The machine was set up with 1.2 GB shared
memory and 1 GB maintenance memory, so I would have expected this to be
sufficient for the task (we will eventually set this up oa 64-bit
machine with 16 GB memory, but at the moment we are restricted to 32 bit).

This is currently running on PostgreSQL 8.3beta2, but since I haven't
seen this problem reported before, I guess this will also be a problem
in earlier versions. Have we run into a bug/limitation of the Postgres
VACUUM or is this something we might be able to solve via reconfiguring
the server/database, or downgrading the DBMS version.

I shall be trying to run a simple VACUUM later this evening, in order to
see whether that manages to complete. Unfortunately, due to the time it
takes to load data, it's not really practicable to shift servers at the
moment

A little background on the application:
We are building a raster-database to be used for storing weather and
water data. The raster data (2D matrices of floating points) are stored
using large objects and indexed using a values table (with multiple
dimensions: time, parameter, altitudes, etc). This is a technique I've
worked with successfully in the past, though in that case using an
Informix DBMS. My current employer is a strong proponent for Open
Software, which has led to our implementation of the current system on a
PostgreSQL DBMS (we will also be releasing our system as GPL in the near
future).

The test instance we are working on now is about 1 TB; we expect to
increase that by a factor of at least 5 within the first year of
operation, so we'd really like to ensure that we can get VACUUM working
(although the data is mostly going to be static on this installation, we
will have others that won't be).

Anyone with some insights on VACUUM FULL ANALYZE who can weigh in on
what is going wrong?

Regards,

Michael Akinde
----
Database Architect,
met.no

Attachment

michael.akinde.vcf

Re: VACUUM ANALYZE out of memory

From

Stefan Kaltenbrunner

Date:

11 December 2007, 06:14:42

Michael Akinde wrote:
> Hi,
> 
> I am encountering problems when trying to run VACUUM FULL ANALYZE on a 
> particular table in my database; namely that the process crashes out 
> with the following problem:
> 
> INFO:  vacuuming "pg_catalog.pg_largeobject"
> ERROR:  out of memory
> DETAIL:  Failed on request of size 536870912.
> 
> INFO:  vacuuming "pg_catalog.pg_largeobject"
> ERROR:  out of memory
> DETAIL:  Failed on request of size 32.
> 
> Granted, our largeobject table is a bit large:
> 
> INFO:  analyzing "pg_catalog.pg_largeobject"
> INFO:  "pg_largeobject": scanned 3000 of 116049431 pages, containing 
> 18883 live rows and 409 dead rows; 3000 rows in sample, 730453802 
> estimated total rows
> 
> ...but I trust that VACUUM ANALYZE doesn't try to read the entire table 
> into memory at once. :-) The machine was set up with 1.2 GB shared 
> memory and 1 GB maintenance memory, so I would have expected this to be 
> sufficient for the task (we will eventually set this up oa 64-bit 
> machine with 16 GB memory, but at the moment we are restricted to 32 bit).
> 
> This is currently running on PostgreSQL 8.3beta2, but since I haven't 
> seen this problem reported before, I guess this will also be a problem 
> in earlier versions. Have we run into a bug/limitation of the Postgres 
> VACUUM or is this something we might be able to solve via reconfiguring 
> the server/database, or downgrading the DBMS version.

this seems simply a problem of setting maintenance_work_mem too high (ie 
higher than what your OS can support - maybe an ulimit/processlimit is 
in effect?) . Try reducing maintenance_work_mem to say 128MB and retry.
If you promise postgresql that it can get 1GB it will happily try to use 
it ...


Stefan

Re: VACUUM ANALYZE out of memory

From

Simon Riggs

Date:

11 December 2007, 06:24:36

On Tue, 2007-12-11 at 10:59 +0100, Michael Akinde wrote:

> I am encountering problems when trying to run VACUUM FULL ANALYZE on a 
> particular table in my database; namely that the process crashes out 
> with the following problem:

Probably just as well, since a VACUUM FULL on an 800GB table is going to
take a rather long time, so you are saved from discovering just how
excessively long it will run for. But it seems like a bug. This happens
consistently, I take it?

Can you run ANALYZE and then VACUUM VERBOSE, both on just
pg_largeobject, please? It will be useful to know whether they succeed.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com

Re: VACUUM ANALYZE out of memory

From

Michael Akinde

Date:

11 December 2007, 07:30:51

Thanks for the rapid responses.

Stefan Kaltenbrunner wrote:

this seems simply a problem of setting maintenance_work_mem too high (ie higher than what your OS can support - maybe an ulimit/processlimit is in effect?) . Try reducing maintenance_work_mem to say 128MB and retry.
If you promise postgresql that it can get 1GB it will happily try to use it ...

I set up the system together with one of our Linux sysOps, so I think the settings should be OK. Kernel.shmmax is set to 1.2 GB, but I'll get him to recheck if there could be any other limits he has forgotten to increase.

The way the process was running, it seems to have basically just continually allocated memory until (presumably) it broke through the slightly less than 1.2 GB shared memory allocation we had provided for PostgreSQL (at least the postgres process was still running by the time resident size had reached 1.1 GB).

Incidentally, in the first error of the two I posted, the shared memory setting was significantly lower (24 MB, I believe). I'll try with 128 MB before I leave in the evening, though (assuming the other tests I'm running complete by then).

Simon Riggs wrote:

On Tue, 2007-12-11 at 10:59 +0100, Michael Akinde wrote:

I am encountering problems when trying to run VACUUM FULL ANALYZE on a 
particular table in my database; namely that the process crashes out 
with the following problem:

Probably just as well, since a VACUUM FULL on an 800GB table is going to
take a rather long time, so you are saved from discovering just how
excessively long it will run for. But it seems like a bug. This happens
consistently, I take it?

I suspect so, though it has only happened a couple of times yet (as it does take a while) before it hits that 1.1 GB roof. But part of the reason for running the VACUUM FULL was of course to find out how long time it would take. Reliability is always a priority for us,
so I like to know what (useful) tools we have available and stress the system as much as possible... :-)

Can you run ANALYZE and then VACUUM VERBOSE, both on just
pg_largeobject, please? It will be useful to know whether they succeed.

I ran just ANALYZE on the entire database yesterday, and that worked without any problems.

I am currently running a VACUUM VERBOSE on the database. It isn't done yet, but it is running with a steady (low) resource usage.

Regards,

Michael A.

Attachment

michael.akinde.vcf

Re: VACUUM ANALYZE out of memory

From

Stefan Kaltenbrunner

Date:

11 December 2007, 07:45:54

Michael Akinde wrote:
> Thanks for the rapid responses.
> 
> Stefan Kaltenbrunner wrote:
>> this seems simply a problem of setting maintenance_work_mem too high 
>> (ie higher than what your OS can support - maybe an 
>> ulimit/processlimit is in effect?) . Try reducing maintenance_work_mem 
>> to say 128MB and retry.
>> If you promise postgresql that it can get 1GB it will happily try to 
>> use it ...
> I set up the system together with one of our Linux sysOps, so I think 
> the settings should be OK. Kernel.shmmax is set to 1.2 GB, but I'll get 
> him to recheck if there could be any other limits he has forgotten to 
> increase.
> 
> The way the process was running, it seems to have basically just 
> continually allocated memory until (presumably) it broke through the  
> slightly less than 1.2 GB shared memory allocation we had provided for 
> PostgreSQL (at least the postgres process was still running by the time 
> resident size had reached 1.1 GB).
> 
> Incidentally, in the first error of the two I posted, the shared memory 
> setting was significantly lower (24 MB, I believe). I'll try with 128 MB 
> before I leave in the evening, though (assuming the other tests I'm 
> running complete by then).

this is most likely not at all related to your shared memory settings 
but to your setting of maintenance_work_mem which is the amount of 
memory a single backend(!) can use for maintainance operations (which 
VACUUM is for example).
notice that your first error refers to an allocation of about 500MB 
which your ulimit/kernel process limit simply might not be able to give 
a single process.
And for very large tables VACUUM FULL is generally not a good idea at 
all - either look into regular normal vacuum scheduling or if you need 
to recover from a a bloated database use a command that forced a rewrite 
of the table (like CLUSTER) which will be heaps faster but also require 
about twice the amount of diskspace.

Stefan

Re: VACUUM ANALYZE out of memory

From

Alvaro Herrera

Date:

11 December 2007, 07:49:50

Michael Akinde wrote:
> Thanks for the rapid responses.
>
> Stefan Kaltenbrunner wrote:
>> this seems simply a problem of setting maintenance_work_mem too high (ie 
>> higher than what your OS can support - maybe an ulimit/processlimit is in 
>> effect?) . Try reducing maintenance_work_mem to say 128MB and retry.
>> If you promise postgresql that it can get 1GB it will happily try to use 
>> it ...
> I set up the system together with one of our Linux sysOps, so I think the 
> settings should be OK. Kernel.shmmax is set to 1.2 GB, but I'll get him to 
> recheck if there could be any other limits he has forgotten to increase.

You are confusing shared memory (shared_buffers and kernel.shmmax) with
local memory (work_mem and maintenance_work_mem).  The error you got is
about the latter kind.

-- 
Alvaro Herrera                 http://www.amazon.com/gp/registry/CTMLCN8V17R4
"La soledad es compañía"

Re: VACUUM ANALYZE out of memory

From

Martijn van Oosterhout

Date:

11 December 2007, 07:50:19

On Tue, Dec 11, 2007 at 12:30:43PM +0100, Michael Akinde wrote:
> The way the process was running, it seems to have basically just
> continually allocated memory until (presumably) it broke through the
> slightly less than 1.2 GB shared memory allocation we had provided for
> PostgreSQL (at least the postgres process was still running by the time
> resident size had reached 1.1 GB).

I think you're slightly confused. The VACUUM isn't going to use much of
the shared memory anyway. Shared memory is just disk buffers mostly and
is all allocated at startup. The memory being allocated by VACUUM is
the maintainence workmem *in addition* to any shared memory.

Also, depending on what's happening it may be allocating maintainence
workmem more than once.

> Incidentally, in the first error of the two I posted, the shared memory
> setting was significantly lower (24 MB, I believe). I'll try with 128 MB
> before I leave in the evening, though (assuming the other tests I'm
> running complete by then).

What you want is a reasonable shared mem, maybe 0.5GB and a smaller
maintainence workmem since the letter is probably what's killing you.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution inevitable.
>  -- John F Kennedy

Re: VACUUM ANALYZE out of memory

From

Michael Akinde

Date:

11 December 2007, 10:19:01

Stefan Kaltenbrunner wrote:
> Michael Akinde wrote:
>> Incidentally, in the first error of the two I posted, the shared
>> memory setting was significantly lower (24 MB, I believe). I'll try
>> with 128 MB before I leave in the evening, though (assuming the other
>> tests I'm running complete by then).
>
> this is most likely not at all related to your shared memory settings
> but to your setting of maintenance_work_mem which is the amount of
> memory a single backend(!) can use for maintainance operations (which
> VACUUM is for example).
> notice that your first error refers to an allocation of about 500MB
> which your ulimit/kernel process limit simply might not be able to
> give a single process.

Yes - in the first case, the maintenance_work_mem was at default (so I
wasn't surprised to see it fail to allocate half a gigabyte). In the
second case, though, maintenance_work_mem was set at 1024 MB (where it
then has the slighly odd error "Failed on request of size 32").

The server has 4 GB RAM available, so even if it was trying to use 1.2
GB shared memory + 1 GB for maintenance_mem all at once, it still seems
odd that the process would fail. As far as I can tell (running ulimit -a
),  the limits look pretty OK to me.

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
max nice                        (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) unlimited
max rt priority                 (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Being unable to run VACUUM FULL isn't a problem for the current
configuration of our application (as it will mostly be large amounts of
static data), but we're likely to have an application working with the
database next year where we'd move around 100 GB through the database on
a daily basis. At least based on the documentation of the various
commands, I would expect that one would want to perform VACUUM FULL
every once in a while.

Again, thanks for the feedback.

Regards,

Michael Akinde
Database Architect, met.no

Attachment

michael.akinde.vcf

Re: VACUUM ANALYZE out of memory

From

Martijn van Oosterhout

Date:

11 December 2007, 10:33:58

On Tue, Dec 11, 2007 at 03:18:54PM +0100, Michael Akinde wrote:
> The server has 4 GB RAM available, so even if it was trying to use 1.2
> GB shared memory + 1 GB for maintenance_mem all at once, it still seems
> odd that the process would fail. As far as I can tell (running ulimit -a
> ),  the limits look pretty OK to me.

IIRC you said you're on a 32-bit architecture? Which means any single
process only has 4GB address space. Take off 1GB for the kernel, 1GB
shared memory, 1 GB maintainence workmem and a collection of libraries,
stack space and general memory fragmentation and I can absolutly
beleive you've run into the limit of *address* space.

On a 64-bit machine it doesn't matter so much but on a 32-bit machine
using 1GB for shared memory severely cuts the amount of auxilliary
memory the server can use. Unless you've shown a measuable difference
between 256MB and 1G shared memory, I'd say you're better off using the
smaller amount so you can have higher maintainence work mem.

VACUUM doesn't benefit much from lots of shared buffers, but it does
benefit from maint workmem.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution inevitable.
>  -- John F Kennedy

Re: VACUUM ANALYZE out of memory

From

Michael Akinde

Date:

11 December 2007, 11:09:43

Martijn van Oosterhout wrote:
> IIRC you said you're on a 32-bit architecture? Which means any single
> process only has 4GB address space. Take off 1GB for the kernel, 1GB
> shared memory, 1 GB maintainence workmem and a collection of libraries,
> stack space and general memory fragmentation and I can absolutly
> beleive you've run into the limit of *address* space.
>
Should have been 64-bit, but a foul-up means it is running in 32-bit at
the moment.
> On a 64-bit machine it doesn't matter so much but on a 32-bit machine
> using 1GB for shared memory severely cuts the amount of auxilliary
> memory the server can use. Unless you've shown a measuable difference
> between 256MB and 1G shared memory, I'd say you're better off using the
> smaller amount so you can have higher maintainence work mem.
>
We're still in the process of testing and tuning (which takes its sweet
time), so at the moment I can not tell what benefits we have on the
different settings in practice. But I'll try to set shared buffers down
to 128-256 MB and the maintenance_work_memory to 512-1024MB when I next
have a time slot where I can run the server into the ground.

However, the problem also occurred with the shared_buffers limit set at
24 MB and maintenance_work_mem was at its default setting (16 MB?), so I
would be rather surprised if the problem did not repeat itself.

Regards,

Michael Akinde
Database Architect, met.no

Attachment

michael.akinde.vcf

Re: VACUUM ANALYZE out of memory

From

Michael Akinde

Date:

14 December 2007, 07:00:11

[Synopsis: VACUUM FULL ANALYZE goes out of memory on a very large
pg_catalog.pg_largeobject table.]

Simon Riggs wrote:
> Can you run ANALYZE and then VACUUM VERBOSE, both on just
> pg_largeobject, please? It will be useful to know whether they succeed
ANALYZE:

INFO:  analyzing "pg_catalog.pg_largeobject"
INFO:  "pg_largeobject": scanned 3000 of 116049431 pages, containing
18883 live rows and 409 dead rows; 3000 rows in sample, 730453802
estimated total rows

VACUUM VERBOSE:

INFO:  vacuuming "pg_catalog.pg_largeobject"
INFO:  scanned index "pg_largeobject_loid_pn_index" to remove 106756133
row versions
DETAIL:  CPU 38.88s/303.43u sec elapsed 2574.24 sec.
INFO:  "pg_largeobject": removed 106756133 row versions in 13190323 pages
DETAIL:  CPU 259.42s/113.20u sec elapsed 14017.17 sec.
INFO:  index "pg_largeobject_loid_pn_index" now contains 706303560 row
versions in 2674471 pages
DETAIL:  103960219 index row versions were removed.
356977 index pages have been deleted, 77870 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO:  "pg_largeobject": found 17489832 removable, 706303560
nonremovable row versions in 116049431 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 36000670 unused item pointers.
64493445 pages contain useful free space.
0 pages are entirely empty.
CPU 1605.42s/1107.48u sec elapsed 133032.02 sec.
WARNING:  relation "pg_catalog.pg_largeobject" contains more than
"max_fsm_pages" pages with useful free space
HINT:  Consider using VACUUM FULL on this relation or increasing the
configuration parameter "max_fsm_pages".
VACUUM

(This took some 36+ Hours. It will be interesting to see what happens
when we add another 20 years worth of data to the 13 years already in
the DB).

ANALYZE:

INFO:  analyzing "pg_catalog.pg_largeobject"
INFO:  "pg_largeobject": scanned 3000 of 116049431 pages, containing
17830 live rows and 0 dead rows; 3000 rows in sample, 689720452
estimated total rows

I will lower the SharedMem and MaintenanceWorkMem settings as suggested
in earlier posts before leaving for home this evening, and then let it
run a VACUUM FULL ANALYZE. I remain dubious though - as mentioned, the
first test I did had quite low settings for this, and we still had the
memory crash. No reason not to try it though.

Over Christmas, we will be moving this over on a 64-bit kernel and 16
GB, so after that we'll be able to test on the database with > 1GB
maintenance memory as well.

Regards,

Michael A.
Database Architect, met.no