Thread: VACUUM ANALYZE out of memory

VACUUM ANALYZE out of memory

From
Michael Akinde
Date:
Hi,

I am encountering problems when trying to run VACUUM FULL ANALYZE on a
particular table in my database; namely that the process crashes out
with the following problem:

INFO:  vacuuming "pg_catalog.pg_largeobject"
ERROR:  out of memory
DETAIL:  Failed on request of size 536870912.

INFO:  vacuuming "pg_catalog.pg_largeobject"
ERROR:  out of memory
DETAIL:  Failed on request of size 32.

Granted, our largeobject table is a bit large:

INFO:  analyzing "pg_catalog.pg_largeobject"
INFO:  "pg_largeobject": scanned 3000 of 116049431 pages, containing
18883 live rows and 409 dead rows; 3000 rows in sample, 730453802
estimated total rows

...but I trust that VACUUM ANALYZE doesn't try to read the entire table
into memory at once. :-) The machine was set up with 1.2 GB shared
memory and 1 GB maintenance memory, so I would have expected this to be
sufficient for the task (we will eventually set this up oa 64-bit
machine with 16 GB memory, but at the moment we are restricted to 32 bit).

This is currently running on PostgreSQL 8.3beta2, but since I haven't
seen this problem reported before, I guess this will also be a problem
in earlier versions. Have we run into a bug/limitation of the Postgres
VACUUM or is this something we might be able to solve via reconfiguring
the server/database, or downgrading the DBMS version.

I shall be trying to run a simple VACUUM later this evening, in order to
see whether that manages to complete. Unfortunately, due to the time it
takes to load data, it's not really practicable to shift servers at the
moment

A little background on the application:
We are building a raster-database to be used for storing weather and
water data. The raster data (2D matrices of floating points) are stored
using large objects and indexed using a values table (with multiple
dimensions: time, parameter, altitudes, etc). This is a technique I've
worked with successfully in the past, though in that case using an
Informix DBMS. My current employer is a strong proponent for Open
Software, which has led to our implementation of the current system on a
PostgreSQL DBMS (we will also be releasing our system as GPL in the near
future).

The test instance we are working on now is about 1 TB; we expect to
increase that by a factor of at least 5 within the first year of
operation, so we'd really like to ensure that we can get VACUUM working
(although the data is mostly going to be static on this installation, we
will have others that won't be).

Anyone with some insights on VACUUM FULL ANALYZE who can weigh in on
what is going wrong?

Regards,

Michael Akinde
----
Database Architect,
met.no


Attachment

Re: VACUUM ANALYZE out of memory

From
Stefan Kaltenbrunner
Date:
Michael Akinde wrote:
> Hi,
> 
> I am encountering problems when trying to run VACUUM FULL ANALYZE on a 
> particular table in my database; namely that the process crashes out 
> with the following problem:
> 
> INFO:  vacuuming "pg_catalog.pg_largeobject"
> ERROR:  out of memory
> DETAIL:  Failed on request of size 536870912.
> 
> INFO:  vacuuming "pg_catalog.pg_largeobject"
> ERROR:  out of memory
> DETAIL:  Failed on request of size 32.
> 
> Granted, our largeobject table is a bit large:
> 
> INFO:  analyzing "pg_catalog.pg_largeobject"
> INFO:  "pg_largeobject": scanned 3000 of 116049431 pages, containing 
> 18883 live rows and 409 dead rows; 3000 rows in sample, 730453802 
> estimated total rows
> 
> ...but I trust that VACUUM ANALYZE doesn't try to read the entire table 
> into memory at once. :-) The machine was set up with 1.2 GB shared 
> memory and 1 GB maintenance memory, so I would have expected this to be 
> sufficient for the task (we will eventually set this up oa 64-bit 
> machine with 16 GB memory, but at the moment we are restricted to 32 bit).
> 
> This is currently running on PostgreSQL 8.3beta2, but since I haven't 
> seen this problem reported before, I guess this will also be a problem 
> in earlier versions. Have we run into a bug/limitation of the Postgres 
> VACUUM or is this something we might be able to solve via reconfiguring 
> the server/database, or downgrading the DBMS version.

this seems simply a problem of setting maintenance_work_mem too high (ie 
higher than what your OS can support - maybe an ulimit/processlimit is 
in effect?) . Try reducing maintenance_work_mem to say 128MB and retry.
If you promise postgresql that it can get 1GB it will happily try to use 
it ...


Stefan


Re: VACUUM ANALYZE out of memory

From
Simon Riggs
Date:
On Tue, 2007-12-11 at 10:59 +0100, Michael Akinde wrote:

> I am encountering problems when trying to run VACUUM FULL ANALYZE on a 
> particular table in my database; namely that the process crashes out 
> with the following problem:

Probably just as well, since a VACUUM FULL on an 800GB table is going to
take a rather long time, so you are saved from discovering just how
excessively long it will run for. But it seems like a bug. This happens
consistently, I take it?

Can you run ANALYZE and then VACUUM VERBOSE, both on just
pg_largeobject, please? It will be useful to know whether they succeed.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com



Re: VACUUM ANALYZE out of memory

From
Michael Akinde
Date:
Thanks for the rapid responses.

Stefan Kaltenbrunner wrote:
this seems simply a problem of setting maintenance_work_mem too high (ie higher than what your OS can support - maybe an ulimit/processlimit is in effect?) . Try reducing maintenance_work_mem to say 128MB and retry.
If you promise postgresql that it can get 1GB it will happily try to use it ...
I set up the system together with one of our Linux sysOps, so I think the settings should be OK. Kernel.shmmax is set to 1.2 GB, but I'll get him to recheck if there could be any other limits he has forgotten to increase.

The way the process was running, it seems to have basically just continually allocated memory until (presumably) it broke through the  slightly less than 1.2 GB shared memory allocation we had provided for PostgreSQL (at least the postgres process was still running by the time resident size had reached 1.1 GB).

Incidentally, in the first error of the two I posted, the shared memory setting was significantly lower (24 MB, I believe). I'll try with 128 MB before I leave in the evening, though (assuming the other tests I'm running complete by then).

Simon Riggs wrote:
On Tue, 2007-12-11 at 10:59 +0100, Michael Akinde wrote:
 
I am encountering problems when trying to run VACUUM FULL ANALYZE on a 
particular table in my database; namely that the process crashes out 
with the following problem:   
Probably just as well, since a VACUUM FULL on an 800GB table is going to
take a rather long time, so you are saved from discovering just how
excessively long it will run for. But it seems like a bug. This happens
consistently, I take it? 
I suspect so, though it has only happened a couple of times yet (as it does take a while) before it hits that 1.1 GB roof. But part of the reason for running the VACUUM FULL was of course to find out how long time it would take. Reliability is always a priority for us,
so I like to know what (useful) tools we have available and stress the system as much as possible...  :-)
Can you run ANALYZE and then VACUUM VERBOSE, both on just
pg_largeobject, please? It will be useful to know whether they succeed. 
I ran just ANALYZE on the entire database yesterday, and that worked without any problems.

I am currently running a VACUUM VERBOSE on the database. It isn't done yet, but it is running with a steady (low) resource usage.

Regards,

Michael A.

Attachment

Re: VACUUM ANALYZE out of memory

From
Stefan Kaltenbrunner
Date:
Michael Akinde wrote:
> Thanks for the rapid responses.
> 
> Stefan Kaltenbrunner wrote:
>> this seems simply a problem of setting maintenance_work_mem too high 
>> (ie higher than what your OS can support - maybe an 
>> ulimit/processlimit is in effect?) . Try reducing maintenance_work_mem 
>> to say 128MB and retry.
>> If you promise postgresql that it can get 1GB it will happily try to 
>> use it ...
> I set up the system together with one of our Linux sysOps, so I think 
> the settings should be OK. Kernel.shmmax is set to 1.2 GB, but I'll get 
> him to recheck if there could be any other limits he has forgotten to 
> increase.
> 
> The way the process was running, it seems to have basically just 
> continually allocated memory until (presumably) it broke through the  
> slightly less than 1.2 GB shared memory allocation we had provided for 
> PostgreSQL (at least the postgres process was still running by the time 
> resident size had reached 1.1 GB).
> 
> Incidentally, in the first error of the two I posted, the shared memory 
> setting was significantly lower (24 MB, I believe). I'll try with 128 MB 
> before I leave in the evening, though (assuming the other tests I'm 
> running complete by then).

this is most likely not at all related to your shared memory settings 
but to your setting of maintenance_work_mem which is the amount of 
memory a single backend(!) can use for maintainance operations (which 
VACUUM is for example).
notice that your first error refers to an allocation of about 500MB 
which your ulimit/kernel process limit simply might not be able to give 
a single process.
And for very large tables VACUUM FULL is generally not a good idea at 
all - either look into regular normal vacuum scheduling or if you need 
to recover from a a bloated database use a command that forced a rewrite 
of the table (like CLUSTER) which will be heaps faster but also require 
about twice the amount of diskspace.


Stefan


Re: VACUUM ANALYZE out of memory

From
Alvaro Herrera
Date:
Michael Akinde wrote:
> Thanks for the rapid responses.
>
> Stefan Kaltenbrunner wrote:
>> this seems simply a problem of setting maintenance_work_mem too high (ie 
>> higher than what your OS can support - maybe an ulimit/processlimit is in 
>> effect?) . Try reducing maintenance_work_mem to say 128MB and retry.
>> If you promise postgresql that it can get 1GB it will happily try to use 
>> it ...
> I set up the system together with one of our Linux sysOps, so I think the 
> settings should be OK. Kernel.shmmax is set to 1.2 GB, but I'll get him to 
> recheck if there could be any other limits he has forgotten to increase.

You are confusing shared memory (shared_buffers and kernel.shmmax) with
local memory (work_mem and maintenance_work_mem).  The error you got is
about the latter kind.

-- 
Alvaro Herrera                 http://www.amazon.com/gp/registry/CTMLCN8V17R4
"La soledad es compañía"


Re: VACUUM ANALYZE out of memory

From
Martijn van Oosterhout
Date:
On Tue, Dec 11, 2007 at 12:30:43PM +0100, Michael Akinde wrote:
> The way the process was running, it seems to have basically just
> continually allocated memory until (presumably) it broke through the
> slightly less than 1.2 GB shared memory allocation we had provided for
> PostgreSQL (at least the postgres process was still running by the time
> resident size had reached 1.1 GB).

I think you're slightly confused. The VACUUM isn't going to use much of
the shared memory anyway. Shared memory is just disk buffers mostly and
is all allocated at startup. The memory being allocated by VACUUM is
the maintainence workmem *in addition* to any shared memory.

Also, depending on what's happening it may be allocating maintainence
workmem more than once.

> Incidentally, in the first error of the two I posted, the shared memory
> setting was significantly lower (24 MB, I believe). I'll try with 128 MB
> before I leave in the evening, though (assuming the other tests I'm
> running complete by then).

What you want is a reasonable shared mem, maybe 0.5GB and a smaller
maintainence workmem since the letter is probably what's killing you.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution inevitable.
>  -- John F Kennedy

Re: VACUUM ANALYZE out of memory

From
Michael Akinde
Date:
Stefan Kaltenbrunner wrote:
> Michael Akinde wrote:
>> Incidentally, in the first error of the two I posted, the shared
>> memory setting was significantly lower (24 MB, I believe). I'll try
>> with 128 MB before I leave in the evening, though (assuming the other
>> tests I'm running complete by then).
>
> this is most likely not at all related to your shared memory settings
> but to your setting of maintenance_work_mem which is the amount of
> memory a single backend(!) can use for maintainance operations (which
> VACUUM is for example).
> notice that your first error refers to an allocation of about 500MB
> which your ulimit/kernel process limit simply might not be able to
> give a single process.

Yes - in the first case, the maintenance_work_mem was at default (so I
wasn't surprised to see it fail to allocate half a gigabyte). In the
second case, though, maintenance_work_mem was set at 1024 MB (where it
then has the slighly odd error "Failed on request of size 32").

The server has 4 GB RAM available, so even if it was trying to use 1.2
GB shared memory + 1 GB for maintenance_mem all at once, it still seems
odd that the process would fail. As far as I can tell (running ulimit -a
),  the limits look pretty OK to me.

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
max nice                        (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) unlimited
max rt priority                 (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Being unable to run VACUUM FULL isn't a problem for the current
configuration of our application (as it will mostly be large amounts of
static data), but we're likely to have an application working with the
database next year where we'd move around 100 GB through the database on
a daily basis. At least based on the documentation of the various
commands, I would expect that one would want to perform VACUUM FULL
every once in a while.

Again, thanks for the feedback.

Regards,

Michael Akinde
Database Architect, met.no


Attachment

Re: VACUUM ANALYZE out of memory

From
Martijn van Oosterhout
Date:
On Tue, Dec 11, 2007 at 03:18:54PM +0100, Michael Akinde wrote:
> The server has 4 GB RAM available, so even if it was trying to use 1.2
> GB shared memory + 1 GB for maintenance_mem all at once, it still seems
> odd that the process would fail. As far as I can tell (running ulimit -a
> ),  the limits look pretty OK to me.

IIRC you said you're on a 32-bit architecture? Which means any single
process only has 4GB address space. Take off 1GB for the kernel, 1GB
shared memory, 1 GB maintainence workmem and a collection of libraries,
stack space and general memory fragmentation and I can absolutly
beleive you've run into the limit of *address* space.

On a 64-bit machine it doesn't matter so much but on a 32-bit machine
using 1GB for shared memory severely cuts the amount of auxilliary
memory the server can use. Unless you've shown a measuable difference
between 256MB and 1G shared memory, I'd say you're better off using the
smaller amount so you can have higher maintainence work mem.

VACUUM doesn't benefit much from lots of shared buffers, but it does
benefit from maint workmem.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution inevitable.
>  -- John F Kennedy

Re: VACUUM ANALYZE out of memory

From
Michael Akinde
Date:
Martijn van Oosterhout wrote:
> IIRC you said you're on a 32-bit architecture? Which means any single
> process only has 4GB address space. Take off 1GB for the kernel, 1GB
> shared memory, 1 GB maintainence workmem and a collection of libraries,
> stack space and general memory fragmentation and I can absolutly
> beleive you've run into the limit of *address* space.
>
Should have been 64-bit, but a foul-up means it is running in 32-bit at
the moment.
> On a 64-bit machine it doesn't matter so much but on a 32-bit machine
> using 1GB for shared memory severely cuts the amount of auxilliary
> memory the server can use. Unless you've shown a measuable difference
> between 256MB and 1G shared memory, I'd say you're better off using the
> smaller amount so you can have higher maintainence work mem.
>
We're still in the process of testing and tuning (which takes its sweet
time), so at the moment I can not tell what benefits we have on the
different settings in practice. But I'll try to set shared buffers down
to 128-256 MB and the maintenance_work_memory to 512-1024MB when I next
have a time slot where I can run the server into the ground.

However, the problem also occurred with the shared_buffers limit set at
24 MB and maintenance_work_mem was at its default setting (16 MB?), so I
would be rather surprised if the problem did not repeat itself.

Regards,

Michael Akinde
Database Architect, met.no


Attachment

Re: VACUUM ANALYZE out of memory

From
Michael Akinde
Date:
[Synopsis: VACUUM FULL ANALYZE goes out of memory on a very large
pg_catalog.pg_largeobject table.]

Simon Riggs wrote:
> Can you run ANALYZE and then VACUUM VERBOSE, both on just
> pg_largeobject, please? It will be useful to know whether they succeed
ANALYZE:

INFO:  analyzing "pg_catalog.pg_largeobject"
INFO:  "pg_largeobject": scanned 3000 of 116049431 pages, containing
18883 live rows and 409 dead rows; 3000 rows in sample, 730453802
estimated total rows

VACUUM VERBOSE:

INFO:  vacuuming "pg_catalog.pg_largeobject"
INFO:  scanned index "pg_largeobject_loid_pn_index" to remove 106756133
row versions
DETAIL:  CPU 38.88s/303.43u sec elapsed 2574.24 sec.
INFO:  "pg_largeobject": removed 106756133 row versions in 13190323 pages
DETAIL:  CPU 259.42s/113.20u sec elapsed 14017.17 sec.
INFO:  index "pg_largeobject_loid_pn_index" now contains 706303560 row
versions in 2674471 pages
DETAIL:  103960219 index row versions were removed.
356977 index pages have been deleted, 77870 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO:  "pg_largeobject": found 17489832 removable, 706303560
nonremovable row versions in 116049431 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 36000670 unused item pointers.
64493445 pages contain useful free space.
0 pages are entirely empty.
CPU 1605.42s/1107.48u sec elapsed 133032.02 sec.
WARNING:  relation "pg_catalog.pg_largeobject" contains more than
"max_fsm_pages" pages with useful free space
HINT:  Consider using VACUUM FULL on this relation or increasing the
configuration parameter "max_fsm_pages".
VACUUM

(This took some 36+ Hours. It will be interesting to see what happens
when we add another 20 years worth of data to the 13 years already in
the DB).

ANALYZE:

INFO:  analyzing "pg_catalog.pg_largeobject"
INFO:  "pg_largeobject": scanned 3000 of 116049431 pages, containing
17830 live rows and 0 dead rows; 3000 rows in sample, 689720452
estimated total rows

I will lower the SharedMem and MaintenanceWorkMem settings as suggested
in earlier posts before leaving for home this evening, and then let it
run a VACUUM FULL ANALYZE. I remain dubious though - as mentioned, the
first test I did had quite low settings for this, and we still had the
memory crash. No reason not to try it though.

Over Christmas, we will be moving this over on a 64-bit kernel and 16
GB, so after that we'll be able to test on the database with > 1GB
maintenance memory as well.

Regards,

Michael A.
Database Architect, met.no

Attachment

Re: VACUUM FULL out of memory

From
Michael Akinde
Date:
As suggested, I tested a VACUUM FULL ANALYZE with 128MB shared_buffers
and 512 MB reserved for maintenance_work_mem (on a 32 bit machine with 4
GB RAM). That ought to leave more than enough space for other processes
in the system. Again, the system fails on the VACUUM with the following
error (identical to the error we had when maintenance_work_mem was very
low.

INFO:  vacuuming "pg_catalog.pg_largeobject"
ERROR:  out of memory
DETAIL:  Failed on request of size 536870912

I've now also tested a 64-bit setup with 16 GB RAM, with 2 GB
maintenance_work_mem; this time on PostgreSQL 8.2.5.

INFO:  vacuuming "pg_catalog.pg_largeobject"
ERROR:  invalid memory alloc request size 1073741824

It strikes me as somewhat worrying that VACUUM FULL ANALYZE has so much
trouble with a large table. Granted - 730 million rows is a good deal -
but it's really not that much for a large database. I'd expect an
operation on such a table to take time, of course, but not to
consistently crash out of memory.

Any suggestions as to what we can otherwise try to isolate the problem?

Regards,

Michael Akinde
Database Architect, met.no


Michael Akinde wrote:
> [Synopsis: VACUUM FULL ANALYZE goes out of memory on a very large
> pg_catalog.pg_largeobject table.]
>
> Simon Riggs wrote:
>> Can you run ANALYZE and then VACUUM VERBOSE, both on just
>> pg_largeobject, please? It will be useful to know whether they succeed
> ANALYZE:
>
> INFO:  analyzing "pg_catalog.pg_largeobject"
> INFO:  "pg_largeobject": scanned 3000 of 116049431 pages, containing
> 18883 live rows and 409 dead rows; 3000 rows in sample, 730453802
> estimated total rows
>
> VACUUM VERBOSE:
>
> INFO:  vacuuming "pg_catalog.pg_largeobject"
> INFO:  scanned index "pg_largeobject_loid_pn_index" to remove
> 106756133 row versions
> DETAIL:  CPU 38.88s/303.43u sec elapsed 2574.24 sec.
> INFO:  "pg_largeobject": removed 106756133 row versions in 13190323 pages
> DETAIL:  CPU 259.42s/113.20u sec elapsed 14017.17 sec.
> INFO:  index "pg_largeobject_loid_pn_index" now contains 706303560 row
> versions in 2674471 pages
> DETAIL:  103960219 index row versions were removed.
> 356977 index pages have been deleted, 77870 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.02 sec.
> INFO:  "pg_largeobject": found 17489832 removable, 706303560
> nonremovable row versions in 116049431 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 36000670 unused item pointers.
> 64493445 pages contain useful free space.
> 0 pages are entirely empty.
> CPU 1605.42s/1107.48u sec elapsed 133032.02 sec.
> WARNING:  relation "pg_catalog.pg_largeobject" contains more than
> "max_fsm_pages" pages with useful free space
> HINT:  Consider using VACUUM FULL on this relation or increasing the
> configuration parameter "max_fsm_pages".
> VACUUM
>
> (This took some 36+ Hours. It will be interesting to see what happens
> when we add another 20 years worth of data to the 13 years already in
> the DB).
>
> ANALYZE:
>
> INFO:  analyzing "pg_catalog.pg_largeobject"
> INFO:  "pg_largeobject": scanned 3000 of 116049431 pages, containing
> 17830 live rows and 0 dead rows; 3000 rows in sample, 689720452
> estimated total rows
>
> I will lower the SharedMem and MaintenanceWorkMem settings as
> suggested in earlier posts before leaving for home this evening, and
> then let it run a VACUUM FULL ANALYZE. I remain dubious though - as
> mentioned, the first test I did had quite low settings for this, and
> we still had the memory crash. No reason not to try it though.
>
> Over Christmas, we will be moving this over on a 64-bit kernel and 16
> GB, so after that we'll be able to test on the database with > 1GB
> maintenance memory as well.
>
> Regards,
>
> Michael A.
> Database Architect, met.no
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>


Attachment

Re: VACUUM FULL out of memory

From
"Usama Dar"
Date:


On Jan 7, 2008 2:40 PM, Michael Akinde <michael.akinde@met.no> wrote:
As suggested, I tested a VACUUM FULL ANALYZE with 128MB shared_buffers
and 512 MB reserved for maintenance_work_mem (on a 32 bit machine with 4
GB RAM).

My Apologies if  my question seems redundant and something you have already discussed with list members, but why do you need to do a VACUUM FULL? have you not vacuumed for a while? or some special requirement which requires very aggressive space re-claim? Vacuum Full is also known to cause some index bloat at times as well. most systems i know run regular vacuums and had never required to run a vacuum full.



--
Usama Munir Dar http://www.linkedin.com/in/usamadar
Consultant Architect
Cell:+92 321 5020666
Skype: usamadar

Re: VACUUM FULL out of memory

From
Michael Akinde
Date:
Hi,

The system we are building is intended to be utilized in a number of different applications, so the testing we are doing is primarily directed at stressing the system by running it through its paces and uncovering any weaknesses. I prefer to find as many problems as possible now, rather than in production. ;-)

For the current application set I'm testing, I expect we won't need to do much VACUUMing, as it will be a fairly static dataset only used for querying (once all the data is loaded). I know that we will be running some databases with some pretty rapid throughput (100 GB/day), but if VACUUM will do (as I expect), then we'll probably just stick to that. I don't have time to do any testing on that until next month, though.

I do find it odd, however, that pgsql recommends using a VACUUM FULL (as a result of running the VACUUM). Especially if, as it seems, VACUUM FULL doesn't work for tables beyond a certain size. Assuming we have not set up something completely wrongly, this seems like a bug.

If this is the wrong mailing list to be posting this, then please let me know.

Regards,

Michael Akinde
Database Architect, Met.no

Usama Dar wrote:
On Jan 7, 2008 2:40 PM, Michael Akinde <michael.akinde@met.no> wrote:
As suggested, I tested a VACUUM FULL ANALYZE with 128MB shared_buffers
and 512 MB reserved for maintenance_work_mem (on a 32 bit machine with 4
GB RAM).

My Apologies if  my question seems redundant and something you have already discussed with list members, but why do you need to do a VACUUM FULL? have you not vacuumed for a while? or some special requirement which requires very aggressive space re-claim? Vacuum Full is also known to cause some index bloat at times as well. most systems i know run regular vacuums and had never required to run a vacuum full.



--
Usama Munir Dar http://www.linkedin.com/in/usamadar
Consultant Architect
Cell:+92 321 5020666
Skype: usamadar

Attachment

Re: VACUUM FULL out of memory

From
Andrew Sullivan
Date:
On Mon, Jan 07, 2008 at 10:40:23AM +0100, Michael Akinde wrote:
> As suggested, I tested a VACUUM FULL ANALYZE with 128MB shared_buffers 
> and 512 MB reserved for maintenance_work_mem (on a 32 bit machine with 4 
> GB RAM). That ought to leave more than enough space for other processes 
> in the system. Again, the system fails on the VACUUM with the following 
> error (identical to the error we had when maintenance_work_mem was very 
> low.
> 
> INFO:  vacuuming "pg_catalog.pg_largeobject"
> ERROR:  out of memory
> DETAIL:  Failed on request of size 536870912

Something is using up the memory on the machine, or (I'll bet this is more
likely) your user (postgres?  Whatever's running the postmaster) has a
ulimit on its ability to allocate memory on the machine.  

> It strikes me as somewhat worrying that VACUUM FULL ANALYZE has so much 
> trouble with a large table. Granted - 730 million rows is a good deal - 

No, it's not really that big.  I've never seen a problem like this.  If it
were the 8.3 beta, I'd be worried; but I'm inclined to suggest you look at
the OS settings first given your set up.

Note that you should almost never use VACUUM FULL unless you've really
messed things up.  I understand from the thread that you're just testing
things out right now.  But VACUUM FULL is not something you should _ever_
need in production, if you've set things up correctly.

A




Re: VACUUM FULL out of memory

From
"Holger Hoffstaette"
Date:
On Mon, 07 Jan 2008 10:57:53 -0500, Andrew Sullivan wrote:

> Note that you should almost never use VACUUM FULL unless you've really
> messed things up.  I understand from the thread that you're just testing
> things out right now.  But VACUUM FULL is not something you should _ever_
> need in production, if you've set things up correctly.

Then why does it exist? Is it a historical leftover? If it is
only needed for emergency, should it not have a different name?
Just curious..

Holger




Re: VACUUM FULL out of memory

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 07 Jan 2008 17:33:53 +0100
"Holger Hoffstaette" <holger@wizards.de> wrote:

> On Mon, 07 Jan 2008 10:57:53 -0500, Andrew Sullivan wrote:
> 
> > Note that you should almost never use VACUUM FULL unless you've
> > really messed things up.  I understand from the thread that you're
> > just testing things out right now.  But VACUUM FULL is not
> > something you should _ever_ need in production, if you've set
> > things up correctly.
> 
> Then why does it exist? Is it a historical leftover? If it is
> only needed for emergency, should it not have a different name?
> Just curious..

There are times when it is required, usually when people don't
configure normal vacuum/autovacuum correctly.

Sincerely,

Joshua D. Drake

> 
> Holger
> 
> 
> 
> ---------------------------(end of
> broadcast)--------------------------- TIP 4: Have you searched our
> list archives?
> 
>                http://archives.postgresql.org
> 


- -- 
The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ 
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD'


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHgmEbATb/zqfZUUQRAoyXAJ9GB4lXGr6UsEMpdT4KDvtIkRv+ywCdEMQV
T07KuT+OUWcrr9NEX+blSuQ=
=rDNL
-----END PGP SIGNATURE-----

Re: VACUUM FULL out of memory

From
Tom Lane
Date:
Michael Akinde <michael.akinde@met.no> writes:
> As suggested, I tested a VACUUM FULL ANALYZE with 128MB shared_buffers 
> and 512 MB reserved for maintenance_work_mem (on a 32 bit machine with 4 
> GB RAM). That ought to leave more than enough space for other processes 
> in the system. Again, the system fails on the VACUUM with the following 
> error (identical to the error we had when maintenance_work_mem was very 
> low.

> INFO:  vacuuming "pg_catalog.pg_largeobject"
> ERROR:  out of memory
> DETAIL:  Failed on request of size 536870912

Are you sure this is a VACUUM FULL, and not a plain VACUUM?  I suspect
that it's the latter, and the reason it's failing is that you are
running the postmaster under a ulimit that is less than 512MB (or at
least not enough more to allow an allocation of that size).
        regards, tom lane


Re: VACUUM FULL out of memory

From
Michael Akinde
Date:
Tom Lane wrote:
Michael Akinde <michael.akinde@met.no> writes:
INFO: vacuuming "pg_catalog.pg_largeobject"
ERROR: out of memory
DETAIL: Failed on request of size 536870912

Are you sure this is a VACUUM FULL, and not a plain VACUUM?
Very sure.

Ran a VACUUM FULL again yesterday (the prior query was a VACUUM FULL ANALYZE) and received essentially the same error, simply with different failure size.

INFO:  vacuuming "pg_catalog.pg_largeobject"
ERROR:  invalid memory alloc request size 1073741824

No changes done on the system from the previous iteration. VACUUM ran OK on the 8.3beta2 instance I tested with before Christmas (current setup is 8.2.5)

I suspect that it's the latter, and the reason it's failing is that you are
running the postmaster under a ulimit that is less than 512MB (or at
least not enough more to allow an allocation of that size).
We went over this somewhat prior to Christmas. Here's how its currently set up.

$> ulimit -a
core file size          (blocks, -c) 100000000
data seg size           (kbytes, -d) unlimited
max nice                        (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) unlimited
max rt priority                 (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Anything we should try to change?

Andrew Sullivan wrote:
> Something is using up the memory on the machine, or (I'll bet this is more
> likely) your user (postgres? Whatever's running the postmaster) has a
> ulimit on its ability to allocate memory on the machine.

If one looks at the system resources while the VACUUM FULL is going up, its pretty obvious that its a postgres process going on a memory allocation rampage that eats up all the resources.

> No, it's not really that big. I've never seen a problem like this. If it
> were the 8.3 beta, I'd be worried; but I'm inclined to suggest you look at
> the OS settings first given your set up.

Have the same problem with the 8.3beta, but won't be using it anyway until its been out for a while.

>  Note that you should almost never use VACUUM FULL unless you've really
> messed things up. I understand from the thread that you're just testing
> things out right now. But VACUUM FULL is not something you should _ever_
> need in production, if you've set things up correctly.

That's good to hear. I'm not particularly worried about this with respect to my own system. So far, we have found Postgres amazingly robust in every other issue that we have deliberately (or unwittingly) provoked. More reason to be puzzled about this problem, though.

Holger Hoffstaette wrote:
> Then why does it exist? Is it a historical leftover? If it is
> only needed for emergency, should it not have a different name?

Or in this case: if VACUUM FULL is never required (except in very special circumstances), it might be a good idea not to have VACUUM recommend running it (cf. the VACUUM I ran before New Year on a similar size table).

INFO: vacuuming "pg_catalog.pg_largeobject"

INFO: scanned index "pg_largeobject_loid_pn_index" to remove 106756133 row versions
DETAIL: CPU 38.88s/303.43u sec elapsed 2574.24 sec.
INFO: "pg_largeobject": removed 106756133 row versions in 13190323 pages
DETAIL: CPU 259.42s/113.20u sec elapsed 14017.17 sec.

INFO: index "pg_largeobject_loid_pn_index" now contains 706303560 row versions in 2674471 pages
DETAIL: 103960219 index row versions were removed.
356977 index pages have been deleted, 77870 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.02 sec.

INFO: "pg_largeobject": found 17489832 removable, 706303560 nonremovable row versions in 116049431 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 36000670 unused item pointers.
64493445 pages contain useful free space.
0 pages are entirely empty.
CPU 1605.42s/1107.48u sec elapsed 133032.02 sec.

WARNING: relation "pg_catalog.pg_largeobject" contains more than "max_fsm_pages" pages with useful free space HINT: Consider using VACUUM FULL on this relation or increasing the configuration parameter "max_fsm_pages".

Anyway, thanks for the responses.

I do have the test setup available for hopefully some weeks, so if there is anyone interested in digging further into the matter, we do have the possibility to run further test attempts for a while (it takes about a week to load all the data, so once we take it back down, it may be a while before we set it up again).

Regards,

Michael Akinde
Database Architect, met.no

Attachment

Re: VACUUM FULL out of memory

From
Andrew Sullivan
Date:
On Tue, Jan 08, 2008 at 09:50:07AM +0100, Michael Akinde wrote:
> stack size              (kbytes, -s) 8192

Perhaps this is the issue?  (I don't know.)  Also, this _is_ for the
postgres user, right?  That's the relevant one: the one that's actually
running the back end process.  

Also, are you sure there's nothing else in the way?  I don't remember what
OS you're using.  On AIX, for instance, there's some _other_ dopey setting
that allows you to control user resource consumption as well, and it means
that ulimit's answers are not the full story.  (I learned this through
painful experience, and confess it's one of the many reasons I think AIX
should be prounounced as one word, rather than three letters.)

> Andrew Sullivan wrote:
> > Something is using up the memory on the machine, or (I'll bet this is 
> more
> > likely) your user (postgres? Whatever's running the postmaster) has a
> > ulimit on its ability to allocate memory on the machine.
> 
> If one looks at the system resources while the VACUUM FULL is going up, 
> its pretty obvious that its a postgres process going on a memory 
> allocation rampage that eats up all the resources.

Of course VACUUM FULL is eating up as much memory as it can: it's moving a
lot of data around.  But is it in fact exhausting memory on the machine? 
There are only two possibilities: either there's something else that is
preventing that allocation, or else you've run into a case so unusual that
nobody else has ever seen it.  The data you're talking about isn't that big:
I've run similar-sized databases on my laptop without pain.  

> Or in this case: if VACUUM FULL is never required (except in very 
> special circumstances), it might be a good idea not to have VACUUM 
> recommend running it (cf. the VACUUM I ran before New Year on a similar 
> size table).

The suggestion you see there, though, is in fact one of the cases where you
might in fact want to run it.  That is,

> WARNING: relation "pg_catalog.pg_largeobject" contains more than 
> "max_fsm_pages" pages with useful free space HINT: Consider using VACUUM 
> FULL on this relation or increasing the configuration parameter 
> "max_fsm_pages".

what it is saying is that a regular vacuum can no longer recover all the
dead pages in the table, and if you want that space back and marked usable
on your disk, you have to run VACUUM FULL (or, in fact, CLUSTER, or else
dump and reload the table.  But one of these).  Note that I said that, if
you have things configured _correctly_, you shouldn't have to run VACUUM
FULL except in unusual circumstances.  That doesn't mean "never".  The
problem here is an historical one: you have a "hangover" from previous
missed maintenance or sub-optimal vacuum scheduling.  In those cases, you
may want to perform VACUUM FULL, provided you understand the potential side
effects (like possibly slower inserts initially, and some possible index
bloat).

A



Re: VACUUM FULL out of memory

From
Tom Lane
Date:
Michael Akinde <michael.akinde@met.no> writes:
> We went over this somewhat prior to Christmas. Here's how its currently 
> set up.

> $> ulimit -a
> core file size          (blocks, -c) 100000000
> ...

What you're showing us is the conditions that prevail in your
interactive session.  That doesn't necessarily have a lot to do with
the ulimits that init-scripts run under ...
        regards, tom lane


Re: VACUUM FULL out of memory

From
Michael Akinde
Date:
Tom Lane wrote:
Michael Akinde <michael.akinde@met.no> writes: 
$> ulimit -a
core file size          (blocks, -c) 100000000
...   
What you're showing us is the conditions that prevail in your
interactive session.  That doesn't necessarily have a lot to do with
the ulimits that init-scripts run under ... 
Those are the ulimits of the db_admin account (i.e., the user that set up and runs the DB processes). Is Postgres limited by other settings?

Regards,

Michael A.
Database Architect, Met.no

Attachment

Re: VACUUM FULL out of memory

From
Andrew Sullivan
Date:
On Tue, Jan 08, 2008 at 05:27:16PM +0100, Michael Akinde wrote:
> >  
> Those are the ulimits of the db_admin account (i.e., the user that set 
> up and runs the DB processes). Is Postgres limited by other settings?

Are you sure?

On one system I used many years ago, /bin/sh wasn't what I thought it was,
and so the ulimit that I got when logged in was not what the postmaster was
starting under.  Took me many days to figure out what was up.

A



Re: VACUUM FULL out of memory

From
Tom Lane
Date:
Andrew Sullivan <ajs@crankycanuck.ca> writes:
> On Tue, Jan 08, 2008 at 05:27:16PM +0100, Michael Akinde wrote:
>> Those are the ulimits of the db_admin account (i.e., the user that set 
>> up and runs the DB processes). Is Postgres limited by other settings?

> Are you sure?

> On one system I used many years ago, /bin/sh wasn't what I thought it was,
> and so the ulimit that I got when logged in was not what the postmaster was
> starting under.  Took me many days to figure out what was up.

The only thing I find convincing is to insert "ulimit -a >someplace"
into the script that starts the postmaster, adjacent to where it does
so, and then reboot.  There are too many systems on which daemons are
launched under settings different from what interactive shells use
(a policy that's often a good one, too).
        regards, tom lane


Re: VACUUM FULL out of memory

From
Sam Mason
Date:
On Tue, Jan 08, 2008 at 12:33:34PM -0500, Tom Lane wrote:
> Andrew Sullivan <ajs@crankycanuck.ca> writes:
> > On one system I used many years ago, /bin/sh wasn't what I thought it was,
> > and so the ulimit that I got when logged in was not what the postmaster was
> > starting under.  Took me many days to figure out what was up.
> 
> The only thing I find convincing is to insert "ulimit -a >someplace"
> into the script that starts the postmaster, adjacent to where it does
> so, and then reboot.  There are too many systems on which daemons are
> launched under settings different from what interactive shells use
> (a policy that's often a good one, too).

What about a stored procedure in a language that allows you to do
system(3) calls?

 Sam


Re: VACUUM FULL out of memory

From
Tom Lane
Date:
Sam Mason <sam@samason.me.uk> writes:
> On Tue, Jan 08, 2008 at 12:33:34PM -0500, Tom Lane wrote:
>> The only thing I find convincing is to insert "ulimit -a >someplace"
>> into the script that starts the postmaster,

> What about a stored procedure in a language that allows you to do
> system(3) calls?

Yeah, that would work, if you have any untrusted languages installed.
        regards, tom lane


Re: VACUUM FULL out of memory

From
Andrew Sullivan
Date:
On Tue, Jan 08, 2008 at 05:53:28PM +0000, Sam Mason wrote:
> What about a stored procedure in a language that allows you to do
> system(3) calls?

PL/bash?  (I think there is something like this).  But surely the ulimit
before start is much easier!

A



Re: VACUUM FULL out of memory

From
Michael Akinde
Date:
Thanks for the explanation on the ulimits; I can see how that could turn out a problem in some cases.

Following Tom's suggestion, here is the startup script I used:
#!/bin/sh
ulimit -a > $PGHOST/server.ulimit
pg_ctl start -l $PGHOST/server.log

The ulimits seem to be the same, though:
$> cat server.ulimit
core file size          (blocks, -c) 100000000
data seg size           (kbytes, -d) unlimited
max nice                        (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) unlimited
max rt priority                 (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Regards,

Michael A.

Tom Lane wrote:
Andrew Sullivan <ajs@crankycanuck.ca> writes: 
On Tue, Jan 08, 2008 at 05:27:16PM +0100, Michael Akinde wrote:   
Those are the ulimits of the db_admin account (i.e., the user that set 
up and runs the DB processes). Is Postgres limited by other settings?     
On one system I used many years ago, /bin/sh wasn't what I thought it was,
and so the ulimit that I got when logged in was not what the postmaster was
starting under.  Took me many days to figure out what was up.   
The only thing I find convincing is to insert "ulimit -a >someplace"
into the script that starts the postmaster, adjacent to where it does
so, and then reboot.  There are too many systems on which daemons are
launched under settings different from what interactive shells use
(a policy that's often a good one, too).
		regards, tom lane 

Attachment