Thread: Size for vacuum_mem

Size for vacuum_mem

From
Francisco Reyes
Date:
Trying to improve my vacuum times. Currently have vacuum_mem set to 128MB.

What is my best bet to improve vacuum time? Increasing this number?

Have 1GB of ram in the machine.
Some of my settings:
wal_buffers = 256          # min 4
shared_buffers = 12288     #96 MB (buffers * 8K)
sort_mem    = 131072   # 128 MB measured in K
vacuum_mem  = 131072   # 128 MB measured in K
wal_files = 10                  # range 0-64
checkpoint_segments = 12        # in logfile segments (16MB each), min 1
checkpoint_timeout = 900  # in seconds, range 30-3600

Currently a vacuum full takes 3+ hours and very soon the amount of data
will increase. Have a very high turn over of data because this is a copy
of data on a different system. Because of that I every day delete/merge
entire range of dates deleting/adding million of records.


Re: Size for vacuum_mem

From
Neil Conway
Date:
On Wed, 2002-12-04 at 19:17, Francisco Reyes wrote:
> What is my best bet to improve vacuum time?

If you VACUUM more frequently, each individual VACUUM takes less time.

> Currently a vacuum full takes 3+ hours and very soon the amount of data
> will increase.

Do you need to use VACUUM FULL?

Cheers,

Neil
--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC




Re: Size for vacuum_mem

From
"Shridhar Daithankar"
Date:
On 4 Dec 2002 at 19:38, Neil Conway wrote:

> > Currently a vacuum full takes 3+ hours and very soon the amount of data
> > will increase.
> Do you need to use VACUUM FULL?

Let me elaborate this statement.

1) You need vacuum full only to recover space from deleted tuples. Unless the
database has undergone major deletion somewhere 'vacuum full' might be a cannon
to kill an ant.

2) You should consider vacuuming tablewise. Vacuum is useful only for those
tables which change at a faster rate. A lookup or archive table might not need
vacuum. Just vacuum the tables which are heavily updated/deleted/inserted. And
by heavily, I mean heavily in terms of tuples. Inserting a single 200MB BLOB
and vacuuming the table might not yield any performance improvement..

 HTH

Bye
 Shridhar

--
blithwapping:    Using anything BUT a hammer to hammer a nail into the    wall, such
as shoes, lamp bases, doorstops, etc.        -- "Sniglets", Rich Hall & Friends


Re: Size for vacuum_mem

From
"Francisco Reyes"
Date:
On 4 Dec 2002, Neil Conway wrote:

> > Currently a vacuum full takes 3+ hours and very soon the amount of data
> > will increase.
>
> Do you need to use VACUUM FULL?

I have a very large turnover ratio.
For some tables I delete/reload the whole table daily (tables in the 500K
to 1.5 Million records), while other tables I delete/reload about 1/3 (ie
7 Million records table I delete/copy 1.5 Million records).

The reason for the delete/copy is that the data is coming from a system
where there is now way to tell what was added, changed since the last
update. So I use date ranges for some tables and do entire tables for
others.



Re: Size for vacuum_mem

From
"Francisco Reyes"
Date:
On 5 Dec 2002, Neil Conway wrote:

> For these, it would probably be faster to TRUNCATE the table and then
> load the new data, then ANALYZE.

I can't because those tables would not be usable during the load.
Right now I do delete/copy from within a transaction. If the loads are
still running while people start coming in the morning they can still do
work.

> >  while other tables I delete/reload about 1/3 (ie
> > 7 Million records table I delete/copy 1.5 Million records).
>
> For these, you can try just using a plain VACUUM and seeing how
> effective that is at reclaiming space.

I am not too concerned with space reclamation. In theory if I don't do
vacuum fulls I may have some dead space, but it would get re-used daily.
My concern is the performance hit I would suffer with the table scans.

>If necessary, increase max_fsm_pages.

What is this setting for? To what number could I increase it to?

> You might also want to check and see if your indexes are growing in size
> (btree indexes on incrementally increasing values like timestamps can
> grow, even with VACUUM FULL); use REINDEX if that's the case.

Every once in a while I truncate the tables and re-load the whole set.
Probably about every couple of months.



Re: Size for vacuum_mem

From
Medi Montaseri
Date:
Can someone please explain the relatiionship between operation of a
vacuum-ing job
and vacuum_mem setting in non-shared memfory configuration.

Ie from a capacity planning point of view, how is vacuum_mem's size is
related to this
operation. The current 8k default seems so low...

Thanks

Shridhar Daithankar wrote:

>On 4 Dec 2002 at 19:38, Neil Conway wrote:
>
>
>
>>>Currently a vacuum full takes 3+ hours and very soon the amount of data
>>>will increase.
>>>
>>>
>>Do you need to use VACUUM FULL?
>>
>>
>
>Let me elaborate this statement.
>
>1) You need vacuum full only to recover space from deleted tuples. Unless the
>database has undergone major deletion somewhere 'vacuum full' might be a cannon
>to kill an ant.
>
>2) You should consider vacuuming tablewise. Vacuum is useful only for those
>tables which change at a faster rate. A lookup or archive table might not need
>vacuum. Just vacuum the tables which are heavily updated/deleted/inserted. And
>by heavily, I mean heavily in terms of tuples. Inserting a single 200MB BLOB
>and vacuuming the table might not yield any performance improvement..
>
> HTH
>
>Bye
> Shridhar
>
>--
>blithwapping:    Using anything BUT a hammer to hammer a nail into the    wall, such
>as shoes, lamp bases, doorstops, etc.        -- "Sniglets", Rich Hall & Friends
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>




Re: Size for vacuum_mem

From
Tom Lane
Date:
"Francisco Reyes" <lists@natserv.com> writes:
> On 4 Dec 2002, Neil Conway wrote:
>> Do you need to use VACUUM FULL?

> I have a very large turnover ratio.
> For some tables I delete/reload the whole table daily (tables in the 500K
> to 1.5 Million records), while other tables I delete/reload about 1/3 (ie
> 7 Million records table I delete/copy 1.5 Million records).

For the delete/copy scenario, you could avoid the need for VACUUM by
using TRUNCATE, if you don't mind the fact that TRUNCATE isn't
rollbackable (yet).

            regards, tom lane

Order of execution of Constraints, Triggers and Rules

From
Jean-Luc Lachance
Date:
Hi all,

I know triggers are executed in alphabetical order, but which of
constraints, triggers and rules get executed first and which one is
last?

Thx

JLL

Re: Size for vacuum_mem

From
"Francisco Reyes"
Date:
On 4 Dec 2002, Neil Conway wrote:

> > Currently a vacuum full takes 3+ hours and very soon the amount of data
> > will increase.
>
> Do you need to use VACUUM FULL?

I have a very large turnover ratio.
For some tables I delete/reload the whole table daily (tables in the 500K
to 1.5 Million records), while other tables I delete/reload about 1/3 (ie
7 Million records table I delete/copy 1.5 Million records).

The reason for the delete/copy is that the data is coming from a system
where there is now way to tell what was added, changed since the last
update. So I use date ranges for some tables and do entire tables for
others.



Re: Size for vacuum_mem

From
"Francisco Reyes"
Date:
On 5 Dec 2002, Neil Conway wrote:

> For these, it would probably be faster to TRUNCATE the table and then
> load the new data, then ANALYZE.

I can't because those tables would not be usable during the load.
Right now I do delete/copy from within a transaction. If the loads are
still running while people start coming in the morning they can still do
work.

> >  while other tables I delete/reload about 1/3 (ie
> > 7 Million records table I delete/copy 1.5 Million records).
>
> For these, you can try just using a plain VACUUM and seeing how
> effective that is at reclaiming space.

I am not too concerned with space reclamation. In theory if I don't do
vacuum fulls I may have some dead space, but it would get re-used daily.
My concern is the performance hit I would suffer with the table scans.

>If necessary, increase max_fsm_pages.

What is this setting for? To what number could I increase it to?

> You might also want to check and see if your indexes are growing in size
> (btree indexes on incrementally increasing values like timestamps can
> grow, even with VACUUM FULL); use REINDEX if that's the case.

Every once in a while I truncate the tables and re-load the whole set.
Probably about every couple of months.



Re: Size for vacuum_mem

From
Robert Treat
Date:
On Thu, 2002-12-05 at 12:57, Francisco Reyes wrote:
> > For these, you can try just using a plain VACUUM and seeing how
> > effective that is at reclaiming space.
>
> I am not too concerned with space reclamation. In theory if I don't do
> vacuum fulls I may have some dead space, but it would get re-used daily.
> My concern is the performance hit I would suffer with the table scans.
>

you should see very little performance impact from lazy vacuuming. If
there is a performance hit, you can gain some offset by quicker queries
(if you do vacuum analyze).  And remember, lazy vacuums are non-blocking
so users won't see an impact from that standpoint. The trick is to find
a good interval that will keep your tables from growing too big. I have
one table that updates every 10 minutes (the whole content of the table
gets updated within 15 minutes), which keeps the size very manageable
(it's not a huge table, or I would do it more).  In this scenario, you
can still do vacuum fulls if you feel the need, but they should take
much less time.

Robert Treat



Re: Size for vacuum_mem

From
"David Blood"
Date:
A "lazy vacuum" can hurt If you have lots of i/o.  If we try to run it
during the day it kills us. This is because to vacuum all the tables
postgres has to read them from the disk. While it doesn't not lock rows
it does block other rows from reading/writing to/from the disk.

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Robert Treat
Sent: Thursday, December 05, 2002 2:58 PM
To: Francisco Reyes
Cc: neilc@samurai.com; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Size for vacuum_mem

On Thu, 2002-12-05 at 12:57, Francisco Reyes wrote:
> > For these, you can try just using a plain VACUUM and seeing how
> > effective that is at reclaiming space.
>
> I am not too concerned with space reclamation. In theory if I don't do
> vacuum fulls I may have some dead space, but it would get re-used
daily.
> My concern is the performance hit I would suffer with the table scans.
>

you should see very little performance impact from lazy vacuuming. If
there is a performance hit, you can gain some offset by quicker queries
(if you do vacuum analyze).  And remember, lazy vacuums are non-blocking
so users won't see an impact from that standpoint. The trick is to find
a good interval that will keep your tables from growing too big. I have
one table that updates every 10 minutes (the whole content of the table
gets updated within 15 minutes), which keeps the size very manageable
(it's not a huge table, or I would do it more).  In this scenario, you
can still do vacuum fulls if you feel the need, but they should take
much less time.

Robert Treat



---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)



Re: Size for vacuum_mem

From
"scott.marlowe"
Date:
On Thu, 5 Dec 2002, David Blood wrote:

> A "lazy vacuum" can hurt If you have lots of i/o.  If we try to run it
> during the day it kills us. This is because to vacuum all the tables
> postgres has to read them from the disk. While it doesn't not lock rows
> it does block other rows from reading/writing to/from the disk.

How much shared memory do you have allocated to Postgresql?

I've found that if I have a couple hundred megs of shared buffer on a
machine with 1 gig or more of ram, that lazy vacuums (in 7.2.x and later,
7.1 has massive problems with lazy vacuums acting up) don't seem to affect
performance much at all.

Vacuum on most my boxen results in no more than a 5% performance loss for
other queries (all types, select, update, delete, insert) but keeps the
database running well, even if they are running one vacuum right after
another.


Re: Order of execution of Constraints, Triggers and Rules

From
Jean-Luc Lachance
Date:
Any takers on this one?

Jean-Luc Lachance wrote:
>
> Hi all,
>
> I know triggers are executed in alphabetical order, but which of
> constraints, triggers and rules get executed first and which one is
> last?
>
> Thx
>
> JLL
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: Order of execution of Constraints, Triggers and Rules

From
Stephan Szabo
Date:
On Fri, 6 Dec 2002, Jean-Luc Lachance wrote:

> Jean-Luc Lachance wrote:
> >
> > Hi all,
> >
> > I know triggers are executed in alphabetical order, but which of
> > constraints, triggers and rules get executed first and which one is
> > last?

I think it's:

Rules are processed (since they're rewrites).
Before triggers (in order)
Check constraints/not null/unique/primary key
After triggers (before the foreign key triggers)
Foreign keys
After triggers (after the foreign key triggers)



Re: Size for vacuum_mem

From
Tom Lane
Date:
"David Blood" <david@matraex.com> writes:
> A "lazy vacuum" can hurt If you have lots of i/o.  If we try to run it
> during the day it kills us. This is because to vacuum all the tables
> postgres has to read them from the disk. While it doesn't not lock rows
> it does block other rows from reading/writing to/from the disk.

On the other hand, I have watched people lazy-vacuum production
databases in 7.2.* and not seen any visible hit on system load
(as far as top or vmstat could show, anyway).

I think it may be a matter of whether you have disk bandwidth to
spare.  If the disk farm is marginal, the extra demand from a vacuum
may push you over the knee of the performance curve.  But that's just
a guess.  It would be interesting if some folks from the "it doesn't
hurt" and the "it does hurt" camps could compare notes and try to
understand the reason for the difference in their results.

            regards, tom lane

Re: Size for vacuum_mem

From
"Nicolai Tufar"
Date:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
> On the other hand, I have watched people lazy-vacuum production
> databases in 7.2.* and not seen any visible hit on system load
> (as far as top or vmstat could show, anyway).

From my experience lazy-vacuum never strains system load with
modern processors. It is defenitely IO-bound.

regards,
Nic


Re: Size for vacuum_mem

From
Steve Atkins
Date:
On Fri, Dec 06, 2002 at 02:56:47PM -0500, Tom Lane wrote:
> "David Blood" <david@matraex.com> writes:
> > A "lazy vacuum" can hurt If you have lots of i/o.  If we try to run it
> > during the day it kills us. This is because to vacuum all the tables
> > postgres has to read them from the disk. While it doesn't not lock rows
> > it does block other rows from reading/writing to/from the disk.
>
> On the other hand, I have watched people lazy-vacuum production
> databases in 7.2.* and not seen any visible hit on system load
> (as far as top or vmstat could show, anyway).
>
> I think it may be a matter of whether you have disk bandwidth to
> spare.  If the disk farm is marginal, the extra demand from a vacuum
> may push you over the knee of the performance curve.  But that's just
> a guess.  It would be interesting if some folks from the "it doesn't
> hurt" and the "it does hurt" camps could compare notes and try to
> understand the reason for the difference in their results.

I'm firmly in the "devastating to performance" camp.

7.2.3, reasonably well-tuned on a not overspecced, but adequate
Solaris box. (Built with a non-Solaris qsort, though I doubt that's
relevant).

Several large-ish (hundreds of thousands to millions of rows), fairly
heavily updated tables, with some text fields large enough to push
data out to toast.

Vacuumed pretty much continuously while data is being updated, so it
didn't get too far out to lunch.

Then the process updating the table were shut down, so the system was
basically idle, and the tables were vacuumed. Simple selects (from
some small tables, via psql) slowed to a crawl - tens of seconds to
get any response. There was a lot of I/O but also high CPU usage -
including a fair fraction of system time.

It felt like the system was i/o-starved, yet it would run a very
intensive DB app quite happily if it wasn't vacuuming.

(I finally rewrote the algorithm to avoid UPDATE, instead storing
 deltas in a daily table, then every night reading all the deltas and
 all the archived data and inserting the merged data into a new archive
 table (then indexing it and renaming to replace the old archived data
 table). Ugly, and offended my SQL sensibilities, but it avoided having
 to keep that table vacuumed.)

Cheers,
  Steve

Re: Slow Lazy Vacuum (was Size for vacuum_mem)

From
"Peter Darley"
Date:
Friends,
    I just want to throw my support behind what David said.  We have
particularly slow drives, but a lot of memory (our data all fits into the
cache) because of the failure of our fast drive and our inability to replace
it (we're just using a single slow ide drive now), and the system fly's
except while doing a vacuum.  Just showing an example of when the lazy
vacuum can significantly slow things.
Thanks,
Peter Darley

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org]On Behalf Of David Blood
Sent: Thursday, December 05, 2002 3:12 PM
To: 'Robert Treat'; 'Francisco Reyes'
Cc: neilc@samurai.com; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Size for vacuum_mem


A "lazy vacuum" can hurt If you have lots of i/o.  If we try to run it
during the day it kills us. This is because to vacuum all the tables
postgres has to read them from the disk. While it doesn't not lock rows
it does block other rows from reading/writing to/from the disk.

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Robert Treat
Sent: Thursday, December 05, 2002 2:58 PM
To: Francisco Reyes
Cc: neilc@samurai.com; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Size for vacuum_mem

On Thu, 2002-12-05 at 12:57, Francisco Reyes wrote:
> > For these, you can try just using a plain VACUUM and seeing how
> > effective that is at reclaiming space.
>
> I am not too concerned with space reclamation. In theory if I don't do
> vacuum fulls I may have some dead space, but it would get re-used
daily.
> My concern is the performance hit I would suffer with the table scans.
>

you should see very little performance impact from lazy vacuuming. If
there is a performance hit, you can gain some offset by quicker queries
(if you do vacuum analyze).  And remember, lazy vacuums are non-blocking
so users won't see an impact from that standpoint. The trick is to find
a good interval that will keep your tables from growing too big. I have
one table that updates every 10 minutes (the whole content of the table
gets updated within 15 minutes), which keeps the size very manageable
(it's not a huge table, or I would do it more).  In this scenario, you
can still do vacuum fulls if you feel the need, but they should take
much less time.

Robert Treat



---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)



---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly