Thread: Postgres: VACUUM

Postgres: VACUUM

From
Date:

Any comments on multi-versioning problem:

As far as I understand from PG documentation, *CURRENTLY* VACUUM must be run
regulary, otherwise:

-Q. database will grow as fast as there are many DML going on it, won't it?

-Q. transaction ID may wrap - no doubt here.

-Q. Good news that VACUUM nowdays can run in parallel with other db activity
(non locking mode) and that pg_vacuum automates execution of VACUUM - this is
so, ins't it?

- Q. Bad knews that VACUUM must eventually scan every row(in fact, every row
and index pages?) in the database(?):
    - if this is true(?) then can anyone give an idea on how long it runs
for a paticular size of the database and how much it slowdowns a database?



Thank you in advance,

Laimutis Nedzinskas
Reykjavik, Iceland














Re: Postgres: VACUUM

From
"Joshua D. Drake"
Date:
>- Q. Bad knews that VACUUM must eventually scan every row(in fact, every row
>and index pages?) in the database(?):
>    - if this is true(?) then can anyone give an idea on how long it runs
>for a paticular size of the database and how much it slowdowns a database?
>
>
>
On heavily used databases (over 100,000 transactions an hour), vacuum is
a killer and
needs to be run pretty much concurrently with operation. This is
especially true if you
database is doing large amounts of updates and deletes. Vacuum is also
very hard on
the database hardware, if you do not want to see significant performance
hits -- you
need to have a lot of hard drive IO. Think 4-8-12 hard drives.

If you are only doing say (5000,10000) transactions and hour or
somewhere in between
you may get away with running vacuum in an off peak time.

Sincerely,

Joshua D. Drake





>
>Thank you in advance,
>
>Laimutis Nedzinskas
>Reykjavik, Iceland
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 9: the planner will ignore your desire to choose an index scan if your
>      joining column's datatypes do not match
>
>


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL


Re: Postgres: VACUUM

From
Allen Landsidel
Date:
1. Yes.  Every data-modifying transaction will create a new tuple, growing
the physical db by at least the size of the row.

2. Yes.

3. Yes.  VACUUM (not VACUUM FULL!) can be run in parallel with normal
database activity.

4. The size of the database isn't what counts, it's how quickly you expire
rows (by update/delete).  I run a pair of dbs, with several schemas each,
that expire tuples about as often as you can imagine -- an
insert/update/delete cycle on multiple tables, about half a dozen
connections doing this simultaneously at a rate of several hundred to
several thousand a minute depending on the time of day.

I have a script that creates a lock file, runs a VACUUM VERBOSE ANALYZE,
and removes the lockfile, set to run via cron every 5 minutes.  It
currently takes the less-used db about 10 minutes per VACUUM ANALYZE and
the intensely updated one 30-45 minutes -- thus the lockfile, so cron
doesn't try to run more than one vacuum per db simultaneously.

I do notice a slight performance penalty when certain tables are hit, but
it's nothing severe -- certainly not severe enough to dissuade me to run
the vacuum less often (it's pretty much always running with this scheduling
and the time it takes per run) compared to how badly this database bloats
and becomes unusable if I don't run it so often.

To satisfy the curious, the data directory is currently 18GB on disk.

At 15:50 1/14/2004, lnd@hnit.is wrote:


>Any comments on multi-versioning problem:
>
>As far as I understand from PG documentation, *CURRENTLY* VACUUM must be run
>regulary, otherwise:
>
>-Q. database will grow as fast as there are many DML going on it, won't it?
>
>-Q. transaction ID may wrap - no doubt here.
>
>-Q. Good news that VACUUM nowdays can run in parallel with other db activity
>(non locking mode) and that pg_vacuum automates execution of VACUUM - this is
>so, ins't it?
>
>- Q. Bad knews that VACUUM must eventually scan every row(in fact, every row
>and index pages?) in the database(?):
>         - if this is true(?) then can anyone give an idea on how long it runs
>for a paticular size of the database and how much it slowdowns a database?
>
>
>
>Thank you in advance,
>
>Laimutis Nedzinskas
>Reykjavik, Iceland
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match