Re: 12 hour table vacuums - Mailing list pgsql-performance

From Jean-David Beyer
Subject Re: 12 hour table vacuums
Date
Msg-id 471F4542.2010804@verizon.net
Whole thread Raw
In response to 12 hour table vacuums  (Ron St-Pierre <ron.pgsql@shaw.ca>)
List pgsql-performance
Ron St-Pierre wrote:
> We vacuum only a few of our tables nightly, this one is the last one
> because it takes longer to run. I'll probably re-index it soon, but I
> would appreciate any advice on how to speed up the vacuum process (and
> the db in general).

I am a novice to postgreSQL, so I have no answers for you. But for my own
education, I am confused by some of your post.
>
> Okay, here's our system:
>   postgres 8.1.4

I have postgresql-8.1.9-1.el5

>   Linux version 2.4.21

I imagine you mean Linux kernel version; I have 2.6.18-8.1.15.el5PAE

>   Red Hat Linux 3.2.3

I have no clue what this means. Red Hat Linux 3 must have been in the early
1990s. RHL 5 came out about 1998 IIRC.

Red Hat Enterprise Linux 3, on the other hand, was not numbered like that,
as I recall. I no longer run that, but my current RHEL5 is named like this:

Red Hat Enterprise Linux Server release 5 (Tikanga)

and for my CentOS 4 system, it is

CentOS release 4.5 (Final)

Did RHEL3 go with the second dot in their release numbers? I do not remember
that.

>   8 GB ram
>   Intel(R) Xeon(TM) CPU 3.20GHz
>   Raid 5
>   autovacuum=off

Why would you not have that on?

>   serves as the application server and database server
>   server is co-located in another city, hardware upgrade is not
> currently an option
>
> Here's the table information:
> The table has 140,000 rows, 130 columns (mostly NUMERIC), 60 indexes.

I have designed databases, infrequently, but since the late 1970s. In my
experience, my tables had relatively few columns, rarely over 10. Are you
sure this table needs so many? Why not, e.g., 13 tables averaging 10 columns
each?

OTOH, 140,000 rows is not all that many. I have a 6,000,000 row table in my
little database on my desktop, and I do not even consider that large.
Imagine the size of a database belonging to the IRS, for example. Surely it
would have at least one row for each taxpayer and each employer (possibly in
two tables, or two databases).

Here are the last few lines of a VACUUM VERBOSE; command for that little
database. The 6,000,000 row table is not in the database at the moment, nor
are some of the other tables, but two relatively (for me) large tables are.

CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  free space map contains 166 pages in 76 relations
DETAIL:  A total of 1280 page slots are in use (including overhead).
1280 page slots are required to track all free space.
Current limits are:  40000 page slots, 1000 relations, using 299 KB.
VACUUM
stock=> select count(*) from ranks; [table has 10 columns]
 count
--------
 981030
(1 row)

stock=> select count(*) from ibd;   [table has 8 columns]
  count
---------
 1099789
(1 row)

And this is the time for running that psql process, most of which was
consumed by slow typing on my part.

real    1m40.206s
user    0m0.027s
sys     0m0.019s

My non-default settings for this are

# - Memory -

shared_buffers = 251000
work_mem  = 32768
max_fsm_pages  = 40000

I have 8GBytes RAM on this machine, and postgreSQL is the biggest memory
user. I set shared_buffers high to try to get some entire (small) tables in
RAM and to be sure there is room for indices.

--
  .~.  Jean-David Beyer          Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A         Registered Machine   241939.
 /( )\ Shrewsbury, New Jersey    http://counter.li.org
 ^^-^^ 08:40:01 up 1 day, 58 min, 1 user, load average: 4.08, 4.13, 4.17

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Finalizing commit taking very long
Next
From: Gregory Stark
Date:
Subject: Re: multiple apaches against single postgres database