Thread: Need to run CLUSTER to keep performance

Need to run CLUSTER to keep performance

From
Rafael Martinez
Date:
Hello

This is a question about something we have seen sometimes in the last
months. It happens with tables with a large amount of updates/selects
compared with the amount of inserts/deletes. The sizes of these tables
are small and the amount of rows too.

The 'problem' is that performance decrease during the day and the only
thing that helps is to run CLUSTER on the table with problems. VACUUM
ANALYZE does not help.

Some information that can help to find out why this happens:

- PostgreSQL version: 8.1.9

------------------------------------------------------------------------------
scanorama=# SELECT pg_size_pretty(pg_relation_size('hosts'));

 pg_size_pretty
----------------
 12 MB
------------------------------------------------------------------------------
scanorama=# SELECT count(*) FROM hosts ;

 count
-------
 16402
------------------------------------------------------------------------------
scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;

 Seq Scan on hosts  (cost=0.00..2771.56 rows=66756 width=314) (actual
time=0.008..2013.415 rows=16402 loops=1)
 Total runtime: 2048.486 ms
------------------------------------------------------------------------------
scanorama=# VACUUM ANALYZE ;
VACUUM
------------------------------------------------------------------------------
scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;

 Seq Scan on hosts  (cost=0.00..2718.57 rows=61357 width=314) (actual
time=0.008..1676.283 rows=16402 loops=1)
 Total runtime: 1700.826 ms
------------------------------------------------------------------------------
scanorama=# CLUSTER hosts_pkey ON hosts ;
CLUSTER
------------------------------------------------------------------------------
scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;

 Seq Scan on hosts  (cost=0.00..680.02 rows=16402 width=314) (actual
time=0.008..31.205 rows=16402 loops=1)
 Total runtime: 53.635 ms
------------------------------------------------------------------------------
scanorama=# SELECT * from pg_stat_all_tables WHERE relname LIKE 'hosts';
 relid  | schemaname | relname | seq_scan | seq_tup_read | idx_scan |
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del

--------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------
 105805 | public     | hosts   |  1996430 |  32360280252 |  2736391 |
   3301856 |       948 |   1403325 |       737

The information from pg_stat_all_tables is from the last 20 days.
------------------------------------------------------------------------------
INFO:  analyzing "public.hosts"
INFO:  "hosts": scanned 2536 of 2536 pages, containing 16410 live rows
and 57042 dead rows; 16410 rows in sample, 16410 estimated total rows
INFO:  free space map contains 191299 pages in 786 relations
DETAIL:  A total of 174560 page slots are in use (including overhead).
174560 page slots are required to track all free space.
Current limits are:  2000000 page slots, 4000 relations, using 12131 KB.
------------------------------------------------------------------------------

The tables with this 'problem' are not big, so CLUSTER finnish very fast
and it does not have an impact in the access because of locking. But we
wonder why this happens.

Do you need more information?

Thanks in advance.
regards
--
 Rafael Martinez, <r.m.guerrero@usit.uio.no>
 Center for Information Technology Services
 University of Oslo, Norway

 PGP Public Key: http://folk.uio.no/rafael/

Re: Need to run CLUSTER to keep performance

From
Heikki Linnakangas
Date:
Rafael Martinez wrote:
> This is a question about something we have seen sometimes in the last
> months. It happens with tables with a large amount of updates/selects
> compared with the amount of inserts/deletes. The sizes of these tables
> are small and the amount of rows too.
>
> The 'problem' is that performance decrease during the day and the only
> thing that helps is to run CLUSTER on the table with problems. VACUUM
> ANALYZE does not help.
>
> Some information that can help to find out why this happens:
>
> - PostgreSQL version: 8.1.9
>
> ------------------------------------------------------------------------------
> scanorama=# SELECT pg_size_pretty(pg_relation_size('hosts'));
>
>  pg_size_pretty
> ----------------
>  12 MB
> ------------------------------------------------------------------------------
> scanorama=# SELECT count(*) FROM hosts ;
>
>  count
> -------
>  16402
> ------------------------------------------------------------------------------
> scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;
>
>  Seq Scan on hosts  (cost=0.00..2771.56 rows=66756 width=314) (actual
> time=0.008..2013.415 rows=16402 loops=1)
>  Total runtime: 2048.486 ms
> ------------------------------------------------------------------------------
> scanorama=# VACUUM ANALYZE ;
> VACUUM
> ------------------------------------------------------------------------------
> scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;
>
>  Seq Scan on hosts  (cost=0.00..2718.57 rows=61357 width=314) (actual
> time=0.008..1676.283 rows=16402 loops=1)
>  Total runtime: 1700.826 ms
> ------------------------------------------------------------------------------
> scanorama=# CLUSTER hosts_pkey ON hosts ;
> CLUSTER
> ------------------------------------------------------------------------------
> scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;
>
>  Seq Scan on hosts  (cost=0.00..680.02 rows=16402 width=314) (actual
> time=0.008..31.205 rows=16402 loops=1)
>  Total runtime: 53.635 ms
> ------------------------------------------------------------------------------
> scanorama=# SELECT * from pg_stat_all_tables WHERE relname LIKE 'hosts';
>  relid  | schemaname | relname | seq_scan | seq_tup_read | idx_scan |
> idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del
>
--------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------
>  105805 | public     | hosts   |  1996430 |  32360280252 |  2736391 |
>    3301856 |       948 |   1403325 |       737
>
> The information from pg_stat_all_tables is from the last 20 days.
> ------------------------------------------------------------------------------
> INFO:  analyzing "public.hosts"
> INFO:  "hosts": scanned 2536 of 2536 pages, containing 16410 live rows
> and 57042 dead rows; 16410 rows in sample, 16410 estimated total rows
> INFO:  free space map contains 191299 pages in 786 relations
> DETAIL:  A total of 174560 page slots are in use (including overhead).
> 174560 page slots are required to track all free space.
> Current limits are:  2000000 page slots, 4000 relations, using 12131 KB.
> ------------------------------------------------------------------------------
>
> The tables with this 'problem' are not big, so CLUSTER finnish very fast
> and it does not have an impact in the access because of locking. But we
> wonder why this happens.

2 seconds for seq scanning 12 MB worth of data sounds like a lot. Have
you increased shared_buffers from the default? Which operating system
are you using? Shared memory access is known to be slower on Windows.

On a small table like that you could run VACUUM every few minutes
without much impact on performance. That should keep the table size in
check.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Need to run CLUSTER to keep performance

From
Tomáš Vondra
Date:
Performance problems with heavily modified tables (UPDATE or DELETE) are
usually caused by not vacuuming. There are two main modes the VACUUM can
run in (plain or full) and the former works in a much more aggressive
way (exclusive locking, etc). Try to run VACUUM FULL VERBOSE on the
table and see if it helps.

A way to fix this is usually a proper setting of pg_autovacuum daemon -
it may work on the tables that are not modified heavily, but it does not
work for the heavily modified ones. Do you have the autovacuum daemon
enabled? What are the settings of it? Try to set it a little bit more
aggressive (this can be done on a table level).

The stats from pg_stat_all_tables are nice, but I guess the stats that
matter are located in pg_class catalog, the most interesting beeing
reltuples and relpages columns - run

   SELECT relname, relpages, reltuples WHERE relname LIKE 'hosts';

and observe the number of pages before and afrer the vacuum full (or
cluster). I guess the number of pages increases quite fast and the
autovacuum daemon is not able to reclaim that - and this is probably the
cause why scanning 12 MB of data takes 2 sec, which is way too much -
the table is acrually much bigger as it contains a lot of dead data).

Tomas


> Hello
>
> This is a question about something we have seen sometimes in the last
> months. It happens with tables with a large amount of updates/selects
> compared with the amount of inserts/deletes. The sizes of these tables
> are small and the amount of rows too.
>
> The 'problem' is that performance decrease during the day and the only
> thing that helps is to run CLUSTER on the table with problems. VACUUM
> ANALYZE does not help.
>
> Some information that can help to find out why this happens:
>
> - PostgreSQL version: 8.1.9
>
> ------------------------------------------------------------------------------
> scanorama=# SELECT pg_size_pretty(pg_relation_size('hosts'));
>
>  pg_size_pretty
> ----------------
>  12 MB
> ------------------------------------------------------------------------------
> scanorama=# SELECT count(*) FROM hosts ;
>
>  count
> -------
>  16402
> ------------------------------------------------------------------------------
> scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;
>
>  Seq Scan on hosts  (cost=0.00..2771.56 rows=66756 width=314) (actual
> time=0.008..2013.415 rows=16402 loops=1)
>  Total runtime: 2048.486 ms
> ------------------------------------------------------------------------------
> scanorama=# VACUUM ANALYZE ;
> VACUUM
> ------------------------------------------------------------------------------
> scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;
>
>  Seq Scan on hosts  (cost=0.00..2718.57 rows=61357 width=314) (actual
> time=0.008..1676.283 rows=16402 loops=1)
>  Total runtime: 1700.826 ms
> ------------------------------------------------------------------------------
> scanorama=# CLUSTER hosts_pkey ON hosts ;
> CLUSTER
> ------------------------------------------------------------------------------
> scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;
>
>  Seq Scan on hosts  (cost=0.00..680.02 rows=16402 width=314) (actual
> time=0.008..31.205 rows=16402 loops=1)
>  Total runtime: 53.635 ms
> ------------------------------------------------------------------------------
> scanorama=# SELECT * from pg_stat_all_tables WHERE relname LIKE 'hosts';
>  relid  | schemaname | relname | seq_scan | seq_tup_read | idx_scan |
> idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del
>
--------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------
>  105805 | public     | hosts   |  1996430 |  32360280252 |  2736391 |
>    3301856 |       948 |   1403325 |       737
>
> The information from pg_stat_all_tables is from the last 20 days.
> ------------------------------------------------------------------------------
> INFO:  analyzing "public.hosts"
> INFO:  "hosts": scanned 2536 of 2536 pages, containing 16410 live rows
> and 57042 dead rows; 16410 rows in sample, 16410 estimated total rows
> INFO:  free space map contains 191299 pages in 786 relations
> DETAIL:  A total of 174560 page slots are in use (including overhead).
> 174560 page slots are required to track all free space.
> Current limits are:  2000000 page slots, 4000 relations, using 12131 KB.
> ------------------------------------------------------------------------------
>
> The tables with this 'problem' are not big, so CLUSTER finnish very fast
> and it does not have an impact in the access because of locking. But we
> wonder why this happens.
>
> Do you need more information?
>
> Thanks in advance.
> regards


Re: Need to run CLUSTER to keep performance

From
Rafael Martinez
Date:
Heikki Linnakangas wrote:
> Rafael Martinez wrote:

>> The tables with this 'problem' are not big, so CLUSTER finnish very fast
>> and it does not have an impact in the access because of locking. But we
>> wonder why this happens.
>
> 2 seconds for seq scanning 12 MB worth of data sounds like a lot. Have
> you increased shared_buffers from the default? Which operating system
> are you using? Shared memory access is known to be slower on Windows.
>

This is a server with 8GB of ram, we are using 25% as shared_buffers.
Linux RHELAS4 with a 2.6.9-55.0.9.ELsmp kernel / x86_64.

> On a small table like that you could run VACUUM every few minutes
> without much impact on performance. That should keep the table size in
> check.
>

Ok, we run VACUUM ANALYZE only one time a day, every night. But we would
espect the performance to get ok again after running vacuum, and it
doesn't. Only CLUSTER helps.

I can not see we need to change the max_fsm_pages parameter and pg_class
and analyze give us this information today (not long ago a CLUSTER was
executed):
------------------------------------------------------------------------------
scanorama=# VACUUM VERBOSE ANALYZE hosts;
INFO:  vacuuming "public.hosts"
INFO:  index "hosts_pkey" now contains 20230 row versions in 117 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "hosts": found 0 removable, 20230 nonremovable row versions in
651 pages
DETAIL:  3790 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  vacuuming "pg_toast.pg_toast_376127"
INFO:  index "pg_toast_376127_index" now contains 131 row versions in 2
pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_376127": found 0 removable, 131 nonremovable row
versions in 33 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.hosts"
INFO:  "hosts": scanned 651 of 651 pages, containing 16440 live rows and
3790 dead rows; 16440 rows in sample, 16440 estimated total rows
VACUUM

scanorama=# SELECT relname, relpages, reltuples from pg_class WHERE
relname LIKE 'hosts';
 relname | relpages | reltuples
---------+----------+-----------
 hosts   |      651 |     20230
------------------------------------------------------------------------------


Anymore ideas?
regards,
--
 Rafael Martinez, <r.m.guerrero@usit.uio.no>
 Center for Information Technology Services
 University of Oslo, Norway

 PGP Public Key: http://folk.uio.no/rafael/

Re: Need to run CLUSTER to keep performance

From
Alvaro Herrera
Date:
Rafael Martinez wrote:
> Hello
>
> This is a question about something we have seen sometimes in the last
> months. It happens with tables with a large amount of updates/selects
> compared with the amount of inserts/deletes. The sizes of these tables
> are small and the amount of rows too.
>
> The 'problem' is that performance decrease during the day and the only
> thing that helps is to run CLUSTER on the table with problems. VACUUM
> ANALYZE does not help.

Probably because all the live tuples are clustered at the end of the
table, and the initial pages are polluted with dead tuples.  Try
vacuuming the table much more often, say every few minutes.

Your table is 2536 pages long, but it could probably be in the vicinity
of 700 ...

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Need to run CLUSTER to keep performance

From
Heikki Linnakangas
Date:
Rafael Martinez wrote:
> Heikki Linnakangas wrote:
>> On a small table like that you could run VACUUM every few minutes
>> without much impact on performance. That should keep the table size in
>> check.
>
> Ok, we run VACUUM ANALYZE only one time a day, every night. But we would
> espect the performance to get ok again after running vacuum, and it
> doesn't. Only CLUSTER helps.

If the table is already bloated, a VACUUM won't usually shrink it. It
only makes the space available for reuse, but a sequential scan still
needs to go through a lot of pages.

CLUSTER on the other hand repacks the tuples and gets rid of all the
unused space on pages. You need to run CLUSTER or VACUUM FULL once to
shrink the relation, but after that frequent-enough VACUUMs should keep
the table size down.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Need to run CLUSTER to keep performance

From
Rafael Martinez
Date:
Alvaro Herrera wrote:
> Rafael Martinez wrote:

>> The 'problem' is that performance decrease during the day and the only
>> thing that helps is to run CLUSTER on the table with problems. VACUUM
>> ANALYZE does not help.
>
> Probably because all the live tuples are clustered at the end of the
> table, and the initial pages are polluted with dead tuples.  Try
> vacuuming the table much more often, say every few minutes.
>
> Your table is 2536 pages long, but it could probably be in the vicinity
> of 700 ...
>

We run VACUUM ANALYZE every 10 minuttes during 2-3 days to see if it
helped, but when it didn't we when back to the old configuration (1 time
everyday)

Yes, after a CLUSTER we are using 517 pages. But the table does not grow
much, it is always around 12-20MB, it looks like vacuum works without
problems.

regards,
--
 Rafael Martinez, <r.m.guerrero@usit.uio.no>
 Center for Information Technology Services
 University of Oslo, Norway

 PGP Public Key: http://folk.uio.no/rafael/

Re: Need to run CLUSTER to keep performance

From
Bill Moran
Date:
In response to Rafael Martinez <r.m.guerrero@usit.uio.no>:

> Heikki Linnakangas wrote:
> > Rafael Martinez wrote:
>
> >> The tables with this 'problem' are not big, so CLUSTER finnish very fast
> >> and it does not have an impact in the access because of locking. But we
> >> wonder why this happens.
> >
> > 2 seconds for seq scanning 12 MB worth of data sounds like a lot. Have
> > you increased shared_buffers from the default? Which operating system
> > are you using? Shared memory access is known to be slower on Windows.
> >
>
> This is a server with 8GB of ram, we are using 25% as shared_buffers.
> Linux RHELAS4 with a 2.6.9-55.0.9.ELsmp kernel / x86_64.
>
> > On a small table like that you could run VACUUM every few minutes
> > without much impact on performance. That should keep the table size in
> > check.
> >
>
> Ok, we run VACUUM ANALYZE only one time a day, every night. But we would
> espect the performance to get ok again after running vacuum, and it
> doesn't. Only CLUSTER helps.

If you have a large value for max_fsm_pages, but only vacuum once a day,
you could end up with considerable bloat on a small table, but not enough
to exceed max_fsm_pages (thus you wouldn't see any warning/errors)

I recommend either:
a) autovaccum, with aggressive settings for that table
b) a more aggressive schedule for that particular table, maybe a cron
   that vacuums that table every 5 minutes.

You could also do a combination, i.e. enable autovacuum with conservative
settings and set a cron to vacuum the table every 10 minutes.

Vacuuming once a day is usually only enough if you have very minimal
updates.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

Re: Need to run CLUSTER to keep performance

From
Rafael Martinez
Date:
Heikki Linnakangas wrote:
>
> If the table is already bloated, a VACUUM won't usually shrink it. It
> only makes the space available for reuse, but a sequential scan still
> needs to go through a lot of pages.
>
> CLUSTER on the other hand repacks the tuples and gets rid of all the
> unused space on pages. You need to run CLUSTER or VACUUM FULL once to
> shrink the relation, but after that frequent-enough VACUUMs should keep
> the table size down.
>

Ok, thanks for the advice. We will try this and will come back with more
information.

--
 Rafael Martinez, <r.m.guerrero@usit.uio.no>
 Center for Information Technology Services
 University of Oslo, Norway

 PGP Public Key: http://folk.uio.no/rafael/

Re: Need to run CLUSTER to keep performance

From
Tom Lane
Date:
Rafael Martinez <r.m.guerrero@usit.uio.no> writes:
> Heikki Linnakangas wrote:
>> On a small table like that you could run VACUUM every few minutes
>> without much impact on performance. That should keep the table size in
>> check.

> Ok, we run VACUUM ANALYZE only one time a day, every night.

There's your problem.

Reading between the lines I gather that you think an update is "free"
in the sense of not creating a need for vacuum.  It's not --- it's
exactly equivalent to an insert + a delete, and it leaves behind a
dead row that needs to be vacuumed.  If you do a lot of updates, you
need to vacuum.

            regards, tom lane

Re: Need to run CLUSTER to keep performance

From
Chris Browne
Date:
r.m.guerrero@usit.uio.no (Rafael Martinez) writes:
> Heikki Linnakangas wrote:
>> On a small table like that you could run VACUUM every few minutes
>> without much impact on performance. That should keep the table size in
>> check.
>>
>
> Ok, we run VACUUM ANALYZE only one time a day, every night. But we would
> espect the performance to get ok again after running vacuum, and it
> doesn't. Only CLUSTER helps.

You have characterized the shape of the problem Right There.

If you only VACUUM that table once a day, then it has a whole day to
get cluttered with dead tuples, which increases its size to encompass
651 pages, and NOTHING ever allows it to shrink back to a small size.
Plain VACUUM (or VACUUM ANALYZE) does not attempt to shrink table
sizes.  Only VACUUM FULL and CLUSTER do that.

Here are some options to "parameterize" your choices:

- If you vacuum the table often enough that only 10% of the table
  consists of dead tuples, then you can expect the table to perpetually
  have 10% of dead space.

- If you vacuum the table seldom enough that 90% of the table may be
  expected to consist of dead tuples, then you can expect this table to
  consistently have 90% of its space be "dead."

It sounds like this particular table needs to be vacuumed quite a bit
more frequently than once a day.

On our systems, we have certain tables where tuples get killed off so
frequently that we find it worthwhile to vacuum those tables once
every two to three minutes.  If we didn't, we'd see application
performance bog down until it forced us to CLUSTER or VACUUM FULL the
table.
--
"cbbrowne","@","acm.org"
http://linuxfinances.info/info/linux.html
"How much more helpful could I be than to provide you with the
appropriate e-mail address? I could engrave it on a clue-by-four and
deliver it to you in Chicago, I suppose." -- Seen on Slashdot...

Re: Need to run CLUSTER to keep performance

From
Rafael Martinez
Date:
Tom Lane wrote:
> Rafael Martinez <r.m.guerrero@usit.uio.no> writes:
>> Heikki Linnakangas wrote:
>>> On a small table like that you could run VACUUM every few minutes
>>> without much impact on performance. That should keep the table size in
>>> check.
>
>> Ok, we run VACUUM ANALYZE only one time a day, every night.
>
> There's your problem.
>
> Reading between the lines I gather that you think an update is "free"
> in the sense of not creating a need for vacuum.  It's not --- it's
> exactly equivalent to an insert + a delete, and it leaves behind a
> dead row that needs to be vacuumed.  If you do a lot of updates, you
> need to vacuum.
>

Hello again

We have more information about this 'problem'.

Tom, we have many other tables which are much bigger and have larger
amount of updates/deletes and are working very well with our actual
vacuum configuration. We are aware of how important is to run vacuum
jobs and we think we have a good understanding of how/why vacuum works.

We think the problem we are seeing sometimes with these small tables is
another thing.

We increased the vacuum analyze jobs, as you all pointed, from one a day
to four every hour (we did not run cluster at all since we started with
this new configuration). We started with this after a fresh 'cluster' of
the table. This has been in production since last week and the
performance of this table only gets worst and worst.

After 4 days with the new maintenance jobs, it took more than 4 sec to
run a select on this table. After running a cluster we are down to
around 50ms. again.

I can not believe 4 vacuum jobs every hour is not enough for this table.
If we see the statistics, it has only ca.67000 updates/day, ca.43
deletes/day and ca.48 inserts/day. This is nothing compare with many of
the systems we are administrating.

What we see in common between these tables (we have seen this a couple
of times before) is:

- Small table size.
- Small amount of tuples in the table (almost constant).
- Large amount of updates compared to inserts/deletes and compared to
the amount of tuples in the table.

You that know the interns of postgres :), can you think of anything that
can be causing this behavior? Any more suggestions? do you need more data?

Thanks in advance :)

We are sending all data we had before the last cluster command and after
it.

----------------------------------------------------------------------
**** BEFORE CLUSTER ****
----------------------------------------------------------------------
INFO:  vacuuming "public.hosts"
INFO:  index "hosts_pkey" now contains 99933 row versions in 558 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "hosts": found 0 removable, 99933 nonremovable row versions in
3875 pages
DETAIL:  83623 dead row versions cannot be removed yet.
There were 12079 unused item pointers.
0 pages are entirely empty.
CPU 0.02s/0.03u sec elapsed 0.06 sec.
INFO:  vacuuming "pg_toast.pg_toast_376272"
INFO:  index "pg_toast_376272_index" now contains 133 row versions in 2
pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_376272": found 0 removable, 133 nonremovable row
versions in 65 pages
DETAIL:  2 dead row versions cannot be removed yet.
There were 127 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.hosts"
INFO:  "hosts": scanned 3875 of 3875 pages, containing 16310 live rows
and 83623 dead rows; 16310 rows in sample, 16310 estimated total rows


scanorama=# SELECT age(now(), pg_postmaster_start_time());
           age
-------------------------
 25 days 22:40:01.241036
(1 row)

scanorama=# SELECT pg_size_pretty(pg_relation_size('hosts'));
 pg_size_pretty
----------------
 30 MB
(1 row)

scanorama=# SELECT count(*) from hosts;
 count
-------
 16311
(1 row)

scanorama=# SELECT
relname,relpages,reltuples,reltoastrelid,reltoastidxid from pg_class
where relname = 'hosts';
 relname | relpages | reltuples | reltoastrelid | reltoastidxid
---------+----------+-----------+---------------+---------------
 hosts   |     3875 |    100386 |        376276 |             0
(1 row)

scanorama=# SELECT * from pg_stat_all_tables where schemaname = 'public'
and relname = 'hosts';
 relid  | schemaname | relname | seq_scan | seq_tup_read | idx_scan |
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del

--------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------
 105805 | public     | hosts   |  2412159 |  39109243131 |  3244406 |
    9870886 |      1208 |   1685525 |      1088
(1 row)

scanorama=# EXPLAIN ANALYZE SELECT * from hosts;
                                                   QUERY PLAN

----------------------------------------------------------------------------------------------------------------
 Seq Scan on hosts  (cost=0.00..4878.86 rows=100386 width=314) (actual
time=0.025..4719.082 rows=16311 loops=1)
 Total runtime: 4742.754 ms
(2 rows)


scanorama=# CLUSTER hosts_pkey ON hosts ;
CLUSTER

----------------------------------------------------------------------
**** AFTER CLUSTER ****
----------------------------------------------------------------------

scanorama=# VACUUM VERBOSE ANALYZE hosts;

INFO:  vacuuming "public.hosts"
INFO:  index "hosts_pkey" now contains 16321 row versions in 65 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "hosts": found 0 removable, 16321 nonremovable row versions in
514 pages
DETAIL:  10 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  vacuuming "pg_toast.pg_toast_383759"
INFO:  index "pg_toast_383759_index" now contains 131 row versions in 2
pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_383759": found 0 removable, 131 nonremovable row
versions in 33 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.hosts"
INFO:  "hosts": scanned 514 of 514 pages, containing 16311 live rows and
10 dead rows; 16311 rows in sample, 16311 estimated total rows
VACUUM


scanorama=# SELECT pg_size_pretty(pg_relation_size('hosts'));
 pg_size_pretty
----------------
 4112 kB
(1 row)

scanorama=# SELECT count(*) from hosts;
 count
-------
 16311
(1 row)

scanorama=#  SELECT
relname,relpages,reltuples,reltoastrelid,reltoastidxid from pg_class
where relname = 'hosts';
 relname | relpages | reltuples | reltoastrelid | reltoastidxid
---------+----------+-----------+---------------+---------------
 hosts   |      514 |     16321 |        383763 |             0
(1 row)

scanorama=# SELECT * from pg_stat_all_tables where schemaname = 'public'
and relname = 'hosts';
 relid  | schemaname | relname | seq_scan | seq_tup_read | idx_scan |
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del

--------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------
 105805 | public     | hosts   |  2412669 |  39117480187 |  3244962 |
    9887752 |      1208 |   1685857 |      1088
(1 row)

scanorama=# EXPLAIN ANALYZE SELECT * from hosts;
                                                 QUERY PLAN

------------------------------------------------------------------------------------------------------------
 Seq Scan on hosts  (cost=0.00..678.53 rows=16353 width=314) (actual
time=0.006..32.143 rows=16311 loops=1)
 Total runtime: 57.408 ms
(2 rows)
----------------------------------------------------------------------


--
 Rafael Martinez, <r.m.guerrero@usit.uio.no>
 Center for Information Technology Services
 University of Oslo, Norway

 PGP Public Key: http://folk.uio.no/rafael/

Re: Need to run CLUSTER to keep performance

From
Rafael Martinez
Date:
Rafael Martinez wrote:
>
> We have more information about this 'problem'.
>

Sending this just in case it can help ....

Checking all the log files from these vacuum jobs we have been running,
we found one that looks difference from the rest, specially on the
amount of removed pages.

We are sending also the output before and after the one we are talking
about:

###############################################
2007-11-11_0245.log
###############################################
COMMAND: /local/opt/pgsql-8.1/bin/psql -h /tmp/pg_sockets/dbpg-meridien
-p 5432 scanorama -c 'VACUUM VERBOSE ANALYZE hosts'
CODE: 0

OUTPUT:
INFO:  vacuuming "public.hosts"
INFO:  index "hosts_pkey" now contains 110886 row versions in 554 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.02s/0.00u sec elapsed 0.87 sec.
INFO:  "hosts": found 0 removable, 110886 nonremovable row versions in
3848 pages
DETAIL:  94563 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.05s/0.03u sec elapsed 0.94 sec.
INFO:  vacuuming "pg_toast.pg_toast_376272"
INFO:  index "pg_toast_376272_index" now contains 260 row versions in 2
pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_376272": found 0 removable, 260 nonremovable row
versions in 65 pages
DETAIL:  129 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO:  analyzing "public.hosts"
INFO:  "hosts": scanned 3848 of 3848 pages, containing 16323 live rows
and 94563 dead rows; 16323 rows in sample, 16323 estimated total rows
VACUUM

###############################################
2007-11-11_0301.log
###############################################
COMMAND: /local/opt/pgsql-8.1/bin/psql -h /tmp/pg_sockets/dbpg-meridien
-p 5432 scanorama -c 'VACUUM VERBOSE ANALYZE hosts'
CODE: 0

OUTPUT:
INFO:  vacuuming "public.hosts"
INFO:  index "hosts_pkey" now contains 16782 row versions in 556 pages
DETAIL:  94551 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.04s/0.09u sec elapsed 590.48 sec.
INFO:  "hosts": removed 94551 row versions in 3835 pages
DETAIL:  CPU 0.00s/0.03u sec elapsed 0.10 sec.
INFO:  "hosts": found 94551 removable, 16695 nonremovable row versions
in 3865 pages
DETAIL:  372 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.08s/0.16u sec elapsed 590.99 sec.
INFO:  vacuuming "pg_toast.pg_toast_376272"
INFO:  index "pg_toast_376272_index" now contains 131 row versions in 2
pages
DETAIL:  129 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_376272": removed 129 row versions in 33 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 32.05 sec.
INFO:  "pg_toast_376272": found 129 removable, 131 nonremovable row
versions in 65 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 51.96 sec.
INFO:  analyzing "public.hosts"
INFO:  "hosts": scanned 3875 of 3875 pages, containing 16323 live rows
and 576 dead rows; 16323 rows in sample, 16323 estimated total rows
VACUUM

###############################################
2007-11-11_0315.log
###############################################
COMMAND: /local/opt/pgsql-8.1/bin/psql -h /tmp/pg_sockets/dbpg-meridien
-p 5432 scanorama -c 'VACUUM VERBOSE ANALYZE hosts'
CODE: 0

OUTPUT:
INFO:  vacuuming "public.hosts"
INFO:  index "hosts_pkey" now contains 17363 row versions in 556 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.01s/0.00u sec elapsed 1.39 sec.
INFO:  "hosts": found 0 removable, 17362 nonremovable row versions in
3875 pages
DETAIL:  1039 dead row versions cannot be removed yet.
There were 94074 unused item pointers.
0 pages are entirely empty.
CPU 0.02s/0.02u sec elapsed 1.43 sec.
INFO:  vacuuming "pg_toast.pg_toast_376272"
INFO:  index "pg_toast_376272_index" now contains 131 row versions in 2
pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_376272": found 0 removable, 131 nonremovable row
versions in 65 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 129 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.05 sec.
INFO:  analyzing "public.hosts"
INFO:  "hosts": scanned 3875 of 3875 pages, containing 16323 live rows
and 1040 dead rows; 16323 rows in sample, 16323 estimated total rows
VACUUM



After this last job the amount of dead rows just continued growing until
 today.

--
 Rafael Martinez, <r.m.guerrero@usit.uio.no>
 Center for Information Technology Services
 University of Oslo, Norway

 PGP Public Key: http://folk.uio.no/rafael/

Re: Need to run CLUSTER to keep performance

From
Heikki Linnakangas
Date:
Rafael Martinez wrote:
> DETAIL:  83623 dead row versions cannot be removed yet.

Looks like you have a long-running transaction in the background, so
VACUUM can't remove all dead tuples. I didn't see that in the vacuum
verbose outputs you sent earlier. Is there any backends in "Idle in
transaction" state, if you run ps?

In 8.1, CLUSTER will remove those tuples anyway, but it's actually not
correct. If the long-running transaction decides to do a select on
hosts-table later on, it will see an empty table because of that. That's
been fixed in 8.3, but it also means that CLUSTER might no longer help
you on 8.3. VACUUM FULL is safe in that sense in 8.1 as well.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Need to run CLUSTER to keep performance

From
"Scott Marlowe"
Date:
On Nov 12, 2007 10:11 AM, Rafael Martinez <r.m.guerrero@usit.uio.no> wrote:

> Sending this just in case it can help ....
>
> Checking all the log files from these vacuum jobs we have been running,
> we found one that looks difference from the rest, specially on the
> amount of removed pages.
>
> We are sending also the output before and after the one we are talking
> about:
>
> ###############################################
> 2007-11-11_0245.log
> ###############################################
> COMMAND: /local/opt/pgsql-8.1/bin/psql -h /tmp/pg_sockets/dbpg-meridien
> -p 5432 scanorama -c 'VACUUM VERBOSE ANALYZE hosts'
> CODE: 0
>
> OUTPUT:
> INFO:  vacuuming "public.hosts"
> INFO:  index "hosts_pkey" now contains 110886 row versions in 554 pages
> DETAIL:  0 index pages have been deleted, 0 are currently reusable.
> CPU 0.02s/0.00u sec elapsed 0.87 sec.
> INFO:  "hosts": found 0 removable, 110886 nonremovable row versions in
> 3848 pages
> DETAIL:  94563 dead row versions cannot be removed yet.
> There were 0 unused item pointers.

You see that right there?  You've got 94k dead rows that cannot be removed.

Then, later on, they can:

> CPU 0.04s/0.09u sec elapsed 590.48 sec.
> INFO:  "hosts": removed 94551 row versions in 3835 pages
> DETAIL:  CPU 0.00s/0.03u sec elapsed 0.10 sec.
> INFO:  "hosts": found 94551 removable, 16695 nonremovable row versions
> in 3865 pages

So, between the first and second vacuum you had a long running
transaction that finally ended and let you clean up the dead rows.

> After this last job the amount of dead rows just continued growing until
>  today.

I think you've got a long running transaction that's preventing you
from recovering dead rows.

Re: Need to run CLUSTER to keep performance

From
Bill Moran
Date:
In response to Heikki Linnakangas <heikki@enterprisedb.com>:

> Rafael Martinez wrote:
> > DETAIL:  83623 dead row versions cannot be removed yet.
>
> Looks like you have a long-running transaction in the background, so
> VACUUM can't remove all dead tuples. I didn't see that in the vacuum
> verbose outputs you sent earlier. Is there any backends in "Idle in
> transaction" state, if you run ps?
>
> In 8.1, CLUSTER will remove those tuples anyway, but it's actually not
> correct. If the long-running transaction decides to do a select on
> hosts-table later on, it will see an empty table because of that. That's
> been fixed in 8.3, but it also means that CLUSTER might no longer help
> you on 8.3. VACUUM FULL is safe in that sense in 8.1 as well.

Considering how small the table is, you may want to just program the
process holding the transaction open to do a vacuum full of that table
when it's done with it's work.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

****************************************************************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************

Re: Need to run CLUSTER to keep performance

From
Heikki Linnakangas
Date:
Scott Marlowe wrote:
> On Nov 12, 2007 10:11 AM, Rafael Martinez <r.m.guerrero@usit.uio.no> wrote:
>
>> Sending this just in case it can help ....
>>
>> Checking all the log files from these vacuum jobs we have been running,
>> we found one that looks difference from the rest, specially on the
>> amount of removed pages.
>>
>> We are sending also the output before and after the one we are talking
>> about:
>>
>> ###############################################
>> 2007-11-11_0245.log
>> ###############################################
>> COMMAND: /local/opt/pgsql-8.1/bin/psql -h /tmp/pg_sockets/dbpg-meridien
>> -p 5432 scanorama -c 'VACUUM VERBOSE ANALYZE hosts'
>> CODE: 0
>>
>> OUTPUT:
>> INFO:  vacuuming "public.hosts"
>> INFO:  index "hosts_pkey" now contains 110886 row versions in 554 pages
>> DETAIL:  0 index pages have been deleted, 0 are currently reusable.
>> CPU 0.02s/0.00u sec elapsed 0.87 sec.
>> INFO:  "hosts": found 0 removable, 110886 nonremovable row versions in
>> 3848 pages
>> DETAIL:  94563 dead row versions cannot be removed yet.
>> There were 0 unused item pointers.
>
> You see that right there?  You've got 94k dead rows that cannot be removed.
>
> Then, later on, they can:
>
>> CPU 0.04s/0.09u sec elapsed 590.48 sec.
>> INFO:  "hosts": removed 94551 row versions in 3835 pages
>> DETAIL:  CPU 0.00s/0.03u sec elapsed 0.10 sec.
>> INFO:  "hosts": found 94551 removable, 16695 nonremovable row versions
>> in 3865 pages
>
> So, between the first and second vacuum you had a long running
> transaction that finally ended and let you clean up the dead rows.

No, before 8.3, CLUSTER throws away non-removable dead tuples. So the
long running transaction might still be there.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Need to run CLUSTER to keep performance

From
"Scott Marlowe"
Date:
On Nov 12, 2007 11:01 AM, Heikki Linnakangas <heikki@enterprisedb.com> wrote:
>
> Scott Marlowe wrote:
> > So, between the first and second vacuum you had a long running
> > transaction that finally ended and let you clean up the dead rows.
>
> No, before 8.3, CLUSTER throws away non-removable dead tuples. So the
> long running transaction might still be there.

Wow, good to know.  Why would it have changed in 8.3?  Was it
considered broken behaviour?

Re: Need to run CLUSTER to keep performance

From
Heikki Linnakangas
Date:
Scott Marlowe wrote:
> On Nov 12, 2007 11:01 AM, Heikki Linnakangas <heikki@enterprisedb.com> wrote:
>> Scott Marlowe wrote:
>>> So, between the first and second vacuum you had a long running
>>> transaction that finally ended and let you clean up the dead rows.
>> No, before 8.3, CLUSTER throws away non-removable dead tuples. So the
>> long running transaction might still be there.
>
> Wow, good to know.  Why would it have changed in 8.3?  Was it
> considered broken behaviour?

I certainly considered it broken, though it was a known issue all along.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Need to run CLUSTER to keep performance

From
Rafael Martinez
Date:
Heikki Linnakangas wrote:
> Rafael Martinez wrote:
>> DETAIL:  83623 dead row versions cannot be removed yet.
>
> Looks like you have a long-running transaction in the background, so
> VACUUM can't remove all dead tuples. I didn't see that in the vacuum
> verbose outputs you sent earlier. Is there any backends in "Idle in
> transaction" state, if you run ps?
>

I don't see any long transaction in progress (<IDLE> in transaction) and
if we run the vacuum jobb manual just after checking this, it still
cannot remove the dead rows.

Any suggestions cause vacuum cannot remove these dead rows?

> In 8.1, CLUSTER will remove those tuples anyway, but it's actually not
> correct.

With other words, .... we have to be very carefull to not run CLUSTER on
a table been modified inside a transaction if we do not want to lose
data? ...

Does this mean that if we run a transaction which update/delete many
rows, run cluster before the transaction is finnish, and then rollback
the transaction after cluster has been executed, all dead rows
updated/deleted  by the transaction can not be rollbacked back because
they are not there anymore?


--
 Rafael Martinez, <r.m.guerrero@usit.uio.no>
 Center for Information Technology Services
 University of Oslo, Norway

 PGP Public Key: http://folk.uio.no/rafael/

Re: Need to run CLUSTER to keep performance

From
Rafael Martinez
Date:
Rafael Martinez wrote:
> Heikki Linnakangas wrote:
>
>> In 8.1, CLUSTER will remove those tuples anyway, but it's actually not
>> correct.
>
> With other words, .... we have to be very carefull to not run CLUSTER on
> a table been modified inside a transaction if we do not want to lose
> data? ...
>
> Does this mean that if we run a transaction which update/delete many
> rows, run cluster before the transaction is finnish, and then rollback
> the transaction after cluster has been executed, all dead rows
> updated/deleted  by the transaction can not be rollbacked back because
> they are not there anymore?
>

Stupid question, I could have checked this myself. CLUSTER will wait to
be executed until the transaction is finish. I have just checked this.


--
 Rafael Martinez, <r.m.guerrero@usit.uio.no>
 Center for Information Technology Services
 University of Oslo, Norway

 PGP Public Key: http://folk.uio.no/rafael/