Thread: degenerate performance on one server of 3

From:
Erik Aronesty
Date:

I have 3 servers, all with identical databases, and each performing
very differently for the same queries.

www3 is my fastest, www2 is the worst, and www1 is in the middle...
even though www2 has more ram, faster CPU and faster drives (by far),
and is running a newer version of postgres. I have been reluctant to
post because I know it's something that I'm doing wrong in the
settings or something that I should be able to figure out.

Last year at this time www2 was the fastest... in fact, I bought the
machine to be my "primary" server, and it performed as such.... with
the striped volumes and higher RAM, it outpaced the other 2 in every
query.  It has, over time, "degenerated" to being so slow it
frequently has to be taken out of the load-balance set.   The only
major changes to the sever have been "yum update (or ports upgrade)"
to the newer releases ... over time.

The query planner "knows" about the problem, but I'm not sure *why*
there's a difference... since the tables all have the same data ...
loaded from a dump nightly.

The planner shows a different number of "rows" even though the items
table has 22680 rows in all 3 instances.  I ran a vacuum analyze just
before these runs hoping to get them all into a similar "clean" state.

The difference is outlined below, with the query planner output from a
table-scan query that greatly exaggerates the differences in
performance, along with some info about the configuration and platform
differences.

QUERY: explain select count(*) from items where name like '%a%'

www3: psql (PostgreSQL) 8.1.14
www3: Linux www3 2.6.23.17-88.fc7 #1 SMP Thu May 15 00:02:29 EDT 2008
x86_64 x86_64 x86_64 GNU/Linux

www3: Mem:   1996288k total,  1537576k used,   458712k free,    23124k buffers
www3: Swap:        0k total,        0k used,        0k free,  1383208k cached

www3: shared_buffers = 10000                    # min 16 or
max_connections*2, 8KB each

www3:                             QUERY PLAN
www3: ------------------------------------------------------------------
www3:  Aggregate  (cost=3910.07..3910.08 rows=1 width=0)
www3:    ->  Seq Scan on items  (cost=0.00..3853.39 rows=22671 width=0)
www3:          Filter: (name ~~ '%a%'::text)
www3: (3 rows)
www3:

www1: psql (PostgreSQL) 8.1.17
www1: Linux www1 2.6.26.8-57.fc8 #1 SMP Thu Dec 18 18:59:49 EST 2008
x86_64 x86_64 x86_64 GNU/Linux

www1: Mem:   1019376k total,   973064k used,    46312k free,    27084k buffers
www1: Swap:  1959888k total,    17656k used,  1942232k free,   769776k cached

www1: shared_buffers = 6000                      # min 16 or
max_connections*2, 8KB each

www1:                             QUERY PLAN
www1: ------------------------------------------------------------------
www1:  Aggregate  (cost=5206.20..5206.21 rows=1 width=0)
www1:    ->  Seq Scan on items  (cost=0.00..5149.50 rows=22680 width=0)
www1:          Filter: (name ~~ '%a%'::text)
www1: (3 rows)
www1:

www2: psql (PostgreSQL) 8.2.13
www2: FreeBSD www2 6.3-RELEASE-p7 FreeBSD 6.3-RELEASE-p7 #0: Sun Dec
21 03:24:04 UTC 2008
:/usr/obj/usr/src/sys/SMP  amd64

www2: Mem: 57M Active, 1078M Inact, 284M Wired, 88M Cache, 213M Buf, 10M Free
www2: Swap: 4065M Total, 144K Used, 4065M Free

www2: shared_buffers = 360MB                   # min 128kB or
max_connections*16kB

www2:                             QUERY PLAN
www2: ------------------------------------------------------------------
www2:  Aggregate  (cost=17659.45..17659.46 rows=1 width=0)
www2:    ->  Seq Scan on items  (cost=0.00..17652.24 rows=2886 width=0)
www2:          Filter: (name ~~ '%a%'::text)
www2: (3 rows)
www2:

From:
Tom Lane
Date:

Erik Aronesty <> writes:
> I have 3 servers, all with identical databases, and each performing
> very differently for the same queries.

I'm betting on varying degrees of table bloat.  Have you tried vacuum
full, cluster, etc?

            regards, tom lane

From:
Craig Ringer
Date:

Tom Lane wrote:
> Erik Aronesty <> writes:
>> I have 3 servers, all with identical databases, and each performing
>> very differently for the same queries.
>
> I'm betting on varying degrees of table bloat.  Have you tried vacuum
> full, cluster, etc?

Or, if you have been using VACUUM FULL, try REINDEXing the tables,
because it could easily be index bloat. Clustering the table will take
care of index bloat as well as table bloat.

--
Craig Ringer

From:
Tom Lane
Date:

Craig Ringer <> writes:
> Tom Lane wrote:
>> I'm betting on varying degrees of table bloat.  Have you tried vacuum
>> full, cluster, etc?

> Or, if you have been using VACUUM FULL, try REINDEXing the tables,
> because it could easily be index bloat. Clustering the table will take
> care of index bloat as well as table bloat.

Index bloat wouldn't explain the slow-seqscan behavior the OP was
complaining of.  Still, you're right that if the tables are bloated
then their indexes probably are too ... and that VACUUM FULL alone
will not fix that.

            regards, tom lane

From:
Erik Aronesty
Date:

it was all vacuum full...thanks

the other 2 servers truncate and reload that table from time to time
... IE: they are always vacuumed

as the "master" ... that server never does it... hence the bloat

but why wasn't autovac enough to reclaim at least *most* of the space?
  that table *does* get updated every day... but rows are not
overwritten, just edited.   it seems that most of the pages should be
"reused" via autovac ....



On Sun, May 31, 2009 at 11:40 PM, Tom Lane <> wrote:
> Craig Ringer <> writes:
>> Tom Lane wrote:
>>> I'm betting on varying degrees of table bloat.  Have you tried vacuum
>>> full, cluster, etc?
>
>> Or, if you have been using VACUUM FULL, try REINDEXing the tables,
>> because it could easily be index bloat. Clustering the table will take
>> care of index bloat as well as table bloat.
>
> Index bloat wouldn't explain the slow-seqscan behavior the OP was
> complaining of.  Still, you're right that if the tables are bloated
> then their indexes probably are too ... and that VACUUM FULL alone
> will not fix that.
>
>                        regards, tom lane
>
> --
> Sent via pgsql-performance mailing list ()
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

From:
Tom Lane
Date:

Erik Aronesty <> writes:
> but why wasn't autovac enough to reclaim at least *most* of the space?

Autovac isn't meant to reclaim major amounts of bloat; it's more in the
line of trying to prevent it from happening in the first place.  To
reclaim bloat it would have to execute VACUUM FULL, or some other
operation that requires exclusive table lock, which doesn't seem like
a good idea for an automatic background operation.

            regards, tom lane

From:
Erik Aronesty
Date:

I think, perhaps, autovac wasn't running on that machine.

Is there any way to check to see if it's running?

I have enabled all the options , and I know it's running on my other
servers because I see

LOG: autovacuum....  entries (a profusion of them)

I suspect, perhaps, that it's just not showing up in the log since my
8.2 BSD box came with different settings by default.

current settings:

autovacuum = on
stats_start_collector = on              # needed for block or row stats
stats_row_level = on
log_min_error_statement = error
log_min_messages = notice
log_destination = 'syslog'
client_min_messages = notice

....should be enought to get it going and for me to see it right?  not
sure which setting controls logging of autovac, nor am i sure of a way
to *ask* the server if autovac is running.

On Mon, Jun 1, 2009 at 10:06 AM, Tom Lane <> wrote:
> Erik Aronesty <> writes:
>> but why wasn't autovac enough to reclaim at least *most* of the space?
>
> Autovac isn't meant to reclaim major amounts of bloat; it's more in the
> line of trying to prevent it from happening in the first place.  To
> reclaim bloat it would have to execute VACUUM FULL, or some other
> operation that requires exclusive table lock, which doesn't seem like
> a good idea for an automatic background operation.
>
>                        regards, tom lane
>
> --
> Sent via pgsql-performance mailing list ()
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

From:
Tom Lane
Date:

Erik Aronesty <> writes:
> I think, perhaps, autovac wasn't running on that machine.
> Is there any way to check to see if it's running?

> I have enabled all the options , and I know it's running on my other
> servers because I see

> LOG: autovacuum....  entries (a profusion of them)

> I suspect, perhaps, that it's just not showing up in the log since my
> 8.2 BSD box came with different settings by default.

8.2 has far crummier support for logging what autovacuum is doing than
8.3 does :-(.  The settings you show should mean that it's running, but
the only way to check specifically is to crank log_min_messages way up,
which will clutter your log with a lot of useless noise along with
autovacuum's messages.

            regards, tom lane

From:
Reid Thompson
Date:

Erik Aronesty wrote:
> I think, perhaps, autovac wasn't running on that machine.
>
> Is there any way to check to see if it's running?
>

since it looks like stats are on too....

http://www.network-theory.co.uk/docs/postgresql/vol3/ViewingCollectedStatistics.html

read the entry on pg_stat_all_tables

From:
Erik Aronesty
Date:

> read the entry on pg_stat_all_tables

yeah, it's running ... vacuum'ed last night

it's odd, to me, that the performance would degrade so extremely
(noticeably) over the course of one year on a table which has few
insertions, no deletions,and daily updates of an integer non null
column (stock level).

is there some way to view the level of "bloat that needs full" in each
table, so i could write a script that alerts me to the need of a
"vacuum full"  without waiting for random queries to "get slow"?

looking at the results of the "bloat query", i still can't see how to
know whether bloat is getting bad in an objective manner.

        http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html

on the machines that perform well 30MB of bloat seems to be fine, and
i don't knwo what the badly performing table's bloat was, since i
already vac'ed it.

.......

there is one table i have with 2GB of bloat ... but it's performance
(since all querys are on a clustered index) is more than adequate.
also, it's so big i'm afraid my server would be down for 24 hours on
that on vacuum

it's a rolling "cookie table" with millions of random-id'ed entries
that expire after a few months ... i think i'm going to copy the most
recent 6 months worth of rows to a new table, then just drop the old
one.....  seems easier to me.than the scary unknown of running "vaccum
full", and then i won't have to worry about the system being down on a
table lock.

Seems like "VACUUM FULL" could figure out to do that too depending on
the bloat-to-table-size ratio ...

   - copy all rows to new table
   - lock for a millisecond while renaming tables
   - drop old table.

Locking a whole table for a very long time is scary for admins.


- erik

From:
Robert Haas
Date:

On Thu, Jun 4, 2009 at 7:31 AM, Erik Aronesty <> wrote:
> Seems like "VACUUM FULL" could figure out to do that too depending on
> the bloat-to-table-size ratio ...
>
>   - copy all rows to new table
>   - lock for a millisecond while renaming tables
>   - drop old table.

You'd have to lock the table at least against write operations during
the copy; otherwise concurrent changes might be lost.

AIUI, this is pretty much what CLUSTER does, and I've heard that it
works as well or better as VACUUM FULL for bloat reclamation.
However, it's apparently still pessimal:
http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php (I
had never heard this word before Greg Stark used it in this email, but
it's a great turn of phrase, so I'm reusing it.)

> Locking a whole table for a very long time is scary for admins.

Agreed.  It would be nice if we had some kind of "incremental full"
vacuum that would run for long enough to reclaim a certain number of
pages and then exit.  Then you could clean up this kind of problem
incrementally instead of in one shot.  It would be even nicer if the
lock strength could be reduced, but I'm guessing that's not easy to do
or someone would have already done it by now.  I haven't read the code
myself.

...Robert

From:
Scott Carey
Date:

On 6/4/09 4:31 AM, "Erik Aronesty" <> wrote:

>> read the entry on pg_stat_all_tables
>
> yeah, it's running ... vacuum'ed last night
>
> it's odd, to me, that the performance would degrade so extremely
> (noticeably) over the course of one year on a table which has few
> insertions, no deletions,and daily updates of an integer non null
> column (stock level).
>
> is there some way to view the level of "bloat that needs full" in each
> table, so i could write a script that alerts me to the need of a
> "vacuum full"  without waiting for random queries to "get slow"?
>
> looking at the results of the "bloat query", i still can't see how to
> know whether bloat is getting bad in an objective manner.
>
>         http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html
>
> on the machines that perform well 30MB of bloat seems to be fine, and
> i don't knwo what the badly performing table's bloat was, since i
> already vac'ed it.
>

Updates require space as well, the full MVCC process requires that the
values for all open transactions exist, so updates are not overwrites, but
copies (of the whole tuple in the worst case, of just the column(s) in the
best).

For heavily updated tables, adjusting the table (and maybe index) fillfactor
will help prevent bloat, by adding a constant amount of extra space for temp
data for updates.

See ALTER TABLE and CREATE TABLE  (and the Index variants).

ALTER TABLE foo SET (fillfactor=90);

This will leave on average, 10% of every 8k block empty and allow updates to
columns to more likely live within the same block.

Indexes have default fillfactor set to 90, I believe.


> .......
>
> there is one table i have with 2GB of bloat ... but it's performance
> (since all querys are on a clustered index) is more than adequate.
> also, it's so big i'm afraid my server would be down for 24 hours on
> that on vacuum
>
> it's a rolling "cookie table" with millions of random-id'ed entries
> that expire after a few months ... i think i'm going to copy the most
> recent 6 months worth of rows to a new table, then just drop the old
> one.....  seems easier to me.than the scary unknown of running "vaccum
> full", and then i won't have to worry about the system being down on a
> table lock.

Creating a new table as a select from the old and renaming, OR doing a
CLUSTER and REINDEX is almost always faster than VACUUM FULL for such large
tables.  But there are different implications on how long other queries are
locked out of access to the table.  CLUSTER will generally lock out other
queries for a long time, but the end result (especially combined with a
reasonable fillfactor setting) ends up best for long term performance and
reduction in bloat.


> Seems like "VACUUM FULL" could figure out to do that too depending on
> the bloat-to-table-size ratio ...
>
>    - copy all rows to new table
>    - lock for a millisecond while renaming tables
>    - drop old table.
>
> Locking a whole table for a very long time is scary for admins.
>

You can do the above manually in a single transaction, however any updates
or inserts during that time may be lost.


>
> - erik
>
> --
> Sent via pgsql-performance mailing list ()
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


From:
Scott Carey
Date:



On 6/4/09 6:16 AM, "Robert Haas" <> wrote:

> On Thu, Jun 4, 2009 at 7:31 AM, Erik Aronesty <> wrote:
>> Seems like "VACUUM FULL" could figure out to do that too depending on
>> the bloat-to-table-size ratio ...
>>
>>   - copy all rows to new table
>>   - lock for a millisecond while renaming tables
>>   - drop old table.
>
> You'd have to lock the table at least against write operations during
> the copy; otherwise concurrent changes might be lost.
>
> AIUI, this is pretty much what CLUSTER does, and I've heard that it
> works as well or better as VACUUM FULL for bloat reclamation.
> However, it's apparently still pessimal:
> http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php (I
> had never heard this word before Greg Stark used it in this email, but
> it's a great turn of phrase, so I'm reusing it.)
>

Interesting, I suppose a race between VACUUM FULL and CLUSTER will depend a
lot on the index and how much of the table already exists in RAM.

If the index is in RAM, and most of the table is, CLUSTER will be rather
fast.


>> Locking a whole table for a very long time is scary for admins.
>
> Agreed.  It would be nice if we had some kind of "incremental full"
> vacuum that would run for long enough to reclaim a certain number of
> pages and then exit.  Then you could clean up this kind of problem
> incrementally instead of in one shot.  It would be even nicer if the
> lock strength could be reduced, but I'm guessing that's not easy to do
> or someone would have already done it by now.  I haven't read the code
> myself.
>
> ...Robert
>
> --
> Sent via pgsql-performance mailing list ()
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


From:
Erik Aronesty
Date:

> See ALTER TABLE and CREATE TABLE  (and the Index variants).
>
> ALTER TABLE foo SET (fillfactor=90);

I'll try that.

> This will leave on average, 10% of every 8k block empty and allow updates to
> columns to more likely live within the same block.

Good for the items table.

Probably bad for the cookies table, with 6 million rows, and thousands
of inserts and deletes every day, but few updates.

Maybe I should have another way of doing it.    That table gets
bloated fast.   A vacuum full takes 3 and half hours - which would be
an unacceptable amount of downtime if I didn't have working mirrors of
everything.

> Creating a new table as a select from the old and renaming, OR doing a
> CLUSTER and REINDEX is almost always faster than VACUUM FULL for such large
> tables.  But there are different implications on how long other queries are
> locked out of access to the table.  CLUSTER will generally lock out other
> queries for a long time, but the end result (especially combined with a
> reasonable fillfactor setting) ends up best for long term performance and
> reduction in bloat.

I'll try it on the other mirror server, which has the same specs and
size, see if CLUSTER/REINDEX is faster.

>>    - copy all rows to new table
>>    - lock for a millisecond while renaming tables
>>    - drop old table.
>>
>> Locking a whole table for a very long time is scary for admins.
>>
>
> You can do the above manually in a single transaction, however any updates
> or inserts during that time may be lost.

Postgres can have multiple row versions around for transactions, so
for a lockless vacuum full to work, some row versions would have to be
in the "new table". I think that could be done at the expense of some
performance degradation, as you'd have to figure out which table to
look at (or reads... new one.... nothing there.... ok then old
one...., for copies... there's an update there... put the copy "under"
it), some wacky logic like that.

I don't know postgres's internals well enough to do it for "all
cases", but I know my own DB well enought to get it to work for me.
Have 2 tables with triggered timestamps, then juggling of the queries
that hit the tables (check table a and table b, use the row with newer
timestamp for reads, meanwhile a is copying to b, but not overwriting
newer rows....something like that).

Not sure whether I'd rather have a 7-hour performance degraded
"table-copy" (which would reindex and recluster too) or a 3.5 hour
table-locked vacuum (which doesn't reindex or re-cluster).

From:
Robert Haas
Date:

On Thu, Jun 4, 2009 at 7:31 AM, Erik Aronesty<> wrote:
> is there some way to view the level of "bloat that needs full" in each
> table, so i could write a script that alerts me to the need of a
> "vacuum full"  without waiting for random queries to "get slow"?
>
> looking at the results of the "bloat query", i still can't see how to
> know whether bloat is getting bad in an objective manner.

One other thought on this...  I think the main thing to consider is
bloat as a percentage of table size.  When you go to sequential scan
the table, a table with as much bloat as data will take twice as long
to scan, one with twice as much bloat as data will take three times as
long to scan, and so on.

If you're only ever doing index scans, the effect will be less
noticeable, but in round figures comparing the amount of bloat to the
amount of data is a good place to start.  I usually find 3x is about
where the pain starts to hit.  Also, small tables can sometimes
tolerate a higher percentage of bloat than large ones, because those
table scans tend to be fast anyway.

A lot of times bloat happens at one particular time and just never
goes away.  Leaving an open transaction around for an hour or two can
bloat all of your tables, and they'll never get de-bloated on their
own without help.  It would be nice if VACUUM had even a little bit of
capability for incrementally improving this situation, but currently
it doesn't.  So when you mention running for a year, it's not unlikely
that you had one bad day (or several days in a row) when you collected
all of that bloat, rather than accumulating it gradually over time.

...Robert