Thread: database bloat, but vacuums are done, and fsm seems to be setup ok

database bloat, but vacuums are done, and fsm seems to be setup ok

From
hubert depesz lubaczewski
Date:
hi
setup:
postgresql 8.0.3 put on debian on dual xeon, 8GB ram, hardware raid.

database just after recreation from dump takes 15gigabytes.
after some time (up to 3 weeks) it gets really slow and has to be dump'ed and restored.

as for fsm:
end of vacuum info:
INFO:  free space map: 248 relations, 1359140 pages stored; 1361856 total pages needed
DETAIL:  Allocated FSM size: 1000 relations + 10000000 pages = 58659 kB shared memory.

so it looks i have plenty of space in fsm.

vacuums run constantly.
4 different tasks, 3 of them doing:
while true
vacuum table
sleep 15m
done
with different tables (i have chooses the most updated tables in system).

and the fourth vacuum task does the same, but without specifying table - so it vacuumes whole database.

after last dump/restore cycle i noticed that doing reindex on all indices in database made it drop in side from 40G to about 20G - so it might be that i will be using reindex instead of drop/restore.
anyway - i'm not using any special indices - just some (117 to be exact) indices of btree type. we use simple, multi-column, partial and multi-column partial indices. we do not have functional indices.

database has quite huge load of updates, but i thought that vacum will guard me from database bloat, but from what i observed it means that vacuuming of b-tree indices is somewhat faulty.

any suggestions? what else can i supply you with to help you help me?

best regards

depesz

Re: [GENERAL] database bloat, but vacuums are done, and fsm seems to be setup ok

From
"Jim C. Nasby"
Date:
Looks like it's definately an issue with index bloat. Note that it's
normal to have some amount of empty space depending on vacuum and update
frequency, so 15G -> 20G isn't terribly surprising. I would suggest
using pg_autovacuum instead of the continuous vacuum; it's very possible
that some of your tables need more frequent vacuuming than they're
getting now. If you go this route, you might want to change the default
settings a bit to make pg_autovacuum more agressive.

Also, I'd suggest posting to -hackers about the index bloat. Would you
be able to make a filesystem copy (ie: tar -cjf database.tar.bz2
$PGDATA) available? It might also be useful to keep an eye on index size
in pg_class.relpages and see exactly what indexes are bloating.

On Wed, Sep 28, 2005 at 09:07:07AM +0200, hubert depesz lubaczewski wrote:
> hi
> setup:
> postgresql 8.0.3 put on debian on dual xeon, 8GB ram, hardware raid.
>
> database just after recreation from dump takes 15gigabytes.
> after some time (up to 3 weeks) it gets really slow and has to be dump'ed
> and restored.
>
> as for fsm:
> end of vacuum info:
> INFO: free space map: 248 relations, 1359140 pages stored; 1361856 total
> pages needed
> DETAIL: Allocated FSM size: 1000 relations + 10000000 pages = 58659 kB
> shared memory.
>
> so it looks i have plenty of space in fsm.
>
> vacuums run constantly.
> 4 different tasks, 3 of them doing:
> while true
> vacuum table
> sleep 15m
> done
> with different tables (i have chooses the most updated tables in system).
>
> and the fourth vacuum task does the same, but without specifying table - so
> it vacuumes whole database.
>
> after last dump/restore cycle i noticed that doing reindex on all indices in
> database made it drop in side from 40G to about 20G - so it might be that i
> will be using reindex instead of drop/restore.
> anyway - i'm not using any special indices - just some (117 to be exact)
> indices of btree type. we use simple, multi-column, partial and multi-column
> partial indices. we do not have functional indices.
>
> database has quite huge load of updates, but i thought that vacum will guard
> me from database bloat, but from what i observed it means that vacuuming of
> b-tree indices is somewhat faulty.
>
> any suggestions? what else can i supply you with to help you help me?
>
> best regards
>
> depesz

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: [GENERAL] database bloat, but vacuums are done, and fsm seems to be setup ok

From
hubert depesz lubaczewski
Date:
On 9/30/05, Jim C. Nasby <jnasby@pervasive.com> wrote:
Looks like it's definately an issue with index bloat. Note that it's
normal to have some amount of empty space depending on vacuum and update
frequency, so 15G -> 20G isn't terribly surprising. I would suggest
using pg_autovacuum instead of the continuous vacuum; it's very possible
that some of your tables need more frequent vacuuming than they're
getting now. If you go this route, you might want to change the default
settings a bit to make pg_autovacuum more agressive.


actually i have a very bad experience with autovaccum - of course it is because i dont know how to setup it correctly, but for me it's just easier to setup continuos vacuums. and i know which tables are frequently updated, so i setup additional vacuums on them.
 

Also, I'd suggest posting to -hackers about the index bloat. Would you
be able to make a filesystem copy (ie: tar -cjf database.tar.bz2
$PGDATA) available? It might also be useful to keep an eye on index size
in pg_class.relpages and see exactly what indexes are bloating.


i'm watching it right now (which indices are bloating), but i cannot send copy of pgdata - it contains very sensitive information.
 
depesz

Re: database bloat, but vacuums are done, and fsm seems

From
Simon Riggs
Date:
On Wed, 2005-09-28 at 09:07 +0200, hubert depesz lubaczewski wrote:
> database has quite huge load of updates, but i thought that vacum will
> guard me from database bloat, but from what i observed it means that
> vacuuming of b-tree indices is somewhat faulty.

No, thats perfectly normal.

Indices are packed tighter when they are first created and they spread
out a bit as you update the database. Blocks start at 90% full and end
up at 50% full for non-monotonic indexes (e.g. SERIAL) or 67% for
monotonic.

It's a long debated design feature on any DBMS that uses b-trees.

REINDEX or dump/restore should be identical.

Best Regards, Simon Riggs