Thread: new to postgres (and db management) and performance already a problem :-(

Hi,
We have a horribly designed postgres 8.1.0 database (not my fault!). I
am pretty new to database design and management and have really no idea
how to diagnose performance problems. The db has only 25-30 tables, and
half of them are only there because our codebase needs them (long story,
again not my fault!). Basically we have 10 tables that are being
accessed, and only a couple of queries that join more than 3 tables.
Most of the action takes place on two tables. One of the devs has done
some truly atrocious coding and is using the db as his data access
mechanism (instead of an in-memory array, and he only needs an
array/collection).
It is running on an p4 3000ish (desktop model) running early linux 2.6
(mdk 10.1) (512meg of ram) so that shouldn't be an issue, as we are
talking only about 20000 inserts a day. It probably gets queried about
20000 times a day too (all vb6 via the pg odbc).
So... seeing as I didn't really do any investigation as to setting
default sizes for storage and the like - I am wondering whether our
performance problems (a programme running 1.5x slower than two weeks
ago) might not be coming from the db (or rather, my maintaining of it).
I have turned on stats, so as to allow autovacuuming, but have no idea
whether that could be related. Is it better to schedule a cron job to do
it x times a day? I just left all the default values in postgres.conf...
could I do some tweaking?
Does anyone know of any practical resources that might guide me in
sorting out these sorts of problems? Some stuff with pratical examples
would be good so I could compare with what we have.
Thanks
Antoine
ps. I had a look with top and it didn't look like it was going much over
15% cpu, with memory usage negligeable. There are usually about 10 open
connections. I couldn't find an easy way to check for disk accessings.
pps. The db is just one possible reason for our bottleneck so if you
tell me it is very unlikely I will be most reassured!

Re: new to postgres (and db management) and performance already a problem :-(

From
Andrew Sullivan
Date:
On Mon, Jan 16, 2006 at 11:07:52PM +0100, Antoine wrote:

> performance problems (a programme running 1.5x slower than two weeks
> ago) might not be coming from the db (or rather, my maintaining of it).
> I have turned on stats, so as to allow autovacuuming, but have no idea
> whether that could be related. Is it better to schedule a cron job to do
> it x times a day? I just left all the default values in postgres.conf...
> could I do some tweaking?

The first thing you need to do is find out where your problem is.
Are queries running slowly?  You need to do some EXPLAIN ANALYSE
queries to understand that.

A


--
Andrew Sullivan  | ajs@crankycanuck.ca
The whole tendency of modern prose is away from concreteness.
        --George Orwell

Antoine <melser.anton@gmail.com> writes:
> So... seeing as I didn't really do any investigation as to setting
> default sizes for storage and the like - I am wondering whether our
> performance problems (a programme running 1.5x slower than two weeks
> ago) might not be coming from the db (or rather, my maintaining of it).

That does sound like a lack-of-vacuuming problem.  If the performance
goes back where it was after VACUUM FULL, then you can be pretty sure
of it.  Note that autovacuum is not designed to fix this for you: it
only ever issues regular vacuum not vacuum full.

> I couldn't find an easy way to check for disk accessings.

Watch the output of "vmstat 1" or "iostat 1" for info about that.

            regards, tom lane

> That does sound like a lack-of-vacuuming problem.  If the performance
> goes back where it was after VACUUM FULL, then you can be pretty sure
> of it.  Note that autovacuum is not designed to fix this for you: it
> only ever issues regular vacuum not vacuum full.

in our db system (for a website), i notice performance boosts after a vacuum
full. but then, a VACUUM FULL takes 50min+ during which the db is not really
accessible to web-users. is there another way to perform maintenance tasks
AND leaving the db fully operable and accessible?

thanks,
thomas



<me@alternize.com> writes:
> in our db system (for a website), i notice performance boosts after a vacuum
> full. but then, a VACUUM FULL takes 50min+ during which the db is not really
> accessible to web-users. is there another way to perform maintenance tasks
> AND leaving the db fully operable and accessible?

You're not doing regular vacuums often enough.

            regards, tom lane

>> in our db system (for a website), i notice performance boosts after a
>> vacuum
>> full. but then, a VACUUM FULL takes 50min+ during which the db is not
>> really
>> accessible to web-users. is there another way to perform maintenance
>> tasks
>> AND leaving the db fully operable and accessible?
>
> You're not doing regular vacuums often enough.

well, shouldn't autovacuum take care of "regular" vacuums? in addition to
autovacuum, tables with data changes are vacuumed and reindexed once a day -
still performance seems to degrade slowly until a vacuum full is
initiated... could an additional daily vacuum over the entire db (even on
tables that only get data added, never changed or removed) help?

- thomas



Re: new to postgres (and db management) and performance already a problem :-(

From
Christopher Browne
Date:
>>> in our db system (for a website), i notice performance boosts after
>>> a vacuum
>>> full. but then, a VACUUM FULL takes 50min+ during which the db is
>>> not really
>>> accessible to web-users. is there another way to perform
>>> maintenance tasks
>>> AND leaving the db fully operable and accessible?
>>
>> You're not doing regular vacuums often enough.

By the way, you can get that VACUUM FULL to be "less injurious" if you
collect a list of tables:
pubs=# select table_schema, table_name from information_schema.tables
where table_type = 'BASE TABLE';

And then VACUUM FULL table by table.  It'll take the same 50 minutes;
it'll be more sporadically "unusable" which may turn out better.  But
that's just one step better; you want more steps :-).

> well, shouldn't autovacuum take care of "regular" vacuums? in addition
> to autovacuum, tables with data changes are vacuumed and reindexed
> once a day -
> still performance seems to degrade slowly until a vacuum full is
> initiated... could an additional daily vacuum over the entire db (even
> on tables that only get data added, never changed or removed) help?

Tables which never see updates/deletes don't need to get vacuumed very
often.  They should only need to get a periodic ANALYZE so that the
query optimizer gets the right stats.

There are probably many tables where pg_autovacuum is doing a fine
job.  What you need to do is to figure out which tables *aren't*
getting maintained well enough, and see about doing something special
to them.

What you may want to do is to go table by table and, for each one, do
two things:

1) VACUUM VERBOSE, which will report some information about how much
dead space there is on the table.

2) Contrib function pgstattuple(), which reports more detailed info
about space usage (alas, for just the table).

You'll find, between these, that there are some tables that have a LOT
of dead space.  At that point, there may be three answers:

a) PG 8.1 pg_autovacuum allows you to modify how often specific tables
are vacuumed; upping the numbers for the offending tables may clear
things up

b) Schedule cron jobs to periodically (hourly?  several times per
hour?) VACUUM the "offending" tables

c) You may decide to fall back to VACUUM FULL; if you do so just for a
small set of tables, the "time of pain" won't be the 50 minutes you're
living with now...

Try a), b), and c) in order on the "offending" tables as they address
the problem at increasing cost...
--
(reverse (concatenate 'string "moc.liamg" "@" "enworbbc"))
http://linuxdatabases.info/info/x.html
"Listen,  strange women, lyin'  in ponds,  distributin' swords,  is no
basis  for a  system of  government. Supreme  executive  power derives
itself from a mandate from  the masses, not from some farcical aquatic
ceremony."  -- Monty Python and the Holy Grail

> Try a), b), and c) in order on the "offending" tables as they address
> the problem at increasing cost...

thanks alot for the detailed information! the entire concept of vacuum isn't
yet that clear to me, so your explanations and hints are very much
appreciated. i'll defenitely try these steps this weekend when the next full
vacuum was scheduled :-)

best regards,
thomas





On 17/01/06, me@alternize.com <me@alternize.com> wrote:
> Try a), b), and c) in order on the "offending" tables as they address
> the problem at increasing cost...

thanks alot for the detailed information! the entire concept of vacuum isn't
yet that clear to me, so your explanations and hints are very much
appreciated. i'll defenitely try these steps this weekend when the next full
vacuum was scheduled :-)

Thanks guys, that pretty much answered my question(s) too. I have a sneaking suspicion that vacuuming won't do too much for us however... now that I think about it - we do very little removing, pretty much only inserts and selects. I will give it a vacuum full and see what happens.
Cheers
Antoine




--
This is where I should put some witty comment.

Re: new to postgres (and db management) and performance already a problem :-(

From
Andrew Sullivan
Date:
On Tue, Jan 17, 2006 at 09:14:27AM +0100, Antoine wrote:
> think about it - we do very little removing, pretty much only inserts and
> selects. I will give it a vacuum full and see what happens.

UPDATES?  Remember that, in Postgres, UPDATE is effectively DELETE +
INSERT (from the point of view of storage, not the point of view of
the user).

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
The plural of anecdote is not data.
        --Roger Brinner

Re: new to postgres (and db management) and performance

From
Markus Schaber
Date:
Hi, Tom,

Tom Lane wrote:

>>in our db system (for a website), i notice performance boosts after a vacuum
>>full. but then, a VACUUM FULL takes 50min+ during which the db is not really
>>accessible to web-users. is there another way to perform maintenance tasks
>>AND leaving the db fully operable and accessible?
>
> You're not doing regular vacuums often enough.

It may also help to increase the max_fsm_pages setting, so postmaster
has more memory to remember freed pages between VACUUMs.

HTH,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Re: new to postgres (and db management) and performance

From
Markus Schaber
Date:
Hi, Thomas,

me@alternize.com wrote:
>> Try a), b), and c) in order on the "offending" tables as they address
>> the problem at increasing cost...
>
> thanks alot for the detailed information! the entire concept of vacuum
> isn't yet that clear to me, so your explanations and hints are very much
> appreciated. i'll defenitely try these steps this weekend when the next
> full vacuum was scheduled :-)

Basically, VACUUM scans the whole table and looks for pages containing
garbage rows (or row versions), deletes the garbage, and adds those
pages to the free space map (if there are free slots). When allocating
new rows / row versions, PostgreSQL first tries to fit them in pages
from the free space maps before allocating new pages. This is why a high
max_fsm_pages setting can help when VACUUM freqency is low.

VACUUM FULL additionally moves rows between pages, trying to concentrate
all the free space at the end of the tables (aka "defragmentation"), so
it can then truncate the files and release the space to the filesystem.

CLUSTER basically rebuilds the tables by copying all rows into a new
table, in index order, and then dropping the old table, which also
reduces fragmentation, but not as strong as VACUUM FULL might.

ANALYZE creates statistics about the distribution of values in a column,
allowing the query optimizer to estimate the selectivity of query criteria.

(This explanation is rather simplified, and ignores indices as well as
the fact that a table can consist of multiple files. Also, I believe
that newer PostgreSQL versions allow VACUUM to truncate files when free
pages happen to appear at the very end of the file.)


HTH,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org