Thread: [Again] Postgres performance problem

[Again] Postgres performance problem

From
Ruben Rubio
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Hi,

I having the same problem I told here a few weeks before. Database is
using too much resources again.

I do a vacumm full each day, but seems it is not working. I am preparing
an update to postgres 8.2.4 (actually I am using at 8.1.3, and tests for
update will need several days)

Last time I had this problem i solved it stopping website,  restarting
database, vacuumm it, run again website. But I guess this is going to
happen again.

I would like to detect and solve the problem. Any ideas to detect it?

Thanks in advance,



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG5jbLIo1XmbAXRboRArcpAJ0YvoCT6KWv2fafVAtapu6nwFmKoACcD0uA
zFTx9Wq+2NSxijIf/R8E5f8=
=u0k5
-----END PGP SIGNATURE-----

Attachment

Re: [Again] Postgres performance problem

From
db@zigo.dhs.org
Date:
> Last time I had this problem i solved it stopping website,  restarting
> database, vacuumm it, run again website. But I guess this is going to
> happen again.
>
> I would like to detect and solve the problem. Any ideas to detect it?

Do you have very long transactions? Maybe some client that is connected
all the time that is idle in transaction?

/Dennis


Re: [Again] Postgres performance problem

From
Ruben Rubio
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

db@zigo.dhs.org escribió:
>> Last time I had this problem i solved it stopping website,  restarting
>> database, vacuumm it, run again website. But I guess this is going to
>> happen again.
>>
>> I would like to detect and solve the problem. Any ideas to detect it?
>
> Do you have very long transactions? Maybe some client that is connected
> all the time that is idle in transaction?

There should not be long transactions. I ll keep an eye on Idle transactions

I m detecting it using:

echo 'SELECT current_query  FROM pg_stat_activity;' |
/usr/local/pgsql/bin/psql vacadb  | grep IDLE | wc -l



>
> /Dennis
>
>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG5kiRIo1XmbAXRboRAj3sAKCH21zIhvdvPcmVQG71owiCye96xwCcDPe0
o/aArJF0JjUnTIFd1sMYD+Y=
=6zyY
-----END PGP SIGNATURE-----

Attachment

Re: [Again] Postgres performance problem

From
Decibel!
Date:
On Tue, Sep 11, 2007 at 09:49:37AM +0200, Ruben Rubio wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> db@zigo.dhs.org escribi?:
> >> Last time I had this problem i solved it stopping website,  restarting
> >> database, vacuumm it, run again website. But I guess this is going to
> >> happen again.
> >>
> >> I would like to detect and solve the problem. Any ideas to detect it?
> >
> > Do you have very long transactions? Maybe some client that is connected
> > all the time that is idle in transaction?
>
> There should not be long transactions. I ll keep an eye on Idle transactions
>
> I m detecting it using:
>
> echo 'SELECT current_query  FROM pg_stat_activity;' |
> /usr/local/pgsql/bin/psql vacadb  | grep IDLE | wc -l

If you're using VACUUM FULL, you're doing something wrong. :) Run lazy
vacuum frequently enough (better yet, autovacuum, but cut all of 8.1's
autovac parameters in half), and make sure your FSM is big enough
(periodic vacuumdb -av | tail is an easy way to check that).

Try a REINDEX. VACUUM FULL is especially hard on the indexes, and it's
easy for them to seriously bloat.
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Attachment

Re: [Again] Postgres performance problem

From
ruben@rentalia.com
Date:

Decibel! escribió:
> On Tue, Sep 11, 2007 at 09:49:37AM +0200, Ruben Rubio wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> db@zigo.dhs.org escribi?:
>>>> Last time I had this problem i solved it stopping website,  restarting
>>>> database, vacuumm it, run again website. But I guess this is going to
>>>> happen again.
>>>>
>>>> I would like to detect and solve the problem. Any ideas to detect it?
>>> Do you have very long transactions? Maybe some client that is connected
>>> all the time that is idle in transaction?
>> There should not be long transactions. I ll keep an eye on Idle transactions
>>
>> I m detecting it using:
>>
>> echo 'SELECT current_query  FROM pg_stat_activity;' |
>> /usr/local/pgsql/bin/psql vacadb  | grep IDLE | wc -l
>
> If you're using VACUUM FULL, you're doing something wrong. :)

I do a VACUUM FULL VERBOSE ANALYZE each day. I save all logs so I can
check if vacuum is done properly.(it is)

Run lazy
> vacuum frequently enough (better yet, autovacuum, but cut all of 8.1's
> autovac parameters in half), and make sure your FSM is big enough

I checked that there is no warnings about FSM in logs. (also in logs
from vacuum). Is it reliable?

What do u mean for "cut all of 8.1's autovac parameters in half" Maybe
default autovac parameters?

> (periodic vacuumdb -av | tail is an easy way to check that).

I ll keep an eye on it.

>
> Try a REINDEX. VACUUM FULL is especially hard on the indexes, and it's
> easy for them to seriously bloat.

Reindex is  done everyday after VACUUM FULL VERBOSE ANALYZE. I save also
the output averyday and save it into a log, and I can check that it is
done properly.



Re: [Again] Postgres performance problem

From
"Scott Marlowe"
Date:
On 9/12/07, ruben@rentalia.com <ruben@rentalia.com> wrote:
>
> Decibel! escribió:
> > On Tue, Sep 11, 2007 at 09:49:37AM +0200, Ruben Rubio wrote:
> >> -----BEGIN PGP SIGNED MESSAGE-----
> >> Hash: SHA1
> >>
> >> db@zigo.dhs.org escribi?:
> >>>> Last time I had this problem i solved it stopping website,  restarting
> >>>> database, vacuumm it, run again website. But I guess this is going to
> >>>> happen again.
> >>>>
> >>>> I would like to detect and solve the problem. Any ideas to detect it?
> >>> Do you have very long transactions? Maybe some client that is connected
> >>> all the time that is idle in transaction?
> >> There should not be long transactions. I ll keep an eye on Idle transactions
> >>
> >> I m detecting it using:
> >>
> >> echo 'SELECT current_query  FROM pg_stat_activity;' |
> >> /usr/local/pgsql/bin/psql vacadb  | grep IDLE | wc -l
> >
> > If you're using VACUUM FULL, you're doing something wrong. :)
>
> I do a VACUUM FULL VERBOSE ANALYZE each day. I save all logs so I can
> check if vacuum is done properly.(it is)

Then, like Jim said, you're doing it wrong.  Regular vacuum full is
like rebuiling a piece of equipment every night when all it needs is
the filter changed.

> Run lazy
> > vacuum frequently enough (better yet, autovacuum, but cut all of 8.1's
> > autovac parameters in half), and make sure your FSM is big enough
>
> I checked that there is no warnings about FSM in logs. (also in logs
> from vacuum). Is it reliable?
>
> What do u mean for "cut all of 8.1's autovac parameters in half" Maybe
> default autovac parameters?

Yep.  ( I assume)

> > (periodic vacuumdb -av | tail is an easy way to check that).
>
> I ll keep an eye on it.
>
> >
> > Try a REINDEX. VACUUM FULL is especially hard on the indexes, and it's
> > easy for them to seriously bloat.
>
> Reindex is  done everyday after VACUUM FULL VERBOSE ANALYZE. I save also
> the output averyday and save it into a log, and I can check that it is
> done properly.

Then you're vacuum full is wasted.  A reindex accomplishes the same
thing, plus shrinks indexes (vacuum full can bloat indexes).

Just run regular vacuums, preferably by autovacuum, and keep an eye on
the vacuum analyze you run each night to see if your fsm is big
enough.

Occasionally vacuum full is absolutely the right answer.  Most the
time it's not.

I'm getting more and more motivated to rewrite the vacuum docs.  I
think a rewrite from the ground up might be best...  I keep seeing
people doing vacuum full on this list and I'm thinking it's as much
because of the way the docs represent vacuum full as anything.  Is
that true for you?

Re: [Again] Postgres performance problem

From
Brian Hurt
Date:
Scott Marlowe wrote:

>I'm getting more and more motivated to rewrite the vacuum docs.  I
>think a rewrite from the ground up might be best...  I keep seeing
>people doing vacuum full on this list and I'm thinking it's as much
>because of the way the docs represent vacuum full as anything.  Is
>that true for you?
>
>
>

It's true for me.

I turned off autovacuum as I was getting occassional hangs, which I
thought were the result of vacuums (and have signifigantly decreased
since I did that), and went nightly vacuum fulls, and vacuum
full/reindex/cluster on the weekends (which I now realize is redundant).

Brian


Re: [Again] Postgres performance problem

From
"Mikko Partio"
Date:


On 9/12/07, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On 9/12/07, ruben@rentalia.com <ruben@rentalia.com> wrote:
 

> > Try a REINDEX. VACUUM FULL is especially hard on the indexes, and it's
> > easy for them to seriously bloat.
>
> Reindex is  done everyday after VACUUM FULL VERBOSE ANALYZE. I save also
> the output averyday and save it into a log, and I can check that it is
> done properly.

Then you're vacuum full is wasted.  A reindex accomplishes the same
thing, plus shrinks indexes (vacuum full can bloat indexes).

Aren't you mixing up REINDEX and CLUSTER?

Regards

MP


Re: [Again] Postgres performance problem

From
"Scott Marlowe"
Date:
On 9/12/07, Mikko Partio <mpartio@gmail.com> wrote:
>
>
> On 9/12/07, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> > On 9/12/07, ruben@rentalia.com <ruben@rentalia.com> wrote:
>
> > > > Try a REINDEX. VACUUM FULL is especially hard on the indexes, and it's
> > > > easy for them to seriously bloat.
> > >
> > > Reindex is  done everyday after VACUUM FULL VERBOSE ANALYZE. I save also
> > > the output averyday and save it into a log, and I can check that it is
> > > done properly.
> >
> > Then you're vacuum full is wasted.  A reindex accomplishes the same
> > thing, plus shrinks indexes (vacuum full can bloat indexes).
>
> Aren't you mixing up REINDEX and CLUSTER?

I don't think so.  reindex (which runs on tables and indexes, so the
name is a bit confusing, I admit) basically was originally a "repair"
operation that rewrote the whole relation and wasn't completely
transaction safe (way back, 7.2 days or so I think).  Due to the
issues with vacuum full bloating indexes, and being slowly replaced by
regular vacuum, reindex received some attention to make it transaction
/ crash safe and has kind of take the place of vacuum full in terms of
"how to fix bloated objects".

cluster, otoh, rewrites the table into index order.

Either one does what a vacuum full did / does, but generally does it better.

Re: [Again] Postgres performance problem

From
Frank Schoep
Date:
On Sep 12, 2007, at 9:07 PM, Scott Marlowe wrote:
> On 9/12/07, Mikko Partio <mpartio@gmail.com> wrote:
>> …
>> Aren't you mixing up REINDEX and CLUSTER?
>
> …
> Either one does what a vacuum full did / does, but generally does
> it better.

On topic of REINDEX / VACUUM FULL versus a CLUSTER / VACUUM ANALYZE
I'd like to ask if CLUSTER is safe to run on a table that is in
active use.

After updating my maintenance scripts from a VACUUM FULL (add me to
the list) to CLUSTER (which improves performance a lot) I noticed I
was getting "could not open relation …" errors in the log while the
scripts ran so I reverted the change. This was on 8.1.9.

Am I hitting a corner case or is it generally not a good idea to
CLUSTER tables which are being queried? I haven't had problems with
the REINDEX / VACUUM FULL combination while CLUSTER / VACUUM ANALYZE
resulted in errors on the first run.

Can the "could not open relation…" error bring down the whole
database server? I'm really interested in using CLUSTER regularly as
it speeds up my system by a factor of two because of more efficient I/O.

Sincerely,

Frank

Re: [Again] Postgres performance problem

From
"Scott Marlowe"
Date:
On 9/12/07, Frank Schoep <frank@ffnn.nl> wrote:
> On Sep 12, 2007, at 9:07 PM, Scott Marlowe wrote:
> > On 9/12/07, Mikko Partio <mpartio@gmail.com> wrote:
> >> …
> >> Aren't you mixing up REINDEX and CLUSTER?
> >
> > …
> > Either one does what a vacuum full did / does, but generally does
> > it better.
>
> On topic of REINDEX / VACUUM FULL versus a CLUSTER / VACUUM ANALYZE
> I'd like to ask if CLUSTER is safe to run on a table that is in
> active use.
>
> After updating my maintenance scripts from a VACUUM FULL (add me to
> the list) to CLUSTER (which improves performance a lot) I noticed I
> was getting "could not open relation …" errors in the log while the
> scripts ran so I reverted the change. This was on 8.1.9.
>
> Am I hitting a corner case or is it generally not a good idea to
> CLUSTER tables which are being queried? I haven't had problems with
> the REINDEX / VACUUM FULL combination while CLUSTER / VACUUM ANALYZE
> resulted in errors on the first run.
>
> Can the "could not open relation…" error bring down the whole
> database server? I'm really interested in using CLUSTER regularly as
> it speeds up my system by a factor of two because of more efficient I/O.

No, it won't bring it down.  Basically the query lost the relation is
was operating against because it disappeared when the cluster command
runs.

Re: [Again] Postgres performance problem

From
Alvaro Herrera
Date:
Scott Marlowe escribió:

> > Aren't you mixing up REINDEX and CLUSTER?
>
> I don't think so.  reindex (which runs on tables and indexes, so the
> name is a bit confusing, I admit) basically was originally a "repair"
> operation that rewrote the whole relation and wasn't completely
> transaction safe (way back, 7.2 days or so I think).  Due to the
> issues with vacuum full bloating indexes, and being slowly replaced by
> regular vacuum, reindex received some attention to make it transaction
> / crash safe and has kind of take the place of vacuum full in terms of
> "how to fix bloated objects".

Hmm, REINDEX does not rewrite tables.  If there are dead tuples, they
will still be there after REINDEX.


> cluster, otoh, rewrites the table into index order.

... excluding dead tuples, and then rewrites all the indexes.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: [Again] Postgres performance problem

From
Erik Jones
Date:
On Sep 12, 2007, at 2:19 PM, Frank Schoep wrote:

> On Sep 12, 2007, at 9:07 PM, Scott Marlowe wrote:
>> On 9/12/07, Mikko Partio <mpartio@gmail.com> wrote:
>>> …
>>> Aren't you mixing up REINDEX and CLUSTER?
>>
>> …
>> Either one does what a vacuum full did / does, but generally does
>> it better.
>
> On topic of REINDEX / VACUUM FULL versus a CLUSTER / VACUUM ANALYZE
> I'd like to ask if CLUSTER is safe to run on a table that is in
> active use.
>
> After updating my maintenance scripts from a VACUUM FULL (add me to
> the list) to CLUSTER (which improves performance a lot) I noticed I
> was getting "could not open relation …" errors in the log while the
> scripts ran so I reverted the change. This was on 8.1.9.

You'd probably see the same behavior on 8.2.x.  CLUSTER is not
transactionally safe so you don't want to run CLUSTER on tables that
are actively being used.  I believe that's been fixed for 8.3.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



Re: [Again] Postgres performance problem

From
Greg Smith
Date:
On Wed, 12 Sep 2007, Scott Marlowe wrote:

> I'm getting more and more motivated to rewrite the vacuum docs.  I think
> a rewrite from the ground up might be best...  I keep seeing people
> doing vacuum full on this list and I'm thinking it's as much because of
> the way the docs represent vacuum full as anything.

I agree you shouldn't start thinking in terms of how to fix the existing
documentation.  I'd suggest instead writing a tutorial leading someone
through what they need to know about their tables first and then going
into how vacuum works based on that data.

As an example, people throw around terms like "index bloat" and "dead
tuples" when talking about vacuuming.  The tutorial I'd like to see
somebody write would start by explaining those terms and showing how to
measure them--preferably with a good and bad example to contrast.  The way
these terms are thrown around right now, I don't expect newcomers to
understand either the documentation or the advice people are giving them;
I think it's shooting over their heads and what's needed are some
walkthroughs.  Another example I'd like to see thrown in there is what it
looks like when you don't have enough FSM slots.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: [Again] Postgres performance problem

From
El-Lotso
Date:
On Thu, 2007-09-13 at 01:58 -0400, Greg Smith wrote:
> On Wed, 12 Sep 2007, Scott Marlowe wrote:
>
> > I'm getting more and more motivated to rewrite the vacuum docs.  I think
> > a rewrite from the ground up might be best...  I keep seeing people
> > doing vacuum full on this list and I'm thinking it's as much because of
> > the way the docs represent vacuum full as anything.
>
> I agree you shouldn't start thinking in terms of how to fix the existing
> documentation.  I'd suggest instead writing a tutorial leading someone
> through what they need to know about their tables first and then going
> into how vacuum works based on that data.

I'm new to PG and it's true that I am confused.
As it stands this is a newbie's understanding of the various terms.

cluster -> rewrites a table according to index order so that IO is
ordered/sequential
reindex -> basically, rewrites the indexes adding new records/fixes up
old deleted records
vacuum -> does cleaning
vacuum analyse -> clean and update statistics (i run this mostly)
autovacuum - does vacuum analyse automatically per default setup or some
or cost based parameter

vacuum full -> I also do this frequently (test DB only) as a means to
retrieve back used spaces due to MVCC. (trying lots of different methods
of query/add new index/make concatenated join/unique keys and then
deleting them if it's not useful)


>
> As an example, people throw around terms like "index bloat" and "dead
> tuples" when talking about vacuuming.

I honestly have only the vaguest idea what these 2 mean. (i only grasped
recently that tuples = records/rows)

> The tutorial I'd like to see
> somebody write would start by explaining those terms and showing how to
> measure them--preferably with a good and bad example to contrast.  The way
> these terms are thrown around right now, I don't expect newcomers to
> understand either the documentation or the advice people are giving them;
> I think it's shooting over their heads and what's needed are some
> walkthroughs.  Another example I'd like to see thrown in there is what it
> looks like when you don't have enough FSM slots.


actually, an additional item I would like is to understand explain
analyse. The current docs written by tom only shows explain and not
explain analyse and I'm getting confuse as to the rows=xxx vs actual
rows=yyy where on some of my queries can be very far apart 1 vs 500x
ratio on some problematic query[1]. And googling doesn't give much doc
on the explain. (the only other useful doc I've seen is a presentation
given from oscon 2003)

[1](See my other post)


Re: [Again] Postgres performance problem

From
"Scott Marlowe"
Date:
On 9/13/07, Greg Smith <gsmith@gregsmith.com> wrote:
> On Wed, 12 Sep 2007, Scott Marlowe wrote:
>
> > I'm getting more and more motivated to rewrite the vacuum docs.  I think
> > a rewrite from the ground up might be best...  I keep seeing people
> > doing vacuum full on this list and I'm thinking it's as much because of
> > the way the docs represent vacuum full as anything.
>
> I agree you shouldn't start thinking in terms of how to fix the existing
> documentation.  I'd suggest instead writing a tutorial leading someone
> through what they need to know about their tables first and then going
> into how vacuum works based on that data.

I think both things are needed actually.  The current docs were
started back when pg 7.2 roamed the land, and they've been updated a
bit at a time.  The technical definitions of vacuum,  vacuum full,
analyze etc all show a bit too much history from back in the day, and
are confusing.  so, I think that 1: vacuum and analyze should have
their own sections.  analyze used to be a subcommand of vacuum but it
no longer is, but the docs still pretty much tie them together.  2:
The definition for vacuum full needs to include a caveat that vacuum
full should be considered more of a recovery operation than a way to
simply get back some space on your hard drives.

Which leads me to thinking that we then need a simple tutorial on
vacuuming to include the free space map, vacuum, vacuum analyze,
vacuum full, and the autovacuum daemon.  We can throw analyze in there
somewhere too, I just don't want it to seem like it's still married to
vacuum.

> As an example, people throw around terms like "index bloat" and "dead
> tuples" when talking about vacuuming.  The tutorial I'd like to see
> somebody write would start by explaining those terms and showing how to
> measure them--preferably with a good and bad example to contrast.

I agree.  I might rearrange it a bit but that's the way I'm looking at it too.

> The way
> these terms are thrown around right now, I don't expect newcomers to
> understand either the documentation or the advice people are giving them;
> I think it's shooting over their heads and what's needed are some
> walkthroughs.  Another example I'd like to see thrown in there is what it
> looks like when you don't have enough FSM slots.

OK.  Got something to start with.  I'm thinking I might work on a
vacuum tutorial first, then the tech docs...

Re: [Again] Postgres performance problem

From
"Gavin M. Roy"
Date:
How many backends do you have at any given time?  Have you tried using
something like pgBouncer to lower backend usage?  How about your IO
situation?  Have you run something like sysstat to see what iowait is
at?

On 9/11/07, Ruben Rubio <ruben@rentalia.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> Hi,
>
> I having the same problem I told here a few weeks before. Database is
> using too much resources again.
>
> I do a vacumm full each day, but seems it is not working. I am preparing
> an update to postgres 8.2.4 (actually I am using at 8.1.3, and tests for
> update will need several days)
>
> Last time I had this problem i solved it stopping website,  restarting
> database, vacuumm it, run again website. But I guess this is going to
> happen again.
>
> I would like to detect and solve the problem. Any ideas to detect it?
>
> Thanks in advance,
>
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFG5jbLIo1XmbAXRboRArcpAJ0YvoCT6KWv2fafVAtapu6nwFmKoACcD0uA
> zFTx9Wq+2NSxijIf/R8E5f8=
> =u0k5
> -----END PGP SIGNATURE-----
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>
>
>

Re: [Again] Postgres performance problem

From
Erik Jones
Date:
On Sep 13, 2007, at 12:58 AM, Greg Smith wrote:

> On Wed, 12 Sep 2007, Scott Marlowe wrote:
>
>> I'm getting more and more motivated to rewrite the vacuum docs.  I
>> think a rewrite from the ground up might be best...  I keep seeing
>> people doing vacuum full on this list and I'm thinking it's as
>> much because of the way the docs represent vacuum full as anything.
>
> I agree you shouldn't start thinking in terms of how to fix the
> existing documentation.  I'd suggest instead writing a tutorial
> leading someone through what they need to know about their tables
> first and then going into how vacuum works based on that data.
>
> As an example, people throw around terms like "index bloat" and
> "dead tuples" when talking about vacuuming.  The tutorial I'd like
> to see somebody write would start by explaining those terms and
> showing how to measure them--preferably with a good and bad example
> to contrast.  The way these terms are thrown around right now, I
> don't expect newcomers to understand either the documentation or
> the advice people are giving them; I think it's shooting over their
> heads and what's needed are some walkthroughs.  Another example I'd
> like to see thrown in there is what it looks like when you don't
> have enough FSM slots.

Isn't that the point of the documentation?  I mean, if the existing,
official manual has been demonstrated (through countless mailing list
help requests) to not sufficiently explain a given topic, shouldn't
it be revised?  One thing that might help is a hyperlinked glossary
so that people reading through the documentation can go straight to
the postgres definition of dead tuple, index bloat, etc.


Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



Re: [Again] Postgres performance problem

From
"Scott Marlowe"
Date:
On 9/13/07, Erik Jones <erik@myemma.com> wrote:
> On Sep 13, 2007, at 12:58 AM, Greg Smith wrote:
>
> > On Wed, 12 Sep 2007, Scott Marlowe wrote:
> >
> >> I'm getting more and more motivated to rewrite the vacuum docs.  I
> >> think a rewrite from the ground up might be best...  I keep seeing
> >> people doing vacuum full on this list and I'm thinking it's as
> >> much because of the way the docs represent vacuum full as anything.
> >
> > I agree you shouldn't start thinking in terms of how to fix the
> > existing documentation.  I'd suggest instead writing a tutorial
> > leading someone through what they need to know about their tables
> > first and then going into how vacuum works based on that data.
> >
> > As an example, people throw around terms like "index bloat" and
> > "dead tuples" when talking about vacuuming.  The tutorial I'd like
> > to see somebody write would start by explaining those terms and
> > showing how to measure them--preferably with a good and bad example
> > to contrast.  The way these terms are thrown around right now, I
> > don't expect newcomers to understand either the documentation or
> > the advice people are giving them; I think it's shooting over their
> > heads and what's needed are some walkthroughs.  Another example I'd
> > like to see thrown in there is what it looks like when you don't
> > have enough FSM slots.
>
> Isn't that the point of the documentation?  I mean, if the existing,
> official manual has been demonstrated (through countless mailing list
> help requests) to not sufficiently explain a given topic, shouldn't
> it be revised?  One thing that might help is a hyperlinked glossary
> so that people reading through the documentation can go straight to
> the postgres definition of dead tuple, index bloat, etc.

Yes and no.  The official docs are more of a technical specification.
Short, simple and to the point so that if you know mostly what you're
doing you don't have to wade through a long tutorial to find the
answer.  I find MySQL's documentation frustrating as hell because I
can never find just the one thing I wanna look for.  Because it's all
written as a tutorial.  I.e. I have to pay the "stupid tax" when I
read their docs.

What I want to do is two fold.  1: fix the technical docs so they have
better explanations of each of the topics, without turning them into
huge tutorials.  2:  Write a vacuuming tutorial that will be useful
should someone be new to postgresql and need to set up their system.
I think the tutorial should be broken into at least two sections, a
quick start guide and an ongoing maintenance and tuning section.

Re: [Again] Postgres performance problem

From
Greg Smith
Date:
On Thu, 13 Sep 2007, Scott Marlowe wrote:

> I think both things are needed actually.  The current docs were
> started back when pg 7.2 roamed the land, and they've been updated a
> bit at a time...

No argument here that ultimately the documentation needs to be updated as
well.  I was just suggesting what I've been thinking about as the path of
least resistance to move in that direction.  Updating the documentation is
harder to do because of the build process involved.  It's easier to write
something new that addresses the deficiencies, get that right, and then
merge it into the documentation when it's stable.  After the main new
content is done, then it's easier to sweep back through the existing
material and clean things up.

> Which leads me to thinking that we then need a simple tutorial on
> vacuuming to include the free space map, vacuum, vacuum analyze,
> vacuum full, and the autovacuum daemon.

Right, that's the sort of thing that's missing right now, and I think that
would be more useful to newbies than correcting the documentation that's
already there.

Also:  if you don't have a public working area to assemble this document
at, I've set a precedent of sorts that it's OK to put working material
like this onto the PG developer's wiki at http://developer.postgresql.org/
as long as your stated intention is ultimately to move it off of there
once it's complete.  In addition to providing a nice set of tools for
working the text (presuming you're comfortable with Wiki syntax) that will
get you a pool of reviewers/contributors who can make changes directly
rather than you needing to do all the work yourself.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: [Again] Postgres performance problem

From
Adam Tauno Williams
Date:
> > Isn't that the point of the documentation?  I mean, if the existing,
> > official manual has been demonstrated (through countless mailing list
> > help requests) to not sufficiently explain a given topic, shouldn't
> > it be revised?

Or it proves that no one bothers to read the docs.

> > One thing that might help is a hyperlinked glossary
> > so that people reading through the documentation can go straight to
> > the postgres definition of dead tuple, index bloat, etc.
> Yes and no.  The official docs are more of a technical specification.
> Short, simple and to the point so that if you know mostly what you're
> doing you don't have to wade through a long tutorial to find the
> answer.  I find MySQL's documentation frustrating as hell because I
> can never find just the one thing I wanna look for.

Yes!  MySQL documentation is maddening.

This is why, I suspect, for products like Informix and DB2 IBM publishes
two manuals (or roughly equivalent to two manuals):  a "guide" and a
"reference".

> written as a tutorial.  I.e. I have to pay the "stupid tax" when I
> read their docs.

Yep.

> What I want to do is two fold.  1: fix the technical docs so they have
> better explanations of each of the topics, without turning them into
> huge tutorials.  2:  Write a vacuuming tutorial that will be useful
> should someone be new to postgresql and need to set up their system.
> I think the tutorial should be broken into at least two sections, a
> quick start guide and an ongoing maintenance and tuning section.



Re: [Again] Postgres performance problem

From
Ruben Rubio
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Gavin M. Roy escribió:
> How many backends do you have at any given time?  Have you tried using
> something like pgBouncer to lower backend usage?  How about your IO
> situation?  Have you run something like sysstat to see what iowait is
> at?

backends arround 50 -100 I don't use pgBouncer yet.
Sysstat reports veeery low io.

Right now Im checking out fsm parameter, as Scott recomended. Seems
there is the problem.



>
> On 9/11/07, Ruben Rubio <ruben@rentalia.com> wrote:
>
> Hi,
>
> I having the same problem I told here a few weeks before. Database is
> using too much resources again.
>
> I do a vacumm full each day, but seems it is not working. I am preparing
> an update to postgres 8.2.4 (actually I am using at 8.1.3, and tests for
> update will need several days)
>
> Last time I had this problem i solved it stopping website,  restarting
> database, vacuumm it, run again website. But I guess this is going to
> happen again.
>
> I would like to detect and solve the problem. Any ideas to detect it?
>
> Thanks in advance,
>
>
>
>>
>>
- ---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo@postgresql.org so that your
       message can get through to the mailing list cleanly
>>
>>
>>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG7mx7Io1XmbAXRboRAn0VAJ4sGc1KCNlsbrybVbY/WfB+3XWBbwCfb7Z/
WNGyJCRo6zd26uR6FB6SA8o=
=SYzs
-----END PGP SIGNATURE-----

Attachment

Re: [Again] Postgres performance problem

From
Decibel!
Date:
On Wed, Sep 12, 2007 at 03:01:12PM -0500, Erik Jones wrote:
>
> On Sep 12, 2007, at 2:19 PM, Frank Schoep wrote:
>
> >On Sep 12, 2007, at 9:07 PM, Scott Marlowe wrote:
> >>On 9/12/07, Mikko Partio <mpartio@gmail.com> wrote:
> >>>?
> >>>Aren't you mixing up REINDEX and CLUSTER?
> >>
> >>?
> >>Either one does what a vacuum full did / does, but generally does
> >>it better.
> >
> >On topic of REINDEX / VACUUM FULL versus a CLUSTER / VACUUM ANALYZE
> >I'd like to ask if CLUSTER is safe to run on a table that is in
> >active use.
> >
> >After updating my maintenance scripts from a VACUUM FULL (add me to
> >the list) to CLUSTER (which improves performance a lot) I noticed I
> >was getting "could not open relation ?" errors in the log while the
> >scripts ran so I reverted the change. This was on 8.1.9.
>
> You'd probably see the same behavior on 8.2.x.  CLUSTER is not
> transactionally safe so you don't want to run CLUSTER on tables that
> are actively being used.  I believe that's been fixed for 8.3.

Actually, that's a bit over-conservative... what happens prior to 8.3 is
that CLUSTER rewrites the table using it's XID for everything. That can
break semantics for any transactions that are running in serializable
mode; if you're just using the default isolation level of read
committed, you're fine with CLUSTER.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Attachment

Re: [Again] Postgres performance problem

From
Decibel!
Date:
On Thu, Sep 13, 2007 at 01:58:10AM -0400, Greg Smith wrote:
> On Wed, 12 Sep 2007, Scott Marlowe wrote:
>
> >I'm getting more and more motivated to rewrite the vacuum docs.  I think
> >a rewrite from the ground up might be best...  I keep seeing people
> >doing vacuum full on this list and I'm thinking it's as much because of
> >the way the docs represent vacuum full as anything.
>
> I agree you shouldn't start thinking in terms of how to fix the existing
> documentation.  I'd suggest instead writing a tutorial leading someone
> through what they need to know about their tables first and then going
> into how vacuum works based on that data.

Take a look at the stuff at http://decibel.org/~decibel/pervasive/, it'd
hopefully provide a useful starting point.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Attachment

[OT] Re: [Again] Postgres performance problem

From
Ow Mun Heng
Date:
On Mon, 2007-09-17 at 07:27 -0500, Decibel! wrote:

> Take a look at the stuff at http://decibel.org/~decibel/pervasive/, it'd
> hopefully provide a useful starting point.


A bit offtrack, but I was reading the articles and noticed this in the
bottom. Is this a typo or ...


Making PostreSQL pervasive© 2005 Pervasive Software Inc
       ^^^^^^^^^