Thread: VACUUM ANALYZE extremely slow

VACUUM ANALYZE extremely slow

From
Sergei Shelukhin
Date:
This is my first (and, by the love of the God, last) project w/pgsql
and everything but the simplest selects is so slow I want to cry.
This is especially bad with vacuum analyze - it takes several hours
for a database of mere 15 Gb on a fast double-core server w/2Gb of RAM
and virtually no workload at the moment. Maintenance work mem is set
to 512 Mb.

Is there any way to speed up ANALYZE? Without it all the queries run
so slow that I want to cry after a couple of hours of operation and
with it system has to go down for hours per day and that is
unacceptable.

The same database running on mysql on basically the same server used
to run optimize table on every table every half an hour without any
problem, I am actually pondering scraping half the work on the
conversion and stuff and going back to mysql but I wonder if there's
some way to improve it.


Re: VACUUM ANALYZE extremely slow

From
Steve Atkins
Date:
On Jun 17, 2007, at 2:15 PM, Sergei Shelukhin wrote:

> This is my first (and, by the love of the God, last) project w/pgsql
> and everything but the simplest selects is so slow I want to cry.
> This is especially bad with vacuum analyze - it takes several hours
> for a database of mere 15 Gb on a fast double-core server w/2Gb of RAM
> and virtually no workload at the moment. Maintenance work mem is set
> to 512 Mb.
>
> Is there any way to speed up ANALYZE? Without it all the queries run
> so slow that I want to cry after a couple of hours of operation and
> with it system has to go down for hours per day and that is
> unacceptable.

There's no need to stop the database to run vacuum or analyze (heck,
with autovacuum in modern versions of postgresql you can get away
without running them at all, sometimes).

I suspect you're doing the wrong thing (perhaps running vacuum full,
rather than plain vacuum). Can you tell us what version of postgresql
you're running and how you're vacuuming it (what commands you're
running)?

Cheers,
   Steve



Re: VACUUM ANALYZE extremely slow

From
Alvaro Herrera
Date:
Sergei Shelukhin escribió:

> The same database running on mysql on basically the same server used
> to run optimize table on every table every half an hour without any
> problem, I am actually pondering scraping half the work on the
> conversion and stuff and going back to mysql but I wonder if there's
> some way to improve it.

Why waste time on it?  Just move it to mysql since it's so good, and be
done with it.

--
Alvaro Herrera       Valdivia, Chile   ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"Someone said that it is at least an order of magnitude more work to do
production software than a prototype. I think he is wrong by at least
an order of magnitude."                              (Brian Kernighan)

Re: VACUUM ANALYZE extremely slow

From
David Wall
Date:
On Jun 17, 2007, at 2:15 PM, Sergei Shelukhin wrote:
>> This is my first (and, by the love of the God, last) project w/pgsql
>> and everything but the simplest selects is so slow I want to cry.
>> This is especially bad with vacuum analyze - it takes several hours
>> for a database of mere 15 Gb on a fast double-core server w/2Gb of RAM
>> and virtually no workload at the moment. Maintenance work mem is set
>> to 512 Mb.
I have a  23 GB PG 8.1 db running on a 3 year old dual processor Dell
running Red Hat Linux, 2GB RAM.  My tests ran on our production database
while it was running (albeit not under a heavy load) with no obvious
slowdown for users using our application.  PG was using 1.2% of the CPU,
and noted the server had been running for 322 days.  I'm not sure if
there's a way to make vacuum use more processor time to speed it up or not.

"ANALYZE" took 1 minute 14 seconds.

"VACUUM" took significantly longer at 24 minutes 22 seconds.

I noted that we hadn't set maintenance_work_mem so it presumably is
using the default of 16384 (16MB).

I'm sure much depends on how many tables, how many deleted rows, etc.
are in your db, but "several hours" certainly is nothing like what we see.

David


Re: VACUUM ANALYZE extremely slow

From
Bill Moran
Date:
In response to Sergei Shelukhin <realgeek@gmail.com>:

> This is my first (and, by the love of the God, last) project w/pgsql

One has to ask, are you actually looking for help, or trolling?

If you honestly want help, I would suggest you work on your communication
skills first.  If you're a troll, go away.

--
Bill Moran
http://www.potentialtech.com

Re: VACUUM ANALYZE extremely slow

From
Greg Smith
Date:
On Sun, 17 Jun 2007, Sergei Shelukhin wrote:

> Is there any way to speed up ANALYZE? Without it all the queries run
> so slow that I want to cry after a couple of hours of operation and
> with it system has to go down for hours per day and that is
> unacceptable.

I've found I cry a lot less if I actually spend a minute educating myself
about things instead of complaining to a mailing list in a fashion
offensive to the people who might help me.  You should start with
http://www.postgresql.org/docs/current/static/sql-vacuum.html where you'll
discover the amazing fact that VACUUM ANALYZE doesn't require taking down
the system at all.  The way you get it to take less time is to do it more
often.  In fact, if you're using a PostgreSQL version where you can set up
auto-vacuum correctly, you might not ever need to run it manually at all.

Here are the things to consider if you actually want some help here:

0) You might as well keep this threading going, but next time, post to the
performance list instead of the general one; it's more appropriate and
you'll get a better mix of people familiar with this sort of topic.

1) Give some more details about the non-default values in your
postgresql.conf file.
http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm will get
you started on the most important ones to worry about and pointers to
additional resources.  You say you've increased the maintenance working
mem; that's a good start, but if everything else is at the default you're
not taking advantage of all the memory in your system.  MySQL manages
memory very differently, and those differences may be the root cause of
your issue.

2) State what version of PostgreSQL you're running.  If it's older than a
recent 8.1 release, there are many known and unresolvable performance
issues and you may have to upgrade to get what you're looking for.

3) Give some information about your disk configuration, and some tests
results to confirm they're working normally if possible.
http://www.westnet.com/~gsmith/content/postgresql/pg-disktesting.htm has
an outline of how to do those tests.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: VACUUM ANALYZE extremely slow

From
Scott Marlowe
Date:
Sergei Shelukhin wrote:
> This is my first (and, by the love of the God, last) project w/pgsql
> and everything but the simplest selects is so slow I want to cry.
> This is especially bad with vacuum analyze - it takes several hours
> for a database of mere 15 Gb on a fast double-core server w/2Gb of RAM
> and virtually no workload at the moment. Maintenance work mem is set
> to 512 Mb.
>
>
I noticed you didn't mention your disk subsystem.  PostgreSQL tends to
use a fair bit of disk I/O when running vacuum and / or analyze.  If you
check with top / iostat while vacuum analyze is running, I'm betting
you'll see a lot of waiting on I/O going on.

You do know those two commands (vacuum and analyze) aren't married
anymore, right?  You can run analyze all by itself if you want?

And I hope you're not running vacuum analyze full all the time, cause
there's usually no need for that.

Look up pg_autovacuum.  Saves a lot of har pulling.
> Is there any way to speed up ANALYZE?
Analyze is usually REALY fast.  Even on my rather pokey workstation,
with a single SATA hard drive and other things to do, I can run analyze
on a 31 Gig database in
>  Without it all the queries run
> so slow that I want to cry after a couple of hours of operation and
> with it system has to go down for hours per day and that is
> unacceptable.
>
You should only need to run analyze every so often.  You should only
need vacuum after lots of updates / deletes.  You should not need to
take the system down to vacuum, as vacuum doesn't block.  Vacuum full
does block, but if you need that you either aren't vacuuming often
enough or you don't have the autovacuum daemon configured.
> The same database running on mysql on basically the same server used
> to run optimize table on every table every half an hour without any
> problem, I am actually pondering scraping half the work on the
> conversion and stuff and going back to mysql but I wonder if there's
> some way to improve it.
>
And when you ran optimize on those tables, were they not locked for
regular users the whole time?

There may be a way to improve it.  Tell us, what OS are you running,
what are your non-default postgresql.conf settings, what ACTUAL commands
are you running here?  Vacuum, vacuum analyze, vacuum full analyze?  Are
you inserting / deleting / updating tons of rows between vacuums and /
or analyzes?


>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>


Re: VACUUM ANALYZE extremely slow

From
Sergei Shelukhin
Date:
errr... workmem is 256Mb of course, and 5m for explain analyze costs.


Re: VACUUM ANALYZE extremely slow

From
Sergei Shelukhin
Date:
Hi. Sorry for being a bit emotional, I was pretty constructive in my
earlier posts (the earlier, the more constructive if you care to
search) but I am progressively getting pissed off :(

Thanks for the initial tip, running ANALYZE w/o vacuum is faster. Are
frequent vacuums even necessary if there are virtually no deletes in
the system and updates are much less frequent compared to inserts?

Now for the answers.
* What other non-default configuration settings do you have?
I played w/shared buffers, setting them between 16k and 32k,~ 24k
seems to be the best but the difference is minimal. The work_mem
setting is 256kb, and I increased effective cache size to ~700Mb (~35%
ram) based on a tip from some article.
max_fsm_* is increased too to accomodate vacuum analyze runs.

* What version of PostgreSQL are you using and on what OS?
8.1 and Debian

What kind of disks and controllers do you have?
Umm... Unfortunately I do not know, and do not have access to check :)
THe development server probably has a fast consumer grade HDD, not a
RAID I suppose.

* My "system has to go down" remark was probably a bit exaggerated;
the system is noticeably slower when the vacuum is running and that is
basically without workload; it will probably become unusable under
load.

* Specific queries that were slow: there are generally no specific
queries, everything runs slowly, mostly due to estimates being grossly
inaccurate (like 800k cost for a complex query based on Explain and
then 5, actual cost based on e.analyze)

I was not able to find any special bottlenecks in the queries. I come
from MSSQL (and a bit of MySQL) background, and the only thing I
noticed is that there are a lot of seqscans in the places where
indexes should be used (imo), when I turn seqscans off indexes are
ridiculously slow (I have already posted about it), presumably due to
random access used to build a big result set; however, result set is
not in fact big in real queries and postgres supposedly doesn't do
some magic that MSSQL does (join order or something) that makes the
join that should utilize the index narrower and thus faster.

There are also some weird development-stage-specific bottlenecks like
adding a column to 34m row table and updating it to default value (I
wasn't able to wait till the end of that query, it probably runs for
days); I think I understand why it is slow based on forum posts, but
again, MSSQL and MySQL do it better.

Also there's a general impression of everything being slow even after
MySQL on the same server; starting from seqscans to aggregate
operations; I won't be able to supply concrete evidence tho (not yet
anyway), it's just the gut feeling.




Re: VACUUM ANALYZE extremely slow

From
Alvaro Herrera
Date:
Sergei Shelukhin escribió:

> * What other non-default configuration settings do you have?
> I played w/shared buffers, setting them between 16k and 32k,~ 24k
> seems to be the best but the difference is minimal. The work_mem
> setting is 256kb, and I increased effective cache size to ~700Mb (~35%
> ram) based on a tip from some article.
> max_fsm_* is increased too to accomodate vacuum analyze runs.

work_mem = 256 MB is probably too much; you might be filling your RAM
with it, causing the system to swap.  Try decreasing it.  This would be
particularly noticeable under load.  Keep in mind that work_mem is per
sort/hash and per backend, i.e. if you have a query which does 3 sorts
and is being executed by 5 processes in parallel, you will have 3 * 5 *
256 MB = 3840 MB of RAM in active use.

> * My "system has to go down" remark was probably a bit exaggerated;
> the system is noticeably slower when the vacuum is running and that is
> basically without workload; it will probably become unusable under
> load.

You can set the vacuum_cost_* settings in order to reduce the impact of
vacuum in the available I/O.


> * Specific queries that were slow: there are generally no specific
> queries, everything runs slowly, mostly due to estimates being grossly
> inaccurate (like 800k cost for a complex query based on Explain and
> then 5, actual cost based on e.analyze)

You might want to try increasing statistic targets for the problematic
columns.  It would be useful to choose one of these and send them along
for dissection here on the lists, to investigate the actual problems.
Misestimations are a problem at times, but most of the time there are
workarounds.

--
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"Oh, great altar of passive entertainment, bestow upon me thy discordant images
at such speed as to render linear thought impossible" (Calvin a la TV)