Thread: Is Vacuum/analyze destroying my performance?

Is Vacuum/analyze destroying my performance?

From
"Carlo Stonebanks"
Date:
I have always been frustrated by the wildly erratic performance of our
postgresql 8 server. We run aprogram that does heavy data importing via a
heuristics-based import program. Sometime records being imported would just
fly by, sometimes they would crawl. The import program imports records from
a flat table and uses heuristics to normalise and dedupe. This is done via a
sequence of updates and inserts bracketed by a start-end transaction.

At a certain checkpoint representing about 1,000,000 rows read and imported,
I ran a vacuum/analyze on all of the tables in the target schema. To my
horror, performance reduced to less than TEN percent of what it was befor
the vacuum/analyze. I thought that turning autovacuum off and doing my own
vacuuming would improve performance, but it seems to be killing it.

I have since turned autovacuum on and am tearing my hair out wathcing the
imported records crawl by. I have tried vacuuming the entire DB as well as
rebuilding indexes. Nothing. Any ideas what could have happened? What is the
right thing to do?

Carlo



Re: Is Vacuum/analyze destroying my performance?

From
"Carlo Stonebanks"
Date:
Update on this issue, I "solved" my problem by doing the following:

1) Stopped the import, and did a checkpoint backup on my import target
schema
2) Dropped the import target schema
3) Restored a backup from a previous checkpoint when the tables were much
smaller
4) Performed a VACUUM/ANALYZE on all of the tables in the import target
schema in that smaller state
5) Dropped the import target schema again
6) Restored the checkpoint backup of the larger data set referred to in step
1
7) Rstarted the import from where it left off

The result: the import is flying again, with 10-20 times the performance.
The import runs as 4 different TCL scripts in parallel, importing difernt
segments of the table. The problem that I have when the import runs at this
speed is that I hve to constantly watch for lock-ups. Previously I had
reported that when these multiple processes are running at high speed,
PostgreSQL occasionally freezes one or more of the processes by never
retutning from a COMMIT. I look at the target tables, and it seems that the
commit has gone through.

This used to be a disaster because Ithought I had to restart every frozen
proess by killing the script and restarting at the last imported row.

Now I have found a way to un-freeze the program: I find the frozen process
via PgAdmin III and send a CANCEL. To my surprise, the import continues as i
nothing happened. Still incredibly inconvenient and laborious, but at least
it's a little less tedious.

Could these two problems - the weird slowdowns after a VACUUM/ANALYZE and
the frequent lockups when the import process is running quickly - be
related?

Carlo



Re: Is Vacuum/analyze destroying my performance?

From
Matthew O'Connor
Date:
Just a wild guess, but the performance problem sounds like maybe as your
data changes, eventually the planner moves some query from an index scan
to a sequential scan, do you have any details on what queries are taking
so long when things are running slow?  You can turn on the GUC var
"log_min_duration_statement" and see what queries are slow and then
manually check them with an explain analyze, that might help.

Matt


Carlo Stonebanks wrote:
> Update on this issue, I "solved" my problem by doing the following:
>
> 1) Stopped the import, and did a checkpoint backup on my import target
> schema
> 2) Dropped the import target schema
> 3) Restored a backup from a previous checkpoint when the tables were much
> smaller
> 4) Performed a VACUUM/ANALYZE on all of the tables in the import target
> schema in that smaller state
> 5) Dropped the import target schema again
> 6) Restored the checkpoint backup of the larger data set referred to in step
> 1
> 7) Rstarted the import from where it left off
>
> The result: the import is flying again, with 10-20 times the performance.
> The import runs as 4 different TCL scripts in parallel, importing difernt
> segments of the table. The problem that I have when the import runs at this
> speed is that I hve to constantly watch for lock-ups. Previously I had
> reported that when these multiple processes are running at high speed,
> PostgreSQL occasionally freezes one or more of the processes by never
> retutning from a COMMIT. I look at the target tables, and it seems that the
> commit has gone through.
>
> This used to be a disaster because Ithought I had to restart every frozen
> proess by killing the script and restarting at the last imported row.
>
> Now I have found a way to un-freeze the program: I find the frozen process
> via PgAdmin III and send a CANCEL. To my surprise, the import continues as i
> nothing happened. Still incredibly inconvenient and laborious, but at least
> it's a little less tedious.
>
> Could these two problems - the weird slowdowns after a VACUUM/ANALYZE and
> the frequent lockups when the import process is running quickly - be
> related?
>
> Carlo
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

Re: Is Vacuum/analyze destroying my performance?

From
"Carlo Stonebanks"
Date:
""Matthew O'Connor"" <matthew@zeut.net> wrote in message
news:45743240.7050302@zeut.net...
> Just a wild guess, but the performance problem sounds like maybe as your
> data changes, eventually the planner moves some query from an index scan
> to a sequential scan, do you have any details on what queries are taking
> so long when things are running slow?  You can turn on the GUC var
> "log_min_duration_statement" and see what queries are slow and then
> manually check them with an explain analyze, that might help.
>
> Matt

This is pretty well what I think is happening - I expect all queries to
eventually move from seq scans to index scans. I actually have a SQL logging
opion built into the import app.

I just can't figure out how the planner can be so wrong. We are running a 4
CPU server (two dual core 3.2 GHz Xeons) with 4GB RAM and Windows Server
2003 x64 and a PERC RAID subsystem (I don't know the RAID type). I know that
the metrics for the planner can be changed - is the default config for
postgesql not suitable for our setup? For this server, we would like to be
optimised for high speed over a few connections, rather than the classic
balanced speed over many connections.



Re: Is Vacuum/analyze destroying my performance?

From
"Matthew T. O'Connor"
Date:
Carlo Stonebanks wrote:
>> Just a wild guess, but the performance problem sounds like maybe as your
>> data changes, eventually the planner moves some query from an index scan
>> to a sequential scan, do you have any details on what queries are taking
>> so long when things are running slow?  You can turn on the GUC var
>> "log_min_duration_statement" and see what queries are slow and then
>> manually check them with an explain analyze, that might help.
>>
> This is pretty well what I think is happening - I expect all queries to
> eventually move from seq scans to index scans. I actually have a SQL logging
> opion built into the import app.
>
> I just can't figure out how the planner can be so wrong. We are running a 4
> CPU server (two dual core 3.2 GHz Xeons) with 4GB RAM and Windows Server
> 2003 x64 and a PERC RAID subsystem (I don't know the RAID type). I know that
> the metrics for the planner can be changed - is the default config for
> postgesql not suitable for our setup? For this server, we would like to be
> optimised for high speed over a few connections, rather than the classic
> balanced speed over many connections.

If it is the planner choosing a very bad plan, then I don't think your
hardware has anything to do with it.  And, we can't diagnose why the
planner is doing what it's doing without a lot more detail.  I suggest
you do something to figure out what queries are taking so long, then
send us an explain analyze, that might shine some light on the subject.