Thread: Server vacuuming the same table again and again

Server vacuuming the same table again and again

From
Дмитрий Шалашов
Date:
Hi!

Half a day ago one of our production PG servers (arguably busiest one) become very slow; I went to investigate the issue and found that it runs simultaneously '(auto)VACUUM ANALYZE recommendations' - largest table on that server - and checkpoint, giving a 100% disk load, that resulted with queue of queries which only made things worse of course.
For a while I tried to set different ionice settings to wal writer and checkpointer processes (-c 2 -n [5-7]) for no visible effect. Then I cancelled autovacuum and it seems to help.

When things settled up and day was reaching end I started VACUUM ANALYZE of this table by hand and continued observations.
Vacuum ended in about 2 hours and half. But soon I noticed that server started another autovacuum of the same table...
Problems returned and resolved after it finished (not 100% sure it was the reason though).

In the morning autovacuum was back. And then it finished and I gone to work. And now I'm here and there is autovacuum again %)
And load too. But I had to say, sometimes there is autovacuum and no load. I'm not really sure autovacuum is the culprit, but there is correlation and it behaves strange anyway.
In the app code nothing changed I believe.

Any recommendations where to dig further?

PG version: 9.2.8

Server hardware: E5-2690 x 2, 96GB RAM, 146GB 15k SAS x 8, HP P420i 2G RAID controller, raid 1 for system and raid 50 for DB.

Perfomance settings changed:
shared_buffers = 24GB
temp_buffers = 128MB
work_mem = 16MB
maintenance_work_mem = 1GB
effective_cache_size = 48GB
effective_io_concurrency = 6 (I just realised I have to set it to 4, right?)
synchronous_commit = off
checkpoint_segments = 64
checkpoint_timeout = 10min
checkpoint_completion_target = 0.8
checkpoint_warning = 3600s

Plus I set vm.dirty_background_bytes to 134217728 and vm.dirty_bytes to 1073741824.

Also I believe now that raid 1 for system might be a mistake. Maybe give it for WAL?

Best regards,
Dmitriy Shalashov

Re: Server vacuuming the same table again and again

From
Ilya Kosmodemiansky
Date:
Hi Dmitry,

On Fri, Apr 25, 2014 at 9:47 AM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
> cancelled autovacuum and it seems to help.

> In the morning autovacuum was back. And then it finished and I gone to work.

Actually, thise two things are tightly bound and there is no chance to
avoid vacuum, you can only postpone it, this kind of work eventually
supposed to be done.

What you really need to do as a first thing - configure your
autovacuum aggressively enough and then mayde ionice autovacuum
instead of mission critical ckeckpointer or bgwriter.

Which exact values have you in the following settings:

 autovacuum_analyze_scale_factor
 autovacuum_analyze_threshold
 autovacuum_freeze_max_age
 autovacuum_max_workers
 autovacuum_naptime
 autovacuum_vacuum_cost_delay
 autovacuum_vacuum_cost_limit
 autovacuum_vacuum_scale_factor
 autovacuum_vacuum_threshold
 log_autovacuum_min_duration

?

Best regards, Ilya
>
> Best regards,
> Dmitriy Shalashov



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


Re: Server vacuuming the same table again and again

From
Дмитрий Шалашов
Date:
Hi Ilya!

Actually, thise two things are tightly bound and there is no chance to avoid vacuum, you can only postpone it, this kind of work eventually supposed to be done.

I understand that autovacuum has to be done, but not right after previous autovacuum? And then again and again.
And after cancelling that first autovacuum I started another one by hand; from there no autovacuum was cancelled.

ionice autovacuum instead of mission critical ckeckpointer or bgwriter
Yeah, that was desperate. I restarted server when I had a chance - to drop my ionice settings back to defaults.

Which exact values have you in the following settings:

autovacuum_analyze_scale_factor = 0.1
autovacuum_analyze_threshold = 50
autovacuum_freeze_max_age = 200000000
autovacuum_max_workers = 3
autovacuum_naptime = 60
autovacuum_vacuum_cost_delay = 20
autovacuum_vacuum_cost_limit = -1
autovacuum_vacuum_scale_factor = 0.2
autovacuum_vacuum_threshold = 50
log_autovacuum_min_duration = 0

All defaults except last one I believe.


Minwhile I noticed in the night logs:
checkpoints are occurring too frequently (138 seconds apart)
Consider increasing the configuration parameter "checkpoint_segments".

Increased checkpoint_segments to 256 and reloaded config.


Best regards,
Dmitriy Shalashov


2014-04-25 12:12 GMT+04:00 Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com>:
Hi Dmitry,

On Fri, Apr 25, 2014 at 9:47 AM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
> cancelled autovacuum and it seems to help.

> In the morning autovacuum was back. And then it finished and I gone to work.

Actually, thise two things are tightly bound and there is no chance to
avoid vacuum, you can only postpone it, this kind of work eventually
supposed to be done.

What you really need to do as a first thing - configure your
autovacuum aggressively enough and then mayde ionice autovacuum
instead of mission critical ckeckpointer or bgwriter.

Which exact values have you in the following settings:

 autovacuum_analyze_scale_factor
 autovacuum_analyze_threshold
 autovacuum_freeze_max_age
 autovacuum_max_workers
 autovacuum_naptime
 autovacuum_vacuum_cost_delay
 autovacuum_vacuum_cost_limit
 autovacuum_vacuum_scale_factor
 autovacuum_vacuum_threshold
 log_autovacuum_min_duration

?

Best regards, Ilya
>
> Best regards,
> Dmitriy Shalashov



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com

Re: Server vacuuming the same table again and again

From
Дмитрий Шалашов
Date:
And right now we have a new kind of problem.
Previously during load disk was 100% busy; now we have around 100 active state queries, 100% loaded proc, but disk is virtually idle... Normally we have under 10 active queries.
Any hints on that?


Best regards,
Dmitriy Shalashov


2014-04-25 12:22 GMT+04:00 Дмитрий Шалашов <skaurus@gmail.com>:
Hi Ilya!

Actually, thise two things are tightly bound and there is no chance to avoid vacuum, you can only postpone it, this kind of work eventually supposed to be done.

I understand that autovacuum has to be done, but not right after previous autovacuum? And then again and again.
And after cancelling that first autovacuum I started another one by hand; from there no autovacuum was cancelled.

ionice autovacuum instead of mission critical ckeckpointer or bgwriter
Yeah, that was desperate. I restarted server when I had a chance - to drop my ionice settings back to defaults.

Which exact values have you in the following settings:

autovacuum_analyze_scale_factor = 0.1
autovacuum_analyze_threshold = 50
autovacuum_freeze_max_age = 200000000
autovacuum_max_workers = 3
autovacuum_naptime = 60
autovacuum_vacuum_cost_delay = 20
autovacuum_vacuum_cost_limit = -1
autovacuum_vacuum_scale_factor = 0.2
autovacuum_vacuum_threshold = 50
log_autovacuum_min_duration = 0

All defaults except last one I believe.


Minwhile I noticed in the night logs:
checkpoints are occurring too frequently (138 seconds apart)
Consider increasing the configuration parameter "checkpoint_segments".

Increased checkpoint_segments to 256 and reloaded config.


Best regards,
Dmitriy Shalashov


2014-04-25 12:12 GMT+04:00 Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com>:

Hi Dmitry,

On Fri, Apr 25, 2014 at 9:47 AM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
> cancelled autovacuum and it seems to help.

> In the morning autovacuum was back. And then it finished and I gone to work.

Actually, thise two things are tightly bound and there is no chance to
avoid vacuum, you can only postpone it, this kind of work eventually
supposed to be done.

What you really need to do as a first thing - configure your
autovacuum aggressively enough and then mayde ionice autovacuum
instead of mission critical ckeckpointer or bgwriter.

Which exact values have you in the following settings:

 autovacuum_analyze_scale_factor
 autovacuum_analyze_threshold
 autovacuum_freeze_max_age
 autovacuum_max_workers
 autovacuum_naptime
 autovacuum_vacuum_cost_delay
 autovacuum_vacuum_cost_limit
 autovacuum_vacuum_scale_factor
 autovacuum_vacuum_threshold
 log_autovacuum_min_duration

?

Best regards, Ilya
>
> Best regards,
> Dmitriy Shalashov



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


Re: Server vacuuming the same table again and again

From
Ilya Kosmodemiansky
Date:
Dmitry,

How is you filesystem under database exactly mount? (mount -l) And
just in case, while increasing checkpoint_segments, better to increase
checkpoint_timeout, otherwise all checkpoints will be still frequent
because segment threshold will be never reached. You could monitor
your pg_stat_bgwriter to understand which type of checkpoint happens
more frequent.

On Fri, Apr 25, 2014 at 10:22 AM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
> I understand that autovacuum has to be done, but not right after previous
> autovacuum? And then again and again.

That is exactly what happen: your autovacuum is not aggresive enough
and that is why it runs constantly instead of doing it s job by small
portions.

you should try something like this:

 autovacuum                          | on
 autovacuum_analyze_scale_factor     | 0.05
 autovacuum_analyze_threshold        | 5
 autovacuum_freeze_max_age           | 200000000
 autovacuum_max_workers              | 10 # set 10 for example and
then you could see - if they all working constantly, maybe you need
more. or less if not.
 autovacuum_multixact_freeze_max_age | 400000000
 autovacuum_naptime                  | 1
 autovacuum_vacuum_cost_delay        | 10
 autovacuum_vacuum_cost_limit        | -1
 autovacuum_vacuum_scale_factor      | 0.01
 autovacuum_vacuum_threshold         | 10
 log_autovacuum_min_duration         | -1


Best regards, Ilya

> Best regards,
> Dmitriy Shalashov
>
>
> 2014-04-25 12:12 GMT+04:00 Ilya Kosmodemiansky
> <ilya.kosmodemiansky@postgresql-consulting.com>:
>
>> Hi Dmitry,
>>
>> On Fri, Apr 25, 2014 at 9:47 AM, Дмитрий Шалашов <skaurus@gmail.com>
>> wrote:
>> > cancelled autovacuum and it seems to help.
>>
>> > In the morning autovacuum was back. And then it finished and I gone to
>> > work.
>>
>> Actually, thise two things are tightly bound and there is no chance to
>> avoid vacuum, you can only postpone it, this kind of work eventually
>> supposed to be done.
>>
>> What you really need to do as a first thing - configure your
>> autovacuum aggressively enough and then mayde ionice autovacuum
>> instead of mission critical ckeckpointer or bgwriter.
>>
>> Which exact values have you in the following settings:
>>
>>  autovacuum_analyze_scale_factor
>>  autovacuum_analyze_threshold
>>  autovacuum_freeze_max_age
>>  autovacuum_max_workers
>>  autovacuum_naptime
>>  autovacuum_vacuum_cost_delay
>>  autovacuum_vacuum_cost_limit
>>  autovacuum_vacuum_scale_factor
>>  autovacuum_vacuum_threshold
>>  log_autovacuum_min_duration
>>
>> ?
>>
>> Best regards, Ilya
>> >
>> > Best regards,
>> > Dmitriy Shalashov
>>
>>
>>
>> --
>> Ilya Kosmodemiansky,
>>
>> PostgreSQL-Consulting.com
>> tel. +14084142500
>> cell. +4915144336040
>> ik@postgresql-consulting.com
>
>



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


Re: Server vacuuming the same table again and again

From
Ilya Kosmodemiansky
Date:
On Fri, Apr 25, 2014 at 10:29 AM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
> Previously during load disk was 100% busy; now we have around 100 active
> state queries, 100% loaded proc, but disk is virtually idle...

That was happen after changing checkpoit_segments setting?

>
>
> Best regards,
> Dmitriy Shalashov
>
>
> 2014-04-25 12:22 GMT+04:00 Дмитрий Шалашов <skaurus@gmail.com>:
>
>> Hi Ilya!
>>
>> > Actually, thise two things are tightly bound and there is no chance to
>> > avoid vacuum, you can only postpone it, this kind of work eventually
>> > supposed to be done.
>>
>> I understand that autovacuum has to be done, but not right after previous
>> autovacuum? And then again and again.
>> And after cancelling that first autovacuum I started another one by hand;
>> from there no autovacuum was cancelled.
>>
>> > ionice autovacuum instead of mission critical ckeckpointer or bgwriter
>> Yeah, that was desperate. I restarted server when I had a chance - to drop
>> my ionice settings back to defaults.
>>
>> > Which exact values have you in the following settings:
>>
>> autovacuum_analyze_scale_factor = 0.1
>> autovacuum_analyze_threshold = 50
>> autovacuum_freeze_max_age = 200000000
>> autovacuum_max_workers = 3
>> autovacuum_naptime = 60
>> autovacuum_vacuum_cost_delay = 20
>> autovacuum_vacuum_cost_limit = -1
>> autovacuum_vacuum_scale_factor = 0.2
>> autovacuum_vacuum_threshold = 50
>> log_autovacuum_min_duration = 0
>>
>> All defaults except last one I believe.
>>
>>
>> Minwhile I noticed in the night logs:
>> checkpoints are occurring too frequently (138 seconds apart)
>> Consider increasing the configuration parameter "checkpoint_segments".
>>
>> Increased checkpoint_segments to 256 and reloaded config.
>>
>>
>> Best regards,
>> Dmitriy Shalashov
>>
>>
>> 2014-04-25 12:12 GMT+04:00 Ilya Kosmodemiansky
>> <ilya.kosmodemiansky@postgresql-consulting.com>:
>>
>>> Hi Dmitry,
>>>
>>> On Fri, Apr 25, 2014 at 9:47 AM, Дмитрий Шалашов <skaurus@gmail.com>
>>> wrote:
>>> > cancelled autovacuum and it seems to help.
>>>
>>> > In the morning autovacuum was back. And then it finished and I gone to
>>> > work.
>>>
>>> Actually, thise two things are tightly bound and there is no chance to
>>> avoid vacuum, you can only postpone it, this kind of work eventually
>>> supposed to be done.
>>>
>>> What you really need to do as a first thing - configure your
>>> autovacuum aggressively enough and then mayde ionice autovacuum
>>> instead of mission critical ckeckpointer or bgwriter.
>>>
>>> Which exact values have you in the following settings:
>>>
>>>  autovacuum_analyze_scale_factor
>>>  autovacuum_analyze_threshold
>>>  autovacuum_freeze_max_age
>>>  autovacuum_max_workers
>>>  autovacuum_naptime
>>>  autovacuum_vacuum_cost_delay
>>>  autovacuum_vacuum_cost_limit
>>>  autovacuum_vacuum_scale_factor
>>>  autovacuum_vacuum_threshold
>>>  log_autovacuum_min_duration
>>>
>>> ?
>>>
>>> Best regards, Ilya
>>> >
>>> > Best regards,
>>> > Dmitriy Shalashov
>>>
>>>
>>>
>>> --
>>> Ilya Kosmodemiansky,
>>>
>>> PostgreSQL-Consulting.com
>>> tel. +14084142500
>>> cell. +4915144336040
>>> ik@postgresql-consulting.com
>>
>>
>



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


Re: Server vacuuming the same table again and again

From
Дмитрий Шалашов
Date:
How is you filesystem under database exactly mount?
ext3 (rw)

Thanks, we'll try new autovacuum settings!

That was happen after changing checkpoit_segments setting?

First, I have to say that load comes and go in waves - we don't yet understood why.
All new waves have that behaviour - free disk, idle cpu.
First such wave was before checkpoit_segments change, next waves after. No more warnings about too often checkpoints though.


Best regards,
Dmitriy Shalashov


2014-04-25 13:22 GMT+04:00 Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com>:
On Fri, Apr 25, 2014 at 10:29 AM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
> Previously during load disk was 100% busy; now we have around 100 active
> state queries, 100% loaded proc, but disk is virtually idle...

That was happen after changing checkpoit_segments setting?

>
>
> Best regards,
> Dmitriy Shalashov
>
>
> 2014-04-25 12:22 GMT+04:00 Дмитрий Шалашов <skaurus@gmail.com>:
>
>> Hi Ilya!
>>
>> > Actually, thise two things are tightly bound and there is no chance to
>> > avoid vacuum, you can only postpone it, this kind of work eventually
>> > supposed to be done.
>>
>> I understand that autovacuum has to be done, but not right after previous
>> autovacuum? And then again and again.
>> And after cancelling that first autovacuum I started another one by hand;
>> from there no autovacuum was cancelled.
>>
>> > ionice autovacuum instead of mission critical ckeckpointer or bgwriter
>> Yeah, that was desperate. I restarted server when I had a chance - to drop
>> my ionice settings back to defaults.
>>
>> > Which exact values have you in the following settings:
>>
>> autovacuum_analyze_scale_factor = 0.1
>> autovacuum_analyze_threshold = 50
>> autovacuum_freeze_max_age = 200000000
>> autovacuum_max_workers = 3
>> autovacuum_naptime = 60
>> autovacuum_vacuum_cost_delay = 20
>> autovacuum_vacuum_cost_limit = -1
>> autovacuum_vacuum_scale_factor = 0.2
>> autovacuum_vacuum_threshold = 50
>> log_autovacuum_min_duration = 0
>>
>> All defaults except last one I believe.
>>
>>
>> Minwhile I noticed in the night logs:
>> checkpoints are occurring too frequently (138 seconds apart)
>> Consider increasing the configuration parameter "checkpoint_segments".
>>
>> Increased checkpoint_segments to 256 and reloaded config.
>>
>>
>> Best regards,
>> Dmitriy Shalashov
>>
>>
>> 2014-04-25 12:12 GMT+04:00 Ilya Kosmodemiansky
>> <ilya.kosmodemiansky@postgresql-consulting.com>:
>>
>>> Hi Dmitry,
>>>
>>> On Fri, Apr 25, 2014 at 9:47 AM, Дмитрий Шалашов <skaurus@gmail.com>
>>> wrote:
>>> > cancelled autovacuum and it seems to help.
>>>
>>> > In the morning autovacuum was back. And then it finished and I gone to
>>> > work.
>>>
>>> Actually, thise two things are tightly bound and there is no chance to
>>> avoid vacuum, you can only postpone it, this kind of work eventually
>>> supposed to be done.
>>>
>>> What you really need to do as a first thing - configure your
>>> autovacuum aggressively enough and then mayde ionice autovacuum
>>> instead of mission critical ckeckpointer or bgwriter.
>>>
>>> Which exact values have you in the following settings:
>>>
>>>  autovacuum_analyze_scale_factor
>>>  autovacuum_analyze_threshold
>>>  autovacuum_freeze_max_age
>>>  autovacuum_max_workers
>>>  autovacuum_naptime
>>>  autovacuum_vacuum_cost_delay
>>>  autovacuum_vacuum_cost_limit
>>>  autovacuum_vacuum_scale_factor
>>>  autovacuum_vacuum_threshold
>>>  log_autovacuum_min_duration
>>>
>>> ?
>>>
>>> Best regards, Ilya
>>> >
>>> > Best regards,
>>> > Dmitriy Shalashov
>>>
>>>
>>>
>>> --
>>> Ilya Kosmodemiansky,
>>>
>>> PostgreSQL-Consulting.com
>>> tel. +14084142500
>>> cell. +4915144336040
>>> ik@postgresql-consulting.com
>>
>>
>



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com

Re: Server vacuuming the same table again and again

From
Ilya Kosmodemiansky
Date:
Dmitry,

On Fri, Apr 25, 2014 at 11:31 AM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
> Thanks, we'll try new autovacuum settings!

I think things with vacuum will be much better.

If not, try to find out if you have long running transaction (several
minutes or more) and try to avoid such them.

>
> First, I have to say that load comes and go in waves - we don't yet
> understood why.
> All new waves have that behaviour - free disk, idle cpu.
> First such wave was before checkpoit_segments change, next waves after.

That could be a complicate problem caused by many things from suboptimal
sql-queries to network issues, could be not easy to guess.

- how many locks you have during the wave in comparison with normal workload?
- do you use some connection pooling (pgbouncer etc)?
- how about long running transactions I have mentioned above?
- are you using pg_stat_statements or any other method for detecting
slow queries?

>
>
> Best regards,
> Dmitriy Shalashov
>
>
> 2014-04-25 13:22 GMT+04:00 Ilya Kosmodemiansky
> <ilya.kosmodemiansky@postgresql-consulting.com>:
>
>> On Fri, Apr 25, 2014 at 10:29 AM, Дмитрий Шалашов <skaurus@gmail.com>
>> wrote:
>> > Previously during load disk was 100% busy; now we have around 100 active
>> > state queries, 100% loaded proc, but disk is virtually idle...
>>
>> That was happen after changing checkpoit_segments setting?
>>
>> >
>> >
>> > Best regards,
>> > Dmitriy Shalashov
>> >
>> >
>> > 2014-04-25 12:22 GMT+04:00 Дмитрий Шалашов <skaurus@gmail.com>:
>> >
>> >> Hi Ilya!
>> >>
>> >> > Actually, thise two things are tightly bound and there is no chance
>> >> > to
>> >> > avoid vacuum, you can only postpone it, this kind of work eventually
>> >> > supposed to be done.
>> >>
>> >> I understand that autovacuum has to be done, but not right after
>> >> previous
>> >> autovacuum? And then again and again.
>> >> And after cancelling that first autovacuum I started another one by
>> >> hand;
>> >> from there no autovacuum was cancelled.
>> >>
>> >> > ionice autovacuum instead of mission critical ckeckpointer or
>> >> > bgwriter
>> >> Yeah, that was desperate. I restarted server when I had a chance - to
>> >> drop
>> >> my ionice settings back to defaults.
>> >>
>> >> > Which exact values have you in the following settings:
>> >>
>> >> autovacuum_analyze_scale_factor = 0.1
>> >> autovacuum_analyze_threshold = 50
>> >> autovacuum_freeze_max_age = 200000000
>> >> autovacuum_max_workers = 3
>> >> autovacuum_naptime = 60
>> >> autovacuum_vacuum_cost_delay = 20
>> >> autovacuum_vacuum_cost_limit = -1
>> >> autovacuum_vacuum_scale_factor = 0.2
>> >> autovacuum_vacuum_threshold = 50
>> >> log_autovacuum_min_duration = 0
>> >>
>> >> All defaults except last one I believe.
>> >>
>> >>
>> >> Minwhile I noticed in the night logs:
>> >> checkpoints are occurring too frequently (138 seconds apart)
>> >> Consider increasing the configuration parameter "checkpoint_segments".
>> >>
>> >> Increased checkpoint_segments to 256 and reloaded config.
>> >>
>> >>
>> >> Best regards,
>> >> Dmitriy Shalashov
>> >>
>> >>
>> >> 2014-04-25 12:12 GMT+04:00 Ilya Kosmodemiansky
>> >> <ilya.kosmodemiansky@postgresql-consulting.com>:
>> >>
>> >>> Hi Dmitry,
>> >>>
>> >>> On Fri, Apr 25, 2014 at 9:47 AM, Дмитрий Шалашов <skaurus@gmail.com>
>> >>> wrote:
>> >>> > cancelled autovacuum and it seems to help.
>> >>>
>> >>> > In the morning autovacuum was back. And then it finished and I gone
>> >>> > to
>> >>> > work.
>> >>>
>> >>> Actually, thise two things are tightly bound and there is no chance to
>> >>> avoid vacuum, you can only postpone it, this kind of work eventually
>> >>> supposed to be done.
>> >>>
>> >>> What you really need to do as a first thing - configure your
>> >>> autovacuum aggressively enough and then mayde ionice autovacuum
>> >>> instead of mission critical ckeckpointer or bgwriter.
>> >>>
>> >>> Which exact values have you in the following settings:
>> >>>
>> >>>  autovacuum_analyze_scale_factor
>> >>>  autovacuum_analyze_threshold
>> >>>  autovacuum_freeze_max_age
>> >>>  autovacuum_max_workers
>> >>>  autovacuum_naptime
>> >>>  autovacuum_vacuum_cost_delay
>> >>>  autovacuum_vacuum_cost_limit
>> >>>  autovacuum_vacuum_scale_factor
>> >>>  autovacuum_vacuum_threshold
>> >>>  log_autovacuum_min_duration
>> >>>
>> >>> ?
>> >>>
>> >>> Best regards, Ilya
>> >>> >
>> >>> > Best regards,
>> >>> > Dmitriy Shalashov
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Ilya Kosmodemiansky,
>> >>>
>> >>> PostgreSQL-Consulting.com
>> >>> tel. +14084142500
>> >>> cell. +4915144336040
>> >>> ik@postgresql-consulting.com
>> >>
>> >>
>> >
>>
>>
>>
>> --
>> Ilya Kosmodemiansky,
>>
>> PostgreSQL-Consulting.com
>> tel. +14084142500
>> cell. +4915144336040
>> ik@postgresql-consulting.com
>
>



--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com


Re: Server vacuuming the same table again and again

From
Torsten Förtsch
Date:
On 25/04/14 09:47, Дмитрий Шалашов wrote:
> Half a day ago one of our production PG servers (arguably busiest one)
> become very slow; I went to investigate the issue and found that it runs
> simultaneously '(auto)VACUUM ANALYZE recommendations' - largest table on
> that server - and checkpoint, giving a 100% disk load

Maybe the table has reached the state where it needs a VACUUM FREEZE.
Autovacuum does that for you but it requires a complete scan of the table.

Torsten


Re: Server vacuuming the same table again and again

From
Дмитрий Шалашов
Date:
Turns out yesterday we fixed a bug and introduced a new bug, which was previously hidden by yet another bug which in turn we had fixed last week... %)
In result last fix led to greatly increased number of requests to the database.

But still, thanks for that, we found out about too frequent checkpoints and that our recommendations table has three times more dead tuples than live ones.

We will fix our autovacuum configuration.

As for other problem - 100% cpu load with idle disks - it is no more reproducing and we don't want it to :)

Thanks Ilya and Torsten!


Best regards,
Dmitriy Shalashov


2014-04-25 13:47 GMT+04:00 Torsten Förtsch <torsten.foertsch@gmx.net>:
On 25/04/14 09:47, Дмитрий Шалашов wrote:
> Half a day ago one of our production PG servers (arguably busiest one)
> become very slow; I went to investigate the issue and found that it runs
> simultaneously '(auto)VACUUM ANALYZE recommendations' - largest table on
> that server - and checkpoint, giving a 100% disk load

Maybe the table has reached the state where it needs a VACUUM FREEZE.
Autovacuum does that for you but it requires a complete scan of the table.

Torsten