Thread: Vacuum goes worse

Vacuum goes worse

From
Stéphane Schildknecht
Date:
Hi,

For some times, we have a vacuuming process on a specific table that
goes slower and slower. In fact, it took some 3 minutes a month ago, and
now it take almost 20 minutes. But, if one day it take so many time, it
is possible that on the day after it will only take 4 minutes...

I know the table in concern had 450000 tuples two months ago and now has
more than 700000 tuples in it.

I wonder vacuum verbose would tell me if fsm parameters were not too
badly configured, but I can't get the 4 last lines of the output...

Is there another way to get these info ? Or is it a parameter badly
configured ?

For information, it's on AIX, PG8.1.9.

Some configuration parameters :
client_min_messages : notice
log_error_verbosity : default
log_min_error_statement : panic
log_min_messages : notice.

Whats's more, I wonder what we could monitor to get some explanation of
the recent time increase, and then have a quite-sure way of configuring
the server.

I have to say the database is hosted, accessed in production on a 24/7
basis and then every change in configuration has to be scheduled.

Some more information you may ask:
chackpoint_segments : 32
checkpoint_timeout    : 180
checkpoint_warning   : 30
wal_buffers                 : 64
maintenance_work_mem : 65536
max_fsm_pages                : 400000
max_fsm_relations           : 1000
shared_buffers                : 50000
temp_bufers                    : 1000

We also have 4Gb RAM.

Isn't checkpoint_segments too low as all files in pg_xlogs seem to be
recycled within a few minutes. (In fact among the 60 files, at least 30
have been modified during the few minutes of that particular vacuum).

Thanks for any advice you could give me.

Best regards,

--
Stéphane SCHILDKNECHT
Président de PostgreSQLFr
http://www.postgresqlfr.org


Re: Vacuum goes worse

From
Heikki Linnakangas
Date:
Stéphane Schildknecht wrote:
> I wonder vacuum verbose would tell me if fsm parameters were not too
> badly configured, but I can't get the 4 last lines of the output...

Why not?

> Whats's more, I wonder what we could monitor to get some explanation of
> the recent time increase, and then have a quite-sure way of configuring
> the server.

sar or iostat output would be a good start, to determine if it's waiting
for I/O or what.

> I have to say the database is hosted, accessed in production on a 24/7
> basis and then every change in configuration has to be scheduled.
>
> Some more information you may ask:
> chackpoint_segments : 32
> checkpoint_timeout    : 180
> checkpoint_warning   : 30
> wal_buffers                 : 64
> maintenance_work_mem : 65536
> max_fsm_pages                : 400000
> max_fsm_relations           : 1000
> shared_buffers                : 50000
> temp_bufers                    : 1000
>
> We also have 4Gb RAM.
>
> Isn't checkpoint_segments too low as all files in pg_xlogs seem to be
> recycled within a few minutes. (In fact among the 60 files, at least 30
> have been modified during the few minutes of that particular vacuum).

Increasing checkpoint_segments seems like a good idea then. You should
increase checkpoint_timeout as well, 180 is just 3 minutes. How much
concurrent activity is there in the database? 30 pg_xlog files equals
512 MB of WAL; that's quite a lot.

Have you changed the vacuum cost delay settings from the defaults?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: Vacuum goes worse

From
Stéphane Schildknecht
Date:
Heikki Linnakangas a écrit :
> Stéphane Schildknecht wrote:
>
>> I wonder vacuum verbose would tell me if fsm parameters were not too
>> badly configured, but I can't get the 4 last lines of the output...
>>
>
> Why not?
>

I would like to know... Seems like vacuum does not want me to see these
precious line. I really don't know why.
>
>> Whats's more, I wonder what we could monitor to get some explanation of
>> the recent time increase, and then have a quite-sure way of configuring
>> the server.
>>
>
> sar or iostat output would be a good start, to determine if it's waiting
> for I/O or what.
>

Ok, I'll try that.
>
> Increasing checkpoint_segments seems like a good idea then. You should
> increase checkpoint_timeout as well, 180 is just 3 minutes. How much
> concurrent activity is there in the database? 30 pg_xlog files equals
> 512 MB of WAL; that's quite a lot.
>

I don't know exactly how far, but yes, activity is high.
> Have you changed the vacuum cost delay settings from the defaults?
>

Not yet.


--
Stéphane SCHILDKNECHT
Président de PostgreSQLFr
http://www.postgresqlfr.org


Re: Vacuum goes worse

From
Tom Lane
Date:
=?ISO-8859-1?Q?St=E9phane_Schildknecht?= <stephane.schildknecht@postgresqlfr.org> writes:
> For some times, we have a vacuuming process on a specific table that
> goes slower and slower. In fact, it took some 3 minutes a month ago, and
> now it take almost 20 minutes. But, if one day it take so many time, it
> is possible that on the day after it will only take 4 minutes...

> I know the table in concern had 450000 tuples two months ago and now has
> more than 700000 tuples in it.

The real question is how often do rows get updated?  I suspect you
probably need to vacuum this table more than once a day.

            regards, tom lane

Re: Vacuum goes worse

From
Stéphane Schildknecht
Date:
Tom Lane a écrit :
> =?ISO-8859-1?Q?St=E9phane_Schildknecht?= <stephane.schildknecht@postgresqlfr.org> writes:
>
>> For some times, we have a vacuuming process on a specific table that
>> goes slower and slower. In fact, it took some 3 minutes a month ago, and
>> now it take almost 20 minutes. But, if one day it take so many time, it
>> is possible that on the day after it will only take 4 minutes...
>>
>
>
>> I know the table in concern had 450000 tuples two months ago and now has
>> more than 700000 tuples in it.
>>
>
> The real question is how often do rows get updated?  I suspect you
> probably need to vacuum this table more than once a day.
>
>

To be honest, I suspect it too. But, I have been told by people using
that database they can't do vacuum more frequently than once in a day as
it increases the time to achieve concurrent operations.
That's also why they don't want to hear about autovacuum.

And finally that's why I'm looking for everything I can monitor to
obtain information to convince them they're wrong and I'm right ;-)

That's also why I am so disappointed vacuum doesn't give me these 4
hints lines.

Regards,

--
Stéphane SCHILDKNECHT
Président de PostgreSQLFr
http://www.postgresqlfr.org


Re: Vacuum goes worse

From
Tom Lane
Date:
=?ISO-8859-1?Q?St=E9phane_Schildknecht?= <stephane.schildknecht@postgresqlfr.org> writes:
> Tom Lane a �crit :
>> The real question is how often do rows get updated?  I suspect you
>> probably need to vacuum this table more than once a day.

> To be honest, I suspect it too. But, I have been told by people using
> that database they can't do vacuum more frequently than once in a day as
> it increases the time to achieve concurrent operations.

vacuum_cost_delay can help here.

            regards, tom lane

Re: Vacuum goes worse

From
"Scott Marlowe"
Date:
On 10/16/07, Stéphane Schildknecht
<stephane.schildknecht@postgresqlfr.org> wrote:
> Tom Lane a écrit :
> > =?ISO-8859-1?Q?St=E9phane_Schildknecht?= <stephane.schildknecht@postgresqlfr.org> writes:
> >
> >> For some times, we have a vacuuming process on a specific table that
> >> goes slower and slower. In fact, it took some 3 minutes a month ago, and
> >> now it take almost 20 minutes. But, if one day it take so many time, it
> >> is possible that on the day after it will only take 4 minutes...
> >>
> >
> >
> >> I know the table in concern had 450000 tuples two months ago and now has
> >> more than 700000 tuples in it.
> >>
> >
> > The real question is how often do rows get updated?  I suspect you
> > probably need to vacuum this table more than once a day.
> >
> >
>
> To be honest, I suspect it too. But, I have been told by people using
> that database they can't do vacuum more frequently than once in a day as
> it increases the time to achieve concurrent operations.
> That's also why they don't want to hear about autovacuum.

Sounds like somebody there is operating on the belief that vacuums
always cost the same amount i/o wise.  With the vacuum_cost_delay
setting Tim mentioned this is not true.  Their concern shouldn't be
with how you accomplish your job, but with you meeting certain
performance criteria, and with vacuum cost delay, it is quite possible
to vacuum midday with affecting the db too much.

> And finally that's why I'm looking for everything I can monitor to
> obtain information to convince them they're wrong and I'm right ;-)

Good luck with that.  I still have a boss who thinks "vacuum's not
fast enough".  His last experience with pgsql was in the 7.2 days.
Generally he's a pretty smart guy, but he's convinced himself that
PostgreSQL 8.3 and 7.2 are pretty much the same beasts.

> That's also why I am so disappointed vacuum doesn't give me these 4
> hints lines.

What kind of error, or output, does it give you at the end?  Any hint
as to why they're missing?

Re: Vacuum goes worse

From
Tom Lane
Date:
"Scott Marlowe" <scott.marlowe@gmail.com> writes:
> On 10/16/07, St=E9phane Schildknecht
> <stephane.schildknecht@postgresqlfr.org> wrote:
>> That's also why I am so disappointed vacuum doesn't give me these 4
>> hints lines.

> What kind of error, or output, does it give you at the end?  Any hint
> as to why they're missing?

If you're talking about the FSM statistics display, that only gets
printed by a database-wide VACUUM (one that doesn't name a specific
table).

            regards, tom lane

Re: Vacuum goes worse

From
Brian Herlihy
Date:
Would it make sense to show the FSM stats for individual table vaccums as well?  I'm wondering if the reason they
aren'tshown is because it wouldn't be useful or isn't practical, or just that it hasn't been done. 

Brian

----- Original Message ----
From: Tom Lane <tgl@sss.pgh.pa.us>

If you're talking about the FSM statistics display, that only gets
printed by a database-wide VACUUM (one that doesn't name a specific
table).

            regards, tom lane




Re: Vacuum goes worse

From
Tom Lane
Date:
Brian Herlihy <btherl@yahoo.com.au> writes:
> Would it make sense to show the FSM stats for individual table vaccums as w=
> ell?  I'm wondering if the reason they aren't shown is because it wouldn't =
> be useful or isn't practical, or just that it hasn't been done.

It was intentionally omitted in the original design, on the grounds that
after a single-table VACUUM there's no very good reason to think that
the global FSM stats are sufficiently complete to be accurate.  Of
course, in a multi-database installation the same charge could be
leveled against the situation after a single-database VACUUM, so maybe
there's not a lot of strength in the argument.

IIRC the code change would be trivial, it's just a matter of judgment
whether the extra output is useful/trustworthy.

            regards, tom lane

Re: Vacuum goes worse

From
"Joshua D. Drake"
Date:
On Tue, 16 Oct 2007 17:03:39 -0700 (PDT)
Brian Herlihy <btherl@yahoo.com.au> wrote:

> Would it make sense to show the FSM stats for individual table
> vaccums as well?  I'm wondering if the reason they aren't shown is
> because it wouldn't be useful or isn't practical, or just that it
> hasn't been done.

I am not sure how useful it would be as the FSM is global. However what
would be useful is something like VACUUM SUMMARY, where I could get
"just" the stats instead of all the other output that comes along with
VERBOSE.

Joshua D. Drake

>
> Brian
>
> ----- Original Message ----
> From: Tom Lane <tgl@sss.pgh.pa.us>
>
> If you're talking about the FSM statistics display, that only gets
> printed by a database-wide VACUUM (one that doesn't name a specific
> table).
>
>             regards, tom lane
>
>
>
>
> ---------------------------(end of
> broadcast)--------------------------- TIP 5: don't forget to increase
> your free space map settings
>


--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Attachment

Re: Vacuum goes worse

From
Stéphane Schildknecht
Date:
Tom Lane a écrit :
> "Scott Marlowe" <scott.marlowe@gmail.com> writes:
>
>> On 10/16/07, St=E9phane Schildknecht
>> <stephane.schildknecht@postgresqlfr.org> wrote:
>>
>>> That's also why I am so disappointed vacuum doesn't give me these 4
>>> hints lines.
>>>
>
>
>> What kind of error, or output, does it give you at the end?  Any hint
>> as to why they're missing?
>>
>
> If you're talking about the FSM statistics display, that only gets
> printed by a database-wide VACUUM (one that doesn't name a specific
> table).
>

Yes, I am. The command line is (in a shell script whom ouput is
redirected in a global file) :

vacuumdb -d $DBNAME -p $DBPORT -U $DBUSR -z -v


That does not explain why we don't get FSM statitics display. The output
ends with:
INFO:  vacuuming "public.sometable"
INFO:  "sometable": removed 62 row versions in 3 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "sometable": found 62 removable, 5 nonremovable row versions in 5
pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 534 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.sometable"
INFO:  "sometable": scanned 5 of 5 pages, containing 5 live rows and 0
dead rows; 5 rows in sample, 5 estimated total rows
VACUUM

Best regards,

Stéphane

Re: Vacuum goes worse

From
Alvaro Herrera
Date:
Joshua D. Drake wrote:
> On Tue, 16 Oct 2007 17:03:39 -0700 (PDT)
> Brian Herlihy <btherl@yahoo.com.au> wrote:
>
> > Would it make sense to show the FSM stats for individual table
> > vaccums as well?  I'm wondering if the reason they aren't shown is
> > because it wouldn't be useful or isn't practical, or just that it
> > hasn't been done.
>
> I am not sure how useful it would be as the FSM is global. However what
> would be useful is something like VACUUM SUMMARY, where I could get
> "just" the stats instead of all the other output that comes along with
> VERBOSE.

What would be really useful is to remove all that noise from vacuum and
make it appear on a view.  8.4 material all of this, of course.

--
Alvaro Herrera                 http://www.amazon.com/gp/registry/CTMLCN8V17R4
"Estoy de acuerdo contigo en que la verdad absoluta no existe...
El problema es que la mentira sí existe y tu estás mintiendo" (G. Lama)

Re: Vacuum goes worse

From
"Stefano Dal Pra"
Date:
Your first post says vacuum goes worse (slower).
I see that you do not issue the -f option (FULL VACUUM).

I had a similar situation with a server (with frequent update)
performing nightly vacuumdb. After a few many days it went
slower and slower.

The first solution was to add the -f switch.
Note that it leads to table lock (see docs :-)

the FULL option completely rewrite the table on disk making it much
more compact
(i think of it similar to a "defrag" on windows). I had a dramatic
speed improvement
after the first vacuum full.

latest solution (psql 8.0.1) was a perl script which selectively
chooses tables to
full vacuum basing on results from this select:

SELECT a.relname, a.relpages FROM pg_class a ,pg_stat_user_tables b
WHERE a.relname = b.relname order by relpages desc;

this was to see how much a table's size grows through time.

With psql 8.2.x we adopted pg_autovacuum which seems to perform good,
even thought
i do not clearly understand whether it occasionally performs a full
vacuum (i think he does'nt).

Stefano



On 10/17/07, Stéphane Schildknecht
<stephane.schildknecht@postgresqlfr.org> wrote:
> Tom Lane a écrit :
> > "Scott Marlowe" <scott.marlowe@gmail.com> writes:
> >
> >> On 10/16/07, St=E9phane Schildknecht
> >> <stephane.schildknecht@postgresqlfr.org> wrote:
> >>
> >>> That's also why I am so disappointed vacuum doesn't give me these 4
> >>> hints lines.
> >>>
> >
> >
> >> What kind of error, or output, does it give you at the end?  Any hint
> >> as to why they're missing?
> >>
> >
> > If you're talking about the FSM statistics display, that only gets
> > printed by a database-wide VACUUM (one that doesn't name a specific
> > table).
> >
>
> Yes, I am. The command line is (in a shell script whom ouput is
> redirected in a global file) :
>
> vacuumdb -d $DBNAME -p $DBPORT -U $DBUSR -z -v
>
>
> That does not explain why we don't get FSM statitics display. The output
> ends with:
> INFO:  vacuuming "public.sometable"
> INFO:  "sometable": removed 62 row versions in 3 pages
> DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  "sometable": found 62 removable, 5 nonremovable row versions in 5
> pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 534 unused item pointers.
> 0 pages are entirely empty.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  analyzing "public.sometable"
> INFO:  "sometable": scanned 5 of 5 pages, containing 5 live rows and 0
> dead rows; 5 rows in sample, 5 estimated total rows
> VACUUM
>
> Best regards,
>
> Stéphane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>

Re: Vacuum goes worse

From
Tom Lane
Date:
=?ISO-8859-1?Q?St=E9phane_Schildknecht?= <stephane.schildknecht@postgresqlfr.org> writes:
> Yes, I am. The command line is (in a shell script whom ouput is
> redirected in a global file) :

> vacuumdb -d $DBNAME -p $DBPORT -U $DBUSR -z -v

> That does not explain why we don't get FSM statitics display.

Is $DBUSR a superuser?  If not, some tables are likely getting skipped.

            regards, tom lane

Re: Vacuum goes worse

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> What would be really useful is to remove all that noise from vacuum and
> make it appear on a view.

Well, if you want something decoupled from VACUUM there's already
contrib/pg_freespacemap.

> 8.4 material all of this, of course.

I am hoping that we rewrite FSM into the distributed DSM structure
that's been talked about, so that the whole problem goes away in 8.4.

            regards, tom lane

Re: Vacuum goes worse

From
Alvaro Herrera
Date:
Stefano Dal Pra escribió:
> Your first post says vacuum goes worse (slower).
> I see that you do not issue the -f option (FULL VACUUM).
>
> I had a similar situation with a server (with frequent update)
> performing nightly vacuumdb. After a few many days it went
> slower and slower.

When you have that problem, the solution is to issue more plain vacuum
(not full) more frequently.  If it's a highly updated table, then maybe
once per hour or more.  It depends on the update rate.

> With psql 8.2.x we adopted pg_autovacuum which seems to perform good,
> even thought
> i do not clearly understand whether it occasionally performs a full
> vacuum (i think he does'nt).

It doesn't because it's normally not necessary.  Also, we don't want to
be acquiring exclusive locks in a background automatic process, so if
you really need vacuum full (and I question your need to) then you must
issue it yourself.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Vacuum goes worse

From
"Scott Marlowe"
Date:
On 10/17/07, Stéphane Schildknecht
<stephane.schildknecht@postgresqlfr.org> wrote:
> Tom Lane a écrit :
>
> Yes, I am. The command line is (in a shell script whom ouput is
> redirected in a global file) :
>
> vacuumdb -d $DBNAME -p $DBPORT -U $DBUSR -z -v
>
>
> That does not explain why we don't get FSM statitics display.

Hmmm.  Have you tried running that command interactively? I'm just
wondering if your redirect is somehow dropping bits of the output.

Re: Vacuum goes worse

From
Stéphane Schildknecht
Date:
Tom Lane a écrit :
> =?ISO-8859-1?Q?St=E9phane_Schildknecht?= <stephane.schildknecht@postgresqlfr.org> writes:
>
>> Yes, I am. The command line is (in a shell script whom ouput is
>> redirected in a global file) :
>>
>
>
>> vacuumdb -d $DBNAME -p $DBPORT -U $DBUSR -z -v
>>
>
>
>> That does not explain why we don't get FSM statitics display.
>>
>
> Is $DBUSR a superuser?  If not, some tables are likely getting skipped.
>
>             regards, tom lane
>
No it's not a superuser as some pg_tables are skipped, according to logs.

So that's why these information are not diplayed in 8.19. Shame on me!
Thanks for all the advice. In fact, I did not take care of that as on
8.2.x, these information are displayed whenever you run it with
superuser or not...
What's more it seems (here on my linux box) it does not depends on the
client but on the server version.

Best regards,

--
Stéphane SCHILDKNECHT
Président de PostgreSQLFr
http://www.postgresqlfr.org


Re: Vacuum goes worse

From
Stéphane Schildknecht
Date:
Scott Marlowe a écrit :
> On 10/17/07, Stéphane Schildknecht
> <stephane.schildknecht@postgresqlfr.org> wrote:
>
>> Tom Lane a écrit :
>>
>> Yes, I am. The command line is (in a shell script whom ouput is
>> redirected in a global file) :
>>
>> vacuumdb -d $DBNAME -p $DBPORT -U $DBUSR -z -v
>>
>>
>> That does not explain why we don't get FSM statitics display.
>>
>
> Hmmm.  Have you tried running that command interactively? I'm just
> wondering if your redirect is somehow dropping bits of the output.
>

I tried a few combinations of redirection tests to verify that, and
always get all of the output or none of it.
According to Tom and tests I made on another box, superuser is the key.
Thanks anyway.

Best regards,

--
Stéphane SCHILDKNECHT
Président de PostgreSQLFr
http://www.postgresqlfr.org