Thread: Vacuum problem

Vacuum problem

From
S H
Date:
Hi, 

I have my database in which i am executing vacuuming running manually in one hour.

In  my production database sometime when vacuuming is running it is taking long time in opening connection.

My current version is version 8.1. Is there any known issue about open connection problem with vacuuming.

I found something but i am not sure if it is applicable to V8.1 too?
 
Vacuum of pg_catalog tables causes huge delay in opening new connections.


I think you're hitting the problem that was fixed here:

Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Branch: master Release: REL9_2_BR [532fe28da] 2012-05-26 19:09:52 -0400
Branch: REL9_1_STABLE Release: REL9_1_4 [6c1bf45ea] 2012-05-26 19:09:59 -0400
Branch: REL9_0_STABLE Release: REL9_0_8 [2ce097e6e] 2012-05-26 19:10:05 -0400
Branch: REL8_4_STABLE Release: REL8_4_12 [35cc2be6f] 2012-05-26 19:10:13 -0400
Branch: REL8_3_STABLE Release: REL8_3_19 [422022b12] 2012-05-26 19:10:19 -0400
   Prevent synchronized scanning when systable_beginscan chooses a heapscan.      The only interesting-for-performance case wherein we force heapscan here   is when we're rebuilding the relcache init file, and the only such case   that is likely to be examining a catalog big enough to be syncscanned is   RelationBuildTupleDesc.  But the early-exit optimization in that code gets   broken if we start the scan at a random place within the catalog, so that   allowing syncscan is actually a big deoptimization if pg_attribute is large   (at least for the normal case where the rows for core system catalogs have   never been changed since initdb).  Hence, prevent syncscan here.  Per my   testing pursuant to complaints from Jeff Frost and Greg Sabino Mullane,   though neither of them seem to have actually hit this specific problem.      Back-patch to 8.3, where syncscan was introduced.

Regards,
SH

Re: Vacuum problem

From
Scott Marlowe
Date:
On Mon, May 13, 2013 at 8:05 AM, S H <msq001@live.com> wrote:
> Hi,
>
> I have my database in which i am executing vacuuming running manually in one
> hour.

vacuum or vacuum full?

> In  my production database sometime when vacuuming is running it is taking
> long time in opening connection.

What is the db server doing when this happens? What does top, vmstat,
iostat etc have to say about it?

> My current version is version 8.1. Is there any known issue about open
> connection problem with vacuuming.

8.1 is out of support and has been for some time.

> I found something but i am not sure if it is applicable to V8.1 too?

Tom wrote:
>     broken if we start the scan at a random place within the catalog, so
> that
>     allowing syncscan is actually a big deoptimization if pg_attribute is
> large
SNIP
>     Back-patch to 8.3, where syncscan was introduced.

It was added in 8.3 so no, it is likely not your issue.

Running 8.1 means you're asking about a system no one else on this
list is likely to still be using much. The hackers aren't gonna be
interested in fixing it either, since it's out of support.


Re: Vacuum problem

From
S H
Date:

> vacuum or vacuum full?

vacuum analyze.

> What is the db server doing when this happens? What does top, vmstat,
> iostat etc have to say about it?

It is high end server with 96 GB of RAM , 16 core server, but there are many other application running, This db is used for monitoring the performance of this server and inserting/updating data every one in 10-20 tables.

I am able to reproduce this issue, in case i run vacuuming of my database and in parallel try to open 30 connections, sometime db client takes time to open connection .... Is there any workaround or there is some known issue already existing. If there is any known issue it will be easy to persuade my customers to upgrade..

Stack trace of client is as follows:-

sendto(3, "p\0\0\0(md5de8bdf202e563b11a4384ba5"..., 41, 0, NULL, 0) = 41 <0.000012>
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000005>
poll([{fd=3, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=3, revents=POLLIN}]) <35.027745>


Running 8.1 means you're asking about a system no one else on this
> list is likely to still be using much. The hackers aren't gonna be
> interested in fixing it either, since it's out of support.


I am migrating to new version, for new customers , but for old existing customer, it would require significant time. Is there any workaround for the same. Like improving some DB parameters.

> Date: Mon, 13 May 2013 08:25:30 -0600
> Subject: Re: [GENERAL] Vacuum problem
> From: scott.marlowe@gmail.com
> To: msq001@live.com
> CC: pgsql-general@postgresql.org
>
> On Mon, May 13, 2013 at 8:05 AM, S H <msq001@live.com> wrote:
> > Hi,
> >
> > I have my database in which i am executing vacuuming running manually in one
> > hour.
>
> vacuum or vacuum full?
>
> > In my production database sometime when vacuuming is running it is taking
> > long time in opening connection.
>
> What is the db server doing when this happens? What does top, vmstat,
> iostat etc have to say about it?
>
> > My current version is version 8.1. Is there any known issue about open
> > connection problem with vacuuming.
>
> 8.1 is out of support and has been for some time.
>
> > I found something but i am not sure if it is applicable to V8.1 too?
>
> Tom wrote:
> > broken if we start the scan at a random place within the catalog, so
> > that
> > allowing syncscan is actually a big deoptimization if pg_attribute is
> > large
> SNIP
> > Back-patch to 8.3, where syncscan was introduced.
>
> It was added in 8.3 so no, it is likely not your issue.
>
> Running 8.1 means you're asking about a system no one else on this
> list is likely to still be using much. The hackers aren't gonna be
> interested in fixing it either, since it's out of support.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

Re: Vacuum problem

From
Scott Marlowe
Date:
On Mon, May 13, 2013 at 8:37 AM, S H <msq001@live.com> wrote:

>> What is the db server doing when this happens? What does top, vmstat,
>> iostat etc have to say about it?
>
> It is high end server with 96 GB of RAM , 16 core server, but there are many
> other application running, This db is used for monitoring the performance of
> this server and inserting/updating data every one in 10-20 tables.
>
> I am able to reproduce this issue, in case i run vacuuming of my database
> and in parallel try to open 30 connections, sometime db client takes time to
> open connection .... Is there any workaround or there is some known issue
> already existing. If there is any known issue it will be easy to persuade my
> customers to upgrade..

I wonder if you've got a bloated pg catalog then. Certainly sounds
like it's a possibility.
So other than vacuuming when you recreate this, is the server working
hard? What is vacuum vacuuming when this happens (pg_stat_activity
should show that).

>> Running 8.1 means you're asking about a system no one else on this
>> list is likely to still be using much. The hackers aren't gonna be
>> interested in fixing it either, since it's out of support.
>
> I am migrating to new version, for new customers , but for old existing
> customer, it would require significant time. Is there any workaround for the
> same. Like improving some DB parameters.

Probably not. If the tables that make up the pg catalogs are bloated,
you might have to go take some outage time to run a vacuum full /
reindex on them. Some of them, I believe, require single user mode to
do this, but not sure, and definitely not sure if it's na 8.1 thing or
not.


Re: Vacuum problem

From
S H
Date:
My disk is utilized by many other components, thus do we have minimum recommendation my postgres to have sufficient speed. Current perfomance of my disk is around 1-5MB/sec. Is it sufficient? 

Is it slow and can be cause of slow vacuuming.. 


Re: Vacuum problem

From
John R Pierce
Date:
On 5/13/2013 7:10 PM, S H wrote:
> My disk is utilized by many other components, thus do we have minimum
> recommendation my postgres to have sufficient speed. Current
> perfomance of my disk is around 1-5MB/sec. Is it sufficient?

how are you measuring this?   thats painfully slow by today's standards,
even my desktop SATA drives can sustain well over 100MB/second on
sequential read or write

what counts in a database server like postgres is NOT the sequential
transfer speed, instead its the random IO operations/second.   I'm
benchmarking a 8 disk RAID10 right now and seeing around 2000-4000
write/sec and as high as 1000-2400 read/sec during this TPC-B style
transaction benchmark.    iostat -xm during this operation looks like...

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
            3.34    0.00    1.93   39.43    0.00   55.31

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sdb               0.00     0.10  153.50 4901.30     1.89 40.69
17.25   211.50   42.57   0.20 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           10.95    0.00    3.98   36.35    0.00   48.72

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sdb               0.00     0.10  583.40 2955.40     7.10 28.75
20.75   215.77   63.01   0.28  99.97

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           44.36    0.00   14.28   24.61    0.00   16.75

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sdb               0.00     1.10 2377.30 1062.90    29.23 29.97
35.24    29.53    8.59   0.29  99.95

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           48.39    0.00   16.45   21.18    0.00   13.98

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sdb               0.00     0.00 2564.30 1182.40    31.15 32.42
34.75    28.81    7.69   0.27  99.97

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           44.59    0.00   15.02   25.38    0.00   15.02

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sdb               0.00     0.10 2346.10 1097.50    28.66 29.28
34.46    29.58    8.60   0.29  99.98

(these are 10 second averages)

in general, the solution to more storage performance for a database
server is to use more disks in a raid10 configuration,  I have some
raids that are a many as 20 disks, dedicated to database use (everything
else on the server uses other storage).  These are all 15000rpm SAS2
server drives, on a raid controller with 1GB flash-backed write-back cache.




--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: Vacuum problem

From
S H
Date:
> I wonder if you've got a bloated pg catalog then. Certainly sounds
> like it's a possibility.
> So other than vacuuming when you recreate this, is the server working
> hard? What is vacuum vacuuming when this happens (pg_stat_activity
> should show that)

Does vacuum full is required to avoid bloating, i am running vacuum analyze regularly but not vacuum full.

Could it be cause of bloating ?

Re: Vacuum problem

From
Scott Marlowe
Date:
On Tue, May 14, 2013 at 7:27 AM, S H <msq001@live.com> wrote:
>> I wonder if you've got a bloated pg catalog then. Certainly sounds
>> like it's a possibility.
>> So other than vacuuming when you recreate this, is the server working
>> hard? What is vacuum vacuuming when this happens (pg_stat_activity
>> should show that)
>
> Does vacuum full is required to avoid bloating, i am running vacuum analyze
> regularly but not vacuum full.

Normally vacuum full is NOT required on a regular basis. However, if
you did something like creation 100M tables and then dropped them, or
did it one at a time real fast, you could outrun your autovacuum
daemons and get bloat in the pg catalog tables.

Just offering a possibility for why a connection might be taking a
long time. There's plenty of other possibilities I'm sure.


Re: Vacuum problem

From
S H
Date:
I am doing regular insertion/deletion on the same tables .. and running vacuum in an hour... 

I suspect there is bloating  in my tables.. but how does bloating in pgcatalog is happening is not clear...
> Normally vacuum full is NOT required on a regular basis. However, if
> you did something like creation 100M tables and then dropped them, or
> did it one at a time real fast, you could outrun your autovacuum
> daemons and get bloat in the pg catalog tables.
>
> Just offering a possibility for why a connection might be taking a
> long time. There's plenty of other possibilities I'm sure.

Re: Vacuum problem

From
Scott Marlowe
Date:
Not saying you HAVE bloating there, saying you MIGHT.


Re: Vacuum problem

From
S H
Date:
I confirmed that there is no bloating unfortunately. 

Did some experiment and it seems that connection open are always slow in case of vacuuming.. 

- Experiment done are as follows:-
- Do frequent vacuuming.
- Execute xx connections every min and close after one sec.
- when vacuum is running connection open takes time.

I can almost 100% reduce it when my database size is 30 MB only with 1.5 bloating. 

> Date: Tue, 14 May 2013 10:54:04 -0600
> Subject: Re: [GENERAL] Vacuum problem
> From: scott.marlowe@gmail.com
> To: msq001@live.com
> CC: pgsql-general@postgresql.org
>
> Not saying you HAVE bloating there, saying you MIGHT.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general