Thread: BUG #5774: VACCUM & REINDEX kills production environement

BUG #5774: VACCUM & REINDEX kills production environement

From
"Bala Murugan"
Date:
The following bug has been logged online:

Bug reference:      5774
Logged by:          Bala Murugan
Email address:      b2m@a-cti.com
PostgreSQL version: 8.3.7
Operating system:   openSUSE 10.3 (X86-64) - Kernel \r (\l).
Description:        VACCUM & REINDEX kills production environement
Details:

Iam running postgres 8.3 version for more than 2 yrs on Amazon EC2 Instance,
in recent days Vaccum and reindex make the application down for more than
2hrs. I am not sure this because of my configuration or postgres.

My hardware details : High-CPU Extra Large Instance

7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform

conf :


## Connections
port                       = 5432
max_connections            = 1000
listen_addresses           = '*'

## Memory usage
shared_buffers             = '1 GB'
maintenance_work_mem       = '448 MB'
max_fsm_pages              = 220000
max_fsm_relations          = 1000
max_prepared_transactions  = 0

## Performance
checkpoint_segments        = 16
effective_cache_size       = '5 GB'
## To fix bad plans caused by partitioning 3/8/2010
random_page_cost           = 6

## Logging
# Where and how:
log_destination            =  'stderr'
logging_collector          = on
log_directory              = 'pg_log'
log_filename               = 'postgresql-%Y-%m-%d.log'
#log_line_prefix            = '%t %r[%p]: [%l-1]'
log_line_prefix = 'user=%u,db=%d '
# What:
log_min_duration_statement = 200
log_connections            = on
log_disconnections         = on
log_duration               = off
log_statement              = 'ddl'
# Cleanup
log_rotation_age           = '1d'
log_rotation_size          = 0
log_temp_files             = 0

## Statistics
#default_statistics_target  = 700
default_statistics_target  = 100

## Autovacuum
#autovacuum                     = on
#autovacuum_vacuum_scale_factor = 0.1

## Bulkloading - keep these on!
fsync                      = on
full_page_writes           = on


# We're partitioning the historysystem.history table

constraint_exclusion = on
checkpoint_completion_target = 0.9
wal_buffers = '8 MB'


# Changed 3/2/2010 in response to historySystem.history queries
work_mem = '44 MB'
statement_timeout = 0

thanks,
Bala

Re: BUG #5774: VACCUM & REINDEX kills production environement

From
hubert depesz lubaczewski
Date:
On Sun, Nov 28, 2010 at 07:25:52AM +0000, Bala Murugan wrote:
>
> The following bug has been logged online:
>
> Bug reference:      5774
> Logged by:          Bala Murugan
> Email address:      b2m@a-cti.com
> PostgreSQL version: 8.3.7
> Operating system:   openSUSE 10.3 (X86-64) - Kernel \r (\l).
> Description:        VACCUM & REINDEX kills production environement
> Details:
>
> Iam running postgres 8.3 version for more than 2 yrs on Amazon EC2 Instance,
> in recent days Vaccum and reindex make the application down for more than
> 2hrs. I am not sure this because of my configuration or postgres.

is it normal vacuum?

or are you using "vacuum full"?

generally - voth vacuum full and reindex do lock tables for exclusive
access.

that's why you generally don't use them!

vacuum full is especially frowned upon.

as for reindex - if you *really* need it (are you sure? what makes you
think you need it), then there are ways to do reindex without actually
using "reindex" command, which are mostly transparent for users, but you
should check if you really need to run reindex at all.

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

Re: BUG #5774: VACCUM & REINDEX kills production environement

From
Balamurugan Mahendran
Date:
This is how I used to do maintenance through cron job. Please advise if i
need to change my method, Also I used to run re-index all the time while
doing Vacuum (cron job updated twice a week).

queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.brands
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.
contactmethodtype
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.contacttype
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.skillset
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.skillsettype
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.location
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.contactmethod
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.contact2
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.contact
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.people
Return code for VaccumDb: 0
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.brands
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.contactmethodtype
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.contacttype
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.skillset
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.skillsettype
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.location
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.contactmethod
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.contact2
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.contact

Thanks,
Bala


On Sun, Nov 28, 2010 at 9:53 PM, hubert depesz lubaczewski <
depesz@depesz.com> wrote:

> On Sun, Nov 28, 2010 at 07:25:52AM +0000, Bala Murugan wrote:
> >
> > The following bug has been logged online:
> >
> > Bug reference:      5774
> > Logged by:          Bala Murugan
> > Email address:      b2m@a-cti.com
> > PostgreSQL version: 8.3.7
> > Operating system:   openSUSE 10.3 (X86-64) - Kernel \r (\l).
> > Description:        VACCUM & REINDEX kills production environement
> > Details:
> >
> > Iam running postgres 8.3 version for more than 2 yrs on Amazon EC2
> Instance,
> > in recent days Vaccum and reindex make the application down for more than
> > 2hrs. I am not sure this because of my configuration or postgres.
>
> is it normal vacuum?
>
> or are you using "vacuum full"?
>
> generally - voth vacuum full and reindex do lock tables for exclusive
> access.
>
> that's why you generally don't use them!
>
> vacuum full is especially frowned upon.
>
> as for reindex - if you *really* need it (are you sure? what makes you
> think you need it), then there are ways to do reindex without actually
> using "reindex" command, which are mostly transparent for users, but you
> should check if you really need to run reindex at all.
>
> Best regards,
>
> depesz
>
> --
> Linkedin: http://www.linkedin.com/in/depesz  /  blog:
> http://www.depesz.com/
> jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl /
> gg:6749007
>

Re: BUG #5774: VACCUM & REINDEX kills production environement

From
Heikki Linnakangas
Date:
On 29.11.2010 17:16, Balamurugan Mahendran wrote:
> This is how I used to do maintenance through cron job. Please advise if i
> need to change my method, Also I used to run re-index all the time while
> doing Vacuum (cron job updated twice a week).

Most likely you can just remove the reindex commands. Routing reindexing
is not normally needed.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: BUG #5774: VACCUM & REINDEX kills production environement

From
Heikki Linnakangas
Date:
On 29.11.2010 18:48, Balamurugan Mahendran wrote:
> Thanks, I'll remove it. But I still get down time because of Vacuum(table
> lock). Is there any other better way to do this?

Non-full vacuum doesn't lock out concurrent access. If you're getting
downtime, there's got to be some other explanation.

> I don't mind to switch to bigger instance with more Hardware.

Based on the information you've provided, it won't make a difference.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: BUG #5774: VACCUM & REINDEX kills production environement

From
Balamurugan Mahendran
Date:
Thanks, I'll remove it. But I still get down time because of Vacuum(table
lock). Is there any other better way to do this? I don't mind to switch to
bigger instance with more Hardware.

Current Instance :

7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High

I can go for this : (Only if this helps me from Outage)

34.2 GB of memory
13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each)
850 GB of instance storage
64-bit platform
I/O Performance: High


Thanks,
Bala


On Mon, Nov 29, 2010 at 10:01 PM, Heikki Linnakangas <
heikki.linnakangas@enterprisedb.com> wrote:

> On 29.11.2010 17:16, Balamurugan Mahendran wrote:
>
>> This is how I used to do maintenance through cron job. Please advise if i
>> need to change my method, Also I used to run re-index all the time while
>> doing Vacuum (cron job updated twice a week).
>>
>
> Most likely you can just remove the reindex commands. Routing reindexing is
> not normally needed.
>
> --
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>

Re: BUG #5774: VACCUM & REINDEX kills production environement

From
Balamurugan Mahendran
Date:
Is there a recommended link/documentation to make sure my configuration? for
my current hardware

7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform

Thanks for all your help!!

Thanks,
Bala


On Mon, Nov 29, 2010 at 10:22 PM, Heikki Linnakangas <
heikki.linnakangas@enterprisedb.com> wrote:

> On 29.11.2010 18:48, Balamurugan Mahendran wrote:
>
>> Thanks, I'll remove it. But I still get down time because of Vacuum(table
>> lock). Is there any other better way to do this?
>>
>
> Non-full vacuum doesn't lock out concurrent access. If you're getting
> downtime, there's got to be some other explanation.
>
>
>  I don't mind to switch to bigger instance with more Hardware.
>>
>
> Based on the information you've provided, it won't make a difference.
>
>
> --
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>

Re: BUG #5774: VACCUM & REINDEX kills production environement

From
Alvaro Herrera
Date:
Excerpts from Balamurugan Mahendran's message of lun nov 29 13:48:43 -0300 2010:
> Thanks, I'll remove it. But I still get down time because of Vacuum(table
> lock). Is there any other better way to do this? I don't mind to switch to
> bigger instance with more Hardware.

I wonder if you're doing something other than INSERT/UPDATE/DELETE that
causes the service to "block" in the presence of VACUUM.  Maybe your
application is trying to do ALTER TABLE too frequently (or something
else entirely), which *would* block.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support