BUG #5774: VACCUM & REINDEX kills production environement

Started by Bala Muruganover 15 years ago8 messagesbugs

b2m@a-cti.com

over 15 years ago

The following bug has been logged online:

Bug reference: 5774
Logged by: Bala Murugan
Email address: b2m@a-cti.com
PostgreSQL version: 8.3.7
Operating system: openSUSE 10.3 (X86-64) - Kernel \r (\l).
Description: VACCUM & REINDEX kills production environement
Details:

Iam running postgres 8.3 version for more than 2 yrs on Amazon EC2 Instance,
in recent days Vaccum and reindex make the application down for more than
2hrs. I am not sure this because of my configuration or postgres.

My hardware details : High-CPU Extra Large Instance

7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform

conf :

## Connections
port = 5432
max_connections = 1000
listen_addresses = '*'

## Memory usage
shared_buffers = '1 GB'
maintenance_work_mem = '448 MB'
max_fsm_pages = 220000
max_fsm_relations = 1000
max_prepared_transactions = 0

## Performance
checkpoint_segments = 16
effective_cache_size = '5 GB'
## To fix bad plans caused by partitioning 3/8/2010
random_page_cost = 6

## Logging
# Where and how:
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d.log'
#log_line_prefix = '%t %r[%p]: [%l-1]'
log_line_prefix = 'user=%u,db=%d '
# What:
log_min_duration_statement = 200
log_connections = on
log_disconnections = on
log_duration = off
log_statement = 'ddl'
# Cleanup
log_rotation_age = '1d'
log_rotation_size = 0
log_temp_files = 0

## Statistics
#default_statistics_target = 700
default_statistics_target = 100

## Autovacuum
#autovacuum = on
#autovacuum_vacuum_scale_factor = 0.1

## Bulkloading - keep these on!
fsync = on
full_page_writes = on

# We're partitioning the historysystem.history table

constraint_exclusion = on
checkpoint_completion_target = 0.9
wal_buffers = '8 MB'

# Changed 3/2/2010 in response to historySystem.history queries
work_mem = '44 MB'
statement_timeout = 0

thanks,
Bala

hubert depesz lubaczewski

depesz@depesz.com

over 15 years ago

In reply to: Bala Murugan (#1)

Re: BUG #5774: VACCUM & REINDEX kills production environement

On Sun, Nov 28, 2010 at 07:25:52AM +0000, Bala Murugan wrote:

The following bug has been logged online:

Bug reference: 5774
Logged by: Bala Murugan
Email address: b2m@a-cti.com
PostgreSQL version: 8.3.7
Operating system: openSUSE 10.3 (X86-64) - Kernel \r (\l).
Description: VACCUM & REINDEX kills production environement
Details:

Iam running postgres 8.3 version for more than 2 yrs on Amazon EC2 Instance,
in recent days Vaccum and reindex make the application down for more than
2hrs. I am not sure this because of my configuration or postgres.

is it normal vacuum?

or are you using "vacuum full"?

generally - voth vacuum full and reindex do lock tables for exclusive
access.

that's why you generally don't use them!

vacuum full is especially frowned upon.

as for reindex - if you *really* need it (are you sure? what makes you
think you need it), then there are ways to do reindex without actually
using "reindex" command, which are mostly transparent for users, but you
should check if you really need to run reindex at all.

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

Balamurugan Mahendran

balamurugan@adaptavant.com

over 15 years ago

In reply to: hubert depesz lubaczewski (#2)

Re: BUG #5774: VACCUM & REINDEX kills production environement

This is how I used to do maintenance through cron job. Please advise if i
need to change my method, Also I used to run re-index all the time while
doing Vacuum (cron job updated twice a week).

queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.brands
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.
contactmethodtype
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.contacttype
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.skillset
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.skillsettype
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.location
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.contactmethod
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.contact2
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.contact
queryfor Vaccum:VACUUM VERBOSE ANALYZE contactmanagementsystem.people
Return code for VaccumDb: 0
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.brands
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.contactmethodtype
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.contacttype
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.skillset
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.skillsettype
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.location
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.contactmethod
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.contact2
queryfor Vaccum:REINDEX TABLE contactmanagementsystem.contact

Thanks,
Bala

On Sun, Nov 28, 2010 at 9:53 PM, hubert depesz lubaczewski <
depesz@depesz.com> wrote:

Show quoted text

On Sun, Nov 28, 2010 at 07:25:52AM +0000, Bala Murugan wrote:

The following bug has been logged online:

Bug reference: 5774
Logged by: Bala Murugan
Email address: b2m@a-cti.com
PostgreSQL version: 8.3.7
Operating system: openSUSE 10.3 (X86-64) - Kernel \r (\l).
Description: VACCUM & REINDEX kills production environement
Details:

Iam running postgres 8.3 version for more than 2 yrs on Amazon EC2

Instance,

in recent days Vaccum and reindex make the application down for more than
2hrs. I am not sure this because of my configuration or postgres.

is it normal vacuum?

or are you using "vacuum full"?

generally - voth vacuum full and reindex do lock tables for exclusive
access.

that's why you generally don't use them!

vacuum full is especially frowned upon.

as for reindex - if you *really* need it (are you sure? what makes you
think you need it), then there are ways to do reindex without actually
using "reindex" command, which are mostly transparent for users, but you
should check if you really need to run reindex at all.

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz / blog:
http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl /
gg:6749007

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 15 years ago

In reply to: Balamurugan Mahendran (#3)

Re: BUG #5774: VACCUM & REINDEX kills production environement

On 29.11.2010 17:16, Balamurugan Mahendran wrote:

This is how I used to do maintenance through cron job. Please advise if i
need to change my method, Also I used to run re-index all the time while
doing Vacuum (cron job updated twice a week).

Most likely you can just remove the reindex commands. Routing reindexing
is not normally needed.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Balamurugan Mahendran

balamurugan@adaptavant.com

over 15 years ago

In reply to: Heikki Linnakangas (#4)

Re: BUG #5774: VACCUM & REINDEX kills production environement

Thanks, I'll remove it. But I still get down time because of Vacuum(table
lock). Is there any other better way to do this? I don't mind to switch to
bigger instance with more Hardware.

Current Instance :

7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High

I can go for this : (Only if this helps me from Outage)

34.2 GB of memory
13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each)
850 GB of instance storage
64-bit platform
I/O Performance: High

Thanks,
Bala

On Mon, Nov 29, 2010 at 10:01 PM, Heikki Linnakangas <
heikki.linnakangas@enterprisedb.com> wrote:

Show quoted text

On 29.11.2010 17:16, Balamurugan Mahendran wrote:

This is how I used to do maintenance through cron job. Please advise if i
need to change my method, Also I used to run re-index all the time while
doing Vacuum (cron job updated twice a week).

Most likely you can just remove the reindex commands. Routing reindexing is
not normally needed.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 15 years ago

In reply to: Balamurugan Mahendran (#5)

Re: BUG #5774: VACCUM & REINDEX kills production environement

On 29.11.2010 18:48, Balamurugan Mahendran wrote:

Thanks, I'll remove it. But I still get down time because of Vacuum(table
lock). Is there any other better way to do this?

Non-full vacuum doesn't lock out concurrent access. If you're getting
downtime, there's got to be some other explanation.

I don't mind to switch to bigger instance with more Hardware.

Based on the information you've provided, it won't make a difference.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Balamurugan Mahendran

balamurugan@adaptavant.com

over 15 years ago

In reply to: Heikki Linnakangas (#6)

Re: BUG #5774: VACCUM & REINDEX kills production environement

Is there a recommended link/documentation to make sure my configuration? for
my current hardware

7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform

Thanks for all your help!!

Thanks,
Bala

On Mon, Nov 29, 2010 at 10:22 PM, Heikki Linnakangas <
heikki.linnakangas@enterprisedb.com> wrote:

Show quoted text

On 29.11.2010 18:48, Balamurugan Mahendran wrote:

Thanks, I'll remove it. But I still get down time because of Vacuum(table
lock). Is there any other better way to do this?

Non-full vacuum doesn't lock out concurrent access. If you're getting
downtime, there's got to be some other explanation.

I don't mind to switch to bigger instance with more Hardware.

Based on the information you've provided, it won't make a difference.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Alvaro Herrera

alvherre@2ndquadrant.com

over 15 years ago

In reply to: Balamurugan Mahendran (#5)

Re: BUG #5774: VACCUM & REINDEX kills production environement

Excerpts from Balamurugan Mahendran's message of lun nov 29 13:48:43 -0300 2010:

Thanks, I'll remove it. But I still get down time because of Vacuum(table
lock). Is there any other better way to do this? I don't mind to switch to
bigger instance with more Hardware.

I wonder if you're doing something other than INSERT/UPDATE/DELETE that
causes the service to "block" in the presence of VACUUM. Maybe your
application is trying to do ALTER TABLE too frequently (or something
else entirely), which *would* block.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support