Re: Reindex taking forever, and 99% CPU - Mailing list pgsql-general

From Jeff Janes
Subject Re: Reindex taking forever, and 99% CPU
Date
Msg-id CAMkU=1wOX4pCVEUNVfXMcXifEUn=t7bKn+xF-sJ=dpf50OjJVQ@mail.gmail.com
Whole thread Raw
In response to Reindex taking forever, and 99% CPU  (Phoenix Kiula <phoenix.kiula@gmail.com>)
Responses Re: Reindex taking forever, and 99% CPU  (Shaun Thomas <sthomas@optionshouse.com>)
List pgsql-general
On Saturday, August 2, 2014, Phoenix Kiula <phoenix.kiula@gmail.com> wrote:
Hi. I've been patient. PG is 9.0.17, updated via Yum yesterday.

One of my large tables (101 GB on disk, about 1.1 billion rows) used
to take too long to vacuum.

Too long for what?  Rome wasn't build in a day, it might not get vacuumed in a day either.  So what?
 
Not sure if it's an index corruption
issue. But I tried VACUUM FULL ANALYZE as recommended in another
thread yesterday, which took 5 hours on the two times I tried, without
finishing.

Now the REINDEX TABLE has taken over 6 hours as I decided to be
patient and just let something finish. Not sure this is normal though!
How do production level DBAs do this if it takes so long?

Generally speaking, we don't.  
 

If I open another SSH window to my server and try "select * from
pg_stats_activity" it just hangs there, as the REINDEX I presume is
taking up all the memory? I basically can't do anything else on this
server.

Is this large table one of the system tables?
 

Just in case it helps, a segment of my postgresql.conf is below. Would
appreciate any tips on what I can do.

(I did a pg_dump of just this table, which also took about 2 hours,
then I renamed the original table in the database, and tried to
pg_restore just the table, but it gave me an error message about the
archive being in the wrong format !!! So REINDEX or something like it
seems to be the only idea?)

The only idea in order to DO WHAT?  So far the only problems we know about are the ones you are causing yourself, in an effort to fix some problem which we know nothing about, and which might not actually exist in the first place.



Thanks for any help!

PS: CentOS 6 64 bit, 4 GB of RAM, Raid 1 Raptor disks. Postgresql.conf
and TOP output during the running of the REINDEX are below..

Does RAID 1 mean you only have 2 disks in your RAID?  If so, that is woefully inadequate to your apparent workload. The amount of RAM doesn't inspire confidence, either.  If you want to use this hardware, you need to re-calibrate what "patience" means.  Do a "vacuum verbose" (NOT "full") of the large table, and let it run over a weekend, at least.
 
----POSTGRESQL.CONF-----

max_connections                 = 180

That's probably absurd.  If you have an application that loses track of it's connections and doesn't actually try to make use of them and you can't fix that application and you have no evidence of other problems, then this might sense, kind of, as defensive measure.  But since you are in an emergency, or think you are, you should lower this.

maintenance_work_mem            = 320MB

If the only thing running is the vacuum, you could give it a lot more memory than this, like 2 or 3 GB.  But you should probably do that only in the session doing the "emergency" vacuum, not globally.

autovacuum_vacuum_cost_delay    = 20ms

Is vacuum_cost_delay still the default of 0?  

Cheers,

Jeff 

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Reindex taking forever, and 99% CPU
Next
From: David G Johnston
Date:
Subject: Minor pg_stat_activity query/message improvement request