Thread: REINDEX takes half a day (and still not complete!)
I have a large table but not as large as the kind of numbers that get discussed on this list. It has 125 million rows. REINDEXing the table takes half a day, and it's still not finished. To write this post I did "SELECT COUNT(*)", and here's the output -- so long! select count(*) from links; count ----------- 125418191 (1 row) Time: 1270405.373 ms That's 1270 seconds! I suppose the vaccuum analyze is not doing its job? As you can see from settings below, I have autovacuum set to ON, and there's also a cronjob every 10 hours to do a manual vacuum analyze on this table, which is largest. PG is version 8.2.9. Any thoughts on what I can do to improve performance!? Below are my settings. max_connections = 300 shared_buffers = 500MB effective_cache_size = 1GB max_fsm_relations = 1500 max_fsm_pages = 950000 work_mem = 100MB temp_buffers = 4096 authentication_timeout = 10s ssl = off checkpoint_warning = 3600 random_page_cost = 1 autovacuum = on autovacuum_vacuum_cost_delay = 20 vacuum_cost_delay = 20 vacuum_cost_limit = 600 autovacuum_naptime = 10 stats_start_collector = on stats_row_level = on autovacuum_vacuum_threshold = 75 autovacuum_analyze_threshold = 25 autovacuum_analyze_scale_factor = 0.02 autovacuum_vacuum_scale_factor = 0.01 wal_buffers = 64 checkpoint_segments = 128 checkpoint_timeout = 900 fsync = on maintenance_work_mem = 512MB
On Fri, Mar 18, 2011 at 9:07 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > I have a large table but not as large as the kind of numbers that get > discussed on this list. It has 125 million rows. > > REINDEXing the table takes half a day, and it's still not finished. > > To write this post I did "SELECT COUNT(*)", and here's the output -- so long! > > select count(*) from links; > count > ----------- > 125418191 > (1 row) > > Time: 1270405.373 ms > > That's 1270 seconds! > > I suppose the vaccuum analyze is not doing its job? As you can see > from settings below, I have autovacuum set to ON, and there's also a > cronjob every 10 hours to do a manual vacuum analyze on this table, > which is largest. > > PG is version 8.2.9. > > Any thoughts on what I can do to improve performance!? > > Below are my settings. > > > > max_connections = 300 > shared_buffers = 500MB > effective_cache_size = 1GB > max_fsm_relations = 1500 > max_fsm_pages = 950000 > > work_mem = 100MB What is the output of running vacuum verbose as a superuser (you can run it on the postgres database so it returns fast.) We're looking for the output that looks like this: INFO: free space map contains 1930193 pages in 749 relations DETAIL: A total of 1787744 page slots are in use (including overhead). 1787744 page slots are required to track all free space. Current limits are: 10000000 page slots, 3000 relations, using 58911 kB. If the space needed exceeds page slots then you need to crank up your free space map. If the relations exceeds the available then you'll need to crank up max relations.
On Fri, Mar 18, 2011 at 9:07 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > autovacuum = on > autovacuum_vacuum_cost_delay = 20 > > vacuum_cost_delay = 20 > vacuum_cost_limit = 600 > > autovacuum_naptime = 10 also, if vacuum can't keep up you can increase the vacuum cost limit, and lower the cost delay. Anything above 1ms is still quite a wait compared to 0. And most systems don't have the real granularity to go that low anyway, so 5ms is about as low as you can go and get a change before 0. Also, if you've got a lot of large relations you might need to increase the max workers as well.
Thanks Scott. > What is the output of running vacuum verbose as a superuser (you can > run it on the postgres database so it returns fast.) Here's the output for postgres DB: INFO: free space map contains 110614 pages in 33 relations DETAIL: A total of 110464 page slots are in use (including overhead). 110464 page slots are required to track all free space. Current limits are: 950000 page slots, 1500 relations, using 5665 kB. VACUUM Does running it on a postgres database also show the relevant info for other databases? From above it seems fine, right? > also, if vacuum can't keep up you can increase the vacuum cost limit, > and lower the cost delay. Anything above 1ms is still quite a wait > compared to 0. And most systems don't have the real granularity to go > that low anyway, so 5ms is about as low as you can go and get a change > before 0. Also, if you've got a lot of large relations you might need > to increase the max workers as well. I'm not sure I understand this. (1) I should increase "max workers". But I am on version 8.2.9 -- did this version have "autovacuum_max_workers"? It seems to be a more recent thing: http://sn.im/27nxe1 (2) The big table in my database (with 125 million rows) has about 5,000 rows that get DELETEd every day, about 100,000 new INSERTs, and about 12,000 UPDATEs. (3) What's that thing about cost delay. Which values from vacuum should I check to determine the cost delay -- what's the specific formula? Thanks! On Sat, Mar 19, 2011 at 12:58 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Fri, Mar 18, 2011 at 9:07 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: >> I have a large table but not as large as the kind of numbers that get >> discussed on this list. It has 125 million rows. >> >> REINDEXing the table takes half a day, and it's still not finished. >> >> To write this post I did "SELECT COUNT(*)", and here's the output -- so long! >> >> select count(*) from links; >> count >> ----------- >> 125418191 >> (1 row) >> >> Time: 1270405.373 ms >> >> That's 1270 seconds! >> >> I suppose the vaccuum analyze is not doing its job? As you can see >> from settings below, I have autovacuum set to ON, and there's also a >> cronjob every 10 hours to do a manual vacuum analyze on this table, >> which is largest. >> >> PG is version 8.2.9. >> >> Any thoughts on what I can do to improve performance!? >> >> Below are my settings. >> >> >> >> max_connections = 300 >> shared_buffers = 500MB >> effective_cache_size = 1GB >> max_fsm_relations = 1500 >> max_fsm_pages = 950000 >> >> work_mem = 100MB > > What is the output of running vacuum verbose as a superuser (you can > run it on the postgres database so it returns fast.) We're looking > for the output that looks like this: > > INFO: free space map contains 1930193 pages in 749 relations > DETAIL: A total of 1787744 page slots are in use (including overhead). > 1787744 page slots are required to track all free space. > Current limits are: 10000000 page slots, 3000 relations, using 58911 kB. > > If the space needed exceeds page slots then you need to crank up your > free space map. If the relations exceeds the available then you'll > need to crank up max relations. >
On Fri, Mar 18, 2011 at 10:07 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > I have a large table but not as large as the kind of numbers that get > discussed on this list. It has 125 million rows. > > REINDEXing the table takes half a day, and it's still not finished. > > To write this post I did "SELECT COUNT(*)", and here's the output -- so long! > > select count(*) from links; > count > ----------- > 125418191 > (1 row) > > Time: 1270405.373 ms > > That's 1270 seconds! > > I suppose the vaccuum analyze is not doing its job? As you can see > from settings below, I have autovacuum set to ON, and there's also a > cronjob every 10 hours to do a manual vacuum analyze on this table, > which is largest. > > PG is version 8.2.9. > > Any thoughts on what I can do to improve performance!? > > Below are my settings. > > > > max_connections = 300 > shared_buffers = 500MB > effective_cache_size = 1GB > max_fsm_relations = 1500 > max_fsm_pages = 950000 > > work_mem = 100MB > temp_buffers = 4096 > authentication_timeout = 10s > ssl = off > checkpoint_warning = 3600 > random_page_cost = 1 > > autovacuum = on > autovacuum_vacuum_cost_delay = 20 > > vacuum_cost_delay = 20 > vacuum_cost_limit = 600 > > autovacuum_naptime = 10 > stats_start_collector = on > stats_row_level = on > autovacuum_vacuum_threshold = 75 > autovacuum_analyze_threshold = 25 > autovacuum_analyze_scale_factor = 0.02 > autovacuum_vacuum_scale_factor = 0.01 > > wal_buffers = 64 > checkpoint_segments = 128 > checkpoint_timeout = 900 > fsync = on > maintenance_work_mem = 512MB how much memory do you have? you might want to consider raising maintenance_work_mem to 1GB. Are other things going on in the database while you are rebuilding your indexes? Is it possible you are blocked waiting on a lock for a while? How much index data is there? Can we see the table definition along with create index statements? merlin
On Mon, Mar 21, 2011 at 8:14 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > Thanks Merlin, Scott. > > First, yes, I can increase maintenance_work_memory. I have 8GB RAM in > total, and sure, I can dedicate 1GB of it to PG. Currently PG is the > most intensive software here. If we're talking maintenance work mem, then you might want to set it for a single connection. set maintenance_work_mem='1000MB'; reindex yada yada; etc. So it's not global, just local. > Second, how can I check if there are other things going on in the > database while i REINDEX? Maybe some kind of vacuum is going on, but > isn't that supposed to wait while REINDEX is happening for at least > this table? OK, my main point has been that if autovacuum is running well enough, then you don't need reindex, and if you are running it it's a maintenance thing you shouldn't have to schedule all the time, but only run until you get autovac tuned up enough to handle your db during the day. however, I know sometimes you're stuck with what you're stuck with. You can see what else is running with the pg_stats_activity view, which will show you all running queries. That and iotop cna show you which processes are chewing up how much IO. The other pg_stat_* tables can get you a good idea of what's happening to your tables in the database. iostat and vmstat can give you an idea how much IO bandwidth you're using. If a vacuum starts after the reindex it will either wait or abort and not get in the way. If a vacuum is already running I'm not sure if it will get killed or not.
Sorry, rejuvenating a thread that was basically unanswered. I closed the database for any kinds of access to focus on maintenance operations, killed all earlier processes so that my maintenance is the only stuff going on. REINDEX is still taking 3 hours -- and it is still not finished! Similarly, if I cancel the REINDEX and issue a VACUUM ANALYZE VERBOSE, this too seems to just hang there on my big table. I changed the maintenance_work_men to 2GB for this operation. It's highly worrisome -- the above slow times are with 2GB of my server dedicated to Postgresql!!!! Surely this is not tenable for enterprise environments? I am on a 64bit RedHat server with dual CPU Intel Woodcrest or whatever that was called. Postgres is 8.2.9. How do DB folks do this with small maintenance windows? This is for a very high traffic website so it's beginning to get embarrassing. Would appreciate any thoughts or pointers. Thanks! On Mon, Mar 21, 2011 at 9:28 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Fri, Mar 18, 2011 at 10:07 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: >> I have a large table but not as large as the kind of numbers that get >> discussed on this list. It has 125 million rows. >> >> REINDEXing the table takes half a day, and it's still not finished. >> >> To write this post I did "SELECT COUNT(*)", and here's the output -- so long! >> >> select count(*) from links; >> count >> ----------- >> 125418191 >> (1 row) >> >> Time: 1270405.373 ms >> >> That's 1270 seconds! >> >> I suppose the vaccuum analyze is not doing its job? As you can see >> from settings below, I have autovacuum set to ON, and there's also a >> cronjob every 10 hours to do a manual vacuum analyze on this table, >> which is largest. >> >> PG is version 8.2.9. >> >> Any thoughts on what I can do to improve performance!? >> >> Below are my settings. >> >> >> >> max_connections = 300 >> shared_buffers = 500MB >> effective_cache_size = 1GB >> max_fsm_relations = 1500 >> max_fsm_pages = 950000 >> >> work_mem = 100MB >> temp_buffers = 4096 >> authentication_timeout = 10s >> ssl = off >> checkpoint_warning = 3600 >> random_page_cost = 1 >> >> autovacuum = on >> autovacuum_vacuum_cost_delay = 20 >> >> vacuum_cost_delay = 20 >> vacuum_cost_limit = 600 >> >> autovacuum_naptime = 10 >> stats_start_collector = on >> stats_row_level = on >> autovacuum_vacuum_threshold = 75 >> autovacuum_analyze_threshold = 25 >> autovacuum_analyze_scale_factor = 0.02 >> autovacuum_vacuum_scale_factor = 0.01 >> >> wal_buffers = 64 >> checkpoint_segments = 128 >> checkpoint_timeout = 900 >> fsync = on >> maintenance_work_mem = 512MB > > how much memory do you have? you might want to consider raising > maintenance_work_mem to 1GB. Are other things going on in the > database while you are rebuilding your indexes? Is it possible you > are blocked waiting on a lock for a while? > > How much index data is there? Can we see the table definition along > with create index statements? > > merlin >
On Sun, Apr 17, 2011 at 9:30 AM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > Sorry, rejuvenating a thread that was basically unanswered. > > I closed the database for any kinds of access to focus on maintenance > operations, killed all earlier processes so that my maintenance is the > only stuff going on. > > REINDEX is still taking 3 hours -- and it is still not finished! > > Similarly, if I cancel the REINDEX and issue a VACUUM ANALYZE VERBOSE, > this too seems to just hang there on my big table. > > I changed the maintenance_work_men to 2GB for this operation. It's > highly worrisome -- the above slow times are with 2GB of my server > dedicated to Postgresql!!!! > > Surely this is not tenable for enterprise environments? I am on a > 64bit RedHat server with dual CPU Intel Woodcrest or whatever that was > called. Postgres is 8.2.9. > > How do DB folks do this with small maintenance windows? This is for a > very high traffic website so it's beginning to get embarrassing. > > Would appreciate any thoughts or pointers. Upgrade to something more modern than 8.2.x. Autovacuum was still very much in its infancy back then. 9.0 or higher is a good choice. What do iostat -xd 10 and vmstat 10 and top say about these processes when they're running. "It's taking a really long time and seems like it's hanging" tells us nothing useful. Your OS has tools to let you figure out what's bottlenecking your operations, so get familiar with them and let us know what they tell you. These are all suggestions I made before which you have now classified as "not answering your questions" so I'm getting a little tired of helping you when you don't seem interested in helping yourself. What are your vacuum and autovacuum costing values set to? Can you make vacuum and / or autovacuum more aggresive?
How do DB folks do this with small maintenance windows? This is for a
very high traffic website so it's beginning to get embarrassing.
Normally there is no need to issue reindex. What's your reason for the need?
Jesper
On Sun, Apr 17, 2011 at 9:44 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Sun, Apr 17, 2011 at 9:30 AM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: >> Sorry, rejuvenating a thread that was basically unanswered. >> >> I closed the database for any kinds of access to focus on maintenance >> operations, killed all earlier processes so that my maintenance is the >> only stuff going on. >> >> REINDEX is still taking 3 hours -- and it is still not finished! >> >> Similarly, if I cancel the REINDEX and issue a VACUUM ANALYZE VERBOSE, >> this too seems to just hang there on my big table. >> >> I changed the maintenance_work_men to 2GB for this operation. It's >> highly worrisome -- the above slow times are with 2GB of my server >> dedicated to Postgresql!!!! >> >> Surely this is not tenable for enterprise environments? I am on a >> 64bit RedHat server with dual CPU Intel Woodcrest or whatever that was >> called. Postgres is 8.2.9. >> >> How do DB folks do this with small maintenance windows? This is for a >> very high traffic website so it's beginning to get embarrassing. >> >> Would appreciate any thoughts or pointers. > > Upgrade to something more modern than 8.2.x. Autovacuum was still > very much in its infancy back then. 9.0 or higher is a good choice. > What do iostat -xd 10 and vmstat 10 and top say about these processes > when they're running. "It's taking a really long time and seems like > it's hanging" tells us nothing useful. Your OS has tools to let you > figure out what's bottlenecking your operations, so get familiar with > them and let us know what they tell you. These are all suggestions I > made before which you have now classified as "not answering your > questions" so I'm getting a little tired of helping you when you don't > seem interested in helping yourself. > > What are your vacuum and autovacuum costing values set to? Can you > make vacuum and / or autovacuum more aggresive? Also a few more questions, what are you using for storage? How many drives, RAID controller if any, RAID configuration etc.?
Thanks Scott. I have shared huge amounts of info in my emails to Merlin and you. Intentionally not shared in public. Apologies if you are feeling tired. The reason I need to REINDEX is because a simple SELECT query based on the index column is taking ages. It used to take less than a second. I want to make sure that the index is properly in place, at least. We went through some BLOAT reports. Apparently Merlin told me there's no significant bloat. A manual VACUUM right now takes ages too. AUTOVACUUM settings are below. It's a RAID 1 setup. Two Raptor 10000rpm disks. TOP does not show much beyond "postmaster". How should I use TOP and what info can I give you? This is what it looks like: 14231 root 18 0 4028 872 728 R 93.8 0.0 28915:37 exim_dbmbuild 11001 root 25 0 4056 864 716 R 93.8 0.0 23111:06 exim_dbmbuild 16400 root 25 0 4824 864 720 R 92.5 0.0 33843:52 exim_dbmbuild 4799 postgres 15 0 532m 94m 93m D 0.7 1.2 0:00.14 postmaster 12292 nobody 15 0 48020 14m 5088 S 0.7 0.2 0:00.06 httpd 12943 root 17 0 2828 1224 776 R 0.7 0.0 0:00.04 top 7236 mysql 16 0 224m 64m 3692 S 0.3 0.8 26:43.46 mysqld 31421 postgres 15 0 530m 12m 12m S 0.3 0.2 0:03.08 postmaster 31430 postgres 15 0 10456 576 224 S 0.3 0.0 0:00.08 postmaster 955 postgres 15 0 532m 91m 90m S 0.3 1.1 0:00.15 postmaster 1054 postgres 15 0 532m 196m 195m S 0.3 2.4 0:00.37 postmaster 1232 postgres 15 0 532m 99m 98m D 0.3 1.2 0:00.27 postmaster 1459 postgres 15 0 532m 86m 85m S 0.3 1.1 0:00.12 postmaster 4552 postgres 15 0 532m 86m 85m S 0.3 1.1 0:00.08 postmaster 7187 postgres 15 0 532m 157m 155m S 0.3 1.9 0:00.19 postmaster 7587 postgres 15 0 532m 175m 173m D 0.3 2.2 0:00.23 postmaster 8131 postgres 15 0 532m 154m 152m S 0.3 1.9 0:00.15 postmaster 9473 nobody 16 0 48268 15m 5800 S 0.3 0.2 0:00.34 httpd 9474 nobody 15 0 48096 14m 5472 S 0.3 0.2 0:00.27 httpd 10688 nobody 16 0 0 0 0 Z 0.3 0.0 0:00.20 httpd <defunct> 12261 nobody 15 0 47956 13m 4296 S 0.3 0.2 0:00.08 httpd 12278 nobody 15 0 47956 13m 4052 S 0.3 0.2 0:00.04 httpd 12291 nobody 15 0 47972 14m 4956 S 0.3 0.2 0:00.07 httpd 12673 nobody 15 0 47912 13m 4180 S 0.3 0.2 0:00.02 httpd 12674 nobody 15 0 47936 13m 4924 S 0.3 0.2 0:00.02 httpd 12678 nobody 16 0 47912 13m 4060 S 0.3 0.2 0:00.01 httpd 12727 nobody 15 0 47912 13m 4024 S 0.3 0.2 0:00.03 httpd 12735 nobody 15 0 47912 13m 4144 S 0.3 0.2 0:00.02 httpd VMSTAT 10 shows this: r b swpd free buff cache si so bi bo in cs us sy id wa 3 14 99552 17900 41108 7201712 0 0 42 11 0 0 8 34 41 16 2 17 99552 16468 41628 7203012 0 0 1326 84 1437 154810 7 66 12 15 3 7 99476 16796 41056 7198976 0 0 1398 96 1453 156211 7 66 21 6 3 17 99476 17228 39132 7177240 0 0 1325 68 1529 156111 8 65 16 11 The results of "iostat -xd 10" is: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.24 24.55 9.33 4.41 111.31 231.75 55.65 115.88 24.97 0.17 12.09 6.67 9.17 sdb 0.06 97.65 2.21 3.97 91.59 389.58 45.80 194.79 77.84 0.06 9.95 2.73 1.69 sdc 1.46 62.71 187.20 29.13 132.43 311.72 66.22 155.86 2.05 0.36 1.65 1.12 24.33 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 7.41 0.30 3.50 2.40 87.29 1.20 43.64 23.58 0.13 32.92 10.03 3.81 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 18.32 158.26 4.10 2519.32 180.98 1259.66 90.49 16.63 13.04 79.91 6.17 100.11 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 6.21 0.00 1.40 0.00 60.86 0.00 30.43 43.43 0.03 20.07 15.00 2.10 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.10 10.31 159.06 2.50 2635.44 101.70 1317.72 50.85 16.94 12.82 79.44 6.20 100.12 8GB memory in total. 1GB devoted to PGSQL during these operations. Otherwise, my settings are as follows (and yes I did make the vacuum settings more aggressive based on your email, which has had no apparent impact) -- max_connections = 350 shared_buffers = 500MB effective_cache_size = 1250MB max_fsm_relations = 1500 max_fsm_pages = 950000 work_mem = 100MB maintenance_work_mem = 200MB temp_buffers = 4096 authentication_timeout = 10s ssl = off checkpoint_warning = 3600 random_page_cost = 1 What else can I share? Thanks much for offering to help. On Sun, Apr 17, 2011 at 11:44 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Sun, Apr 17, 2011 at 9:30 AM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: >> Sorry, rejuvenating a thread that was basically unanswered. >> >> I closed the database for any kinds of access to focus on maintenance >> operations, killed all earlier processes so that my maintenance is the >> only stuff going on. >> >> REINDEX is still taking 3 hours -- and it is still not finished! >> >> Similarly, if I cancel the REINDEX and issue a VACUUM ANALYZE VERBOSE, >> this too seems to just hang there on my big table. >> >> I changed the maintenance_work_men to 2GB for this operation. It's >> highly worrisome -- the above slow times are with 2GB of my server >> dedicated to Postgresql!!!! >> >> Surely this is not tenable for enterprise environments? I am on a >> 64bit RedHat server with dual CPU Intel Woodcrest or whatever that was >> called. Postgres is 8.2.9. >> >> How do DB folks do this with small maintenance windows? This is for a >> very high traffic website so it's beginning to get embarrassing. >> >> Would appreciate any thoughts or pointers. > > Upgrade to something more modern than 8.2.x. Autovacuum was still > very much in its infancy back then. 9.0 or higher is a good choice. > What do iostat -xd 10 and vmstat 10 and top say about these processes > when they're running. "It's taking a really long time and seems like > it's hanging" tells us nothing useful. Your OS has tools to let you > figure out what's bottlenecking your operations, so get familiar with > them and let us know what they tell you. These are all suggestions I > made before which you have now classified as "not answering your > questions" so I'm getting a little tired of helping you when you don't > seem interested in helping yourself. > > What are your vacuum and autovacuum costing values set to? Can you > make vacuum and / or autovacuum more aggresive? >
On April 17, 2011, Phoenix <phoenix.kiula@gmail.com> wrote:
> >> Surely this is not tenable for enterprise environments? I am on a
> >> 64bit RedHat server with dual CPU Intel Woodcrest or whatever that was
> >> called. Postgres is 8.2.9.
> >>
.. and you have essentially 1 disk drive. Your hardware is not sized for a database server.
>> it's a RAID 1 setup. Two Raptor 10000rpm disks.
On Sun, Apr 17, 2011 at 10:59 AM, Phoenix <phoenix.kiula@gmail.com> wrote: > TOP does not show much beyond "postmaster". How should I use TOP and > what info can I give you? This is what it looks like: We're basically looking to see if the postmaster process doing the vacuuming or reindexing is stuck in a D state, which means it's waiting on IO. hot the c key while it's running and you should get a little more info on which processes are what. > 4799 postgres 15 0 532m 94m 93m D 0.7 1.2 0:00.14 > postmaster That is likely the postmaster that is waiting on IO. > VMSTAT 10 shows this: > > r b swpd free buff cache si so bi bo in cs us sy id wa > 3 14 99552 17900 41108 7201712 0 0 42 11 0 0 8 34 41 16 > 2 17 99552 16468 41628 7203012 0 0 1326 84 1437 154810 7 66 12 15 > 3 7 99476 16796 41056 7198976 0 0 1398 96 1453 156211 7 66 21 6 > 3 17 99476 17228 39132 7177240 0 0 1325 68 1529 156111 8 65 16 11 So, we're at 11 to 15% io wait. I'm gonna guess you have 8 cores / threads in your CPUs, and 1/8th ot 100% is 12% so looks like you're probably IO bound here. iostat tells us more: > The results of "iostat -xd 10" is: > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s > avgrq-sz avgqu-sz await svctm %util > sda 0.00 7.41 0.30 3.50 2.40 87.29 1.20 43.64 > 23.58 0.13 32.92 10.03 3.81 > sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 > sdc 0.00 18.32 158.26 4.10 2519.32 180.98 1259.66 > 90.49 16.63 13.04 79.91 6.17 100.11 100% IO utilization, so yea, it's likely that your sdc drive is your bottleneck. Given our little data is actually moving through the sdc drive, it's not very fast. > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s > 8GB memory in total. 1GB devoted to PGSQL during these operations. > Otherwise, my settings are as follows (and yes I did make the vacuum > settings more aggressive based on your email, which has had no > apparent impact) -- Yeah, as it gets more aggressive it can use more of your IO bandwidth. Since you > What else can I share? That's a lot of help. I'm assuming you're running software or motherboard fake-raid on this RAID-1 set? I'd suggest buying a $500 or so battery backed caching RAID controller first, the improvements in performance are huge with such a card. You might wanna try testing the current RAID-1 set with bonnie++ to get an idea of how fast it is.
Thanks for these suggestions. I am beginning to wonder if the issue is deeper. I set autovacuum to off, then turned off all the connections to the database, and did a manual vacuum just to see how long it takes. This was last night my time. I woke up this morning and it has still not finished. The maintenance_men given to the DB for this process was 2GB. There is nothing else going on on the server! Now, even REINDEX is just failing in the middle: # REINDEX INDEX new_idx_userid; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. What else could be wrong? On Mon, Apr 18, 2011 at 2:38 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Sun, Apr 17, 2011 at 10:59 AM, Phoenix <phoenix.kiula@gmail.com> wrote: >> TOP does not show much beyond "postmaster". How should I use TOP and >> what info can I give you? This is what it looks like: > > We're basically looking to see if the postmaster process doing the > vacuuming or reindexing is stuck in a D state, which means it's > waiting on IO. > hot the c key while it's running and you should get a little more info > on which processes are what. > >> 4799 postgres 15 0 532m 94m 93m D 0.7 1.2 0:00.14 >> postmaster > > That is likely the postmaster that is waiting on IO. > >> VMSTAT 10 shows this: >> >> r b swpd free buff cache si so bi bo in cs us sy id wa >> 3 14 99552 17900 41108 7201712 0 0 42 11 0 0 8 34 41 16 >> 2 17 99552 16468 41628 7203012 0 0 1326 84 1437 154810 7 66 12 15 >> 3 7 99476 16796 41056 7198976 0 0 1398 96 1453 156211 7 66 21 6 >> 3 17 99476 17228 39132 7177240 0 0 1325 68 1529 156111 8 65 16 11 > > So, we're at 11 to 15% io wait. I'm gonna guess you have 8 cores / > threads in your CPUs, and 1/8th ot 100% is 12% so looks like you're > probably IO bound here. iostat tells us more: > >> The results of "iostat -xd 10" is: >> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s >> avgrq-sz avgqu-sz await svctm %util >> sda 0.00 7.41 0.30 3.50 2.40 87.29 1.20 43.64 >> 23.58 0.13 32.92 10.03 3.81 >> sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >> 0.00 0.00 0.00 0.00 0.00 >> sdc 0.00 18.32 158.26 4.10 2519.32 180.98 1259.66 >> 90.49 16.63 13.04 79.91 6.17 100.11 > > 100% IO utilization, so yea, it's likely that your sdc drive is your > bottleneck. Given our little data is actually moving through the sdc > drive, it's not very fast. > >> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s > >> 8GB memory in total. 1GB devoted to PGSQL during these operations. >> Otherwise, my settings are as follows (and yes I did make the vacuum >> settings more aggressive based on your email, which has had no >> apparent impact) -- > > Yeah, as it gets more aggressive it can use more of your IO bandwidth. > Since you > >> What else can I share? > > That's a lot of help. I'm assuming you're running software or > motherboard fake-raid on this RAID-1 set? I'd suggest buying a $500 > or so battery backed caching RAID controller first, the improvements > in performance are huge with such a card. You might wanna try testing > the current RAID-1 set with bonnie++ to get an idea of how fast it is. >
Btw, hardware is not an issue. My db has been working fine for a while. Smaller poorer systems around the web run InnoDB databases. I wouldn't touch that with a barge pole. I have a hardware RAID controller, not "fake". It's a good quality battery-backed 3Ware: http://192.19.193.26/products/serial_ata2-9000.asp On Mon, Apr 18, 2011 at 1:14 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > Thanks for these suggestions. > > I am beginning to wonder if the issue is deeper. > > I set autovacuum to off, then turned off all the connections to the > database, and did a manual vacuum just to see how long it takes. > > This was last night my time. I woke up this morning and it has still > not finished. > > The maintenance_men given to the DB for this process was 2GB. > > There is nothing else going on on the server! Now, even REINDEX is > just failing in the middle: > > > # REINDEX INDEX new_idx_userid; > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > > > What else could be wrong? > > > > > On Mon, Apr 18, 2011 at 2:38 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: >> On Sun, Apr 17, 2011 at 10:59 AM, Phoenix <phoenix.kiula@gmail.com> wrote: >>> TOP does not show much beyond "postmaster". How should I use TOP and >>> what info can I give you? This is what it looks like: >> >> We're basically looking to see if the postmaster process doing the >> vacuuming or reindexing is stuck in a D state, which means it's >> waiting on IO. >> hot the c key while it's running and you should get a little more info >> on which processes are what. >> >>> 4799 postgres 15 0 532m 94m 93m D 0.7 1.2 0:00.14 >>> postmaster >> >> That is likely the postmaster that is waiting on IO. >> >>> VMSTAT 10 shows this: >>> >>> r b swpd free buff cache si so bi bo in cs us sy id wa >>> 3 14 99552 17900 41108 7201712 0 0 42 11 0 0 8 34 41 16 >>> 2 17 99552 16468 41628 7203012 0 0 1326 84 1437 154810 7 66 12 15 >>> 3 7 99476 16796 41056 7198976 0 0 1398 96 1453 156211 7 66 21 6 >>> 3 17 99476 17228 39132 7177240 0 0 1325 68 1529 156111 8 65 16 11 >> >> So, we're at 11 to 15% io wait. I'm gonna guess you have 8 cores / >> threads in your CPUs, and 1/8th ot 100% is 12% so looks like you're >> probably IO bound here. iostat tells us more: >> >>> The results of "iostat -xd 10" is: >>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s >>> avgrq-sz avgqu-sz await svctm %util >>> sda 0.00 7.41 0.30 3.50 2.40 87.29 1.20 43.64 >>> 23.58 0.13 32.92 10.03 3.81 >>> sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >>> 0.00 0.00 0.00 0.00 0.00 >>> sdc 0.00 18.32 158.26 4.10 2519.32 180.98 1259.66 >>> 90.49 16.63 13.04 79.91 6.17 100.11 >> >> 100% IO utilization, so yea, it's likely that your sdc drive is your >> bottleneck. Given our little data is actually moving through the sdc >> drive, it's not very fast. >> >>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s >> >>> 8GB memory in total. 1GB devoted to PGSQL during these operations. >>> Otherwise, my settings are as follows (and yes I did make the vacuum >>> settings more aggressive based on your email, which has had no >>> apparent impact) -- >> >> Yeah, as it gets more aggressive it can use more of your IO bandwidth. >> Since you >> >>> What else can I share? >> >> That's a lot of help. I'm assuming you're running software or >> motherboard fake-raid on this RAID-1 set? I'd suggest buying a $500 >> or so battery backed caching RAID controller first, the improvements >> in performance are huge with such a card. You might wanna try testing >> the current RAID-1 set with bonnie++ to get an idea of how fast it is. >> >
On Mon, Apr 18, 2011 at 7:14 AM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > # REINDEX INDEX new_idx_userid; > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > > > What else could be wrong? That's hardly enough information to guess, but since we're trying to guess, maybe your maintainance_mem went overboard and your server ran out of RAM. Or disk space. Aside from a bug, that's the only reason I can think for a pg backend to bail out like that. Well, the connection could have been cut off by other means (ie: someone tripped on the cable or something), but lets not dwell on those options.
On Mon, Apr 18, 2011 at 8:39 AM, Claudio Freire <klaussfreire@gmail.com> wrote: > Aside from a bug, that's the only reason I can think for a pg backend > to bail out like that. Well, the connection could have been cut off by > other means (ie: someone tripped on the cable or something), but lets > not dwell on those options. Sorry for the double-post, but I should add, you really should set up some kind of monitoring, like cacti[0] with snmp or a similar setup, so you can monitor the state of your server in detail without having to stare at it. [0] http://www.cacti.net/
People are running larger InnoDB databases on poorer hardware. Note that I wouldn't dream of it because I care about data integrity and stability, but this discussion is purely about performance and I know it is possible. I am sure throwing hardware at it is not the solution. Just trying to highlight what the root cause is. Raptor disks are not that bad, even if there's just "one" disk with RAID1, especially for a SELECT-heavy web app. Scott's idea of upgrading to 9.x is a good one. But it's not been easy in the past. There have been issues related to UTF-8, after the whole RPM stuff on CentOS has been sorted out. QUESTION: If auto_vaccum is ON, and I'm running a manual vacuum, will they coflict with each other or will basically one of them wait for the other to finish? On Mon, Apr 18, 2011 at 2:13 AM, Alan Hodgson <ahodgson@simkin.ca> wrote: > On April 17, 2011, Phoenix <phoenix.kiula@gmail.com> wrote: > >> >> Surely this is not tenable for enterprise environments? I am on a > >> >> 64bit RedHat server with dual CPU Intel Woodcrest or whatever that was > >> >> called. Postgres is 8.2.9. > >> >> > > .. and you have essentially 1 disk drive. Your hardware is not sized for a > database server. > >>> it's a RAID 1 setup. Two Raptor 10000rpm disks.
On Sun, Apr 17, 2011 at 11:19 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > Btw, hardware is not an issue. My db has been working fine for a > while. Smaller poorer systems around the web run InnoDB databases. I > wouldn't touch that with a barge pole. Did you or someone in an earlier post say that you didn't have problems with table bloat? I can't remember for sure. Anyway if it's not hardware then it's drivers or your OS. The output of iostat is abysmally bad. 100% utilization but actual throughput is pretty low. Have you used the CLI utility for your RAID card to check for possible problems or errors? Maybe your battery is dead or non-functioning? Don't just rule out hardware until you're sure yours is working well.
On Mon, Apr 18, 2011 at 1:26 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Sun, Apr 17, 2011 at 11:19 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: >> Btw, hardware is not an issue. My db has been working fine for a >> while. Smaller poorer systems around the web run InnoDB databases. I >> wouldn't touch that with a barge pole. > > Did you or someone in an earlier post say that you didn't have > problems with table bloat? I can't remember for sure. > > Anyway if it's not hardware then it's drivers or your OS. The output > of iostat is abysmally bad. 100% utilization but actual throughput is > pretty low. Have you used the CLI utility for your RAID card to check > for possible problems or errors? Maybe your battery is dead or > non-functioning? Don't just rule out hardware until you're sure yours > is working well. For instance, here is what I get from iostat on my very CPU bound 8 core opteron machine with a battery backed caching controller: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 9.50 0.30 11.20 2.40 1826.40 159.03 0.01 0.54 0.50 0.58 sdb 42.40 219.80 114.60 41.10 27982.40 2088.80 193.14 0.26 1.67 1.42 22.16 Note that sda is the system / pg_xlog drive, and sdb is the /data/base dir, minus pg_xlog. I'm reading ~19MB/s and writing ~1MB/s on sdb and that's using 22% of the IO approximately. My CPUs are all pegged at 100% and I'm getting ~2500 tps. I'm betting pgbench on your system will get something really low like 200 tps and be maxing out your %util.
On Sun, Apr 17, 2011 at 11:19 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > Btw, hardware is not an issue. My db has been working fine for a > while. Smaller poorer systems around the web run InnoDB databases. I > wouldn't touch that with a barge pole. > > I have a hardware RAID controller, not "fake". It's a good quality > battery-backed 3Ware: > http://192.19.193.26/products/serial_ata2-9000.asp (please stop top posting) Also, when you run top and hit c what do those various postgres processes say they're doing? bgwriter, SELECT, VACUMM etc?
On Mon, Apr 18, 2011 at 3:38 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Sun, Apr 17, 2011 at 11:19 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: >> Btw, hardware is not an issue. My db has been working fine for a >> while. Smaller poorer systems around the web run InnoDB databases. I >> wouldn't touch that with a barge pole. >> >> I have a hardware RAID controller, not "fake". It's a good quality >> battery-backed 3Ware: >> http://192.19.193.26/products/serial_ata2-9000.asp > > (please stop top posting) > > Also, when you run top and hit c what do those various postgres > processes say they're doing? bgwriter, SELECT, VACUMM etc? > Thanks. But let me do the "top" stuff later. I think I have a bigger problem now. While doing a PG dump, I seem to get this error: ERROR: invalid memory alloc request size 4294967293 Upon googling, this seems to be a data corruption issue! One of the older messages suggests that I do "file level backup and restore the data" - http://archives.postgresql.org/pgsql-admin/2008-05/msg00191.php How does one do this -- should I copy the data folder? What are the specific steps to restore from here, would I simply copy the files from the data folder back to the new install or something? Cant find these steps in the PG documentation. I'm on PG 8.2.9, CentOS 5, with 8GB of RAM.
On Mon, Apr 18, 2011 at 1:45 AM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > On Mon, Apr 18, 2011 at 3:38 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: >> On Sun, Apr 17, 2011 at 11:19 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: >>> Btw, hardware is not an issue. My db has been working fine for a >>> while. Smaller poorer systems around the web run InnoDB databases. I >>> wouldn't touch that with a barge pole. >>> >>> I have a hardware RAID controller, not "fake". It's a good quality >>> battery-backed 3Ware: >>> http://192.19.193.26/products/serial_ata2-9000.asp >> >> (please stop top posting) >> >> Also, when you run top and hit c what do those various postgres >> processes say they're doing? bgwriter, SELECT, VACUMM etc? >> > > > > > > Thanks. But let me do the "top" stuff later. I think I have a bigger > problem now. > > While doing a PG dump, I seem to get this error: > > ERROR: invalid memory alloc request size 4294967293 > > Upon googling, this seems to be a data corruption issue! > > One of the older messages suggests that I do "file level backup and > restore the data" - > http://archives.postgresql.org/pgsql-admin/2008-05/msg00191.php > > How does one do this -- should I copy the data folder? What are the > specific steps to restore from here, would I simply copy the files > from the data folder back to the new install or something? Cant find > these steps in the PG documentation. > > I'm on PG 8.2.9, CentOS 5, with 8GB of RAM. I wonder if you've got a drive going bad (or both of them) what does your RAID card have to say about the drives? To do a file level backup, setup another machine on the same network, with enough space on a drive with write access for the account you want to backup to. Shut down the Postgres server (sudo /etc/init.d/postgresql stop or something like that) then use rsync -avl /data/pgdir remoteserver:/newdatadir/ to back it up. you want to start with that so you can at least get back to where you are now if things go wrong. Also, after that, run memtest86+ to make sure you don't have memory errors.
You mean the maintenance instead of mentioning the recovery? If yes
The following types of administration commands are not accepted during recovery mode:
Data Definition Language (DDL) - e.g. CREATE INDEX
Privilege and Ownership - GRANT, REVOKE, REASSIGN
Maintenance commands - ANALYZE, VACUUM, CLUSTER, REINDEX
On Sun, Apr 17, 2011 at 5:30 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote:
Sorry, rejuvenating a thread that was basically unanswered.
I closed the database for any kinds of access to focus on maintenance
operations, killed all earlier processes so that my maintenance is the
only stuff going on.
REINDEX is still taking 3 hours -- and it is still not finished!
Similarly, if I cancel the REINDEX and issue a VACUUM ANALYZE VERBOSE,
this too seems to just hang there on my big table.
I changed the maintenance_work_men to 2GB for this operation. It's
highly worrisome -- the above slow times are with 2GB of my server
dedicated to Postgresql!!!!
Surely this is not tenable for enterprise environments? I am on a
64bit RedHat server with dual CPU Intel Woodcrest or whatever that was
called. Postgres is 8.2.9.
How do DB folks do this with small maintenance windows? This is for a
very high traffic website so it's beginning to get embarrassing.
Would appreciate any thoughts or pointers.
Thanks!
On Mon, Mar 21, 2011 at 9:28 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Fri, Mar 18, 2011 at 10:07 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote:
>> I have a large table but not as large as the kind of numbers that get
>> discussed on this list. It has 125 million rows.
>>
>> REINDEXing the table takes half a day, and it's still not finished.
>>
>> To write this post I did "SELECT COUNT(*)", and here's the output -- so long!
>>
>> select count(*) from links;
>> count
>> -----------
>> 125418191
>> (1 row)
>>
>> Time: 1270405.373 ms
>>
>> That's 1270 seconds!
>>
>> I suppose the vaccuum analyze is not doing its job? As you can see
>> from settings below, I have autovacuum set to ON, and there's also a
>> cronjob every 10 hours to do a manual vacuum analyze on this table,
>> which is largest.
>>
>> PG is version 8.2.9.
>>
>> Any thoughts on what I can do to improve performance!?
>>
>> Below are my settings.
>>
>>
>>
>> max_connections = 300
>> shared_buffers = 500MB
>> effective_cache_size = 1GB
>> max_fsm_relations = 1500
>> max_fsm_pages = 950000
>>
>> work_mem = 100MB
>> temp_buffers = 4096
>> authentication_timeout = 10s
>> ssl = off
>> checkpoint_warning = 3600
>> random_page_cost = 1
>>
>> autovacuum = on
>> autovacuum_vacuum_cost_delay = 20
>>
>> vacuum_cost_delay = 20
>> vacuum_cost_limit = 600
>>
>> autovacuum_naptime = 10
>> stats_start_collector = on
>> stats_row_level = on
>> autovacuum_vacuum_threshold = 75
>> autovacuum_analyze_threshold = 25
>> autovacuum_analyze_scale_factor = 0.02
>> autovacuum_vacuum_scale_factor = 0.01
>>
>> wal_buffers = 64
>> checkpoint_segments = 128
>> checkpoint_timeout = 900
>> fsync = on
>> maintenance_work_mem = 512MB
>
> how much memory do you have? you might want to consider raising
> maintenance_work_mem to 1GB. Are other things going on in the
> database while you are rebuilding your indexes? Is it possible you
> are blocked waiting on a lock for a while?
>
> How much index data is there? Can we see the table definition along
> with create index statements?
>
> merlin
>
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
> Thanks. But let me do the "top" stuff later. I think I have a bigger > problem now. > > While doing a PG dump, I seem to get this error: > > ERROR: invalid memory alloc request size 4294967293 > > Upon googling, this seems to be a data corruption issue! > > One of the older messages suggests that I do "file level backup and > restore the data" - > http://archives.postgresql.org/pgsql-admin/2008-05/msg00191.php > > How does one do this -- should I copy the data folder? What are the > specific steps to restore from here, would I simply copy the files > from the data folder back to the new install or something? Cant find > these steps in the PG documentation. Just stop the database, and copy the 'data' directory somewhere else (to a different machine prefferably). You can then start the database from this directory copy (not sure how that works in CentOS, but you can always run "postmaster -D directory"). > > I'm on PG 8.2.9, CentOS 5, with 8GB of RAM. > This is a massive thread (and part of the important info is in another thread other mailing lists), so maybe I've missed something important, but it seems like: 1) You're I/O bound (according to the 100% utilization reported by iostat). 2) Well, you're running RAID1 setup, which basically means it's 1 drive (and you're doing reindex, which means a lot of read/writes). 3) The raid controller should handle this, unless it's broken, the battery is empty (and thus the writes are not cached) or something like that. I'm not that familiar with 3ware - is there any diagnostic tool that you use to check the health of the controller / drives? 4) I know you've mentioned there is no bloat (according to off-the-list discussion with Merlin) - is this true for the table only? Because if the index is not bloated, then there's no point in running reindex ... BTW what is the size of the database and that big table? I know it's 125 million rows, but how much is that? 1GB, 1TB, ... how much? What does this return SELECT reltuples FROM pg_class WHERE relname = 'links'; Do you have any pg_dump backups? What size are they, compared to the live database? Havou you tried to rebuild the database from these backups? That would give you a fresh indexes, so you could see how a 'perfectly clean' database looks (whether the indexes bloated, what speed is expected etc.). regards Tomas
On Apr 17, 2011, at 11:30 AM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > Sorry, rejuvenating a thread that was basically unanswered. > > I closed the database for any kinds of access to focus on maintenance > operations, killed all earlier processes so that my maintenance is the > only stuff going on. > > REINDEX is still taking 3 hours -- and it is still not finished! > > Similarly, if I cancel the REINDEX and issue a VACUUM ANALYZE VERBOSE, > this too seems to just hang there on my big table. > > I changed the maintenance_work_men to 2GB for this operation. It's > highly worrisome -- the above slow times are with 2GB of my server > dedicated to Postgresql!!!! > > Surely this is not tenable for enterprise environments? I am on a > 64bit RedHat server with dual CPU Intel Woodcrest or whatever that was > called. Postgres is 8.2.9. > > How do DB folks do this with small maintenance windows? This is for a > very high traffic website so it's beginning to get embarrassing. > > Would appreciate any thoughts or pointers. An upgrade would probably help you a lot, and as others have said it sounds like your hardware is failing, so you probablywant to deal with that first. I am a bit surprised, however, that no one seems to have mentioned using CLUSTER rather than VACUUM or REINDEX. Sometimesthat's worth a try... ...Robert
On 04/23/2011 03:44 PM, Robert Haas wrote: > On Apr 17, 2011, at 11:30 AM, Phoenix Kiula<phoenix.kiula@gmail.com> wrote: > >> Postgres is 8.2.9. >> >> > An upgrade would probably help you a lot, and as others have said it sounds like your hardware is failing, so you probablywant to deal with that first. > > I am a bit surprised, however, that no one seems to have mentioned using CLUSTER rather than VACUUM or REINDEX. Sometimesthat's worth a try... > Don't know if it was for this reason or not for not mentioning it by others, but CLUSTER isn't so great in 8.2. The whole "not MVCC-safe" bit does not inspire confidence on a production server. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
On Sat, Apr 30, 2011 at 4:07 PM, Greg Smith <greg@2ndquadrant.com> wrote: > On 04/23/2011 03:44 PM, Robert Haas wrote: >> >> On Apr 17, 2011, at 11:30 AM, Phoenix Kiula<phoenix.kiula@gmail.com> >> wrote: >> >>> >>> Postgres is 8.2.9. >>> >>> >> >> An upgrade would probably help you a lot, and as others have said it >> sounds like your hardware is failing, so you probably want to deal with that >> first. >> >> I am a bit surprised, however, that no one seems to have mentioned using >> CLUSTER rather than VACUUM or REINDEX. Sometimes that's worth a try... >> > > Don't know if it was for this reason or not for not mentioning it by others, > but CLUSTER isn't so great in 8.2. The whole "not MVCC-safe" bit does not > inspire confidence on a production server. To everyone. Thanks so much for everything, truly. We have managed to salvage the data by exporting it in bits and pieces. 1. First the schema only 2. Then pg_dump of specific small tables 3. Then pg_dump of timed bits of the big mammoth table Not to jinx it, but the newer hardware seems to be doing well. I am on 9.0.4 now and it's pretty fast. Also, as has been mentioned in this thread and other discussions on the list, just doing a dump and then fresh reload has compacted the DB to nearly 1/3rd of its previously reported size! I suppose that's what I am going to do on a periodic basis from now on. There is a lot of DELETE/UPDATE activity. But I wonder if the vacuum stuff really should do something that's similar in function? What do the high-end enterprise folks do -- surely they can't be dumping/restoring every quarter or so....or are they? Anyway, many many thanks to the lovely folks on this list. Much appreciated!
On Sat, Apr 30, 2011 at 05:26:36PM +0800, Phoenix Kiula wrote: > On Sat, Apr 30, 2011 at 4:07 PM, Greg Smith <greg@2ndquadrant.com> wrote: > > On 04/23/2011 03:44 PM, Robert Haas wrote: > >> > >> On Apr 17, 2011, at 11:30 AM, Phoenix Kiula<phoenix.kiula@gmail.com> > >> ?wrote: > >> > >>> > >>> Postgres is 8.2.9. > >>> > >>> > >> > >> An upgrade would probably help you a lot, and as others have said it > >> sounds like your hardware is failing, so you probably want to deal with that > >> first. > >> > >> I am a bit surprised, however, that no one seems to have mentioned using > >> CLUSTER rather than VACUUM or REINDEX. Sometimes that's worth a try... > >> > > > > Don't know if it was for this reason or not for not mentioning it by others, > > but CLUSTER isn't so great in 8.2. ?The whole "not MVCC-safe" bit does not > > inspire confidence on a production server. > > > > > To everyone. Thanks so much for everything, truly. We have managed to > salvage the data by exporting it in bits and pieces. > > 1. First the schema only > 2. Then pg_dump of specific small tables > 3. Then pg_dump of timed bits of the big mammoth table > > Not to jinx it, but the newer hardware seems to be doing well. I am on > 9.0.4 now and it's pretty fast. > > Also, as has been mentioned in this thread and other discussions on > the list, just doing a dump and then fresh reload has compacted the DB > to nearly 1/3rd of its previously reported size! > > I suppose that's what I am going to do on a periodic basis from now > on. There is a lot of DELETE/UPDATE activity. But I wonder if the > vacuum stuff really should do something that's similar in function? > What do the high-end enterprise folks do -- surely they can't be > dumping/restoring every quarter or so....or are they? > > Anyway, many many thanks to the lovely folks on this list. Much appreciated! > The autovacuum and space management in 9.0 is dramatically more effective and efficient then that of 8.2. Unless you have an odd corner-case there really should be no reason for a periodic dump/restore. This is not your grandmother's Oldsmobile... :) Regards, Ken
On Apr 30, 2011, at 9:34 AM, Kenneth Marshall wrote: >> I suppose that's what I am going to do on a periodic basis from now >> on. There is a lot of DELETE/UPDATE activity. But I wonder if the >> vacuum stuff really should do something that's similar in function? >> What do the high-end enterprise folks do -- surely they can't be >> dumping/restoring every quarter or so....or are they? >> >> Anyway, many many thanks to the lovely folks on this list. Much appreciated! >> > > The autovacuum and space management in 9.0 is dramatically more effective > and efficient then that of 8.2. Unless you have an odd corner-case there > really should be no reason for a periodic dump/restore. This is not your > grandmother's Oldsmobile... :) In 10+ years of using Postgres, I've never come across a case where you actually *need* to dump and restore on a regularbasis. However, you can certainly run into scenarios where vacuum simply can't keep up. If your restored databaseis 1/3 the size of the original then this is certainly what was happening on your 8.2 setup. As Kenneth mentioned, 9.0 is far better in this regard than 8.2, though it's still possible that you're doing something thatwill give it fits. I suggest that you run a weekly vacuumdb -av, capture that output and run it through pgFouine. Thatwill give you a ton of useful information about the amount of bloat you have in each table. I would definitely look atanything with over 20% bloat. BTW, in case you're still questioning using Postgres in an enterprise setting; all of our production OLTP databases run onPostgres. The largest one is ~1.5TB and does over 650TPS on average (with peaks that are much higher). Unplanned downtimeon that database would cost us well over $100k/hour, and we're storing financial information, so data quality issuesare not an option (data quality was one of the primary reasons we moved away from MySQL in 2006). So yes, you can absolutelyrun very large Postgres databases in a high-workload environment. BTW, that's also on version 8.3. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Sat, Apr 30, 2011 at 4:26 AM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > I suppose that's what I am going to do on a periodic basis from now > on. There is a lot of DELETE/UPDATE activity. But I wonder if the > vacuum stuff really should do something that's similar in function? > What do the high-end enterprise folks do -- surely they can't be > dumping/restoring every quarter or so....or are they? The pg_reorg tool (google it) can rebuild a live table rebuilds without taking major locks. It's better to try an engineer your database so that you have enough spare i/o to manage 1-2 continuously running vacuums, but if things get really out of whack it's there. merlin