Thread: Postgres performance slowly gets worse over a month
I've been through the archives a lot, trying to figure this one out. We are running Postgresql 7.2.1 on a RedHat 7.2 system with kernel 2.4.9-12smp on a two processor system. What we have noticed is that performance slowly gets worse over the course of a month to the point that we have to pg_dump the databases, wipe the data dir, initdb and restore the databases. This takes as long as five hours (depending on whether or not we remember to turn off 'fsync' during the pg_restore). After the restore, performance is again stellar. Nightly, we're doing a 'vacuumdb -a -z' after stopping and restarting the database. The performance will get so bad after a month that we start to see load spikes in excess of 30. Normally, we don't see load over 2.5 during the heaviest activity and generally less than 1.0 most of the time. We are using the database as part of a web application that is used fairly intensively every day. We are using persistant connections with the web server that lives on another machine. Apache's MaxRequestsPerChild is set to 100 to prevent the things from growing to insane sizes during the course of the day. We build 7.2.1 from source using the config: ./configure --prefix=/usr/local/pgsql --enable-multibyte Any insights would be most appreciated. Thanks... Bob -- Robert M. Meyer Sr. Network Administrator DigiVision Satellite Services 14 Lafayette Sq, Ste 410 Buffalo, NY 14203-1904 (716)332-1451
>Nightly, we're doing a 'vacuumdb -a -z' after stopping and restarting >the database. The performance will get so bad after a month that we >start to see load spikes in excess of 30. Normally, we don't see load >over 2.5 during the heaviest activity and generally less than 1.0 most >of the time. Typically, performance is linked to your I/O, but my first guess in this case has to do with your indices. As a test, next time performance gets really rotten, drop your indicies and rebuild them. It cannot hurt, and might just help. The trick here is to see what is happening while it is tanking. What does your disk configuration look like? Is it a raid or stripe where reads are spread out among more than one controller? Do sar reports point to anything in particular? ---------------------------------------------------------------------------- ---------------------------------- Naomi Walker Eldorado Computing, Inc Chief Information Officer nwalker@eldocomp.com 602-604-3100 x242
Well, we're running a hardware, three disk RAID5, on an sym53c896-0-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s interface. This is a Compaq 3500 system with a CR3500 raid controller. An md5sum of a 1.2Gig file takes less than two minutes. We tried rebuilding the indices but that didn't seem to help. We had an outside consultant do the rebuild and he's not available now so I don't know what command he ran to do it. I've never used 'sar'. If you can believe it, I've been poking around with Unix for the last 20 years and I've never even seen the 'man' page for 'sar'. I probably should look into it. What flags would give me the most information to help figger out what's going on here? Of course the troubleshooting increment is going to be a month or more so this will probably take some time to resolve :-) Cheers! Bob On Tue, 2002-07-23 at 14:08, Naomi Walker wrote: > > >Nightly, we're doing a 'vacuumdb -a -z' after stopping and restarting > >the database. The performance will get so bad after a month that we > >start to see load spikes in excess of 30. Normally, we don't see load > >over 2.5 during the heaviest activity and generally less than 1.0 most > >of the time. > Typically, performance is linked to your I/O, but my first guess in this > case has to do with your indices. As a test, next time performance gets > really rotten, drop your indicies and rebuild them. It cannot hurt, and > might just help. > > The trick here is to see what is happening while it is tanking. What does > your disk configuration look like? Is it a raid or stripe where reads are > spread out among more than one controller? Do sar reports point to > anything in particular? > > ---------------------------------------------------------------------------- > ---------------------------------- > Naomi Walker > Eldorado Computing, Inc > Chief Information Officer > nwalker@eldocomp.com > 602-604-3100 x242 > > -- Robert M. Meyer Sr. Network Administrator DigiVision Satellite Services 14 Lafayette Sq, Ste 410 Buffalo, NY 14203-1904 (716)332-1451
I have the same problem. My database is getting slower and slower. I realized that the disk space used by the database is also increasing and for me this is the reason for the bad performance. what i have tryed: - vacuumdb 4 times per day - drop and create the indexes The only solution I know is to make a dump and a restore. In my case i can't stop my service, but i'm forced to do it. To check that the problem wasn't in my code i made the following test: ------------------------------ AUTOCOMMIT = 1 FOR 1 to 120000{ INSERT RECORD UPDATE RECORD } DELETE ALL INSERTED RECORDS ------------------------------ I have run this test program several times and after each time i have run the vacummdb. What i have realized is that every time i run this program the database is growing and growing and running the vacummdb only recovers some space. How can this problem be solved??? Thanks, M.P.Garcia On Tue, 2002-07-23 at 19:29, Robert M. Meyer wrote: > Well, we're running a hardware, three disk RAID5, on an > sym53c896-0-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s interface. This is a > Compaq 3500 system with a CR3500 raid controller. An md5sum of a 1.2Gig > file takes less than two minutes. > > We tried rebuilding the indices but that didn't seem to help. We had an > outside consultant do the rebuild and he's not available now so I don't > know what command he ran to do it. > > I've never used 'sar'. If you can believe it, I've been poking around > with Unix for the last 20 years and I've never even seen the 'man' page > for 'sar'. I probably should look into it. What flags would give me > the most information to help figger out what's going on here? > > Of course the troubleshooting increment is going to be a month or more > so this will probably take some time to resolve :-) > > Cheers! > > Bob > > > On Tue, 2002-07-23 at 14:08, Naomi Walker wrote: > > > > >Nightly, we're doing a 'vacuumdb -a -z' after stopping and restarting > > >the database. The performance will get so bad after a month that we > > >start to see load spikes in excess of 30. Normally, we don't see load > > >over 2.5 during the heaviest activity and generally less than 1.0 most > > >of the time. > > Typically, performance is linked to your I/O, but my first guess in this > > case has to do with your indices. As a test, next time performance gets > > really rotten, drop your indicies and rebuild them. It cannot hurt, and > > might just help. > > > > The trick here is to see what is happening while it is tanking. What does > > your disk configuration look like? Is it a raid or stripe where reads are > > spread out among more than one controller? Do sar reports point to > > anything in particular? > > > > ---------------------------------------------------------------------------- > > ---------------------------------- > > Naomi Walker > > Eldorado Computing, Inc > > Chief Information Officer > > nwalker@eldocomp.com > > 602-604-3100 x242 > > > > > -- > Robert M. Meyer > Sr. Network Administrator > DigiVision Satellite Services > 14 Lafayette Sq, Ste 410 > Buffalo, NY 14203-1904 > (716)332-1451 > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- M.P.Garcia PT Inovação, SA Serviços e Redes Móveis Rua José Ferreira Pinto Basto - 3810 Aveiro Tel: 234 403 253 - Fax: 234 424 160 E-mail: marcos-p-garcia@ptinovacao.pt
Attachment
We recently ran into a similar problem with 7.1. Let me start by saying: knowing the SQLs you are running, knowing "EXPLAIN" and knowing pgmonitor go a LONG way in helping you identify the problem. In our case, the problem came down to this: we had a table ("TABLE_A") with ~200,000 rows that was heavily used (1000 rows added per day). The most common access to this table was via a foreign key of type int4 where any select for a specific value would return normally 1 and never more than 10 rows. Obviously, there was an index on this foreign key and the index was used (resulting in an "INDEXED SCAN") in most cases. However, the problem was that half the rows in "TABLE_A" had no foreign key relationship and so had this field set to zero. No one would ever do a select asking for all rows where foreign key = 0 as that didn't make "Business sense". But, since the database isn't aware of the business meaning of our data, the stats gathered by the ANALYSE would think that the overall distribution of the column for a unique value was between 2000 and 3000 rows. Mathematically correct, but in reality not useful. The result was that as the table grew and/or based on differences in the sample of rows from the most recent nightly VUCUUM, the cost of an INDEX SCAN (# of estimated rows * random_page_cost) would exceed the cost of a straight SEQ SCAN (# of data pages in table). Thus, many applications would start scanning the entire table. While any one scan would take 4 seconds (vs. fractions of a second using the index) and that wasn't too bad, the result of many users doing this many times quickly made our machine exhibit the same behavior that you describe. Our short term fix was to turn down the value of random_page_cost. However, as Tom Lane very rightly noted in response to a similar posting, this is a total hack. Our goal is to switch to 7.2 in the hopes that the "WHERE" extension to the "CREATE INDEX" command coupled with greater control of the sample space used in statistics will be the true answer. We also intend to look into whether setting this optional foreign key to be nullable may effect things in 7.1 . Overall, my point is that we were in a position where Postgres was operating exactly as it was intended. There really wasn't any tuning to be corrected or log file with a silver bullet message starting what to change. It was all a matter of seeing what the load on the system was and why it was taking as long as it was. My advice: EXPLAIN is your friend. pgmonitor is your friend. Of course, that's just my opinion - I could be wrong... Marc Mitchell - Senior Technical Architect Enterprise Information Solutions, Inc. 4910 Main Street Downers Grove, IL 60515 marcm@eisolution.com ----- Original Message ----- From: "Robert M. Meyer" <rmeyer@installs.com> To: "Naomi Walker" <nwalker@eldocomp.com> Cc: <pgsql-admin@postgresql.org> Sent: Tuesday, July 23, 2002 1:29 PM Subject: Re: [ADMIN] Postgres performance slowly gets worse over a month > Well, we're running a hardware, three disk RAID5, on an > sym53c896-0-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s interface. This is a > Compaq 3500 system with a CR3500 raid controller. An md5sum of a 1.2Gig > file takes less than two minutes. > > We tried rebuilding the indices but that didn't seem to help. We had an > outside consultant do the rebuild and he's not available now so I don't > know what command he ran to do it. > > I've never used 'sar'. If you can believe it, I've been poking around > with Unix for the last 20 years and I've never even seen the 'man' page > for 'sar'. I probably should look into it. What flags would give me > the most information to help figger out what's going on here? > > Of course the troubleshooting increment is going to be a month or more > so this will probably take some time to resolve :-) > > Cheers! > > Bob > > > On Tue, 2002-07-23 at 14:08, Naomi Walker wrote: > > > > >Nightly, we're doing a 'vacuumdb -a -z' after stopping and restarting > > >the database. The performance will get so bad after a month that we > > >start to see load spikes in excess of 30. Normally, we don't see load > > >over 2.5 during the heaviest activity and generally less than 1.0 most > > >of the time. > > Typically, performance is linked to your I/O, but my first guess in this > > case has to do with your indices. As a test, next time performance gets > > really rotten, drop your indicies and rebuild them. It cannot hurt, and > > might just help. > > > > The trick here is to see what is happening while it is tanking. What does > > your disk configuration look like? Is it a raid or stripe where reads are > > spread out among more than one controller? Do sar reports point to > > anything in particular? > > > > ------------------------------------------------------------------------- --- > > ---------------------------------- > > Naomi Walker > > Eldorado Computing, Inc > > Chief Information Officer > > nwalker@eldocomp.com > > 602-604-3100 x242 > > > > > -- > Robert M. Meyer > Sr. Network Administrator > DigiVision Satellite Services > 14 Lafayette Sq, Ste 410 > Buffalo, NY 14203-1904 > (716)332-1451 > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
Check this value in the postgresql.con file: max_fsm_pages = 100000 I had the same problem with the db growing, index no longer being used, despite vacuums each night. Somewhere, there is a thread on this. Anyway, If you look at the vacuum stats each time your run vacuum, looks to see how many pages are being updated between vacuums--i looked at the removed x tuples in y pages value. Then, set this value to be greater than the number of pages changed between vacuums. If more pages are being updated between vacuums than what max_fsm_pages allows, the extra pages won't be marked to be re-used--from what I understand. This then results in the db growing and the optimizer starts to chose full table scans since the db spans so many pages on the disk--at least this is what happened in my db. My db was growing greatly like, 10's of thousands of tuples being added to the unused value each day. After bumping up the max_fsm_pages, my db has grown much slower-as expected. Do a vacuum full to get the database back to it normal size, then play with the max_fsm_pages value to find a good value for you based on your vacuum verbose output. Hope that helps. It worked for me. --Michael Marcos Garcia wrote: >I have the same problem. >My database is getting slower and slower. > >I realized that the disk space used by the database is also increasing >and for me this is the reason for the bad performance. > >what i have tryed: > >- vacuumdb 4 times per day >- drop and create the indexes > >The only solution I know is to make a dump and a restore. >In my case i can't stop my service, but i'm forced to do it. > > >To check that the problem wasn't in my code i made the following test: > >------------------------------ >AUTOCOMMIT = 1 > > FOR 1 to 120000{ > > INSERT RECORD > UPDATE RECORD > } > > DELETE ALL INSERTED RECORDS >------------------------------ > >I have run this test program several times and after each time i have >run the vacummdb. > >What i have realized is that every time i run this program the database >is growing and growing and running the vacummdb only recovers some >space. > > >How can this problem be solved??? > > >Thanks, > >M.P.Garcia > > >On Tue, 2002-07-23 at 19:29, Robert M. Meyer wrote: > > >>Well, we're running a hardware, three disk RAID5, on an >>sym53c896-0-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s interface. This is a >>Compaq 3500 system with a CR3500 raid controller. An md5sum of a 1.2Gig >>file takes less than two minutes. >> >>We tried rebuilding the indices but that didn't seem to help. We had an >>outside consultant do the rebuild and he's not available now so I don't >>know what command he ran to do it. >> >>I've never used 'sar'. If you can believe it, I've been poking around >>with Unix for the last 20 years and I've never even seen the 'man' page >>for 'sar'. I probably should look into it. What flags would give me >>the most information to help figger out what's going on here? >> >>Of course the troubleshooting increment is going to be a month or more >>so this will probably take some time to resolve :-) >> >>Cheers! >> >>Bob >> >> >>On Tue, 2002-07-23 at 14:08, Naomi Walker wrote: >> >> >>>>Nightly, we're doing a 'vacuumdb -a -z' after stopping and restarting >>>>the database. The performance will get so bad after a month that we >>>>start to see load spikes in excess of 30. Normally, we don't see load >>>>over 2.5 during the heaviest activity and generally less than 1.0 most >>>>of the time. >>>> >>>> >>>Typically, performance is linked to your I/O, but my first guess in this >>>case has to do with your indices. As a test, next time performance gets >>>really rotten, drop your indicies and rebuild them. It cannot hurt, and >>>might just help. >>> >>>The trick here is to see what is happening while it is tanking. What does >>>your disk configuration look like? Is it a raid or stripe where reads are >>>spread out among more than one controller? Do sar reports point to >>>anything in particular? >>> >>>---------------------------------------------------------------------------- >>>---------------------------------- >>>Naomi Walker >>>Eldorado Computing, Inc >>>Chief Information Officer >>>nwalker@eldocomp.com >>>602-604-3100 x242 >>> >>> >>> >>> >>-- >>Robert M. Meyer >>Sr. Network Administrator >>DigiVision Satellite Services >>14 Lafayette Sq, Ste 410 >>Buffalo, NY 14203-1904 >>(716)332-1451 >> >> >>---------------------------(end of broadcast)--------------------------- >>TIP 3: if posting/reading through Usenet, please send an appropriate >>subscribe-nomail command to majordomo@postgresql.org so that your >>message can get through to the mailing list cleanly >> >>
"Michael G. Martin" <michael@vpmonline.com> wrote: > Check this value in the postgresql.con file: > > max_fsm_pages = 100000 > > I had the same problem with the db growing, index no longer being used, > despite vacuums each night. Somewhere, there is a thread on this. > > Anyway, If you look at the vacuum stats each time your run vacuum, looks > to see how many pages are being updated between vacuums--i looked at the > removed x tuples in y pages value. Then, set this value to be greater > than the number of pages changed between vacuums. If more pages are > being updated between vacuums than what max_fsm_pages allows, the extra > pages won't be marked to be re-used--from what I understand. This then > results in the db growing and the optimizer starts to chose full table > scans since the db spans so many pages on the disk--at least this is > what happened in my db. Can you explain me this line that I obatin in the log after a vacuum analyze ? --Relation ua_user_data_exp-- 2002-07-21 05:00:02 [28492] DEBUG: Pages 1402: Changed 2, reaped 1192, Empty 0, New 0; Tup 4277: Vac 16207, Keep/VTL 0/0, Crash 0, UnUsed 1, MinLen 393, MaxLen 680; Re-using: Free/Avail. Space 9148944/9141356; EndEmpty/Avail. Pages 0/1191. CPU 0.00s/0.03u sec. I'm wondering about "Re-using: Free/Avail. Space 9148944/9141356" Ciao Gaetano
I believe, from a quick glance at the file vacuum.c, the first number is the total amount of free pages the database now occupies. The second number is how much is actually available for new tuples to be created in.
--Michael
Gaetano Mendola wrote:
--Michael
Gaetano Mendola wrote:
"Michael G. Martin" <michael@vpmonline.com> wrote:Check this value in the postgresql.con file: max_fsm_pages = 100000 I had the same problem with the db growing, index no longer being used, despite vacuums each night. Somewhere, there is a thread on this. Anyway, If you look at the vacuum stats each time your run vacuum, looks to see how many pages are being updated between vacuums--i looked at the removed x tuples in y pages value. Then, set this value to be greater than the number of pages changed between vacuums. If more pages are being updated between vacuums than what max_fsm_pages allows, the extra pages won't be marked to be re-used--from what I understand. This then results in the db growing and the optimizer starts to chose full table scans since the db spans so many pages on the disk--at least this is what happened in my db.Can you explain me this line that I obatin in the log after a vacuum analyze ? --Relation ua_user_data_exp-- 2002-07-21 05:00:02 [28492] DEBUG: Pages 1402: Changed 2, reaped 1192, Empty 0, New 0; Tup 4277: Vac 16207, Keep/VTL 0/0, Crash 0, UnUsed 1, MinLen 393, MaxLen 680; Re-using: Free/Avail. Space 9148944/9141356; EndEmpty/Avail. Pages 0/1191. CPU 0.00s/0.03u sec. I'm wondering about "Re-using: Free/Avail. Space 9148944/9141356" Ciao Gaetano
This is before my morning coffee, so someone please correct me if I am wrong, but that should be the amount of space in your table that is available for reuse before new pages need to be added. Whether it is actually used is determined by a combination of factors including the amount of insert/update activity on your system and the size of your free space map. If you want to recover all of that space, you'll need to do a vacuum full. As it looks now you could probably start vacuuming this table more frequently to keep this number from growing so quickly. Robert Treat On Wed, 2002-07-24 at 07:14, Gaetano Mendola wrote: > > "Michael G. Martin" <michael@vpmonline.com> wrote: > > Check this value in the postgresql.con file: > > > > max_fsm_pages = 100000 > > > > I had the same problem with the db growing, index no longer being used, > > despite vacuums each night. Somewhere, there is a thread on this. > > > > Anyway, If you look at the vacuum stats each time your run vacuum, looks > > to see how many pages are being updated between vacuums--i looked at the > > removed x tuples in y pages value. Then, set this value to be greater > > than the number of pages changed between vacuums. If more pages are > > being updated between vacuums than what max_fsm_pages allows, the extra > > pages won't be marked to be re-used--from what I understand. This then > > results in the db growing and the optimizer starts to chose full table > > scans since the db spans so many pages on the disk--at least this is > > what happened in my db. > > > Can you explain me this line that I obatin in the log > after a vacuum analyze ? > > --Relation ua_user_data_exp-- > 2002-07-21 05:00:02 [28492] DEBUG: Pages 1402: Changed 2, reaped 1192, > Empty 0, New 0; Tup 4277: Vac 16207, Keep/VTL 0/0, Crash 0, UnUsed 1, MinLen > 393, MaxLen 680; Re-using: Free/Avail. Space 9148944/9141356; > EndEmpty/Avail. Pages 0/1191. CPU 0.00s/0.03u sec. > > I'm wondering about "Re-using: Free/Avail. Space 9148944/9141356" > > > Ciao > Gaetano > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
"Marc Mitchell" <marcm@eisolution.com> writes: > However, the problem > was that half the rows in "TABLE_A" had no foreign key relationship and so > had this field set to zero. No one would ever do a select asking for all > rows where foreign key = 0 as that didn't make "Business sense". But, > since the database isn't aware of the business meaning of our data, the > stats gathered by the ANALYSE would think that the overall distribution of > the column for a unique value was between 2000 and 3000 rows. > Our short term fix was to turn down the value of random_page_cost. > However, as Tom Lane very rightly noted in response to a similar posting, > this is a total hack. Our goal is to switch to 7.2 in the hopes that the > "WHERE" extension to the "CREATE INDEX" command coupled with greater > control of the sample space used in statistics will be the true > answer. Actually, 7.2 should fix this without any need for messing with partial indexes. The new statistics code will realize that zero is an outlier, and will estimate scans for other target values differently. regards, tom lane
----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "Marc Mitchell" <marcm@eisolution.com> Cc: "Robert M. Meyer" <rmeyer@installs.com>; "Naomi Walker" <nwalker@eldocomp.com>; <pgsql-admin@postgresql.org> Sent: Wednesday, July 24, 2002 10:21 AM Subject: Re: [ADMIN] Postgres performance slowly gets worse over a month > "Marc Mitchell" <marcm@eisolution.com> writes: > > Our short term fix was to turn down the value of random_page_cost. > > However, as Tom Lane very rightly noted in response to a similar posting, > > this is a total hack. Our goal is to switch to 7.2 in the hopes that the > > "WHERE" extension to the "CREATE INDEX" command coupled with greater > > control of the sample space used in statistics will be the true > > answer. > > Actually, 7.2 should fix this without any need for messing with partial > indexes. The new statistics code will realize that zero is an outlier, > and will estimate scans for other target values differently. That's welcome news. Two follow-up questions if you could: 1) Is there any inherit value or "dis-value" to using a partial index in this case? Given that 50% of the table has a value the we have no need to be supported via an index, is there any reason not to use a partial index? 2) If we stay in 7.1, would changing the foriegn key field from "NOT NULL" (where we currently populate with zero) to nullable (where we would populate instead with null) have any effect on performace? The hope would be that nulls might be treated differently within the 7.1 stats gathering process than non-null values. This would be a better, albeit still temporary, solution in our current environment than the random_page_access hack we chose for now. Thanks, Marc Mitchell Enterprise Information Solutions, Inc. 4910 Main Street Downers Grove, IL 60515 marcm@eisolution.com
"Marc Mitchell" <marcm@eisolution.com> writes: > 1) Is there any inherit value or "dis-value" to using a partial index in > this case? Given that 50% of the table has a value the we have no need to > be supported via an index, is there any reason not to use a partial index? It probably would be a win, just on the basis of reducing the size and amount of update activity for the index. > 2) If we stay in 7.1, would changing the foriegn key field from "NOT NULL" > (where we currently populate with zero) to nullable (where we would > populate instead with null) have any effect on performace? The hope would > be that nulls might be treated differently within the 7.1 stats gathering > process than non-null values. In fact they are, and this very likely would be a cleaner solution anyway. Seems like NULL might be a semantically clearer representation of the row's status than zero. You could still do something with a partial index excluding the nulls in 7.2, I think. regards, tom lane
At 11:29 AM 7/23/02 , Robert M. Meyer wrote: >I've never used 'sar'. If you can believe it, I've been poking around >with Unix for the last 20 years and I've never even seen the 'man' page >for 'sar'. I probably should look into it. What flags would give me the >most information to help figger out what's going on here? SAR stands for System Activity Reporting and is a SysVish think. You'll find it on Solaris and SCO Unix. I don't know if it exists under Red Hat. -crl -- Chad R. Larson (CRL22) chad@eldocomp.com Eldorado Computing, Inc. 602-604-3100 5353 North 16th Street, Suite 400 Phoenix, Arizona 85016-3228
Yep, it's on RedHat 7.2 but I've never used it. I didn't use it on Solaris, either. I've never needed it (lest-wise, I didn't THINK I needed it) until now so I suppose I need to check it out. Cheers! Bob On Wed, 2002-07-24 at 14:30, Chad R. Larson wrote: > At 11:29 AM 7/23/02 , Robert M. Meyer wrote: > >I've never used 'sar'. If you can believe it, I've been poking around > >with Unix for the last 20 years and I've never even seen the 'man' page > >for 'sar'. I probably should look into it. What flags would give me the > >most information to help figger out what's going on here? > > SAR stands for System Activity Reporting and is a SysVish think. You'll > find it on Solaris and SCO Unix. I don't know if it exists under Red Hat. > > > -crl > -- > Chad R. Larson (CRL22) chad@eldocomp.com > Eldorado Computing, Inc. 602-604-3100 > 5353 North 16th Street, Suite 400 > Phoenix, Arizona 85016-3228 > -- Robert M. Meyer Sr. Network Administrator DigiVision Satellite Services 14 Lafayette Sq, Ste 410 Buffalo, NY 14203-1904 (716)332-1451
Since this conversation had started, the presented solutions for the disk space (disk space of tables and indexes) problem were: - reindex -> no space recovered - frequent vacuums -> some space recovered, meanwhile the database keeps growing - vacuumm full -> some space recovered, meanwhile the database keeps growing, quite similar to simple vacuum. we have also to keep in mind, that the option "full", makes locks to the tables. Therefore, the services around the database locks too, and with the growing of the database the time spent for "vacumm full" increases, as well as the downtime of the services around the database. So, my doubts are: - There's nothing we can do to avoid the growth of the database, only slow down that growth. - We, application developers that use postgresql databases have to count with this problem. Sorry if i'm being a little rude, but i'm in a real trouble. Thanks in advance, M.P.Garcia On Wed, 2002-07-24 at 15:04, Robert Treat wrote: > This is before my morning coffee, so someone please correct me if I am > wrong, but that should be the amount of space in your table that is > available for reuse before new pages need to be added. Whether it is > actually used is determined by a combination of factors including the > amount of insert/update activity on your system and the size of your > free space map. If you want to recover all of that space, you'll need to > do a vacuum full. As it looks now you could probably start vacuuming > this table more frequently to keep this number from growing so quickly. > > Robert Treat > > On Wed, 2002-07-24 at 07:14, Gaetano Mendola wrote: > > > > "Michael G. Martin" <michael@vpmonline.com> wrote: > > > Check this value in the postgresql.con file: > > > > > > max_fsm_pages = 100000 > > > > > > I had the same problem with the db growing, index no longer being used, > > > despite vacuums each night. Somewhere, there is a thread on this. > > > > > > Anyway, If you look at the vacuum stats each time your run vacuum, looks > > > to see how many pages are being updated between vacuums--i looked at the > > > removed x tuples in y pages value. Then, set this value to be greater > > > than the number of pages changed between vacuums. If more pages are > > > being updated between vacuums than what max_fsm_pages allows, the extra > > > pages won't be marked to be re-used--from what I understand. This then > > > results in the db growing and the optimizer starts to chose full table > > > scans since the db spans so many pages on the disk--at least this is > > > what happened in my db. > > > > > > Can you explain me this line that I obatin in the log > > after a vacuum analyze ? > > > > --Relation ua_user_data_exp-- > > 2002-07-21 05:00:02 [28492] DEBUG: Pages 1402: Changed 2, reaped 1192, > > Empty 0, New 0; Tup 4277: Vac 16207, Keep/VTL 0/0, Crash 0, UnUsed 1, MinLen > > 393, MaxLen 680; Re-using: Free/Avail. Space 9148944/9141356; > > EndEmpty/Avail. Pages 0/1191. CPU 0.00s/0.03u sec. > > > > I'm wondering about "Re-using: Free/Avail. Space 9148944/9141356" > > > > > > Ciao > > Gaetano > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 6: Have you searched our list archives? > > > > http://archives.postgresql.org > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster -- M.P.Garcia PT Inovação, SA Serviços e Redes Móveis Rua José Ferreira Pinto Basto - 3810 Aveiro Tel: 234 403 253 - Fax: 234 424 160 E-mail: marcos-p-garcia@ptinovacao.pt
Attachment
I played with this tonight writing a small insert/update routine and frequent vacuums. Here is what I came up with ( (PostgreSQL) 7.2.1 ) Once a table reaches a certain size, it will not auto shrink. This behavior is common for performance reasons. The way to re-use space is to run the vacuum which will mark deleted tuples as unused. Once your table has unused tuples, inserts and updates will go into the unused tuples areas before additional space is allocated. ( there may be some additional restrictions on this--like does the new tuple fit and not cross page boundaries--but I don't know enough to say exactly) . In addition, max_fsm_pages has an impact on how many pages will be available to be marked as re-usable. If you have a huge table and changes are impacting more than the default 10,000 pages this is set to, you will want to bump this number up. My problem was I saw my UnUsed tuples always growing and not being re-used until I bumped this value up. As I watched the vacuum verbose output each run, I notices more than 10k pages were in fact changing between vacuums. Your table will grow to an optimal size until the the amount of inserts,updates,deletes balances out. Of course, if you insert more than you delete, your database will always grow. But, if the number of tuples at some point becomes fixed in the table, and you vacuum enough in between all the inserts and updates, your table size will stop growing. If you are vacuuming 4 times a day, yet the amount of tuples inserted and updated exceeds the current UnUsed tuples, your table will continue to grow. With the addition of the lazy vacuum, there should be no problem increasing the frequency of your vacuums to accomodate the excessive changes being done to the table. Get the changed tuples marked unused as often as you can so as not to create a new tuple with an update or insert. Just how big are your tables getting? How often are you vacuuming now, and how many tuples are marked as UnUsed and how many tuples are changing between vacuums? I have a large table with 20+million records which flattend out around 3.5 gigs. It only really gets updated once each day with a vacuum following. Since I add to the table daily, i expect some growth. I noticed the problem with this table growing to be 2x that size when the unused tuples would not go down, but kept growing each day. Ager changing max_fsm_pages like I mentioned above, the unused tuples stays fairly fixed and the database only grows a little each day as expected. Hope this helps get you out of trouble. --Michael Marcos Garcia wrote: >Since this conversation had started, the presented solutions for the >disk space (disk space of tables and indexes) problem were: > > - reindex -> no space recovered > > - frequent vacuums -> some space recovered, meanwhile the database >keeps growing > > - vacuumm full -> some space recovered, meanwhile the database keeps >growing, quite similar to simple vacuum. we have also to keep in mind, >that the option "full", makes locks to the tables. Therefore, the >services around the database locks too, and with the growing of the >database the time spent for "vacumm full" increases, as well as the >downtime of the services around the database. > > >So, my doubts are: > > - There's nothing we can do to avoid the growth of the database, only >slow down that growth. > > - We, application developers that use postgresql databases have to >count with this problem. > >Sorry if i'm being a little rude, but i'm in a real trouble. > >Thanks in advance, > > >M.P.Garcia > >On Wed, 2002-07-24 at 15:04, Robert Treat wrote: > > >>This is before my morning coffee, so someone please correct me if I am >>wrong, but that should be the amount of space in your table that is >>available for reuse before new pages need to be added. Whether it is >>actually used is determined by a combination of factors including the >>amount of insert/update activity on your system and the size of your >>free space map. If you want to recover all of that space, you'll need to >>do a vacuum full. As it looks now you could probably start vacuuming >>this table more frequently to keep this number from growing so quickly. >> >>Robert Treat >> >>On Wed, 2002-07-24 at 07:14, Gaetano Mendola wrote: >> >> >>>"Michael G. Martin" <michael@vpmonline.com> wrote: >>> >>> >>>>Check this value in the postgresql.con file: >>>> >>>>max_fsm_pages = 100000 >>>> >>>>I had the same problem with the db growing, index no longer being used, >>>>despite vacuums each night. Somewhere, there is a thread on this. >>>> >>>>Anyway, If you look at the vacuum stats each time your run vacuum, looks >>>>to see how many pages are being updated between vacuums--i looked at the >>>>removed x tuples in y pages value. Then, set this value to be greater >>>>than the number of pages changed between vacuums. If more pages are >>>>being updated between vacuums than what max_fsm_pages allows, the extra >>>>pages won't be marked to be re-used--from what I understand. This then >>>>results in the db growing and the optimizer starts to chose full table >>>>scans since the db spans so many pages on the disk--at least this is >>>>what happened in my db. >>>> >>>> >>>Can you explain me this line that I obatin in the log >>>after a vacuum analyze ? >>> >>> --Relation ua_user_data_exp-- >>>2002-07-21 05:00:02 [28492] DEBUG: Pages 1402: Changed 2, reaped 1192, >>>Empty 0, New 0; Tup 4277: Vac 16207, Keep/VTL 0/0, Crash 0, UnUsed 1, MinLen >>>393, MaxLen 680; Re-using: Free/Avail. Space 9148944/9141356; >>>EndEmpty/Avail. Pages 0/1191. CPU 0.00s/0.03u sec. >>> >>>I'm wondering about "Re-using: Free/Avail. Space 9148944/9141356" >>> >>> >>>Ciao >>>Gaetano >>> >>> >>>---------------------------(end of broadcast)--------------------------- >>>TIP 6: Have you searched our list archives? >>> >>>http://archives.postgresql.org >>> >>> >> >> >>---------------------------(end of broadcast)--------------------------- >>TIP 4: Don't 'kill -9' the postmaster >> >>
> I played with this tonight writing a small insert/update routine and > frequent vacuums. Here is what I came up with ( (PostgreSQL) 7.2.1 ) > This is some great info, thanks. > In addition, max_fsm_pages has an impact on how many pages will be > available to be marked as re-usable. If you have a huge table and > changes are impacting more than the default 10,000 pages this is set to, > you will want to bump this number up. My problem was I saw my UnUsed > tuples always growing and not being re-used until I bumped this value > up. As I watched the vacuum verbose output each run, I notices more > than 10k pages were in fact changing between vacuums. > This has made me think about something we've been doing. We've got one db that is used basically read-only; every day ~15000 records are added, but very rarely are any deleted. What we've been doing is just letting it sit until it gets close to too big for the filesystem, then lopping off the earliest 6 months worth of records. The question is, is it best to do this then set the max_fsm_pages to a huge number and vacuum full? Or should I change it so scripts remove the oldest day and vacuum before adding the next days? Or just rebuild the db every time. :) __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com
I believe the more frequently you vacuum, the faster it will go, so that may be the driving factor in deciding. Personally, each day, I'd add the new tuples then remove the no-longer needed tuples, make sure max_fsm_pages is large enough to handle all the pages removed in the largest table, then run a vacuum analyze on the table or entire database. Run it each night and it will be nice and fast and you shouldn't ever need to worry about locking the entire table with a vacuum full or spend time to re-create the table and indicies.
That's what I do which I think is the most automated,maintainance-free solution. I currently run a lazy vacuum analyze each night after making my large changes. My tables don't change enough during the day to require mid-day vacuums.
--Michael
Joshua Daniel Franklin wrote:
That's what I do which I think is the most automated,maintainance-free solution. I currently run a lazy vacuum analyze each night after making my large changes. My tables don't change enough during the day to require mid-day vacuums.
--Michael
Joshua Daniel Franklin wrote:
I played with this tonight writing a small insert/update routine and frequent vacuums. Here is what I came up with ( (PostgreSQL) 7.2.1 )This is some great info, thanks.In addition, max_fsm_pages has an impact on how many pages will be available to be marked as re-usable. If you have a huge table and changes are impacting more than the default 10,000 pages this is set to, you will want to bump this number up. My problem was I saw my UnUsed tuples always growing and not being re-used until I bumped this value up. As I watched the vacuum verbose output each run, I notices more than 10k pages were in fact changing between vacuums.This has made me think about something we've been doing. We've got one db that is used basically read-only; every day ~15000 records are added, but very rarely are any deleted. What we've been doing is just letting it sit until it gets close to too big for the filesystem, then lopping off the earliest 6 months worth of records. The question is, is it best to do this then set the max_fsm_pages to a huge number and vacuum full? Or should I change it so scripts remove the oldest day and vacuum before adding the next days? Or just rebuild the db every time. :) __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
Perhaps I wasn't clear. There really aren't any (daily) "no-longer needed tuples", just added ones. I am under the impression that vacuum is just for freeing up tuples to be re-used, so the only time it needs to be run is after the 6-monthly tuple massacre, at which time I would also need to set max_fsm_pages to a huge number. --- "Michael G. Martin" <michael@vpmonline.com> wrote: > I believe the more frequently you vacuum, the faster it will go, so that > may be the driving factor in deciding. Personally, each day, I'd add > the new tuples then remove the no-longer needed tuples, make sure > max_fsm_pages is large enough to handle all the pages removed in the > largest table, then run a vacuum analyze on the table or entire > database. Run it each night and it will be nice and fast and you > shouldn't ever need to worry about locking the entire table with a > vacuum full or spend time to re-create the table and indicies. > > That's what I do which I think is the most automated,maintainance-free > solution. I currently run a lazy vacuum analyze each night after making > my large changes. My tables don't change enough during the day to > require mid-day vacuums. > > --Michael > > Joshua Daniel Franklin wrote: > > > >>In addition, max_fsm_pages has an impact on how many pages will be > >>available to be marked as re-usable. If you have a huge table and > >>changes are impacting more than the default 10,000 pages this is set to, > >>you will want to bump this number up. My problem was I saw my UnUsed > >>tuples always growing and not being re-used until I bumped this value > >>up. As I watched the vacuum verbose output each run, I notices more > >>than 10k pages were in fact changing between vacuums. > >> > >This has made me think about something we've been doing. We've got one > >db that is used basically read-only; every day ~15000 records are added, > >but very rarely are any deleted. What we've been doing is just letting it > >sit until it gets close to too big for the filesystem, then lopping off > >the earliest 6 months worth of records. The question is, is it best > >to do this then set the max_fsm_pages to a huge number and vacuum full? > >Or should I change it so scripts remove the oldest day and vacuum before > >adding the next days? > > > >Or just rebuild the db every time. :) > > __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com
Yea, you're correct. I think you'll be able to avoid the vacuum full and re-use the tuples by making sure max_fsm_pages is large enough to handle the number of pages changed by the 6-month massacre. After your vacuum, note the unused tuples and page size of the table. Then, as you incremently add new stuff over the next 6 months, you should see the unused tuples decrease while the page size remains failry fixed. The only other thing you may want to do more frequently is analyze if the new tuples might change some statistics during the 6-month interval.
--Michael
Joshua Daniel Franklin wrote:
--Michael
Joshua Daniel Franklin wrote:
Perhaps I wasn't clear. There really aren't any (daily) "no-longer needed tuples", just added ones. I am under the impression that vacuum is just for freeing up tuples to be re-used, so the only time it needs to be run is after the 6-monthly tuple massacre, at which time I would also need to set max_fsm_pages to a huge number. --- "Michael G. Martin" <michael@vpmonline.com> wrote:I believe the more frequently you vacuum, the faster it will go, so that may be the driving factor in deciding. Personally, each day, I'd add the new tuples then remove the no-longer needed tuples, make sure max_fsm_pages is large enough to handle all the pages removed in the largest table, then run a vacuum analyze on the table or entire database. Run it each night and it will be nice and fast and you shouldn't ever need to worry about locking the entire table with a vacuum full or spend time to re-create the table and indicies. That's what I do which I think is the most automated,maintainance-free solution. I currently run a lazy vacuum analyze each night after making my large changes. My tables don't change enough during the day to require mid-day vacuums. --Michael Joshua Daniel Franklin wrote:In addition, max_fsm_pages has an impact on how many pages will be available to be marked as re-usable. If you have a huge table and changes are impacting more than the default 10,000 pages this is set to, you will want to bump this number up. My problem was I saw my UnUsed tuples always growing and not being re-used until I bumped this value up. As I watched the vacuum verbose output each run, I notices more than 10k pages were in fact changing between vacuums.This has made me think about something we've been doing. We've got one db that is used basically read-only; every day ~15000 records are added, but very rarely are any deleted. What we've been doing is just letting it sit until it gets close to too big for the filesystem, then lopping off the earliest 6 months worth of records. The question is, is it best to do this then set the max_fsm_pages to a huge number and vacuum full? Or should I change it so scripts remove the oldest day and vacuum before adding the next days? Or just rebuild the db every time. :)__________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly
Joshua Daniel Franklin <joshuadfranklin@yahoo.com> writes: > Perhaps I wasn't clear. There really aren't any (daily) "no-longer needed > tuples", just added ones. I am under the impression that vacuum is just for > freeing up tuples to be re-used, so the only time it needs to be run is after > the 6-monthly tuple massacre, at which time I would also need to set > max_fsm_pages to a huge number. If you do VACUUM FULL after each "tuple massacre" (which you'd better, since the point AFAICT is to cut the total disk space used by the file) then there's not really any need for bumping up max_fsm_pages. The post-vacuum-full state of the table isn't going to have a whole lot of embedded free space ... regards, tom lane
I look at it like this:
Your database takes up space X after a full vacuum and is ready for the next 6 months of inserts. Then , over the next 6 months it grows by space Y, it now occupies X+Y Space.
You then remove a bunch of old tuples. Space is still X+Y. You now have 2 basic options:
1. Run a vacuum full -- this locks the entier table, and de-fragments all unused space, so space is now back to X. Table will grow incremently by Y over the next 6 months again.
2. Run a lazy vacuum-- no lock, no de-fragment, space is still X+Y. Assuming max_fsm_pages was large enough to hold all the changed pages, over the next 6 months, the space remains fixed at about X+Y. You are now re-using the unused table space.
Either solution will work. If you really want to cut disk space, choose 1. If you want to keep the space at about it optimal size and avoid any downtime, choose 2.
--Michael
Michael G. Martin wrote:
Yea, you're correct. I think you'll be able to avoid the vacuum full and re-use the tuples by making sure max_fsm_pages is large enough to handle the number of pages changed by the 6-month massacre. After your vacuum, note the unused tuples and page size of the table. Then, as you incremently add new stuff over the next 6 months, you should see the unused tuples decrease while the page size remains failry fixed. The only other thing you may want to do more frequently is analyze if the new tuples might change some statistics during the 6-month interval.
--Michael
Joshua Daniel Franklin wrote:
Your database takes up space X after a full vacuum and is ready for the next 6 months of inserts. Then , over the next 6 months it grows by space Y, it now occupies X+Y Space.
You then remove a bunch of old tuples. Space is still X+Y. You now have 2 basic options:
1. Run a vacuum full -- this locks the entier table, and de-fragments all unused space, so space is now back to X. Table will grow incremently by Y over the next 6 months again.
2. Run a lazy vacuum-- no lock, no de-fragment, space is still X+Y. Assuming max_fsm_pages was large enough to hold all the changed pages, over the next 6 months, the space remains fixed at about X+Y. You are now re-using the unused table space.
Either solution will work. If you really want to cut disk space, choose 1. If you want to keep the space at about it optimal size and avoid any downtime, choose 2.
--Michael
Michael G. Martin wrote:
Yea, you're correct. I think you'll be able to avoid the vacuum full and re-use the tuples by making sure max_fsm_pages is large enough to handle the number of pages changed by the 6-month massacre. After your vacuum, note the unused tuples and page size of the table. Then, as you incremently add new stuff over the next 6 months, you should see the unused tuples decrease while the page size remains failry fixed. The only other thing you may want to do more frequently is analyze if the new tuples might change some statistics during the 6-month interval.
--Michael
Joshua Daniel Franklin wrote:
Perhaps I wasn't clear. There really aren't any (daily) "no-longer needed tuples", just added ones. I am under the impression that vacuum is just for freeing up tuples to be re-used, so the only time it needs to be run is after the 6-monthly tuple massacre, at which time I would also need to set max_fsm_pages to a huge number. --- "Michael G. Martin" <michael@vpmonline.com> wrote:I believe the more frequently you vacuum, the faster it will go, so that may be the driving factor in deciding. Personally, each day, I'd add the new tuples then remove the no-longer needed tuples, make sure max_fsm_pages is large enough to handle all the pages removed in the largest table, then run a vacuum analyze on the table or entire database. Run it each night and it will be nice and fast and you shouldn't ever need to worry about locking the entire table with a vacuum full or spend time to re-create the table and indicies. That's what I do which I think is the most automated,maintainance-free solution. I currently run a lazy vacuum analyze each night after making my large changes. My tables don't change enough during the day to require mid-day vacuums. --Michael Joshua Daniel Franklin wrote:In addition, max_fsm_pages has an impact on how many pages will be available to be marked as re-usable. If you have a huge table and changes are impacting more than the default 10,000 pages this is set to, you will want to bump this number up. My problem was I saw my UnUsed tuples always growing and not being re-used until I bumped this value up. As I watched the vacuum verbose output each run, I notices more than 10k pages were in fact changing between vacuums.This has made me think about something we've been doing. We've got one db that is used basically read-only; every day ~15000 records are added, but very rarely are any deleted. What we've been doing is just letting it sit until it gets close to too big for the filesystem, then lopping off the earliest 6 months worth of records. The question is, is it best to do this then set the max_fsm_pages to a huge number and vacuum full? Or should I change it so scripts remove the oldest day and vacuum before adding the next days? Or just rebuild the db every time. :)__________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly
Thanks, this is exactly what I was thinking. --- "Michael G. Martin" <michael@vpmonline.com> wrote: > You then remove a bunch of old tuples. Space is still X+Y. You now > have 2 basic options: > > 1. Run a vacuum full -- this locks the entier table, and de-fragments > all unused space, so space is now back to X. Table will grow incremently > by Y over the next 6 months again. > 2. Run a lazy vacuum-- no lock, no de-fragment, space is still X+Y. > Assuming max_fsm_pages was large enough to hold all the changed pages, > over the next 6 months, the space remains fixed at about X+Y. You are > now re-using the unused table space. > > Either solution will work. If you really want to cut disk space, choose > 1. If you want to keep the space at about it optimal size and avoid any > downtime, choose 2. > > --Michael __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com
On Fri, Jul 26, 2002 at 09:11:25AM -0700, Joshua Daniel Franklin wrote: > Thanks, this is exactly what I was thinking. > > --- "Michael G. Martin" <michael@vpmonline.com> wrote: > > You then remove a bunch of old tuples. Space is still X+Y. You now > > have 2 basic options: > > > > 1. Run a vacuum full -- this locks the entier table, and de-fragments > > all unused space, so space is now back to X. Table will grow incremently > > by Y over the next 6 months again. > > 2. Run a lazy vacuum-- no lock, no de-fragment, space is still X+Y. > > Assuming max_fsm_pages was large enough to hold all the changed pages, > > over the next 6 months, the space remains fixed at about X+Y. You are > > now re-using the unused table space. > > > > Either solution will work. If you really want to cut disk space, choose > > 1. If you want to keep the space at about it optimal size and avoid any > > downtime, choose 2. > > > > --Michael > Do not forget to reindex the db after the delete, index's do not manage them selves(if I remember correctly). The index will continue to grow until it eats your file system, as it did with me. Also if you do not reindex regulary it can take a looong time to do, much like vacuum. Also bigger indexes mean slower queries. marc > > __________________________________________________ > Do You Yahoo!? > Yahoo! Health - Feel better, live better > http://health.yahoo.com > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
Hello,
I am using Postgresql with PostGis and the Minnesota MapServer on a Beowulf cluster for web-mapping applications. It runs fine on one node, producing very fast interactive maps for an Apache/PHP web server. However, the cluster consists of 45 nodes, all using a shared user file system. Is it possible to start up a postmaster on every node, using the same database? The backend processes themselves would be completely autonomous, but they would have to share their data from the same source. To simplify things, only read-access is necessary. Would this be possible, and if so, how can the different postmasters be made to use a different postmaster.pid file (which is located in the shared data directory)?
It would be an interesting way for using the cluster, as the individual map layers can be independently constructed on different nodes, and only finally have to be put together in a complete map. Essentially, MapServer constructs a map from layers, where each layer originates from an individual PostgreSQL connection, even when using only one database. In a cluster solution therefore, no communication between the nodes would be required. Even the data could be distributed over the nodes and put into different databases, but this would inevitably lead to much duplication and a set of databases that would be very difficult to administer. In the archives, I saw some mention of work in progress on distributed databases, but for this I don't need much in the way of distributed facilities, just reading shared data.
Any help would be very much appreciated. It certainly would be a great example of PostgrSQL's advanced geometrical capabilities!
Jan Hartmann
Department of Geography
University of Amsterdam
"Jan Hartmann" <jhart@frw.uva.nl> writes: > cluster consists of 45 nodes, all using a shared user file system. Is it > possible to start up a postmaster on every node, using the same database? No. > The backend processes themselves would be completely autonomous, but they > would have to share their data from the same source. To simplify things, > only read-access is necessary. In that case, make 45 copies of the database ... regards, tom lane
Completely aside from the database issues, mapserver would need some heavy work to make all the layer processes "independent" and amenable to beowulf-style parallelization. It is worth noting that layers are not really all that independant from a display point of view. If you are drawing lake polygons on top of jurisdiction polygons, for example, you are first drawing all the jurisdictions into the end image, then drawing all the lakes into that image. The process of creating the final visual product is the result of sequential application of layers. Individually rendering then serially combining the layers may or may not be more efficient than simply serially rendering them, depending on the render complexity of the layers. A little work on multi-threading mapserver on the data-access side would not hurt though, since reading from the source data files are completely independant actions. As far as parallelizing database reads, Tom can correct me, but multi-processor database systems would help (one postmaster, many postgres's). I wonder also if a Mosix system would work as a way of using a cluster for database work? Having the database process working on a network mounted datastore just can't be good mojo though... P. Tom Lane wrote: > > "Jan Hartmann" <jhart@frw.uva.nl> writes: > > cluster consists of 45 nodes, all using a shared user file system. Is it > > possible to start up a postmaster on every node, using the same database? > > No. > > > The backend processes themselves would be completely autonomous, but they > > would have to share their data from the same source. To simplify things, > > only read-access is necessary. > > In that case, make 45 copies of the database ... > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
Aside from the problems associated with making multiple Postgresql processes access a single data store, wouldn't you be dealing with a severe performance penalty? Since disk is typically the slowest part of any system, I would imagine that 45 nodes, all beating on one network file system (or a multiport filesystem for that matter) would tend to slow things down dramatically. I would think that it would be better to make 45 separate copies of the database and then if there are updates, make some kind of process to pass all of the transactions to each instantiation of the DB. Granted, the disk space would increase to 45X the original estimate. How much updating/changing goes on in the Db? Cheers! Bob On Sat, 2002-07-27 at 11:24, Jan Hartmann wrote: > Hello, > > I am using Postgresql with PostGis and the Minnesota MapServer on a Beowulf > cluster for web-mapping applications. It runs fine on one node, producing > very fast interactive maps for an Apache/PHP web server. However, the > cluster consists of 45 nodes, all using a shared user file system. Is it > possible to start up a postmaster on every node, using the same database? > The backend processes themselves would be completely autonomous, but they > would have to share their data from the same source. To simplify things, > only read-access is necessary. Would this be possible, and if so, how can > the different postmasters be made to use a different postmaster.pid file > (which is located in the shared data directory)? > > It would be an interesting way for using the cluster, as the individual map > layers can be independently constructed on different nodes, and only finally > have to be put together in a complete map. Essentially, MapServer constructs > a map from layers, where each layer originates from an individual PostgreSQL > connection, even when using only one database. In a cluster solution > therefore, no communication between the nodes would be required. Even the > data could be distributed over the nodes and put into different databases, > but this would inevitably lead to much duplication and a set of databases > that would be very difficult to administer. In the archives, I saw some > mention of work in progress on distributed databases, but for this I don't > need much in the way of distributed facilities, just reading shared data. > > Any help would be very much appreciated. It certainly would be a great > example of PostgrSQL's advanced geometrical capabilities! > > > Jan Hartmann > Department of Geography > University of Amsterdam > jhart@frw.uva.nl -- Robert M. Meyer Sr. Network Administrator DigiVision Satellite Services 14 Lafayette Sq, Ste 410 Buffalo, NY 14203-1904 (716)332-1451
Thanks a lot for the reactions. I experimented a bit further with the answers in mind and got the following result: (Tom Lane) > In that case, make 45 copies of the database ... Without expecting much I created 3 data-directories and made symbolic links from everything in the original data-directory, except postmaster.pid. Next I started PostgreSQL on 3 nodes with PGDATA set to a different directory and PGPORT to a different port. Surprisingly it worked! First startup gave a message on each node about not having had a proper shutdown, but afterwards everything ran ok from all servers, even restarting PostgreSQL. MapServer didn't have a problem at all in producing a map from layers requested from different nodes, although without any time gain (see next point). Probably this gives PostgreSQL wizards the creeps and I wouldn't advise it to anyone, but just for curiosity's sake, what dangers am I running (given that only read access is needed). (Paul Ramsey) > It is worth noting that layers are not really all that independant from a display > point of view. (...) The process of creating the final visual product is the > result of sequential application of layers. Yes, I didn't realise that MapServer waits until a layer has been returned from PostgreSQL before starting with the next, essentially doing nothing in the meantime. I thought it worked like a web browser retrieving images, which is done asynchronously. It should work however when asking for complete maps from different browser frames, or retrieving a map in one frame and getting statistics for it in another, using separate PHP scripts targeted at different nodes. This would already help me enormously. (Bob Meyer) > Since disk is typically the slowest part of > any system, I would imagine that 45 nodes, all beating on one network > file system (or a multiport filesystem for that matter) would tend to > slow things down dramatically. I would think that it would be better to > make 45 separate copies of the database and then if there are updates, > make some kind of process to pass all of the transactions to each > instantiation of the DB. Granted, the disk space would increase to 45X > the original estimate. How much updating/changing goes on in the Db? I am trying this out for population statistics in Dutch municipalities within specified distances (1, 2, 5, 10, 25 km etc ) from the railway network. Number of railway lines: 419 (each having numerous line segments), #municipalites: 633, size of mun. map about 10M. It takes about 30 seconds wall time to produce a map (good compared to desktop GIS-sytems, I have no experience with Oracle Spatial). Next step would be using the roads network, (much larger of course, but still in the range of tens of megabytes, perhaps a hundred), and data from very diverse sources and years, including raster bitmaps, all not excessively large. Lots of different buffers have to be put around all kinds of selections (type roads, geographical selections) and compared with each other. Last step is animating the results in Flash movies: MapServer will be supporting Flash in the very near future, and I already got some preliminary results. This will require even more computations of intermediate results, to get flowing movies. So the problem is not data access, it is computing power and administration of a very disparate bunch of data. I certainly have enough computing power, and probably also enough disk space for a 45 fold data reduplication, but I know from experience how error prone this is, even with duplicating scripts. Even so, unless I am very much mistaken, the MapServer-PostgreSQL-Beowulf combinatation should offer some very interesting prospects in GIS. Thanks for the answers Jan Jan Hartmann Department of Geography University of Amsterdam jhart@frw.uva.nl
My apologies, to all that told me that the reindex was the solution to prevent the grouth of the database. I had a script that i thought that would reindex my database, but wasn't true. I figure out ,that the grouth of my database was due to the indexes with the following sql command: dbname=# select relname, relpages,reltuples from pg_class order by relpages desc limit 50; relname | relpages | reltuples ---------------------------------+----------+----------- sms_recv_unique_idclimsgidclien | 55800 | 39259 sms_recv_pkey | 39810 | 38948 sms_send_pkey | 22828 | 52048 The relpages * 8 Kbytes, gives =~ the disk space used by the index/table (correct me if i'm wrong). As we can see from the query result the disk space used by the indexes is very high. After doing the reindex of the tables, that values droped down, as well as the space used by the database. Query result after reindex: dbname=# select relname, relpages,reltuples from pg_class order by relpages desc limit 50; relname | relpages | reltuples ---------------------------------+----------+----------- sms_recv_unique_idclimsgidclien | 222 | 40581 sms_recv_pkey | 174 | 40581 sms_send_pkey | 430 | 54004 As a conclusion, in databases with high turn-around the use of reindex is imperative, combined with vacuums. Thanks to all, M.P.Garcia --------------------------------------------------------------------- Do not forget to reindex the db after the delete, index's do not manage them selves(if I remember correctly). The index will continue to grow until it eats your file system, as it did with me. By Marc Spitzer --------------------------------------------------------------------- On Thu, 2002-07-25 at 01:17, Marcos Garcia wrote: > Since this conversation had started, the presented solutions for the > disk space (disk space of tables and indexes) problem were: > > - reindex -> no space recovered > > - frequent vacuums -> some space recovered, meanwhile the database > keeps growing > > - vacuumm full -> some space recovered, meanwhile the database keeps > growing, quite similar to simple vacuum. we have also to keep in mind, > that the option "full", makes locks to the tables. Therefore, the > services around the database locks too, and with the growing of the > database the time spent for "vacumm full" increases, as well as the > downtime of the services around the database. > > > So, my doubts are: > > - There's nothing we can do to avoid the growth of the database, only > slow down that growth. > > - We, application developers that use postgresql databases have to > count with this problem. > > Sorry if i'm being a little rude, but i'm in a real trouble. > > Thanks in advance, > > > M.P.Garcia > ` > end -- M.P.Garcia PT Inovação, SA Serviços e Redes Móveis Rua José Ferreira Pinto Basto - 3810 Aveiro Tel: 234 403 253 - Fax: 234 424 160 E-mail: marcos-p-garcia@ptinovacao.pt