Thread: Performance advice
I'd like to get some feedback on my setup to see if I can optimize my database performance. My application has two separate applications: The first application connects to websites and records the statistics in the database. Websites are monitored every 5 or 10 minutes (depends on client), there are 900 monitors which comes out to 7,800 monitorings per hour. The monitor table has columns "nextdate" and "status" which are updated with every monitoring, and also a row is inserted into the status table and the status item table. For my performance testing (we're just about to go live) I've loaded the database with a month of data (we don't plan to keep data longer than 1 month). So my status table has 6 million records and my status item table has 6 million records as well. One key is that the system is multithreaded so up to 32 processes are accessing the database at the same time, updating the "nextdate" before the monitoring and inserting the status and status item records after. There is a serious performance constraint here because unlike a webserver, this application cannot slow down. If it slows down, we won't be able to monitor our sites at 5 minute intervals which will make our customers unhappy. The second application is a web app (tomcat) which lets customers check their status. Both of these applications are deployed on the same server, a 4 CPU (Xeon) with 1.5 gigs of RAM. The OS (RedHat Linux 7.3) and servers are running on 18gig 10,000 RPM SCSI disk that is mirrored to a 2nd disk. The database data directory is on a separate 36 gig 10,000 RPM SCSI disk (we're trying to buy a 2nd disk to mirror it). I'm using Postgres 7.3.2. Issue #1 - Vacuum => Overall the system runs pretty well and seems stable. Last night I did a "vacuum full analyze" and then ran my app overnight and first thing in the morning I did a "vacuum analyze", which took 35 minutes. I'm not sure if this is normal for a database this size (there are 15,000 updates per hour). During the vacuum my application does slow down quite a bit and afterwards is slow speeds back up. I've attached the vacuum output to this mail. I'm using Java Data Objects (JDO) so if table/column names look weird it's because the schema is automatically generated. Issue #2 - postgres.conf => I'd love to get some feedback on these settings. I've read the archives and no one seems to agree I know, but with the above description of my app I hope someone can at least point me in the right direction: max_connections = 200 # # Shared Memory Size # shared_buffers = 3072 # min max_connections*2 or 16, 8KB each #max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes #max_fsm_pages = 10000 # min 1000, fsm is free space map, ~6 bytes #max_locks_per_transaction = 64 # min 10 #wal_buffers = 8 # min 4, typically 8KB each # # Non-shared Memory Sizes # sort_mem = 8192 # min 64, size in KB vacuum_mem = 24576 # min 1024, size in KB The rest are left uncommented (using the defaults). Issue #3 - server hardware => - Is there anything I can do with the hardware to increase performance? - Should I increase the ram to 2 gigs? top shows that it is using the swap a bit (about 100k only). - I have at my disposal one other server which has 2 Xeons, 10,000 RPM SCSI drive. Would it make sense to put Postgres on it and leave my apps running on the more powerful 4 CPU server? - Would a RAID setup make the disk faster? Because top rarely shows the CPUs above 50%, I suspect maybe the disk is the bottleneck. I'm thrilled to be able to use Postgres instead of a commercial database and I'm looking forward to putting this into production. Any help with the above questions would be greatly appreciated. Michael Mattox
Attachment
On 24 Jun 2003 at 9:39, Michael Mattox wrote: > I'd like to get some feedback on my setup to see if I can optimize my > database performance. My application has two separate applications: > > The first application connects to websites and records the statistics in the > database. Websites are monitored every 5 or 10 minutes (depends on client), > there are 900 monitors which comes out to 7,800 monitorings per hour. The > monitor table has columns "nextdate" and "status" which are updated with > every monitoring, and also a row is inserted into the status table and the > status item table. For my performance testing (we're just about to go live) > I've loaded the database with a month of data (we don't plan to keep data > longer than 1 month). So my status table has 6 million records and my > status item table has 6 million records as well. One key is that the system > is multithreaded so up to 32 processes are accessing the database at the > same time, updating the "nextdate" before the monitoring and inserting the > status and status item records after. There is a serious performance > constraint here because unlike a webserver, this application cannot slow > down. If it slows down, we won't be able to monitor our sites at 5 minute > intervals which will make our customers unhappy. > > The second application is a web app (tomcat) which lets customers check > their status. Both of these applications are deployed on the same server, a > 4 CPU (Xeon) with 1.5 gigs of RAM. The OS (RedHat Linux 7.3) and servers > are running on 18gig 10,000 RPM SCSI disk that is mirrored to a 2nd disk. > The database data directory is on a separate 36 gig 10,000 RPM SCSI disk > (we're trying to buy a 2nd disk to mirror it). I'm using Postgres 7.3.2. I recommend that you use a latest kernel with, pre-empt+low latency + O(1) patches. First two are said to affect desktop only, but I believe a loaded server need it as well. I suggest you get latest kernel from kernel.org and apply con kolivas's patches from http://members.optusnet.com.au/ckolivas/kernel/. That is the easiest way around. Furthermore if I/O throghput is an issue and you aer ready to experiment at this stage, try freeBSD. Many out here believe that it has superior IO scheduling and of course VM. If you move off your database server to another machine, you might get a chance to play with it. > Issue #1 - Vacuum => Overall the system runs pretty well and seems stable. > Last night I did a "vacuum full analyze" and then ran my app overnight and > first thing in the morning I did a "vacuum analyze", which took 35 minutes. > I'm not sure if this is normal for a database this size (there are 15,000 > updates per hour). During the vacuum my application does slow down quite a > bit and afterwards is slow speeds back up. I've attached the vacuum output > to this mail. I'm using Java Data Objects (JDO) so if table/column names > look weird it's because the schema is automatically generated. That is expected given how much data you have inserted overnight. The changes in status and status item table would need some time to come back. Vacuum is IO intensive process. In case of freeBSD, if you lower the nice priority, IO priority is also lowered. That mean a vacuum process with lower priority will not hog disk bandwidth on freeBSD. Unfortunately not so on linux. So the slowdown you are seeing is probably due to disk bandwidth congestion. Clearly with a load like this, you can not rely upon scheduled vacuums. I recommend you use pgavd in contrib directory in postgresql CVS tree. That would vacuum the database whenever needed. It's much better than scheduled vacuum. If you can not use it immediately, do a hourly vacuum analyze, may be even more frequent. Nightly vacuum would simply not do. > > Issue #2 - postgres.conf => I'd love to get some feedback on these settings. > I've read the archives and no one seems to agree I know, but with the above > description of my app I hope someone can at least point me in the right > direction: > > max_connections = 200 > > # > # Shared Memory Size > # > shared_buffers = 3072 # min max_connections*2 or 16, 8KB each I would say of the order of 10K would be good. You need to experiment a bit to find out what works best for you. > #max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes > #max_fsm_pages = 10000 # min 1000, fsm is free space map, ~6 bytes You may bump these two as well. See past discussions for reference. Doubling them would be a good start. > #max_locks_per_transaction = 64 # min 10 > #wal_buffers = 8 # min 4, typically 8KB each > > # > # Non-shared Memory Sizes > # > sort_mem = 8192 # min 64, size in KB > vacuum_mem = 24576 # min 1024, size in KB > > The rest are left uncommented (using the defaults). Not good. You need to tune effective_cache_size so that postgresql accounts for 1.5GB RAM your machine has. I would say set it up around 800MB. Secondly with SCSI in place, lower random_tuple_cost. Default is 4. 1 might be too agrressive. 2 might be OK. Experiment and decide. > > Issue #3 - server hardware => > > - Is there anything I can do with the hardware to increase performance? > > - Should I increase the ram to 2 gigs? top shows that it is using the swap > a bit (about 100k only). Means it does not need swap almost at all. Linux has habit to touch swap just for no reason. So memory is not the bottleneck. > - I have at my disposal one other server which has 2 Xeons, 10,000 RPM SCSI > drive. Would it make sense to put Postgres on it and leave my apps running > on the more powerful 4 CPU server? > > - Would a RAID setup make the disk faster? Because top rarely shows the > CPUs above 50%, I suspect maybe the disk is the bottleneck. Yes it is. You need to move WAL to a different disk. Even if it is IDE. (OK that was over exaggeration but you got the point). If your data directories and WAL logs are on physically different disks, that should bump up performance plenty. HTH Bye Shridhar -- Ambidextrous, adj.: Able to pick with equal skill a right-hand pocket or a left. -- Ambrose Bierce, "The Devil's Dictionary"
On 24 Jun 2003 at 13:29, Shridhar Daithankar wrote: > > - I have at my disposal one other server which has 2 Xeons, 10,000 RPM SCSI > > drive. Would it make sense to put Postgres on it and leave my apps running > > on the more powerful 4 CPU server? Argh.. Forgot it first time. With java runnning on same machine, I would not trust that machine for having free RAM all the time, no matter how much RAM you have put into it. Secondly you are running linux which is known to have weird behaviour problems when it runs low on memory. For both these reasons, I suggest you put your database on another machine. A dual CPU machine is more than enough. Put good deal RAM, around a GB and two SCSI disks, one for data and another for WAL. If you get RAID for data, great. But that should suffice otherwise as well. > > > > - Would a RAID setup make the disk faster? Because top rarely shows the > > CPUs above 50%, I suspect maybe the disk is the bottleneck. > > Yes it is. You need to move WAL to a different disk. Even if it is IDE. (OK > that was over exaggeration but you got the point). If your data directories and > WAL logs are on physically different disks, that should bump up performance > plenty. In addition to that, on linux, it matters a lot as in what filesystem you use. IMO ext3 is strict no-no. Go for either reiserfs or XFS. There is no agreement as in which file system is best on linux. so you need to experiment if you need every ounce of performance. And for that you got to try freeBSD. That would gave you plenty of idea about performance differences. ( Especially I love man hier and man tuning on freeBSD. Nothing on linux comes anywhere near to that) Bye Shridhar -- "Who is General Failure and why is he reading my hard disk ?"Microsoft spel chekar vor sail, worgs grate !!(By leitner@inf.fu-berlin.de, Felix von Leitner)
On Tue, 24 Jun 2003, Shridhar Daithankar wrote: > On 24 Jun 2003 at 13:29, Shridhar Daithankar wrote: > > > - I have at my disposal one other server which has 2 Xeons, 10,000 RPM SCSI > > > drive. Would it make sense to put Postgres on it and leave my apps running > > > on the more powerful 4 CPU server? > > Argh.. Forgot it first time. > > With java runnning on same machine, I would not trust that machine for having > free RAM all the time, no matter how much RAM you have put into it. There are always the -Xmx, -Xss, -Xms jvm switches, to control stack (per thread) and heap sizes. > > Secondly you are running linux which is known to have weird behaviour problems > when it runs low on memory. > > For both these reasons, I suggest you put your database on another machine. A > dual CPU machine is more than enough. Put good deal RAM, around a GB and two > SCSI disks, one for data and another for WAL. If you get RAID for data, great. > But that should suffice otherwise as well. > I think the DB on another machine could be from something helpfull, to an overkill, to a leg self shooting. Depending on the type of the majority of queries and the network speed someone should give an extra time to think about it. > > > > > > - Would a RAID setup make the disk faster? Because top rarely shows the > > > CPUs above 50%, I suspect maybe the disk is the bottleneck. > > > > Yes it is. You need to move WAL to a different disk. Even if it is IDE. (OK > > that was over exaggeration but you got the point). If your data directories and > > WAL logs are on physically different disks, that should bump up performance > > plenty. > > In addition to that, on linux, it matters a lot as in what filesystem you use. > IMO ext3 is strict no-no. Go for either reiserfs or XFS. > > There is no agreement as in which file system is best on linux. so you need to > experiment if you need every ounce of performance. > > And for that you got to try freeBSD. That would gave you plenty of idea about > performance differences. ( Especially I love man hier and man tuning on > freeBSD. Nothing on linux comes anywhere near to that) > Its like comparing Mazda with VVT-i. Whould you expect to find the furniture fabric specs in the main engine manual? Besides all that, i must note that jdk1.4.1 runs pretty nice on FreeBSD, and some efforts to run java over the KSE libs have been done with success. > Bye > Shridhar > > -- > "Who is General Failure and why is he reading my hard disk ?"Microsoft spel > chekar vor sail, worgs grate !!(By leitner@inf.fu-berlin.de, Felix von Leitner) > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > -- ================================================================== Achilleus Mantzios S/W Engineer IT dept Dynacom Tankers Mngmt Nikis 4, Glyfada Athens 16610 Greece tel: +30-210-8981112 fax: +30-210-8981877 email: achill at matrix dot gatewaynet dot com mantzios at softlab dot ece dot ntua dot gr
On 24 Jun 2003 at 12:10, Achilleus Mantzios wrote: > On Tue, 24 Jun 2003, Shridhar Daithankar wrote: > > With java runnning on same machine, I would not trust that machine for having > > free RAM all the time, no matter how much RAM you have put into it. > > There are always the -Xmx, -Xss, -Xms jvm switches, > to control stack (per thread) and heap sizes. OK. I am not familiar with any of them. Are they related to java? Have never worked on java myself. I was talking about OOM killer behaviour, which was beaten to death for last few days.. > > For both these reasons, I suggest you put your database on another machine. A > > dual CPU machine is more than enough. Put good deal RAM, around a GB and two > > SCSI disks, one for data and another for WAL. If you get RAID for data, great. > > But that should suffice otherwise as well. > > > > I think the DB on another machine could be from something helpfull, > to an overkill, to a leg self shooting. > Depending on the type of the majority of queries and the network speed > someone should give an extra time to think about it. I agree. but with the input provided, I think that remains as viable option. > > And for that you got to try freeBSD. That would gave you plenty of idea about > > performance differences. ( Especially I love man hier and man tuning on > > freeBSD. Nothing on linux comes anywhere near to that) > > > > Its like comparing Mazda with VVT-i. What are they? My guess is they are cars., Anyway, I drive a tiny utility bike in far country like India..:-) > Whould you expect to find the furniture fabric > specs in the main engine manual? Well, I agree they are different but not that much..:-) And besides man tuning is much more helpful w.r.t. tuning a box. I still think it is relevant. and that was just one example why freeBSD is better server OS, out of the box, compared to linux. No flame wars.. Peace.. Bye Shridhar -- Lieberman's Law: Everybody lies, but it doesn't matter since nobody listens.
On Tuesday 24 Jun 2003 8:39 am, Michael Mattox wrote: > I'd like to get some feedback on my setup to see if I can optimize my > database performance. My application has two separate applications: > > The first application connects to websites and records the statistics in > the database. Websites are monitored every 5 or 10 minutes (depends on > client), there are 900 monitors which comes out to 7,800 monitorings per > hour. [snip] > There is a serious > performance constraint here because unlike a webserver, this application > cannot slow down. If it slows down, we won't be able to monitor our sites > at 5 minute intervals which will make our customers unhappy. Others are discussing the performance/tuning stuff, but can I make one suggestion? Don't log your monitoring info directly into the database, log straight to one or more text-files and sync them every few seconds. Rotate the files once a minute (or whatever seems suitable). Then have a separate process that reads "old" files and processes them into the database. The big advantage - you can take the database down for a short period and the monitoring goes on. Useful for those small maintenance tasks. -- Richard Huxton
> Don't log your monitoring info directly into the database, log > straight to one > or more text-files and sync them every few seconds. Rotate the > files once a > minute (or whatever seems suitable). Then have a separate process > that reads > "old" files and processes them into the database. > > The big advantage - you can take the database down for a short > period and the > monitoring goes on. Useful for those small maintenance tasks. This is a good idea but it'd take a bit of redesign to make it work. here's my algorithm now: - Every 10 seconds I get a list of monitors who have nextdate >= current time - I put the id numbers of the monitors into a queue - A thread from a thread pool (32 active threads) retrieves the monitor from the database from its id, updates the nextdate timestamp, executes the monitor, and stores the status in the database So I have two transactions, one to update the monitor's nextdate and another to update its status. Now that I wrote that I see a possibility to steamline the last step. I can wait until I update the status to update the nextdate. That would cut the number of transactions in two. Only problem is I have to be sure not to add a monitor to the queue when it's currently executing. This shouldn't be hard, I have a hashtable containing all the active monitors. Thanks for the suggestion, I'm definitely going to give this some more thought. Michael
"Shridhar Daithankar" <shridhar_daithankar@persistent.co.in> writes: >> - Would a RAID setup make the disk faster? Because top rarely shows the >> CPUs above 50%, I suspect maybe the disk is the bottleneck. > Yes it is. You need to move WAL to a different disk. For an update-intensive setup, putting WAL on its own disk is definitely your biggest win. You might then find it rewarding to fool with the wal_sync_method and perhaps to bump up wal_buffers a little. A small number of people have had luck with putting a nonzero commit_delay but I have little faith in that. regards, tom lane
I want to thank everyone for their help and post a status update. I've made quite a bit of improvements. Here's what all I did: I refactored my algorithm, instead of updating the timestamp, monitoring the website, and then updating the status (two transactions), I wait and update the timestamp and status at the same time (one transaction). This required using a hashtable to contain active monitors so that I don't add a monitor to the queue while it's executing (I check to make sure it's not in the queue and not executing before adding it to the queue). This cut down my transactions by a factor of 2. I changed the postgres.conf settings as suggested by several people. I've attached it to this email, please let me know if you see anything else I can tweak. top still says I have plenty of ram, so should I increase the buffers and/or effective_cache even more? Mem: 1547572K av, 1537212K used, 10360K free, 0K shrd, 107028K buff Swap: 1044216K av, 14552K used, 1029664K free 1192280K cached I moved the WAL (pg_xlog directory) to another drive. There are two drives in the system, so one has the OS, servers, all files, and the WAL and the other has nothing but the data. I think it'd be best to put the WAL on a separate drive from the OS but I don't know if I can get another drive added just for that due to our limited budget. I learned that I only need to vacuum tables that are changed frequently. My app doesn't do any deletes, and only one table changes, the monitor table. several times a second. So I only need to vacuum that table. Vacuuming the entire database is slow and unecessary. If I only do the monitor table, it takes only a few seconds. Much better than the 35 minutes for the entire database that it was taking this morning. Result of all this? Before a monitor operation (update timestamp, download webpage, update status) was taking 5-6 seconds each, and up to a minute during a vacuum. Now it takes less than 1 second. Part of this is because I can run 8 threads instead of 32 due to the other optimizations. I want to thank everyone for their input. I've heard Postgres is slow and doesn't scale, but now I do it's really just a matter of learning to configure it properly and trial & error. I do think the documentation could be enhanced a bit here, but I'm sure there are some users who don't make this effort and end up switching to another database, which is bad for Postgres' image. Anyway, I hope my summary can help others who may find this email in the archives. Regards, Michael > -----Original Message----- > From: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Michael > Mattox > Sent: Tuesday, June 24, 2003 9:40 AM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] Performance advice > > > I'd like to get some feedback on my setup to see if I can optimize my > database performance. My application has two separate applications: > > The first application connects to websites and records the > statistics in the > database. Websites are monitored every 5 or 10 minutes (depends > on client), > there are 900 monitors which comes out to 7,800 monitorings per hour. The > monitor table has columns "nextdate" and "status" which are updated with > every monitoring, and also a row is inserted into the status table and the > status item table. For my performance testing (we're just about > to go live) > I've loaded the database with a month of data (we don't plan to keep data > longer than 1 month). So my status table has 6 million records and my > status item table has 6 million records as well. One key is that > the system > is multithreaded so up to 32 processes are accessing the database at the > same time, updating the "nextdate" before the monitoring and inserting the > status and status item records after. There is a serious performance > constraint here because unlike a webserver, this application cannot slow > down. If it slows down, we won't be able to monitor our sites at 5 minute > intervals which will make our customers unhappy. > > The second application is a web app (tomcat) which lets customers check > their status. Both of these applications are deployed on the > same server, a > 4 CPU (Xeon) with 1.5 gigs of RAM. The OS (RedHat Linux 7.3) and servers > are running on 18gig 10,000 RPM SCSI disk that is mirrored to a 2nd disk. > The database data directory is on a separate 36 gig 10,000 RPM SCSI disk > (we're trying to buy a 2nd disk to mirror it). I'm using Postgres 7.3.2. > > Issue #1 - Vacuum => Overall the system runs pretty well and seems stable. > Last night I did a "vacuum full analyze" and then ran my app overnight and > first thing in the morning I did a "vacuum analyze", which took > 35 minutes. > I'm not sure if this is normal for a database this size (there are 15,000 > updates per hour). During the vacuum my application does slow > down quite a > bit and afterwards is slow speeds back up. I've attached the > vacuum output > to this mail. I'm using Java Data Objects (JDO) so if table/column names > look weird it's because the schema is automatically generated. > > Issue #2 - postgres.conf => I'd love to get some feedback on > these settings. > I've read the archives and no one seems to agree I know, but with > the above > description of my app I hope someone can at least point me in the right > direction: > > max_connections = 200 > > # > # Shared Memory Size > # > shared_buffers = 3072 # min max_connections*2 or 16, 8KB each > #max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes > #max_fsm_pages = 10000 # min 1000, fsm is free space > map, ~6 bytes > #max_locks_per_transaction = 64 # min 10 > #wal_buffers = 8 # min 4, typically 8KB each > > # > # Non-shared Memory Sizes > # > sort_mem = 8192 # min 64, size in KB > vacuum_mem = 24576 # min 1024, size in KB > > The rest are left uncommented (using the defaults). > > Issue #3 - server hardware => > > - Is there anything I can do with the hardware to increase performance? > > - Should I increase the ram to 2 gigs? top shows that it is > using the swap > a bit (about 100k only). > > - I have at my disposal one other server which has 2 Xeons, > 10,000 RPM SCSI > drive. Would it make sense to put Postgres on it and leave my > apps running > on the more powerful 4 CPU server? > > - Would a RAID setup make the disk faster? Because top rarely shows the > CPUs above 50%, I suspect maybe the disk is the bottleneck. > > I'm thrilled to be able to use Postgres instead of a commercial > database and > I'm looking forward to putting this into production. Any help with the > above questions would be greatly appreciated. > > Michael Mattox > >
Attachment
Micheal, > I changed the postgres.conf settings as suggested by several people. I've > attached it to this email, please let me know if you see anything else I > can tweak. top still says I have plenty of ram, so should I increase the > buffers and/or effective_cache even more? Effective cache, yes. Buffers, no. Even if you have RAM available, increasing buffers beyond an optimal but hard to locate point decreases performance. I'd advise you to start playing with buffers only after you are done playing with other memory-eating params. I would suggest, though, increasing FSM_relations even more, until your daily VACUUM FULL does almost no work. This will improve index usage and speed queries. > I moved the WAL (pg_xlog directory) to another drive. There are two drives > in the system, so one has the OS, servers, all files, and the WAL and the > other has nothing but the data. I think it'd be best to put the WAL on a > separate drive from the OS but I don't know if I can get another drive > added just for that due to our limited budget. A high-speed IDE drive might be adequate for WAL, except that Linux has booting issues with a mix of IDE & SCSI and many motherboards. > I learned that I only need to vacuum tables that are changed frequently. > My app doesn't do any deletes, and only one table changes, the monitor > table. several times a second. So I only need to vacuum that table. > Vacuuming the entire database is slow and unecessary. If I only do the > monitor table, it takes only a few seconds. Much better than the 35 > minutes for the entire database that it was taking this morning. Increasing FSM_relations will also make vacuums more efficient. > I want to thank everyone for their input. I've heard Postgres is slow and > doesn't scale, but now I do it's really just a matter of learning to > configure it properly and trial & error. I do think the documentation > could be enhanced a bit here, but I'm sure there are some users who don't Absolutely. I'm working on it. Look to Techdocs next week. -- Josh Berkus Aglio Database Solutions San Francisco
> configure it properly and trial & error. I do think the documentation could > be enhanced a bit here, but I'm sure there are some users who don't make Do you have any specific thoughts about documentation? Areas of confusion? Was it difficult to find the information in question, or was it simply unavailable? -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Attachment
> > configure it properly and trial & error. I do think the > documentation could > > be enhanced a bit here, but I'm sure there are some users who don't make > > Do you have any specific thoughts about documentation? Areas of > confusion? Was it difficult to find the information in question, or was > it simply unavailable? I think the biggest area of confusion for me was that the various parameters are very briefly described and no context is given for their parameters. For example, from: http://www.postgresql.org/docs/view.php?version=7.3&idoc=1&file=runtime-conf ig.html MAX_FSM_RELATIONS (integer) Sets the maximum number of relations (tables) for which free space will be tracked in the shared free-space map. The default is 100. This option can only be set at server start. There's not enough information there to properly tune postgres. A few people suggested increasing this so I set mine to 4000. I don't have much idea if that's too high, too low, just right. What would be nice if these were put into context. Maybe come up with a matrix, with the settings and various server configs. We could come up with the 5-10 most common server configurations. So a user with 256k of ram and a single IDE disk will have different range from a user with 2 gigs of ram and a SCSI RAID. The next thing that really needs improving is the optimization section of the FAQ (http://www.postgresql.org/docs/faqs/FAQ.html#3.6). This is a very important section of the documentation and it's pretty empty. One thing that was suggested to me is to move the WAL directory to another drive. That could be in this FAQ section. effective_cache isn't mentioned either. It'd be great to talk about server hardware as well, such as memory, whether to put postgres on a dedicated server or keep it on the same server as the apps/webapps. Please don't misunderstand, the Postgres documentation is excellent. Some improvements to the performance sections of the documentation would make a huge difference. Regards, Michael
[ This has been written offline yesterday. Now I see that most of it has already been covered. I send it anyway ... ] On Tue, 24 Jun 2003 09:39:32 +0200, "Michael Mattox" <michael.mattox@verideon.com> wrote: >Websites are monitored every 5 or 10 minutes (depends on client), >there are 900 monitors which comes out to 7,800 monitorings per hour. So your server load - at least INSERT, UPDATE, DELETE - is absolutely predictable. This is good. It enables you to design a cron-driven VACUUM strategy. |INFO: --Relation public.jdo_sequencex-- |INFO: Pages 28: Changed 1, Empty 0; Tup 1: Vac 5124, Keep 0, UnUsed 0. ^ ^^^^ This table could stand more frequent VACUUMs, every 15 minutes or so. BTW, from the name of this table and from the fact that there is only one live tuple I guess that you are using it to keep track of a sequence number. By using a real sequence you could get what you need with less contention; and you don't have to VACUUM a sequence. |INFO: --Relation public.monitorx-- |INFO: Removed 170055 tuples in 6036 pages. | CPU 0.52s/0.81u sec elapsed 206.26 sec. |INFO: Pages 6076: Changed 0, Empty 0; Tup 2057: Vac 170055, Keep 568, UnUsed 356. | Total CPU 6.28s/13.23u sec elapsed 486.07 sec. The Vac : Tup ratio for this table is more than 80. You have to VACUUM this table more often. How long is "overnight"? Divide this by 80 and use the result as the interval between VACUUM [VERBOSE] [ANALYSE] public.monitorx; Thus you'd have approximately as many dead tuples as live tuples and the table size should not grow far beyond 150 pages (after an initial VACUUM FULL, of course). Then VACUUM of this table should take no more than 20 seconds. Caveat: Frequent ANALYSEs might trigger the need to VACUUM pg_catalog.pg_statistic. > The >monitor table has columns "nextdate" and "status" which are updated with >every monitoring, [...] > updating the "nextdate" before the monitoring and inserting the >status and status item records after. Do you mean updating monitor.nextdate before the monitoring and monitor.status after the monitoring? Can you combine these two UPDATEs into one? > During the vacuum my application does slow down quite a bit Yes, because VACUUM does lots of I/O. > and afterwards is slow speeds back up. ... because the working set is slowly fetched into the cache after having been flushed out by VACUUM. Your five largest relations are monitorstatus_statusitemsx, monitorstatusitemlistd8ea58a5x, monitorstatusitemlistx, monitorstatusitemx, and monitorstatusx. The heap relations alone (without indexes) account for 468701 pages, almost 4GB. VACUUMing these five relations takes 23 minutes for freeing less than 200 out of 6 million tuples for each relation. This isn't worth it. Unless always the same tuples are updated over and over, scheduling a VACUUM for half a million deletions/updates should be sufficient. >shared_buffers = 3072 # min max_connections*2 or 16, 8KB each >sort_mem = 8192 # min 64, size in KB >vacuum_mem = 24576 # min 1024, size in KB > >The rest are left uncommented (using the defaults). As has already been said, don't forget effective_cache_size. I'm not so sure about random_page_cost. Try to find out which queries are too slow. EXPLAIN ANALYSE is your friend. One more thing: I see 2 or 3 UPDATEs and 5 INSERTs per monitoring. Are these changes wrapped into a single transaction? Servus Manfred
PM4JI but from my point of view this has been a most useful thread. I too have found it difficult to find the right bitof documentation on performance. I *think* what is needed is some sort of a route map, Poor Performance - start here.Then some questions with sections of the documentation you should go to. Hilary At 13:00 24/06/2003 -0400, you wrote: >> configure it properly and trial & error. I do think the documentation could >> be enhanced a bit here, but I'm sure there are some users who don't make > >Do you have any specific thoughts about documentation? Areas of >confusion? Was it difficult to find the information in question, or was >it simply unavailable? > >-- >Rod Taylor <rbt@rbt.ca> > >PGP Key: http://www.rbt.ca/rbtpub.asc Hilary Forbes ------------- DMR Computer Limited: http://www.dmr.co.uk/ Direct line: 01689 889950 Switchboard: (44) 1689 860000 Fax: (44) 1689 860330 E-mail: hforbes@dmr.co.uk **********************************************************
I agree a route map would really help. > -----Original Message----- > From: Hilary Forbes [SMTP:hforbes@dmr.co.uk] > Sent: 25 June 2003 10:12 > To: Rod Taylor > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Performance advice > > PM4JI but from my point of view this has been a most useful thread. I too > have found it difficult to find the right bit of documentation on > performance. I *think* what is needed is some sort of a route map, Poor > Performance - start here. Then some questions with sections of the > documentation you should go to. > > Hilary > > At 13:00 24/06/2003 -0400, you wrote: > >> configure it properly and trial & error. I do think the documentation > could > >> be enhanced a bit here, but I'm sure there are some users who don't > make > > > >Do you have any specific thoughts about documentation? Areas of > >confusion? Was it difficult to find the information in question, or was > >it simply unavailable? > > > >-- > >Rod Taylor <rbt@rbt.ca> > > > >PGP Key: http://www.rbt.ca/rbtpub.asc > > > Hilary Forbes > ------------- > DMR Computer Limited: http://www.dmr.co.uk/ > Direct line: 01689 889950 > Switchboard: (44) 1689 860000 Fax: (44) 1689 860330 > E-mail: hforbes@dmr.co.uk > > ********************************************************** > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match
> [ This has been written offline yesterday. Now I see that most of it > has already been covered. I send it anyway ... ] Still great advice with slightly different explanations, very useful. > |INFO: --Relation public.jdo_sequencex-- > |INFO: Pages 28: Changed 1, Empty 0; Tup 1: Vac 5124, Keep 0, UnUsed 0. > ^ ^^^^ > This table could stand more frequent VACUUMs, every 15 minutes or so. Can you explain what the "Vac" is and how you knew that it should be vacuumed more often? I'd like to understand how to interpret my vacuum log. I looked in the vacuum section of the docs and there's nothing about the vacuum output <hint>. > BTW, from the name of this table and from the fact that there is only > one live tuple I guess that you are using it to keep track of a > sequence number. By using a real sequence you could get what you need > with less contention; and you don't have to VACUUM a sequence. I'm using Java Data Objects (JDO) which is an O/R mapper. It generated the schema from my object model by default it used a table for a sequence. I just got finished configuring it to use a real postgres sequence. With the way they have it designed, it opens and closes a connection each time it retrieves a sequence. Would I get a performance increase if I modify their code to retrieve multiple sequence numbers in one connection? For example I could have it grab 50 at a time, which would replace 50 connections with 1. > > The > >monitor table has columns "nextdate" and "status" which are updated with > >every monitoring, [...] > > updating the "nextdate" before the monitoring and inserting the > >status and status item records after. > > Do you mean updating monitor.nextdate before the monitoring and > monitor.status after the monitoring? Can you combine these two > UPDATEs into one? I was doing this to prevent the monitor from being added to the queue while it was executing. But I fixed this, effectively reducing my transactions by 1/2. > >shared_buffers = 3072 # min max_connections*2 or 16, 8KB each > >sort_mem = 8192 # min 64, size in KB > >vacuum_mem = 24576 # min 1024, size in KB > > > >The rest are left uncommented (using the defaults). > > As has already been said, don't forget effective_cache_size. I'm not > so sure about random_page_cost. Try to find out which queries are too > slow. EXPLAIN ANALYSE is your friend. > > One more thing: I see 2 or 3 UPDATEs and 5 INSERTs per monitoring. > Are these changes wrapped into a single transaction? These were in 2 transactions but now I have it into a single transaction. Thanks, Michael
On 25 Jun 2003 at 11:47, Michael Mattox wrote: > I'm using Java Data Objects (JDO) which is an O/R mapper. It generated the > schema from my object model by default it used a table for a sequence. I > just got finished configuring it to use a real postgres sequence. With the > way they have it designed, it opens and closes a connection each time it > retrieves a sequence. Would I get a performance increase if I modify their > code to retrieve multiple sequence numbers in one connection? For example I > could have it grab 50 at a time, which would replace 50 connections with 1. You need to use sequence functions like setval, curval, nextval. May be you can write your own wrapper function to "grab" as many sequence values as you want but it would be good if you design/maintain locking around it as appropriate. See http://developer.postgresql.org/docs/postgres/sql-createsequence.html http://developer.postgresql.org/docs/postgres/functions-sequence.html HTH Bye Shridhar -- Velilind's Laws of Experimentation: (1) If reproducibility may be a problem, conduct the test only once. (2) If a straight line fit is required, obtain only two data points.
On Wed, 2003-06-25 at 04:12, Hilary Forbes wrote: > PM4JI but from my point of view this has been a most useful thread. I too have found it difficult to find the right bitof documentation on performance. I *think* what is needed is some sort of a route map, Poor Performance - start here.Then some questions with sections of the documentation you should go to. Do you have any examples where this has worked well (for reference)? The only real example I have is MS's help which never gave me the right answer. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Attachment
> I think the biggest area of confusion for me was that the various parameters > are very briefly described and no context is given for their parameters. > improvements to the performance sections of the documentation would make a > huge difference. Agreed.. Josh has done some work recently re-arranging things to make them easier to find, but the content hasn't changed much. Thanks for your thoughts! -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Attachment
I agree that a "directed graph"-like performance map would be difficult to be written or understood. What i think would be ideal (helpful/feasible) is some kind of documentation of the algorithms involved in the planner/optimizer, along with some pointers to postgresql.conf parameters where applicable. This way we will know - Why something is happening - If it is the best plan - What tuning is possible On 25 Jun 2003, Rod Taylor wrote: > > > I think the biggest area of confusion for me was that the various parameters > > are very briefly described and no context is given for their parameters. > > > improvements to the performance sections of the documentation would make a > > huge difference. > > Agreed.. Josh has done some work recently re-arranging things to make > them easier to find, but the content hasn't changed much. > > Thanks for your thoughts! > > -- ================================================================== Achilleus Mantzios S/W Engineer IT dept Dynacom Tankers Mngmt Nikis 4, Glyfada Athens 16610 Greece tel: +30-210-8981112 fax: +30-210-8981877 email: achill at matrix dot gatewaynet dot com mantzios at softlab dot ece dot ntua dot gr
On 25/06/2003 10:47 Michael Mattox wrote: > I'm using Java Data Objects (JDO) which is an O/R mapper. It generated > the > schema from my object model by default it used a table for a sequence. I > just got finished configuring it to use a real postgres sequence. With > the > way they have it designed, it opens and closes a connection each time it > retrieves a sequence. Would I get a performance increase if I modify > their > code to retrieve multiple sequence numbers in one connection? For > example I > could have it grab 50 at a time, which would replace 50 connections with > 1. For best performance, you really should consider using a connection pool as it removes the overhead of creating and closing connections. -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
On Wed, 25 Jun 2003, Achilleus Mantzios wrote: > What i think would be ideal (helpful/feasible) > is some kind of documentation of the algorithms involved > in the planner/optimizer, along with some pointers > to postgresql.conf parameters where applicable. > > This way we will know > - Why something is happening > - If it is the best plan > - What tuning is possible I agree. In combination with this, I would find case studies very useful. Have the documentation team solicit a few volunteers with different setups (w/r/t db size, db traffic, and hardware). Perhaps these folks are running with the default postgresql.conf or have done little tuning. Via the performance list, work through the tuning process with each volunteer: 1. Gathering information about your setup that affects tuning. 2. Measuring initial performance as a baseline. 3. Making initial adjustments based on your setup. 4. Identifying poorly-written SQL. 5. Identifying poorly-indexed tables. 6. Measuring effects of each adjustment, and tuning accordingly. (Note: I am certainly no performance expert -- these steps are meant to be examples only.) Solicit a list member to monitor the discussion and document each case study in a consistent fashion. Run completed case studies by the performance and docs lists for review. I would be happy to join the docs team to work on such a project. michael p.s. Should this discussion be moved to psgql-docs?
On Wed, 25 Jun 2003 11:47:48 +0200, "Michael Mattox" <michael.mattox@verideon.com> wrote: >> |INFO: --Relation public.jdo_sequencex-- >> |INFO: Pages 28: Changed 1, Empty 0; Tup 1: Vac 5124, Keep 0, UnUsed 0. >> ^ ^^^^ >> This table could stand more frequent VACUUMs, every 15 minutes or so. > >Can you explain what the "Vac" is That's a long story, where shall I start? Search for MVCC in the docs and in the list archives. So you know that every DELETE and every UPDATE leaves behind old versions of tuples. The space occupied by these cannot be used immediately. VACUUM is responsible for finding dead tuples, which are so old that there is no active transaction that could be interested in their contents, and reclaiming the space. The number of such tuples is reported as "Vac". > and how you knew that it should be vacuumed more often? jdo_sequencex stores (5000 old versions and 1 active version of) a single row in 28 pages. Depending on when you did ANALYSE it and depending on the SQL statement, the planner might think that a sequential scan is the most efficient way to access this single row. A seq scan has to read 28 pages instead of a single page. Well, probably all 28 pages are in the OS cache or even in PG's shared buffers, but 27 pages are just wasted and push out pages you could make better use of. And processing those 28 pages does not come at no CPU cost. If you VACUUM frequently enough, this relation never grows beyond one page. >I'm using Java Data Objects (JDO) which is an O/R mapper. It generated the >schema from my object model by default it used a table for a sequence. I >just got finished configuring it to use a real postgres sequence. With the >way they have it designed, it opens and closes a connection each time it >retrieves a sequence. Would I get a performance increase if I modify their >code to retrieve multiple sequence numbers in one connection? For example I >could have it grab 50 at a time, which would replace 50 connections with 1. Better yet you modify the code to use the normal access functions for sequences. Servus Manfred