Thread: Help how to tune-up my Database
Hi,
I am a newbie here and just starting to use postgresql. My problems is how to tune up my server because it its too slow.
We just ported from DBF to postgresql.
This is my PC specs: P4, 512Ram, Linux 9
Because I am working in a statistical organization we have a very large data volume
These are my data:
Table 1 with 60 million data but only with 10 fields
Table 2 with 30 million data with 15 fields
Table 3 with 30 million data with 10 fields
I will only use this server for querying ….. I already read and apply those articles found in the archives section but still the performance is not good.
I am planning to add another 512 RAM …Another question is how to calculate shared_buffer size ..
Thanks a lot and hoping for your kind answers ..
Michael Puncia
Philippines
On Fri, 7 May 2004, Michael Ryan S. Puncia wrote: > Hi, > > I am a newbie here and just starting to use postgresql. My > problems is how to tune up my server because it its too slow. First, read this: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html > This is my PC specs: P4, 512Ram, Linux 9 get more ram. Hard Drives: interface, how many, RAID??? For a mostly read database IDEs are pretty good. Having multiple drives in a RAID-5 or RAID1+0 works well on a mostly read database too. Keep the stripe size small is setting up a RAID array for a database. > Because I am working in a statistical organization we have a very large data > volume > > These are my data: > > > > Table 1 with 60 million data but only with 10 fields > > Table 2 with 30 million data with 15 fields > > Table 3 with 30 million data with 10 fields That's not really that big, but it's big enough you have to make sure your server is tuned properly. > I will only use this server for querying ... I already read and apply those > articles found in the archives section but still the performance is not > good. > > I am planning to add another 512 RAM .Another question is how to calculate > shared_buffer size .. I'm assuming you've recently vacuumed full and analyzed your database... Shared buffers should probably be between 1000 and 10000 on about 98% of all installations. Setting it higher than 25% of memory is usually a bad idea. Since they're in 8k blocks (unless you compiled with a customer block size, you'd know if you did, it's not something you can accidentally do by leaning on the wrong switch...) you probably want about 10000 blocks or so to start, which will give you about 80 megs of shared buffer. PostgreSQL doesn't really cache as well as the kernel, so it's better to leave more memory available for kernel cache than you allocate to buffer cache. On a machine with only 512Meg, I'm guessing you'll get about 128 to 200 megs of kernel cache if you're only running postgresql and you have it set to 10000 buffers. The important things to check / set are things lik effective_cache_size. It too is measured in 8k blocks, and reflects the approximate amount of kernel cache being dedicated to postgresql. assuming a single service postgresql only box, that will be the number that a server that's been up for a while shows under top like so: 9:50am up 12:16, 4 users, load average: 0.00, 0.00, 0.00 104 processes: 102 sleeping, 2 running, 0 zombie, 0 stopped CPU states: 0.7% user, 0.3% system, 0.0% nice, 1.7% idle Mem: 512924K av, 499248K used, 13676K free, 0K shrd, 54856K buff Swap: 2048248K av, 5860K used, 2042388K free 229572K cached the 229572k cached entry shows about 230 megs. divided by 8192 we get about 28000. sort_mem might do with a small bump, especially if you're only handling a few connections at a time. Be careful, it's per sort, and measured in megs, so it's easy for folks to set it too high and make their machine start flushing too much kernel cache, which will slow down the other backends that have to go to disk for data. A good starting point for testing is anywhere from 8192 to 32768. 32768 is 32 megs, which can starve a machine as small as yours if there are a couple of queries each running a couple of sorts on large sets at the same time. Lastly, using explain analyze <your query here> you can see if postgresql is making a bad plan choice. compared estimated rows to actual rows. Look for things like nested loops being run on what the planner thinks will be 80 rows but is, in fact, 8000 rows. You can change random page cost to change the tendency of the server to favor seq scans to index scans. Lower = greater tendency towards index scans. the default is 4, but most production servers with enough memory to cache most of their data will run well on a setting of 1.2 to 1.4. My dual 2800 with 2 gig ram runs quite well at 1.3 to 1.4. You can also change the settings to random_page_cost, as well as turning off options to the planner with the following env vars: enable_hashagg enable_hashjoin enable_indexscan enable_mergejoin enable_nestloop enable_seqscan enable_sort enable_tidscan They are all on by default, and shouldn't really be turned off by default for the most part. but for an individual session to figure out if the query planner is making the right plan you can set them to off to see if using another plan works better. so, if you've got a nested loop running over 80000 rows that the planner thought was gonna be 80 rows, you can force it to stop using the nested loop for your session with: set enable_nestloop=off; and use explain analyze to see if it runs faster. You can set effective_cache_size and sort_mem on the fly for a single connection, or set them in postgresql.conf and restart or reload to make a change in the default. shared_buffers is set on postgresql startup, and can't be changed without restarting the database. Reloading won't do it.
scott.marlowe wrote: > sort_mem might do with a small bump, especially if you're only handling a > few connections at a time. Be careful, it's per sort, and measured in > megs, so it's easy for folks to set it too high and make their machine > start flushing too much kernel cache, which will slow down the other > backends that have to go to disk for data. <snip> http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html (under "Memory"), it says that sort_mem is set in KB. Is this document wrong (or outdated)?
Sorry about that, I meant kbytes, not megs. My point being it's NOT measured in 8k blocks, like a lot of other settings. sorry for the mixup. On Fri, 7 May 2004, Bricklen wrote: > scott.marlowe wrote: > > sort_mem might do with a small bump, especially if you're only handling a > > few connections at a time. Be careful, it's per sort, and measured in > > megs, so it's easy for folks to set it too high and make their machine > > start flushing too much kernel cache, which will slow down the other > > backends that have to go to disk for data. > <snip> > http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html > (under "Memory"), it says that sort_mem is set in KB. Is this document > wrong (or outdated)? > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
scott.marlowe wrote: > Sorry about that, I meant kbytes, not megs. My point being it's NOT > measured in 8k blocks, like a lot of other settings. sorry for the mixup. > No worries, I just wanted to sort that out for my own benefit, and anyone else who may not have caught that.