Thread: Beginner optimization questions, esp. regarding Tsearch2 configuration
I'm trying to optimize a resume search engine that is using Tsearch2 indexes. It's running on a dual-opteron 165 system with 4GB of ram and a raid1 3Gb/sec SATA array. Each text entry is about 2-3K of text, and there are about 23,000 rows in the search table, with a goal of reaching about 100,000 rows eventually. I'm running Ubuntu 6.06 amd64 server edition. The raid array is a software-based linux array with LVM on top of it and the file system for the database mount point is XFS. The only optimization I've done so far is to put the following in /etc/sysctl.conf: kernel.shmall = 2097152 kernel.shmmax = 2147483648 kernel.shmmni = 4096 kernel.sem = 250 32000 100 128 fs.file-max = 65536 And in postgresql.conf I set the following parameters: shared_buffers = 131072 work_mem = 65536 max_stack_depth = 4096 max_fsm_pages = 40000 max_fsm_relations = 2000 These probably aren't ideal but I was hoping they would perform a little better than the defaults. I got the following results from a pgbench script I picked up off the web: CHECKPOINT ===== sync ====== 10 concurrent users... transaction type: TPC-B (sort of) scaling factor: 10 number of clients: 10 number of transactions per client: 100 number of transactions actually processed: 1000/1000 tps = 632.146016 (including connections establishing) tps = 710.474526 (excluding connections establishing) Once again I don't know if these results are good or not for my hardware. I have a couple of questions: - Does anyone have some good advice for optimizing postgres for tsearch2 queries? - I noticed that there are six different postmaster daemons running. Only one of them is taking up a lot of RAM (1076m virtual and 584m resident). The second one is using 181m resident while the others are less than 20m each. Is it normal to have multiple postmaster processes? Even the biggest process doesn't seem to be using near as much RAM as I have on this machine. Is that bad? What percentage of my physical memory should I expect postgres to use for itself? How can I encourage it to cache more query results in memory? Thanks in advance for your time. Carl Youngblood
Carl Youngblood wrote: > - I noticed that there are six different postmaster daemons running. > Only one of them is taking up a lot of RAM (1076m virtual and 584m > resident). The second one is using 181m resident while the others are > less than 20m each. Is it normal to have multiple postmaster > processes? You should have one master backend process and one per connection. PG is a classic multi-process designed server. > Even the biggest process doesn't seem to be using near as > much RAM as I have on this machine. Is that bad? What percentage of > my physical memory should I expect postgres to use for itself? How > can I encourage it to cache more query results in memory? OK - one of the key things with PostgreSQL is that it relies on the O.S. to cache its disk files. So, allocating too much memory to PG can be counterproductive. From your figures, you're allocating about 64MB to work_mem, which is per sort. So, a complex query could use several times that amount. If you don't have many concurrent queries that might be what you want. Also, you've allocated 1GB to your shared_buffers which is more than I'd use as a starting point. You've only mentioned one main table with 100,000 rows, so presumably you're going to cache the entire DB in RAM. So, you'll want to increase effective_cache_size and reduce random_page_cost. -- Richard Huxton Archonet Ltd
Hi, Richard and Carl, Richard Huxton wrote: > Carl Youngblood wrote: >> - I noticed that there are six different postmaster daemons running. >> Only one of them is taking up a lot of RAM (1076m virtual and 584m >> resident). The second one is using 181m resident while the others are >> less than 20m each. Is it normal to have multiple postmaster >> processes? > > You should have one master backend process and one per connection. PG is > a classic multi-process designed server. There may be some additional background processes, such as the background writer, stats collector or autovacuum, depending on your version and configuration. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
Thanks a lot for the advice Richard. I will try those things out and report back to the list. Carl On 8/10/06, Richard Huxton <dev@archonet.com> wrote: > From your figures, you're allocating about 64MB to work_mem, which is > per sort. So, a complex query could use several times that amount. If > you don't have many concurrent queries that might be what you want. > > Also, you've allocated 1GB to your shared_buffers which is more than I'd > use as a starting point. > > You've only mentioned one main table with 100,000 rows, so presumably > you're going to cache the entire DB in RAM. So, you'll want to increase > effective_cache_size and reduce random_page_cost.
On Thu, Aug 10, 2006 at 10:23:55AM +0100, Richard Huxton wrote: > Carl Youngblood wrote: > >- I noticed that there are six different postmaster daemons running. > >Only one of them is taking up a lot of RAM (1076m virtual and 584m > >resident). The second one is using 181m resident while the others are > >less than 20m each. Is it normal to have multiple postmaster > >processes? > > You should have one master backend process and one per connection. PG is > a classic multi-process designed server. > > > Even the biggest process doesn't seem to be using near as > >much RAM as I have on this machine. Is that bad? What percentage of > >my physical memory should I expect postgres to use for itself? How > >can I encourage it to cache more query results in memory? > > OK - one of the key things with PostgreSQL is that it relies on the O.S. > to cache its disk files. So, allocating too much memory to PG can be > counterproductive. > > From your figures, you're allocating about 64MB to work_mem, which is > per sort. So, a complex query could use several times that amount. If > you don't have many concurrent queries that might be what you want. > > Also, you've allocated 1GB to your shared_buffers which is more than I'd > use as a starting point. See the recent thread about how old rules of thumb for shared_buffers are now completely bunk. With 4G of memory, setting shared_buffers to 2G could easily be reasonable. The OP really needs to test several different values with their actual workload and see what works best. > You've only mentioned one main table with 100,000 rows, so presumably > you're going to cache the entire DB in RAM. So, you'll want to increase > effective_cache_size and reduce random_page_cost. > > -- > Richard Huxton > Archonet Ltd > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
I tried setting it to 2GB and postgres wouldn't start. Didn't investigate in much greater detail as to why it wouldn't start, but after switching it back to 1GB it started fine. On 8/15/06, Jim C. Nasby <jnasby@pervasive.com> wrote: > See the recent thread about how old rules of thumb for shared_buffers > are now completely bunk. With 4G of memory, setting shared_buffers to 2G > could easily be reasonable. The OP really needs to test several > different values with their actual workload and see what works best.
By the way, can you please post a link to that thread? On 8/15/06, Jim C. Nasby <jnasby@pervasive.com> wrote: > See the recent thread about how old rules of thumb for shared_buffers > are now completely bunk. With 4G of memory, setting shared_buffers to 2G > could easily be reasonable. The OP really needs to test several > different values with their actual workload and see what works best.
On Tue, Aug 15, 2006 at 12:47:54PM -0600, Carl Youngblood wrote: > I tried setting it to 2GB and postgres wouldn't start. Didn't > investigate in much greater detail as to why it wouldn't start, but > after switching it back to 1GB it started fine. Most likely because you didn't set the kernel's shared memory settings high enough. To answer you other question: http://archives.postgresql.org/pgsql-performance/2006-08/msg00095.php > On 8/15/06, Jim C. Nasby <jnasby@pervasive.com> wrote: > >See the recent thread about how old rules of thumb for shared_buffers > >are now completely bunk. With 4G of memory, setting shared_buffers to 2G > >could easily be reasonable. The OP really needs to test several > >different values with their actual workload and see what works best. > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Tue, 15 Aug 2006 12:47:54 -0600 "Carl Youngblood" <carl@youngbloods.org> wrote: > I tried setting it to 2GB and postgres wouldn't start. Didn't > investigate in much greater detail as to why it wouldn't start, but > after switching it back to 1GB it started fine. > > On 8/15/06, Jim C. Nasby <jnasby@pervasive.com> wrote: > > See the recent thread about how old rules of thumb for > > shared_buffers are now completely bunk. With 4G of memory, setting > > shared_buffers to 2G could easily be reasonable. The OP really > > needs to test several different values with their actual workload > > and see what works best. Sounds like you need to increase your kernel's maximum amount of shared memory. This is typically why an increase in shared_buffers causes PostgreSQL not to start. Check out this page in the docs for more information: http://www.postgresql.org/docs/8.1/static/kernel-resources.html --------------------------------- Frank Wiles <frank@wiles.org> http://www.wiles.org ---------------------------------
The relevant portion of my sysctl.conf file looks like this: kernel.shmall = 2097152 kernel.shmmax = 2147483648 kernel.shmmni = 4096 kernel.sem = 250 32000 100 128 fs.file-max = 65536 I understood it was a good idea to set shmmax to half of available memory (2GB in this case). I assume that I need to set shared_buffers slightly lower than 2GB for postgresql to start successfully. Carl On 8/15/06, Jim C. Nasby <jnasby@pervasive.com> wrote: > On Tue, Aug 15, 2006 at 12:47:54PM -0600, Carl Youngblood wrote: > > I tried setting it to 2GB and postgres wouldn't start. Didn't > > investigate in much greater detail as to why it wouldn't start, but > > after switching it back to 1GB it started fine. > > Most likely because you didn't set the kernel's shared memory settings > high enough.
On Wed, Aug 16, 2006 at 09:34:24AM -0600, Carl Youngblood wrote: > The relevant portion of my sysctl.conf file looks like this: > > kernel.shmall = 2097152 > kernel.shmmax = 2147483648 > kernel.shmmni = 4096 > kernel.sem = 250 32000 100 128 > fs.file-max = 65536 > > I understood it was a good idea to set shmmax to half of available > memory (2GB in this case). I assume that I need to set shared_buffers I don't see any reason to do that, so long as you have control over what's being run on the system. Just set it to 3000000000 or so. > slightly lower than 2GB for postgresql to start successfully. > > Carl > > On 8/15/06, Jim C. Nasby <jnasby@pervasive.com> wrote: > >On Tue, Aug 15, 2006 at 12:47:54PM -0600, Carl Youngblood wrote: > >> I tried setting it to 2GB and postgres wouldn't start. Didn't > >> investigate in much greater detail as to why it wouldn't start, but > >> after switching it back to 1GB it started fine. > > > >Most likely because you didn't set the kernel's shared memory settings > >high enough. > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461